How the metric is calculated and what its limitations are
The concentration tile answers the question: are a company's OpenTelemetry contributions coming from many people, or are they relying heavily on one or two individuals?
A company with high concentration is more exposed to contributor churn — if the top one or two people leave or reduce their involvement, the company's overall activity drops sharply. A distributed company is more resilient.
The tile uses the Herfindahl-Hirschman Index, a standard measure of market concentration adapted here to contribution share.
For a company with contributors 1…n, each holding a share sᵢ of the company's total GitHub contributions (expressed as a percentage 0–100):
HHI ranges from near 0 (contributions spread perfectly evenly across many people) to 10,000 (a single contributor holds 100%). The key property is that it penalises large shares more than small ones — a company where one person holds 60% scores much higher than one where ten people each hold 10%, even if both have the same total.
Thresholds mirror the boundaries used in US antitrust analysis, adapted for contributor share. They align with natural breaks observed in the OTel top-10 company distribution:
| Label | HHI range | What it means |
|---|---|---|
| Distributed | HHI < 1,500 | No single contributor dominates; activity is spread across a broad group |
| Moderate | 1,500 – 3,000 | A few key contributors stand out, but a meaningful supporting group exists |
| Concentrated | HHI > 3,000 | One or two contributors account for most activity; high dependency risk |
The contributor leaderboard in the local cache is built from GitHub handles matched against
affiliation files (affiliations.json from CNCF gitdm and
github-companies.json from GitHub profile data). It captures GitHub-mediated
activity only: commits, pull requests, code reviews, and GitHub issues.
The LF Insights API also counts contributions on Jira, Confluence, Gerrit, GitLab, and plain Git. These are included in each company's total contribution count shown at the top of the panel, but they cannot be broken down per contributor in the current data pipeline. Companies that are heavy users of non-GitHub platforms will therefore have a gap between the two numbers.
Before computing HHI, the tile checks coverage: the ratio of the sum of all cached contributor contributions to the company's API-reported total.
If coverage is below 30%, the tile shows "Not enough GitHub data to assess" instead of a label. This protects against misleading classifications for companies whose OTel activity is dominated by Jira, Confluence, or other non-GitHub platforms — where the few GitHub contributors we do see may look artificially concentrated simply because most contributors are invisible to the cache.
Company-to-contributor mapping follows this priority:
affiliations.json) — highest confidence, manually curatedgithub-companies.json) — self-reported, may be stale or informalA contributor whose GitHub profile lists an informal company name (e.g. "@Elastic" vs "Elastic") may be missed or double-counted across the two affiliation files.
The cache is refreshed daily for short periods (30d–1y) and weekly for longer ones. The concentration label reflects the state at the last cache refresh, not real time. Contributor affiliations in CNCF gitdm and GitHub profiles are also updated on a schedule, so recent job changes may not be reflected immediately.