← Back to tracker

Contribution Concentration

How the metric is calculated and what its limitations are

What it measures

The concentration tile answers the question: are a company's OpenTelemetry contributions coming from many people, or are they relying heavily on one or two individuals?

A company with high concentration is more exposed to contributor churn — if the top one or two people leave or reduce their involvement, the company's overall activity drops sharply. A distributed company is more resilient.

The metric: Herfindahl-Hirschman Index (HHI)

The tile uses the Herfindahl-Hirschman Index, a standard measure of market concentration adapted here to contribution share.

For a company with contributors 1…n, each holding a share sᵢ of the company's total GitHub contributions (expressed as a percentage 0–100):

HHI = Σ sᵢ²

HHI ranges from near 0 (contributions spread perfectly evenly across many people) to 10,000 (a single contributor holds 100%). The key property is that it penalises large shares more than small ones — a company where one person holds 60% scores much higher than one where ten people each hold 10%, even if both have the same total.

Classification thresholds

Thresholds mirror the boundaries used in US antitrust analysis, adapted for contributor share. They align with natural breaks observed in the OTel top-10 company distribution:

Label HHI range What it means
Distributed HHI < 1,500 No single contributor dominates; activity is spread across a broad group
Moderate 1,500 – 3,000 A few key contributors stand out, but a meaningful supporting group exists
Concentrated HHI > 3,000 One or two contributors account for most activity; high dependency risk

Limitations

GitHub activity only

The contributor leaderboard in the local cache is built from GitHub handles matched against affiliation files (affiliations.json from CNCF gitdm and github-companies.json from GitHub profile data). It captures GitHub-mediated activity only: commits, pull requests, code reviews, and GitHub issues.

The LF Insights API also counts contributions on Jira, Confluence, Gerrit, GitLab, and plain Git. These are included in each company's total contribution count shown at the top of the panel, but they cannot be broken down per contributor in the current data pipeline. Companies that are heavy users of non-GitHub platforms will therefore have a gap between the two numbers.

Coverage check and the "not enough data" fallback

Before computing HHI, the tile checks coverage: the ratio of the sum of all cached contributor contributions to the company's API-reported total.

coverage = Σ contributor_contributions (cache) / org_total (API)

If coverage is below 30%, the tile shows "Not enough GitHub data to assess" instead of a label. This protects against misleading classifications for companies whose OTel activity is dominated by Jira, Confluence, or other non-GitHub platforms — where the few GitHub contributors we do see may look artificially concentrated simply because most contributors are invisible to the cache.

Affiliation accuracy

Company-to-contributor mapping follows this priority:

A contributor whose GitHub profile lists an informal company name (e.g. "@Elastic" vs "Elastic") may be missed or double-counted across the two affiliation files.

Snapshot in time

The cache is refreshed daily for short periods (30d–1y) and weekly for longer ones. The concentration label reflects the state at the last cache refresh, not real time. Contributor affiliations in CNCF gitdm and GitHub profiles are also updated on a schedule, so recent job changes may not be reflected immediately.

OTel Contributions Tracker · OpenTelemetry