What the index covers

The GCPI tracks publicly posted inference API prices, standardized to a single unit: USD per 1,000,000 input tokens. Coverage is restricted to synchronous, token-priced inference APIs — the dominant billing surface for AI application developers and enterprise buyers.

DimensionScope (v0.1)Future editions
Service typeInference API (token-priced)Training, edge compute
Pricing layerPublic list priceNegotiated / volume tiers
GeographyNorth America, EuropeAsia-Pacific (Vol. II+)
Reference arch.Llama 3.1 familyAdditional open-weight families

Source hierarchy and confidence tiers

Each price observation carries a confidence tag reflecting the reliability of its source.

TierSourceTag
L1Official vendor pricing pages (direct URL)high
L2Vendor announcements and press releaseshigh
L3Third-party comparison platforms (cross-validation)medium
L4Web archive snapshots (Wayback Machine, back-fill)reconstructed

Only high and medium sources contribute to the GCPI headline. reconstructed points are published separately and appropriate for qualitative trend analysis only.

v0.1.0 Status: The inaugural GCPI headline (April 2026) is computed from high and medium observed data only. The historical series (2024-Q3 through 2025-Q4) is constructed entirely from reconstructed data and is published as a supplementary series for trend analysis only. Do not use the reconstructed series as primary input to quantitative models. Each reconstructed price file carries explicit source attributions in the notes column.

Reference architecture and tier definitions

The Llama 3.1 family serves as the reference architecture. Its weights are public, widely deployed, and priced on a per-token basis by the majority of covered vendors — enabling like-for-like comparison across heterogeneous hardware.

TierReference modelParameter range
S — SmallLlama 3.1 8B≤ 10B
M — MediumLlama 3.1 70B10B – 100B
L — LargeLlama 3.1 405B> 100B

A constituent whose price exceeds 5× the within-tier median is flagged outlier_flag = true and excluded from that period's headline calculation, but published in the data appendix.

Weighted geometric mean

The GCPI headline is constructed as a weighted geometric mean of constituent prices:

GCPIt = ∏ (Pi,t)wi    subject to ∑wi = 1

where Pi,t is the standardized price of constituent i in period t (USD / 1M input tokens) and wi is the weight of constituent i.

Why geometric mean? The geometric mean is less sensitive to extreme values than the arithmetic mean — appropriate given the current 10× price spread. Under a multiplicative structure, period-over-period changes decompose additively as log-price contributions, consistent with standard index number theory (Diewert 1978).

Weights proxy estimated API traffic market share, derived from reported API volumes, Hugging Face Hub statistics, and funding disclosures. Weights are rebalanced quarterly; between rebalancing dates they are frozen and only price inputs update.

Weight disclosure (v0.1.0): Current weights are analyst estimates, not empirical market-share statistics. They are based on qualitative triangulation of: (1) public statements from providers about request/token volumes; (2) Hugging Face Hub model download statistics as a demand proxy; (3) funding round announcements and reported revenue for scale calibration. No official industry statistics are available for this market. A sensitivity analysis of the GCPI headline to ±5pp weight shifts is provided in the data appendix (Data page). Formal weight estimation is planned for Vol. II.

AWS derived price disclosure (§3.3): Providers AWS G5 (spot) and AWS P5 (on-demand) do not publish per-token prices. A derived price is computed from the published GPU-hour rate using benchmark throughput assumptions:

  • g5.xlarge — throughput assumption: 600 tokens/sec (Llama 3.1 8B, vLLM, FP16).
    Formula: P_token = P_gpu_hr / (600 × 60) / 1,000,000.
  • p5.48xlarge (8×H100) — throughput assumption: 2,800 tokens/sec (Llama 3.1 8B, vLLM, FP8, 8-way tensor parallel).
    Formula: P_token = P_gpu_hr / (2800 × 60) / 1,000,000.

Throughput benchmarks are sourced from public vLLM benchmark reports (LMSys, 2024). Actual throughput may vary ±30%. These assumptions are locked for the duration of the v0.1.0 data series and any revision will trigger a data version bump with full re-computation of affected periods.

SeriesFrequencyWeight update
GCPI (official)MonthlyQuarterly rebalancing
GCPI–SpotWeeklyFrozen to latest rebalancing

Event-study framework

The GRP is published as an independent series, decoupled from the GCPI headline. It decomposes the portion of GCPI movement attributable to geopolitical supply shocks.

GRP is estimated using a standard event-study methodology (MacKinlay 1997):

WindowDefinition
Estimation window[τ − 90, τ − 10] days
Event window[τ, τ + 30] days

Average Abnormal Price changes (AAR) are computed for the affected provider cohort relative to the estimation baseline. Cumulative Abnormal Returns (CAR) across events are summed to produce the headline GRP percentage.

GRP disclosure (v0.1.0): The GRP series published in this edition is an analyst estimate, not a statistically computed result from the event-study framework above. Insufficient contemporaneous price observations existed at publication time to estimate credible estimation-window baselines. The v0.1.0 GRP represents calibrated expert judgment informed by the documented event log. It carries confidence: analyst_estimate in the underlying data files. The formal statistical GRP will be introduced in Vol. II (planned May 2026), once 12+ months of directly observed monthly price data are available. Full event log and risk scores are on the Geo Risk page.

When and how history is revised

ConditionTreatment
Vendor retroactively corrects a priceRevise affected period; log in CHANGELOG
Unit-of-measure errorRevise affected period; bump data version
Methodology changeRun old and new series in parallel ≥ 2 periods
Quarterly weight rebalancingDoes not revise history; applies from current period

Methodology versioning follows MAJOR.MINOR.PATCH. Data releases follow YYYY-MM-[letter] (e.g., 2026-04-a). All changes are recorded in CHANGELOG.md.

What the index does not capture

  1. Weight uncertainty. Constituent weights are estimated proxies, not official statistics.
  2. List price vs. negotiated price. Enterprise volume discounts may be materially below list. GCPI represents an upper bound for high-volume buyers.
  3. Reconstructed series noise. Back-filled points rely on archive snapshots and may have gaps. Use for qualitative trend analysis only.
  4. Reference architecture basis risk. Standardization on Llama 3.1 introduces cross-architecture approximation for providers with different primary offerings.
  5. Research use only. GCPI does not constitute financial, investment, or procurement advice.

Transparency notes for v0.1.0

The following disclosures describe material aspects of the data that readers should be aware of before drawing conclusions from the GCPI time series.

Reconstructed historical series

The GCPI index values for 2024-Q3 through 2025-Q4 were computed after-the-fact (i.e., back-filled), not recorded contemporaneously. Each price snapshot was reconstructed from web archive sources, third-party trackers, and provider announcements. All reconstructed observations carry confidence: reconstructed in the data files. The inaugural April 2026 value is the only directly observed period in v0.1.0. The reconstructed series is published as a supplementary trend series to provide context; it is not a substitute for a formal prospective price series.

Index composition changes across periods

The set of providers differs across historical periods:

  • Octo.ai (2024-Q3 only): Octo.ai was included in Q3 2024 at its published list price. The service was subsequently discontinued following its acquisition by NVIDIA. Octo.ai is excluded from all post-Q3 2024 periods.
  • Clarifai and Replicate added in 2024-Q4: These providers were added in Q4 2024 to reflect their growing market presence. Earlier periods without these providers are not directly composition-comparable.
  • AWS added in 2025-Q1: AWS G5 (spot) and AWS P5 (on-demand) derived prices were added from Q1 2025 onwards. Their inclusion shifts the GCPI level upward slightly due to higher derived prices; this is a composition effect, not a real price increase.

A fixed-composition index holding only providers present in all periods would show a larger price decline. The published series uses a chained approach: each period uses the constituents available at that time, which better reflects the market structure of each period but limits strict period-over-period comparability. Users requiring strict composition consistency should download the underlying CSV files and compute custom series.

Groq pricing — input price only

Groq publishes separate input and output prices ($0.05 and $0.08 per 1M tokens respectively, as of April 2026). The GCPI uses the input price only ($0.050/1M), consistent with the index standard of USD per 1M input tokens. Prior to this clarification being formalized in v0.1.0, some internal working documents used a blended midpoint ($0.065). All published data files and index calculations use the input price exclusively.

Diewert, W.E. (1978). Superlative index numbers and consistency in aggregation. Econometrica, 46(4), 883–900.

MacKinlay, A.C. (1997). Event studies in economics and finance. Journal of Economic Literature, 35(1), 13–39.