Editorial standard
Methodology
AutoIndex24 is a research publication. Every figure on the site traces back to a public dataset and is recomputed on a fixed schedule. The pipeline below describes how a single statistic gets from raw record to published prose.
Sources
NHTSA vPIC
Vehicle catalog (make / model / trim) with year-of-manufacture coverage from 1981.
EPA fueleconomy.gov
Per-trim EPA city / highway / combined MPG, MPGe, electric range, and user-reported real-world MPG.
NHTSA Recalls API
All formal recall campaigns affecting US-market vehicles.
NHTSA NCDB (Vehicle Owner’s Questionnaire)
Approximately 2.2 million consumer complaints, including odometer reading at failure and standardized component category.
NHTSA FARS
Fatality Analysis Reporting System — every fatal motor-vehicle crash on US public roads, 2014–2022.
DOE AFDC
Operational EV charging stations and federal/state incentive registry.
EIA
Monthly residential electricity prices and weekly retail gasoline prices, both at state granularity.
Cars.com inventory snapshot
Bi-weekly count-only snapshots of for-sale inventory, captured via Cars.com’s public GraphQL filter endpoint. We never store individual listings.
Aggregates only
For aggregator-derived figures, we never persist row-level listings. Each query returns total result counts and distribution buckets only — these are aggregated in worker memory and discarded before commit. The database stores only summary statistics (sample size, median, percentiles, facet counts) keyed by model, year, region and snapshot date.
Pipeline
- Question. A research prompt is composed against the available data slices.
- Analyst (Python). A deterministic SQL/pandas step computes findings and renders charts. No LLM is involved at this stage.
- Writer (LLM). An English narrative is composed around the analyst’s findings file. The writer is explicitly forbidden from arithmetic, estimation, or inferring numbers not present in the findings.
- Validator. Every numeric token in the prose is extracted via regular expression and matched against the findings file. Any unmatched figure rejects the draft and sends it back to the writer.
- Editor. A style-only pass that may not change numbers.
- Publisher. Schema.org markup, hreflang placeholders, sitemap insertion, and canonical URL emission.
Sample-size policy
Slices below n = 30 observations are not published. Slices between n = 30 and n = 100 are published with a low-confidence banner. Percentile estimates require n ≥ 50.
Update cadence
Bi-weekly: aggregator inventory + price-distribution snapshots. Monthly: NHTSA complaints, EPA real-world MPG. Annually: NHTSA FARS (typically released by NHTSA in Q4 for the prior year).
Limitations
- NHTSA complaint volumes are self-reported and biased toward owners who have problems and choose to submit; comparison across models partially controls this by reporting per-1k registered vehicles where registration data is available.
- FARS lags by approximately two years.
- Bi-weekly snapshots are point-in-time; transient market distortions may persist for a single observation cycle.
Methodology last reviewed: 2026-05-03.