Why we built a World Cup API
Public sports data sits at two extremes: free but stuck inside HTML, or sold to broadcasters by the year. We wanted the middle ground — a developer-friendly JSON API for the FIFA World Cup — and when we couldn’t buy one, we built it.
The problem
Building anything that touches World Cup data today means picking between two unattractive options.
On one end, the encyclopedic sources — Wikipedia, RSSSF, FIFA’s own archive — have everything. Every squad since 1930, every match, every goalscorer. The catch is they’re published as articles for humans, not endpoints for code. Want it as JSON? Write a scraper, maintain it as page layouts shift, parse fragile HTML, hope nothing changes mid-tournament.
On the other end, the established sports-data vendors (Sportradar, Genius Sports, Opta) sell exactly the structured feeds you want. They earn their pricing by clearing rights, paying officials, and operating sub-second pipelines used by sportsbooks and broadcasters. The pricing reflects that audience — entry tiers start in five figures monthly.
Between “parse Wikipedia” and “sign a five-figure contract” there’s a wide gap. Indie game developers, sports bloggers, hobbyist data scientists, classroom projects — none of them fit either end.
Why we cared enough to build it
We were already in the gap. siono.app needed reliable World Cup history to power its tournament polls; we’d already done the curation work for our own use. Once we saw how much friction the same data extraction was costing other small projects, exposing it as a clean API became the obvious next step: developers spend less time wrangling HTML and more time building the actual thing they care about.
The product is built for the kind of builder we ourselves were a year ago — someone with a great idea for a fantasy game, a prediction tracker, a recap widget, an analytics notebook — held up by “there’s no clean way to get the data.” That hold-up shouldn’t exist.
How we built it
Three deliberate choices shaped the product:
Wide on history. Every World Cup since 1930 — 23 tournaments, roughly 2,500 player-rows, full squads with DOB, position, and club affiliation. Cross-checked across FIFA, RSSSF, and Wikipedia, with our own pass for consistency. That depth is genuinely rare in queryable form. Want every defender born in October who played for a North-Hemisphere nation in the 1980s? One request. Sub-200ms response.
Narrow on live. During the 2026 tournament we ingest from cross-checked public sources with a five-minute median lag. That makes us the wrong vendor for live-betting and broadcast graphics, and the right vendor for fantasy games, prediction trackers, recap widgets, and post-match analysis — the use cases that don’t need sub-second latency.
Boring tech. Node and Hono on the API side. SQLite for keys and usage tracking. Stripe for billing. Plain HTML for the marketing site. The point of the product is the data and how it’s shaped — not the framework du jour. Boring lets us focus on accuracy.
Where it’s heading
The 2026 cycle is the obvious next leg: full live coverage of the expanded 48-team tournament with the same 5-minute-honest contract on lag. Past that, the same shape of dataset extends naturally to other tournaments — UEFA Euros, Copa América, the Women’s World Cup. We’ll listen to what people actually build before deciding which extends first.
The free tier exists for a reason: the more people who explore the data, the better we understand what’s missing. If something you’d find obviously useful isn’t in the response shape today, that’s a roadmap signal.