VeePeenini, Part 3: Launch and the Long Tail of Data
· Vitor Pontual · 5 min read
The day after launch, one of my friends sent me a message: a group on the bracket was showing the wrong teams. Not a crash, not an error page. Just quietly, confidently wrong. This is exactly the kind of bug you only catch once real people are looking, because they know the tournament better than your test data does.
The bug that was really two files arguing
The group letters were wrong on the bracket but right everywhere the actual fixtures were involved. When I dug in with Claude, the cause was almost funny: two different files in the codebase each held their own hardcoded copy of which team is in which group, and over a week of changes they had drifted apart. Nine of the twelve groups had at least one wrong assignment in one of the files. The database, the actual source of truth, had been correct the whole time.
I could have just fixed the letters and moved on. Instead I treated it as a class of bug, not an instance. The real problem was that the same list lived in two places, which guarantees they eventually disagree. So we restructured it so the group is derived from one source, and the second copy was deleted entirely. Now if the two ever fall out of sync, the app refuses to start instead of silently lying. The header comment on the generated roster file is the whole story:
// file: src/lib/game/roster.ts
// Group letters are NOT stored here anymore — they are derived from
// src/lib/teams/meta.ts (the single source-of-truth for team identity).
// Previously the group letter was duplicated between meta.ts and this
// file, and the two drifted (9 of 12 groups showed wrong teams on the
// bracket). The transformation block below merges TEAMS.group into each
// team at module load and throws loudly if any team is missing from
// meta.ts. Result: drift is impossible.
The rule I took away, and keep taping to the wall: when two files hold the same list, derive, don’t duplicate.
Rosters that won’t sit still
Building a World Cup app three weeks before a World Cup means the data is actively wrong and getting righter on someone else’s schedule. Federations announce their final 26-man squads right up to the deadline, so on day one a lot of my “players” were placeholders waiting on real names.
As each squad dropped, I replaced placeholders with real players. The trap here is subtle and I got bitten by it: the seed file that sets up the database would happily overwrite a fresh roster on the next deploy, because it thinks it knows the truth. So every roster change had to land in two places at once, the live database and the seed source, or the next deploy would quietly undo my work. It took two big waves to get every team to a real, final squad, with the elite players tagged so they show up as the rare cards everyone chases.
Chasing 100% photo coverage
A sticker without a face is sad. Getting a photo for all 1296 player and coach cards was its own small project, and it ran on a chain of fallbacks: pull from Wikipedia first, then Wikimedia Commons, then a football data API, then Transfermarkt for the deep tail.
The interesting failure was the matching. An early version matched on too little, so when it searched for a name it would grab the wrong player who happened to share a first letter, an “Adrián” standing in for an “Alejandro.” For a sticker album that is a real bug, because the whole point is that the face matches the name. So we tightened the scoring to reward exact and prefix matches and reject the first-letter-only collisions:
# file: scripts/scrape-api-football-misses.py
if target_first == api_first_token:
score += 20 # exact first-name match
elif api_first_token.startswith(target_first) or target_first.startswith(api_first_token):
score += 15 # one is a prefix of the other ("m" vs "mohamed")
elif len(target_first) >= 4 and len(api_first_token) >= 4 and target_first[:4] == api_first_token[:4]:
score += 8 # first 4 letters match (diacritic-stripped near-matches)
else:
# First letter could match but full name is a different family -- reject.
# This is the collision we keep hitting (Adrián/Alejandro, Ferry/Frenkie).
return -1
I hand-fetched the last stubborn dozen who simply have no decent photo anywhere public. The album landed at full coverage: every player and coach has a photo, and every crest its flag.
The album index: 48 countries, each its own Panini page to fill.
The part nobody sees
A surprising amount of this stretch was me staring at a card frame and deciding a rarity label sat two pixels too high. There’s a run of commits that are nothing but nudging a tier label down one percentage point, then back up half a point, then a touch left to center it in the frame. It is deeply unglamorous, and it is exactly the kind of thing that separates something made with care from something that looks generated. That polish is part of the job, not beneath it.
By the end of this stretch the data was honest, the faces were real, and the frames sat right. Next the mood changed entirely, because the tournament was about to actually start. That’s Part 4.