I checked today on a whim, and was a little freaked out that it’s exactly 500 days since the first post on this site. That post was a bad joke, foreshadowing much of the work that was to appear on this site, but I thought I might have some leeway to indulge and [record scratch, freeze frame] talk about how I got here.
Out of Gas
In 2015 I found myself utterly burnt out. I’d learned programming at the age of six or seven, and it was something I loved and eventually made a career out of. Lately I’d founded a couple of startups, I even got to pitch at Downing Street one time, but between the exhausting schedules, the disappointments and the guilt of letting down investors and employees alike, I’d arrive at the point where I resented sitting and feeding the machine day after day. No child first stares up at a computer monitor hoping to one day make enterprise middleware. Yet here I was at 35, slightly confused at what had happened in the intervening years.
Instead of packing it all in and making wicker furniture for the rest of my life, I looked around for a hobby to salvage something of the only skill I really had. I toyed with the idea of making games, but I don’t really have the talent. Instead I settled on these here football stats. As I’ve told elsewhere, I was only really aware of football analytics because I followed Ted Knutson back when he wrote about Magic: the Gathering. Eventually StatsBomb happened and I was hooked. In October 2015 I grabbed some data and jumped in.
Not a lot of people know this, but there’s a reason this blog has such a silly name. I applied for the 2016 Opta Pro Forum with a deep learning model that would basically [redacted] and [redacted], allowing you to measure [redacted] for any [redacted]. It got rejected, of course, but I was lucky enough to get an invite to the shindig itself and meet many of you for the first time, and ramble long into the night about numbers and hypothetical Netherlands lineups for Euro 2016. I’m grateful I got rejected in a way, because instead of spending that period doing something, y’know, hard, I could spend months just playing with the data. To this day I think this is something we don’t do enough of. Just take the data and cut it in fifty different ways and see how it looks, see what’s possible.
This approach of playing with data explains pretty much all the work I’ve done in public. What do attacks look like? What does defending look like? All these things derive from mucking about. It’s all entirely well-intentioned, but likely has no analytical value at all. We seem to struggle regularly as a community with this balance between people finding their feet and those expecting fully-fledged science. I feel like my work (and some of the stuff I’ve seen from, for example, David Sumpter), is as much about saying “hey! Look at what you can do with some freakin’ polygons!” as it is about making hard statements of fact. I like to think that some of my work is approachable, and possibly even salvageable for someone to built something cool upon, but I’ll admit that every year Sloan comes around I spend a week in despair that I ever presumed I had anything to offer the field of football analytics. As I pointed out in the last State of the Stats, we’re all vaguely in competition with each other so it’s hard, but to everyone out there experiencing their bout of impostor syndrome, I see you, you’re doing cool stuff!
Impostor syndrome in mind, here’s a list of nice things that have happened to me in and around football:
- Bobby Gardiner was my first every follower, which is nice because I still regularly chat to him and enjoy his work.
- The Challengers Podcast was the first thing to invite me on to ramble, and you should subscribe to them because they put out an absurd amount of content on a regular schedule.
- I got to contribute to StatsBomb and I still feel guilty to this day when I put stuff on here while James works his arse off rustling up new work.
- I repeated myself ramble-for-ramble on the Analytics FC podcast but with added puns.
- I talked to my first Premier League club and utterly failed to convince them to give me a job.
- A very nice and extremely patient guy called Jakub Dobias got in touch with me because he was trying to convince Slavia Prague to use analytics.
- Our first signing (to a tiny degree based on PATCH) went on to win the African Cup of Nations and we don’t really get to take much credit for his incredibly hard work but I am incredibly smug about it as everyone I’ve boasted to knows.
- I’m getting to travel to Copenhagen to judge a really cool hackathon!
- I have a full time job in football analytics!
The Train Job
It felt sad when stats poster-boys Brentford/Midtjylland’s ruling boffins SmartOdds disbanded their football analytics department. But it paid off for me when Ted got the itch to get back into the game and started Statsbomb Services. I’ve been responsible for putting together some dead simple tools for clubs to do smart stuff: you’ll be familiar with Ted’s radars, but also stuff like the shot charts and passing maps we can generate at the touch of a button. There’s a lot of clever stuff in the pipeline for the next few months, much of which I expect will end up causing aggro in Ted’s mentions on Twitter, which is the measure of any good analytics work.
Objects in Space
So, that’s where I have managed to get in my brief time in football analytics. Given that it’s just been both the Opta Pro Forum and the Sloan Sports Analytics Conference in the last six weeks, I thought I’d talk a little bit about the future, and given that football’s just objects in space, where we might be headed with tracking data:
- My overall attitude to analytics in 2017 is the same as the old William Gibson quote: the future is already here — it’s just not very evenly distributed. US sports (within teams or in partnership with academia) are doing amazing work. It’s very likely that there are football clubs (and certainly betting syndicates) running silent with similarly incredible work, absolutely head and shoulders above what we see in public. Then there are clubs who have some solid spreadsheets to avoid obvious mistakes, followed by vast swathes of clubs doing everything as they already have. Some of the latter have good enough coaches and scouts that it doesn’t matter much at the moment.
- Foundational problems such as how to correctly feed tracking data into neural network models are largely unsolved. I think this is where most of the interesting work is happening. At the same time I’d be surprised if any of it was truly digestible inside clubs. There will always be a tension between big, smart, opaque models and small, simple, transparent metrics.
- It is dumb that data is still an issue in football. Not just event data but tracking stuff too. My hope is that some of the nascent work turning broadcast quality footage into tracking data with machine learning will one day fundamentally alter the economics of the football data market, because it doesn’t seem like any of the leagues or football associates are going to.
I’m not gonna lie, at this point this is excruciatingly self-indulgent and I’m just stringing it out so I can reference more Firefly episodes, but the real message here is: I have got everything I could have hoped for out of football analytics, and you can too. It’s a genuinely fun, creative way to use my meagre programming skills, it’s a community of smart and often hilarious people, and it’s apparently even possible to pay your rent doing numbers. The feeling of watching tens of thousands of people, or indeed an entire nation, cheering on someone you had even a small part in moving from one club to another, from one country to another, is utterly thrilling. It’s been the most exciting 500 days in my life, and this is coming from someone that watched Oxford United win the Milk Cup in 1986, the same year I first started mashing the keyboard dreaming of one day making something cool.