This is not using the MCFC data like everything else in /r/footystats so far. It is simply based on results. This is about a spreadsheet I set up that recreates the Colley Matrix method for determining BCS rankings. It is still a work in progress, I am actively searching for ways to further expand on this, as I shall detail below. This post is more looking for feedback on the methodology of something I'm currently attempting to build, than finalized, insightful results.
If you are unfamiliar with the Colley Matrix,
this is the site, including his methodology. It is one of the ranking systems used to rank college football teams for BCS bowls. I picked this one to replicate because it was the one I could actually find the documentation on to reproduce. I don't know that it is any better or worse than the other computer ranking systems. It does not take scores into account, only the W/L results. Now, in soccer, unlike American football, final results on the seasons are determined by the table. This is also where the schedule is equal across all teams. I do not see any use for this system in the final ratings at the end of the season. Where I do see potential is adjusting team rankings for the imbalanced schedules in the middle of the season.
The reason I think this is useful is that it adjusts for strength of schedule so far, and number of games played so far. I think it will be very interesting to watch the results progress with the season. All of my experimentation and back testing with it up to this point has been fairly positive, with no major outliers from my expectations.
Adjustments I made: The addition of draws. There are no draws in college football, they play overtime. I made a draw equal to half a win, rather than 1/3 as it is in the tables.
Downfalls I see at the present moment: Home field advantage is not taken into account. Unfortunately, this is a very large factor in soccer. Beating City and United away is more impressive than beating them at home. I am investigating the best possible way to introduce a correction for this, but haven't done anything yet. If anybody has a suggestion, by all means, I'd love to hear it.
Same goes for the timeline of results. Results at the start of the season count for the same amount recent results do. I am also investigating ways to correct for timeline of results (form), but have not arrived at anything. The reason I want to do this is to have a version that more accurately represents a team's current strength, rather than their full season results. So the first week results would count for very little by the end of the season, and a team's last few games would be worth more.
If anyone wants to get their hands on the actual file, PM me. I would just prefer to know who has access to it, rather than allowing the entire internet to download it. Unfortunately, I have almost zero programming experience, so it's an Xcel spreadsheet. It makes it a bit confusing if you don't know what you are looking at, and it must be updated by hand.
These are the current results for the beginning of the EPL Season:
With only 3 games in the bag, it hasn't had the chance to mesh very much. 2 more weeks will be a ton better. Teams like West Brom and Sunderland could change significantly in the next two weeks. It
does not give a prediction of final finishing position, especially not at this early stage. Teams will find form, and lose it as the long season progresses. But I hope in the coming weeks it may give some indication of teams that are likely to rise in the table soon, or are likely to fall down the table because they've only played cupcakes and have 2 more games than everyone else.
Any and all feedback (even if highly critical of my method) is welcome.
I'd be interested in looking at your spreadsheet. I also am not heavily versed in matrix math, but intuitively I really like the Colley system and can think of a few fun uses for it. Is there any way I can download your file?
ReplyDelete