The data
We currently have available all Champion Ladder match data for seasons 1-51. This post will just be a quick rundown of what is in the dataset, so that I have a reference for when we start to do some real analysis.
The spreadsheets contain one line for each game, so the number of rows gives us the number of games and the number of columns are the number of parameters recorded for each match.
ccl_data = map(ccl_files, read_csv)
num_games <- map(ccl_data, nrow)
num_parameters <- map(ccl_data, ncol)
bind_rows(Games = num_games, Parameters = num_parameters, .id = "") %>%
kable(format.args = list(big.mark=','))
CCL1 | CCL2 | CCL3 | CCL4 | CCL5 |
---|---|---|---|---|
18,033 | 30,496 | 18,950 | 22,963 | 17,290 |
95 | 95 | 95 | 95 | 95 |
So around 20,000 games in each season (except for a big spike in games for season two) and each match has 95 stats recorded. These can be broken down to coach, team and other categories. What they are measuring is generally self-explanatory from the column heading, but some can be a bit confusing.
Coach data
Prefixed with coaches.[0|1].
for home/away coach:
idcoach
Numeric coach idcoachname
coachcyanearned
coachxpearned
Team data
Prefixed with teams.[0|1].
for home/away team:
idteamlisting
Numeric team ididraces
Numeric id of team raceteamname
teamlogo
File name for team logo?value
TV of team (as displayed on team page, so missing journeymen?)score
Touchdownscashbeforematch
popularitybeforematch
Fan factorpopularitygain
cashspentinducements
cashearned
Actual amount of cash gained by teamcashearnedbeforeconcession
Equal tocashearned
unless there was a concessionwinningsdice
spirallingexpenses
nbsupporters
possessionball
% of game spent in posession of balloccupation[own|their]
% of game spent in posession with ball on own/opponent’s half of the pitchmvp
Number of MVPs for a team (useful for figuring out who conceeded?)inflictedpasses
Inflicted, ie. how many passes this team completedinflictedcatches
inflictedinterceptions
inflictedtouchdowns
inflictedtackles
[inflicted|sustained]casualties
Armour breaks, not actual casualties. Those are called injuries[inflicted|sustained]ko
[inflicted|sustained]injuries
[inflicted|sustained]dead
inflictedmetersrunning
inflictedmeterspassing
inflictedpushouts
Crowdsurfssustainedexpulsions
from fouls
Other data
uuid|id
Unique game id as hexadecimal/decimal numberleaguename|competitionname
stadium
Stadium type for the home teamstructstadium
Stadium enhancement (or NA)started|finished
Time game started/finished- Several base URLs for image assets (for putting together match summary screens?)
Things to note
Inflicted casualities/kos/injuries/deaths for one team does not equal the sustained casualities/kos/injuries/deaths for the other team because of injuries that arise through failed dodges and GFIs (maybe failed blocks too?).
We can see how many players have been sent off for fouling, but the number of fouls doesn’t seem to be recorded anywhere.
Text encoding seems to have gotten messed up somewhere, so coach/team names with non-standard characters will come out garbled.
Crowdsurfs do not appear to be recorded properly, because none are recorded across all five Champion Ladder seasons.
Thanks to Dode for providing the data↩
Share this post
Twitter
Google+
Facebook
Reddit
LinkedIn
StumbleUpon
Email