Tracking Massey 2016: The Preseason

By Matthew Lundeen on August 29, 2016 at 8:03 am
The latest in college football calculating technology.
Wikimedia Commons
35 Comments

Introducing a new project for the upcoming 2016 college football season.

In the early 20th century, Charles Sanders Peirce made a distinction between reality and truth. According to the founding father of pragmatism and semiotics, the term "reality" describes an object as it is, in itself. Peirce believed that we interact with reality through the interpretation of signs because the real is not something we can tangibly grasp and examine from all angles. As a result, "truth" is our interpretation of reality through our interaction with these signs by using inquiry. 

This very basic philosophical overview, I believe, is a helpful way to view our relationship to college football. Every football fan wants to know the real talent level of their favorite team and how it compares to others. We know a real talent level exists for each team, but again, it isn't a material object that we can just pull off the field and look at under a microscope. Instead, we have to take the signs we are given and use inquiry to measure and interpret a team's real talent level.

Let's get a little more concrete.

The objective of a football game is to win, and you do that by outscoring your opponent. Therefore, points are a logical starting place for examining the real talent level of a football team. However, at some point in time, human beings realized that not all schedules and opponents are created equal. An average margin of victory of 14 points in the MAC is not the same as an average margin of victory of 14 in the SEC. Thus, the truth behind a team's real talent level changes to reflect our recognition of their difficulty of schedule.

Now, as you may have observed from that example, using Peirce's definition means that truth is ever-changing and never static. Staying with the college football example, every offseason we come up with new ways to measure the real talent level of a team; i.e. people who create various ratings systems make tweaks to their model. The hope behind all of this, is to give us a closer representation of the real; in this case, the real talent level of a college football team. 

It is with that philosophy lesson in mind, that I introduce a new project for the 2016 college football season. For those of you who are not familiar, Ken Massey runs a website that gathers rankings from a number of other college football ratings systems and popular polls, and puts them all together to come up with an average ranking that he calls the "College Football Ranking Composite." Instead of using my own model (THOR+) this season, I am going to track this composite ranking for all 128 teams. (With a focus on the Big Ten, of course.)

I am doing this because every model has its weak points. Despite how we like to represent them, the formula behind each one is created by a human being who had inherent biases and believed that certain things were more important than others when it came to interpreting the real talent level of a college football team. The theory behind crowdsourcing ratings like Massey has done is that pooling them together will help address any weaknesses that one single model has by itself. It still won't get us a perfect representation of reality, of course, but it should get us closer to the real than any single model by itself could. And, according to Peirce at least, that's all we can hope for. 

So, without further ado, allow me to present to you the visualization we will be using over the next 3-4 months.  

2016 preseason college football ranking composite

Allow me to give some context on understanding the Tableau visual:

  • The "Massey Composite" tab plots every team based on their composite ranking. Note that this is not the same as their "mean ranking." After Massey gathers the data, he takes an average of all the rankings (the mean ranking) and then ranks each team from best to worst based on that mean ranking (the composite ranking). 
  • The "Top 25" tab is pretty self-explanatory. 
  • The "Team Variance" tab shows each data point for every team. This is my favorite because it allows you to see who ranks each team the highest and lowest. For instance, we can see that a team like Ohio State has a small range of rankings (#1-#16), while someone like Florida has a gigantic range (#7-#92). 
  • The "Std. Deviation" tab plots each team based on their mean ranking and standard deviation. For those unsure of what a standard deviation is, think of it as a measure of how close each individual ranking is to the mean. So, for example, Ohio State has a tiny standard deviation because all of their rankings are between #1 and #16, which is close to their 4.8 average. Florida, meanwhile, has a huge standard deviation because their mean is 30.3, but they are ranked all over the place from 7th in the country to 92nd. Standard deviation is important because it gives us an idea of which teams there seems to be a consensus on and which teams are going to start arguments among a group of college football fans. 
  • The "Conferences" and "Divisions" tabs are very basic. The first just shows the average ranking for each conference, while the second breaks the same chart into divisions, allowing us to see the disparity between the SEC East and West. 
  • Finally, the "Individual Ratings Systems" tab allows you to filter through every ratings system that Massey tracks and look at where each team ranks, accordingly. Because of the way I had to format the chart, there are odd options in the filter like "number of records" and "standard deviation" that I haven't figured out how to exclude yet. Go ahead and ignore those. 

And, before we begin, a quick primer on the statistics I will be using:

  • "Mean Ranking" is the average ranking of a team that is calculated by adding up all the various rankings and dividing them by the number of ratings systems. 
  • "Standard Deviation", again, is how close all of the data is to the mean. A low number means there is a high consensus on the talent of a team, while a big number means there are widely differing opinions.
  • "High" and "Low" are obviously the highest and lowest rankings for each team. 
  • "25th Percentile" and "75th Percentile" break the data into quarters. So the lower quartile tells us where the cutoff point is for the 25th percentile of a team's rankings lies. While, the upper quartile tells us where the start of the 75th percentile is. 

Okay, that's enough explaining. Let's look at some results. 

The Top 25

preseason top 25

Five Big Ten teams make the Top 25: Ohio State (#4), Michigan (#9), and Michigan State (#12) State in the East, while Iowa (#22) and Wisconsin (#24) represent the West. 

I am using updated numbers as of Sunday, August 28th, which means there are 52 total ratings systems included in the composite. The preseason composite had 55 last season, but I don't know how many or when Massey is going to include them before Thursday night kickoff, so we are going with 52 for this week. Now, here are the ranges of the various rankings for each team:

preseason variance

The Big Ten West

Iowa #22

Mean Rank standard deviation high low 25th percentile 75th percentile
23.3 9.8 #8 #52 #17 #30

Let's use our beloved Hawkeyes as an example for how to read these numbers. 

First of all, a mean rank of 23.3 signifies that when you take all the rankings in the sample, add them up, and then divide by the number of rankings their average ranking comes out to 23.3. Out of all 128 teams, that is the 22nd best total, hence why they are ranked 22nd in the composite rankings. 

Now, the standard deviation just tells us whether the ratings systems have a consensus or not for how good (or bad) a team is by telling us how close the various rankings are to the mean. In Iowa's case, a 9.8 is basically right on the FBS average of 10.2. There isn't an overly strong consensus on where to rank the Hawkeyes, but there isn't a ridiculous variance, either. And the other numbers help us see both sides of that.

Iowa is ranked as high as #8 in one model and as low as #52 in another. That's a fairly large range in itself, but those are outliers; something you can tell by looking at the percentiles in the table. The 25th percentile is what you may have learned in middle school to call the "lower quartile", but I find 25th percentile an easier way to make sense of it for our purposes. Iowa's #17 ranking at the 25th percentile means that 25% of the data believes Iowa's real talent level falls somewhere between #8 (their highest ranking) and #17 in the country in the country. Meanwhile, their 75th percentile is the #30 ranking, meaning that 75% of the ratings systems think they are a top 30 team, while only a quarter think they are worse than 30th. 

Wisconsin #24

Mean Ranking Standard Deviation High Low 25th percentile 75th percentile
25.1 8.6 #11 #41 #17 #30

Wisconsin is a good lesson in standard deviation because the numbers seem to have a tighter consensus on them than they do Iowa. (Something made visible on the chart.) The funny thing is, 50% of the models have the two Midwestern programs ranked between 17th and 30th in the country. The only difference is the Hawkeyes appear to have the higher ceiling (#8), while the ratings seem to think Wisconsin has the higher floor (#41).

Of course, this is looking at real talent level, and not at Wisconsin's hellish schedule this season. The Badgers are the preseason favorites to lose 5 or 6 games, and have you scratching your head as to why the numbers like them so much at the end of the year. You have been warned. 

Nebraska #38

Mean Ranking standard deviation high low 25th percentile 75th percentile
40.6 10.7 #19 #65 #34 #43

I'm guessing Nebraska fans aren't aware of Ken Massey's site because Lincoln hasn't been razed to the ground yet. Nebraska is not only currently projected as the third best team in the Big Ten West, they are just barely sneaking inside the composite top 40. 

The interesting thing about Nebraska, despite having such a huge range between high and low, is that there is actually a decent consensus on them going into 2016. Just three ratings systems have them inside the top 30, while 60-70% of the data sees them as somewhere between #30 and #43 in the country.

With question marks about their defense and whether Tommy Armstrong Jr. can cut his turnovers down, I'd say that's right around where I would put them, too. 

Northwestern #43

Mean Ranking Standard Deviation High Low 25th Percentile 75th Percentile
43.6 16.1 #11 #72 #36 #52

Northwestern is another good lesson in standard deviation, only in the opposite direction. The Wildcats are notorious for confounding models like S&P+ on an annual basis, and this season is projected to be no different. Pat Fitzgerald's team currently has the highest standard deviation of any Big Ten team by quite a large margin. 25% of the data feels they are a top 35 team (someone even has them at #11...) while another 25% also feels they are somewhere between 52nd and 72nd. Of course, the middle 50% of the data has them pegged for somewhere in the 36-52 range, and that actually seems like a pretty good starting spot for this team. 

But who knows? Preseason rankings are largely useless for most non-Alabama teams, so it's not like this really matters. The main takeaway is that Northwestern looks primed to be as confusing as ever in 2016. 

Minnesota #64

Mean Ranking standard deviation high low 25th percentile 75th percentile
61.9 10.1 #34 #83 #54 #67

The Gophers have a very favorable schedule in 2016, and that is playing a big part in why some are calling for them to be the 2016 version of 2015 Iowa. Despite that very manageable schedule, though, the ratings systems still aren't completely sold on them. Just a quarter of the models think that, at their best, they fall between the #34 and #54 slots, while 50% of the models have them pinpointed between 54th and 67th in the country. Obviously, they will need to be closer to that #34 outlier in order to take advantage of their schedule this season. We'll see if they have it in them. 

Illinois #75

Mean Ranking Standard Deviation High low 25th percentile 75th percentile
74.8 8.1 #63 #95 #68 #80

Illinois fans seem pretty happy with the hiring of Lovie Smith this offseason. But while he may eventually do good things for their fledgling football program, the models all think those good things aren't coming this season. 

Purdue #96

Mean ranking standard deviation high low 25th percentile 75th percentile
93.5 9.7 #72 #115 #90 #98

The pressure is on for Darrell Hazell to show positive results this season. Unfortunately, no one is expecting that to actually happen. 

The Big Ten East

Ohio State #4

Mean ranking Standard Deviation High Low 25th percentile 75th percentile
5.5 4.8 #1 #16 #2 #6

Despite losing a ton of guys from the 2015 team, the numbers really like Ohio State for 2016. We all know they are going to be good. However, a team full of talented, but very young players may find it hard to live up to a top five ranking in year one. 

Michigan #9

Mean Ranking Standard deviation high low 25th percentile 75th percentile
12.9 5.6 #4 #23 #8 #18

Not far behind Ohio State, the numbers also like Jim Harbaugh and Michigan to give the Buckeyes a run for their money in the East. The defense is expected to be top notch, but the offense has questions when it comes to quarterback, leaving most to wonder if Harbaugh can work his quarterback magic for a second year in a row. Personally, I think this may be a bit high, but the Wolverines surprised me last season, so I'm not going to get up in arms over it. 

Michigan State #12

mean ranking standard deviation high low 25th percentile 75th percentile
14.5 7.7 #4 #35 #9 #17

The other team from the state of Michigan is also breaking in a brand new quarterback this season, which leaves uncertainty about the potency of the offense. On the bright side, the Spartans do have a stable of good running backs to lean on, so If they can run the ball and their defense is stout, Mark Dantonio should have another good team.

Penn State #48

mean ranking standard deviation high low 25th percentile 75th percentile
50.2 10.8 #30 #66 #40 #59

After three projected top 15 teams, the fourth best team in the division comes in at 48th in the nation. They are yet another team in the Big Ten East that is replacing their QB this season (Noticing a trend yet?), and they are also introducing a new offensive coordinator into the mix, as well. And all that is without mentioning that they lost a ton of guys from an absolutely dominant 2015 defensive line. If you buy the recruiting rankings, Penn State should have a ton of talent on their roster in 2016. Of course, people have been expecting them to breakthrough for years, so is this finally the season? 

Indiana #66

mean ranking standard deviation high low 25th percentile 75th percentile
66.9 8.8 #53 #95 #60 #74

Kevin Wilson is the fourth of seven coaches in the Big Ten East who is dealing with change at the quarterback position in 2016. Of course, barring injuries, offense is rarely a problem at Indiana, so there probably shouldn't be too much worry there. Instead, it's the eternal struggle to play something called "defense" in Bloomington that is a persistent issue. According to GIA sources, Wilson has been doing some offseason reading on Wikipedia, and thinks he may have finally figured out this foreign concept. 

Maryland #86

Mean Ranking standard deviation high low 25th percentile 75th percentile
82.4 11.0 #61 #102 #72 #91

It's year three in a new conference, and the Terps are stuck in an unwinnable division and are breaking in a new coach. But cheer up, Maryland fans. At least you aren't Rutgers!

Rutgers #92

mean ranking standard deviation high low 25th percentile 75th percentile
87.4 8.6 #59 #104 #81 #94

A ratings system that Massey gives the initials LSD (Loudsound, apparently) ranks Rutgers as high as 59th in the country. Yeah... That joke pretty much writes itself. 

35 Comments
View 35 Comments