Men’s Basketball Exploration or Two Prediction Models, Really?

Welcome back, sports fans, to episode 581 since we began publishing in this forum. The GCR started as a boredom-killing exercise during a slow time at work and mental challenge to see if I (G here) could figure out how to rank teams. It was football only and just the FBS (I think about 120 teams then). Honestly, I didn’t know which teams were in the Group of 5 and which were FCS at first. Let me back up. I traveled for my day job work from 2002 through my retirement this past April. Before everything was at our fingertips via smartphones, mid-range hotels provided a copy of the USA Today for their guests, first at the door in the morning and later just in the lobby. Every Tuesday during the football season, they ran the Sagarin ratings which were later expanded to include the FCS teams as well, but not at first. Sagarin was part of the old BCS, which for those that don’t remember used the polls (AP and Coaches) along with five computer ranking systems to determine which teams would play for the championship. I studied the computer rankings Sagarin published and then later others I found online in fascination. Of course, the sausage making was rarely talked about and only partially. Then the football powers that be decided the BCS was flawed and was to be replaced – by people. All computer assistance would be dropped. Imagine my horror.

…..

In my own mode of protest, I decided to try to create a computer ranking system of my own. I was pretty use to formulaic excel – never learned macros and still don’t use them. Twenty years or so before all of this, I did write statistical games for me to play when I was bored at work (seems to be a theme) and tired of Minesweeper. There were multiple versions over time, but the premise of these games was a professional league of some sport that scored points. Using a combination of givens (values for the franchise based on the previous 5 years performance, home field/court/arena advantage, and random number generator), I would play each season. I created a way to auto populate a schedule based on the previous season’s standings. I tried to think of how I could market it or create an app that used the formulas as the data behind some user interface more interesting, to most people, than a page full of numbers. At the same time, I admonished myself for wasting time on such a silly thing. The knowledge gained in writing formulae that drove meaningful results and behaving in an expected way turned out to be invaluable. I miss those games although they were fairly tedious. Back to the beginning of the GCR. One summer, I decided to see if I could figure it out. The first metric was Strength of Schedule (SOS) which was pretty simple at the time – essentially how many games did your opponents win? Later, I added location (home games are “easier” than away games) and levels (Power 5 at the time opponents were “harder” than Group of 5 or FCS). It continues to evolve to be more indicative of difficulty. But SOS is just half of the picture. I had to figure out how to measure performance. Initially, it was just did a team win or not. Later, I added point differential on a diminishing scale and other factors – also evolving to more indicative of success. The term I use when I want to figure out how to make excel do what I want is “playing.”

…..

Like the statistical games I wrote, I never really had the idea of sharing the results of what was to become the GCR with anyone. I did talk about it, though, and some colleagues asked to see it. Each week during the regular season, I would send an email to whoever wanted to see the ranking. Later, people asked me to predict games – something I was reluctant to do for what I think are obvious reasons – so I did with caveats that everything GCR-related is for entertainment purposes only. A few years later, I left that job but continued the email communication system. At one point, I had about 45 people on the distribution list. The initial GCR engine was grid-based so two teams could not play each other twice in the same season, so the rankings ended right after rivalry week – no championship games, no Army/Navy, and no bowls. Later, Liberty and New Mexico St decided to play each other twice and I had to change from a 2-dimensional grid to a 3-dimensional grid (that took a while to figure out and included a complete rewrite). That game, which I am certain no one on the distribution list cared about, allowed me to have the GCR rankings continue until after the championship game. I finally joined the 21st century in 2019 when I opened this page. Since then, we’ve added the FCS in football and both men’s and women’s basketball – based on the same overarching logic, but with significant differences as well. Basketball would have been impossible without the 3-D grids. Today, the GCR continues to try to improve – we review the math and the results each season to ensure our product is logical and explainable.

…..

In football, the predictions for upcoming games are based on overall rating with an emphasis on performance score. In basketball, to get meaningful point differential, we converted the overall rating to a normalized value with 100.00 always being assigned to the #1 team. That provides the much-needed separation. We added 4.5 as the home court advantage which is pretty standard (just like the 3.0 we give to football teams). The issue we saw, however, was that sometimes team A had a higher Score than team B, but B had a higher RPI than A. As a reminder, RPI is a percent based on a weighted average of a team’s winning pct, their opponent’s combined winning pct, and their opponents’ opponent’s combined winning pct. While the Score is an indicator of where a team is at a snapshot in time, the RPI can be used to estimate whether a team is expected to improve in the rankings or not. If the RPI rank is closer to #1 than the GCR rank, expect improvement, and vice versa.

…..

It became apparent to us, that just using the Score to predict may not be enough information. We also did not want to dilute the formulae by trying to combine them. Instead, we found a way to use both – and they may not always agree. First, the method. Just as the Score had to be created to provide differentiation that made sense in basketball, RPI needed some help as well. Duke, the top-rated team, has a Score of 100.00 and an RPI of .695, also best in the league. Arizona is ranked #2 overall (Score of 99.64) with the 7th ranked RPI (.681). According to the Score prediction, Duke would win by 0.36 on a neutral court, by 4.86 at home, but lose by 4.14 at Arizona. That makes sense whether we agree with it or not, the logic is sound. The question is how do we turn .695 and .681 into something meaningful? Is that a big difference? Is it home court dependent like the Score? The raw numbers don’t tell us a lot. So, we converted them.

…..

We needed a constant or similar value to convert. If we just multiplied by 100 (no logical basis for that, but if), we would now have 69.5 and 68.1. It doesn’t answer the question: is that a lot or a little? What does 1.4 (the difference) mean? The logical path we chose, was to compare the extremes. There is a value that is the difference between Duke’s 100.00 Score and the Score of the 365th team, currently MS Valley (7.40). Note: this data is as of the last Top 365 posted, but it changes each day as Score changes each day. That 92.40 is a known large difference. By the Score methodology, we would say Duke would beat MS Valley by 96.90 at home. But MS Valley does not have the lowest RPI. That distinction belongs to Binghamton. The unadjusted RPI is .695 (Duke) vs .330 (Bing), which we can also say is large. We still have to alter the RPI to make it meaningful, so we take the Score Difference between best and worst (92.60) and divide it by the RPI difference between best and worse (.395) to get the constant, which of course recalculates every time a game is completed. In the initial testing using the data above, the constant is 253.7 which is a big so what. If we then take a given team’s RPI and multiply it by the constant we get the new metric RPI Factor (RPIF). In this case, Duke’s is 176.3 and MS Valley’s (to be consistent with the prediction above) is 90.6 for a difference of 85.7. Add 4.5 for the home court and we have duke winning by 90.2. We would then say Duke should win by between 90.2 and 96.9.

…..

Let’s use a more likely scenario and one in which the predicted winner is different. Arizona has a Score of 99.64 and an RPI of .681. They decide to play a neutral site game against Michigan (99.28 and .691). Using the Score method, Arizona would be the favorite by 0.36. Looking at RPIF, Arizona is sitting at 172.8 while Michigan bests them at 175.3. Now the favorite is Michigan by 2.5. We would then say we cannot predict the winner because the estimate crosses the zero line. We at the GCR never bet and we publish the predictions for fun. If someone were to bet on this fictitious game and wanted to use the GCR to help place the bet, s/he would have to decide to focus on Score or RPIF. Earlier, we mentioned combining the two metric muddies the waters. This example demonstrates that point. If we just averaged the two, Michigan would be favored by just over 1 which ignores the fact that one of the predictions has Arizona winning.

…..

And that is how the GCR continues to evolve. Either a reader asks a question or for a metric, or we ask ourselves a question. We often have no idea if we can make it work logically and in a meaningful way, so we play until we get the interocular answer – one that hits us right between the eyes.

…..

That’s it for today. Thank you for reading and sharing with others, JoJo and G

Men’s Basketball Exploration or Two Prediction Models, Really?

Like this:

One Reply to “Men’s Basketball Exploration or Two Prediction Models, Really?”

Leave a Reply to Carl Larsen Cancel reply

Share this:

Like this:

One Reply to “Men’s Basketball Exploration or Two Prediction Models, Really?”

Leave a Reply to Carl Larsen Cancel reply