If you're looking for the Jacksonville preview, you'll need to scroll down quite a bit.
My regular reader may have noticed by now that I've been loathe to assign credit or blame on specific players during a single game, but rather tend to present team stats. I do this in part because I think that it is difficult to evaluate individual play (especially defense) with a simple basketball box score.
There are tools available to glean some additional information when you look at a single game, notably the individual net score box that Dean Oliver describes in Basketball on Paper. Henry Sugar over at Cracked Sidewalks is a particular proponent of this, and has been providing Marquette fans with his version (which he calls "Individual Player Ratings") for most of last season. Here's an example from last year's game between MU and Villanova (hope he doesn't mind me linking):
Note that I've previously discussed this game when I introduced my version of the HD Box Score.
I won't explain Mr. Sugar's work here, but I will point to an excellent post he wrote last season covering the basics of each stat column listed. The bottom line for most fans is in columns 5 and 7 - points produced and net points added. This gives us an idea, based on tempo-free stats, of just how many points each player contributed towards the game result (in this case, a 10 point win for Marquette).
There are some limitations to this work.
Without going into too much detail here, I can assure you that the defensive rating assigned to each player for this game is just loosely tied to reality. Defensive stats are not available for most basketball games (NBA too) at the detail-level needed, so it is somewhere between difficult and impossible to assign blame for each player's defensive effort.
But more generally, the calculations used for the stats in the table above are underpinned by a large number of estimates, which should improve as we aggregate data over the course of a season, but which can be quite a bit off during an individual game. Here are just some examples of missing information needed to make the calculations for the stats above:
- How many possessions did a player have on offense? Defense?
- How many offensive/defensive possessions ended in a score?
- What percentage of field goals made by a player were off of an assist?
- How often are a player's missed shots rebounded by a teammate?
- How well did the team rebound while the player was on the court?
- How often did a player end a possession by making at least 1 free throw?
- How often does a player give a foul, and the opponent miss at least 1 free throw (e.g. Hack-a-Shaq)?
However, all of the questions asked above can be answered by parsing the available play-by-play from the game. And that is what I propose to do.
A few points to consider:
- While I can improve the accuracy of the final stats by replacing estimates with actual tallies of various components of the calculations, I'm not modifying the philosphy (or math) of the final stats. That is, if you don't think individual player Offensive Rating is a good measure of how a player contributes on offense, there is little here to convince you otherwise. Of course, if your main quibble is with D. Oliver's many underlying estimates, keep reading.
- As I've said before, the drawback of using play-by-play data is that there are inevitably errors in the transcript, which can lead to uncertainty in assigning credit or blame. However, I am not convinced that these same errors aren't also in the official box score, but are just hidden from view. Just for Georgetown, I know of at least one instance where Ken Pomeroy found an error in the play-by-play that propagated to the box score.
- I am not exploiting the play-by-play fully yet, because if takes a lot of work. I've written over 5000 lines of code so far (yes, that was a brag) and my wife keeps mentioning how much time I spend working on the program, and something about a divorce (at least I think that's what she said, I wasn't really paying attention). For instance, I could record the shooting percentage of each player making an assisted basket, but I don't yet. I could distinguish between assisted dunks, layups and jumpers, but I don't yet.
Defensive rating is an attempt to estimate the contribution of each player to the team's defensive efficiency. It is calculated as team defensive efficiency, plus one-fifth of the difference between team defensive efficiency and individual player stops per 100 possessions played. Player individual stops are estimated from the number of blocks, steals and defensive rebounds each player has, plus some team stats. Since it is not a simple ratio, it is more like being graded on a curve, such as that it is limited to the range of 80% - 120% of team defensive efficiency. So, a player who literally refused to play defense (e.g. Donte Greene) could score no worse than 80% of his team's efficiency. I would describe this stat as a very rough estimate of actual defensive worth . . .Later in that same post, I discussed an alternative method, which was simply to use available plus/minus stats to calculate the team's defensive efficiency while the player was on the court, and use that (less the team's defensive efficiency while the player was off the court) to rate that player's defensive ability.
The drawback to this method, pointed out on this thread on Hoyatalk, is that it the quality of one's teammates can have a big effect.
So here, I'm proposing a new method: I am using Dean Oliver's basic statistics for player offensive and defensive rating, but the data I am feeding into the underlying equations are only those generated by his team while the player was on the court. This should especially help with defensive stats, in that the base team defensive efficiency used is now the def. efficiency while the player was on the court (i.e. the player receives no credit or penalty for great or lousy defense played by his teammates while he sat on the bench). The remainder of Dean Oliver's def. rating calc. (stops, stop %, scoring poss., etc.) is used as originally described. Additionally, as stated earlier I am removing as many of the estimates used by Oliver as I can, when I have time. The seven listed above are all incorporated, along with a few others (e.g. is a blocked shot recovered by the shooter's team?). I'll try to write up a FAQ covering all of the gory details at some point this season - likely when my wife is out of town.
As a test case, I've run the Marq/Nova game mentioned at the top of this post. Here's what I get:
INDIVIDUAL NET POINTS STATS Marquette Off Poss Individ Def Individ Player Poss Used ORtg Pts Prod Poss DRtg Pts Allow Net Pts HAYWARD, Lazar 59 12.5 111.2 13.9 59 100.9 11.9 +2.0 BARRO, Ousmane 51 3.5 149.3 5.2 51 95.4 9.7 -4.6 JAMES, Dominic 69 18.0 140.5 25.3 70 97.3 13.6 +11.7 MCNEAL, Jerel 66 18.6 79.0 14.7 67 96.4 12.9 +1.8 MATTHEWS, Wesley 42 11.7 92.8 10.8 42 86.0 7.2 +3.6 ACKER, Maurice 23 4.7 181.0 8.4 23 81.8 3.8 +4.7 FITZGERALD, Dan 16 0.3 280.0 0.9 17 104.5 3.6 -2.7 CUBILLAN, David 31 3.1 74.8 2.3 32 134.1 8.6 -6.3 BURKE, Dwight 6 0.0 - 0.0 7 62.9 0.9 -0.9 MBAKWE, Trevor 12 2.0 100.0 2.0 12 124.4 3.0 -1.0 TOTALS 75 74.3 112.4 83.5 76 98.7 74.6 +8.9 Villanova Off Poss Individ Def Individ Player Poss Used ORtg Pts Prod Poss DRtg Pts Allow Net Pts Pena, Antonio 62 12.9 75.6 9.8 60 123.0 14.8 -5.0 Cunningham, Dante 61 12.0 93.3 11.2 62 112.0 13.9 -2.7 Reynolds, Scottie 60 16.1 85.4 13.7 57 123.9 14.1 -0.4 Fisher, Corey 62 17.4 76.4 13.3 59 112.0 13.2 +0.1 Anderson, Dwayne 54 6.6 154.1 10.1 53 125.3 13.3 -3.1 Redding, Reggie 25 2.0 223.2 4.4 28 80.9 4.5 -0.1 Clark, Shane 8 0.8 333.3 2.5 9 70.2 1.3 +1.2 Stokes, Corey 48 7.6 121.7 9.3 47 106.7 10.0 -0.7 TOTALS 76 75.3 98.7 74.3 75 113.3 85.1 -10.8The actual score of the game was MU 85, VU 75.
Several of the columns here are the same as Henry Sugar's above, but there are a few new ones as well. Briefly
- Off/Def Poss - the number of offensive or defensive possessions that a player was on the court; I think this is more useful than minutes played.
- Poss Used - the number of offensive possessions used by a player (partial credit due to assists and offensive rebounds).
- Off. Rating - the number of individual points produced, divided by the number of offensive possessions used, multiplied by 100. This is an estimate of the number of points a player would produce (not simply score) in 100 possessions.
- Points Produced - similar to possessions used, it is an estimate of the team points scored that can be credited to an individual player; again, partial credit due to assists and offensive rebounds.
- Def. Rating - An estimate of the number of points a player would allow in 100 possessions. See the discussion above the table for the details.
- Points Allowed - The actual number of points allowed by the player - again an estimate.
- Net Points - The difference between points produced and points allowed.
I've also included a totals line for all stats, so you can actually check my work.
The total Off Poss & Def Poss are the actual number of possessions in the game.
The total number of possessions used by each team agree very well with the reality - for my data parser, total possessions used are typically within 5% of actual possessions played, but this game worked exceptionally well.
Total points produced for each team are also very close to actual points scored. These should be with 10%, and often with 5%.
The summed points produced divided by total possessions used gives an estimate of team off. efficiency. This is the value listed as the total of ORtg. The estimated team offensive efficiencies (112.4 & 98.7) agree extremely well with actual off. efficiencies for each team (113.3 & 98.7).
At least for this game, it appears that my method is giving a quite satisfactory measure of what happened on offense. It won't always be so accurate, but this is why I want to give these totals - it will allow my reader to decide for himself (do any women read this blog?) how well the stats analysis is working.
Defensive stats are more tightly coupled to team, rather than individual, data so the totals here aren't quite so useful. The DRtg totals are simply team defensive efficiencies, calculated as team points allowed divided by defensive possessions.
Here, the summed individual points allowed for each team agree within 1 point of the actual score, another excellent result - I find typically they will agree within 5 points.
Finally, the net points totals give two estimates of the margin of victory (or loss). The average of the two [(8.9 + 10.7)/2] = 9.9 is almost exactly the true margin. It usually doesn't work quite this well!
I think this method compares favorably to the "classic" method proposed by Dean Oliver. I will keep working at it to remove additional estimated values and fix any bugs (e.g. I wasn't counting missed dunks until last week), but I think the basic framework is now in place. Any feedback would be appreciated.
Edited to add: A year later, and I did incorporate some feedback into net points. See here for the gory details.
Finally tonight, I thought I'd take a look at last year's game vs. Jacksonville, which the Hoyas won 87-55. That link will take you to my post-game post from last season, which includes the tempo-free and HD box scores (both will be part of each post-game analysis this season, when available). Here, I'll post the net points stats from last year's game - I've bolded and italicized any player who should play tomorrow.
INDIVIDUAL NET POINTS STATS Georgetown Off Poss Individ Def Individ Player Poss Used ORtg Pts Prod Poss DRtg Pts Allow Net Pts Wallace, Jonathan 26 9.3 71.7 6.6 25 91.3 4.6 +2.1 Summers, DaJuan 39 8.8 125.0 11.0 36 101.8 7.3 +3.7 Sapp, Jessie 36 9.2 61.8 5.7 35 75.0 5.2 +0.4 Ewing, Patrick 26 2.0 101.6 2.0 26 55.5 2.9 -0.9 Hibbert, Roy 24 8.7 115.6 10.0 24 84.4 4.1 +6.0 Macklin, Vernon 46 4.4 143.3 6.4 45 97.8 8.8 -2.4 Wright, Chris 40 10.0 137.2 13.7 40 74.5 6.0 +7.7 Rivers, Jeremiah 28 4.7 152.7 7.1 28 83.6 4.7 +2.4 Jansen, Bryon 4 0.0 - 0.0 4 80.0 0.6 -0.6 Freeman, Austin 42 4.3 255.1 10.8 43 76.6 6.6 +4.3 Crawford, Tyler 29 4.5 123.0 5.5 29 95.5 5.5 +0.0 Wattad, Omar 10 2.1 129.3 2.7 10 90.9 1.8 +0.9 TOTALS 70 67.9 120.3 81.6 69 79.7 58.1 +23.5 Jacksonville Off Poss Individ Def Individ Player Poss Used ORtg Pts Prod Poss DRtg Pts Allow Net Pts SMITH, Ben 54 16.8 64.3 10.8 55 115.1 12.7 -1.9 HARDY, Ayron 34 5.4 73.4 4.0 37 125.0 9.3 -5.3 MCMILLAN, Andre 37 5.6 143.8 8.0 37 119.7 8.9 -0.8 COLBERT, Lehmon 40 9.0 87.8 7.9 40 105.8 8.5 -0.6 ALLEN, Marcus 30 3.8 95.6 3.6 30 126.3 7.6 -4.0 COHN, Travis 16 3.4 62.0 2.1 16 135.0 4.3 -2.2 GILBERT, Brian 30 3.1 97.2 3.0 30 143.6 8.6 -5.6 KOHIHEIM, Paul 26 3.8 20.9 0.8 25 120.5 6.0 -5.2 BROOKS, Aric 19 5.9 80.9 4.8 19 116.6 4.4 +0.4 LUKASIAK, Szymon 33 5.0 79.1 3.9 35 138.1 9.7 -5.7 JEFFERSON, Evan 26 5.1 59.6 3.1 26 139.5 7.3 -4.2 TOTALS 69 66.9 77.8 52.0 70 124.3 86.9 -34.9DaJuan Summers had a great offensive game, but a lousy defensive game against the Dolphins, while Jessie Sapp was just the reverse (bad O, great D). Austin Freeman was his typical efficient self on offense but didn't use up a lot of possessions (~10%), while Chris Wright was player of the game on both ends of the court. Even Omar Wattad did his thing on the offensive end (1-1 2FG, 1-2 3FG).
I won't go into the Jacksonville players (you can see how they played last year).
The Dolphins lost to Florida State on Saturday, 59-57. J'ville was trailing 57-40 with 3:30 left and proceeded to go on a 15-1 run to bring the score to 58-55 with :20 left in the game, thanks in part to 2-8 FT shooting by FSU.