The Great American GPP:
Recently, I’ve used this blog to examine using advanced stats to create player models and generate HR probabilities. We’ve looked at EV, aEV and EV+, taken HH rates into consideration, thought about plate discipline, and borrowed the concept of coming into form from race handicapping,turning it into the baseball version called “Home Run form”. To continue this thread, this issue of The Great American GPP takes a look at additional batted ball data, Pull%, Mid% and Opp%, and how they fit into a player profile.
First, lets make sure we understand what these terms mean and measure.
Mid% is the percentage of a batters BBE’s that go right up the middle of the field, basically over the pitcher’s mound, over second base, and into center field. As you know from watching games, hits up be middle can be very productive, or very frustrating. H*ard *H*it line drives and ground balls that get by the pitcher (or don’t take his head off) usually become singles, while fly balls, even the hardest hit of them, very often turn into outs. As they say, center field is where hard hit balls go to die. Because center field is the deepest part of any stadium, grounders rarely turn into extra base hits, as the center fielder can charge in on the ball, and LD and FB can often be run down by any decent center fielder. Fly balls require the greatest amount of *DST to get over the fence here, hence the frustration factor, as balls that get perfectly hit harmlessly fall into the CF’er’s glove on the warning track. This is another reason why Park Factors are important to include in your home run models, and not just in your analysis of the run line or team total for the game. How deep is center field? Does the wind blow in from center, or out from behind the plate? How wide is CF, and what is the speed and range of the center fielder? If you don’t think the range and positioning of a fielder impacts offense projection, ask why the Dodgers use laser range finder and GPS to position their fielders at visitor ball parks.
To “Pull” a ball is to hit it to the batter’s right hand side, so for L bats, the pull field is right field, and for R bats, the pull field is left field. Just remember that batter’s pull opposite their handiness. So for righties, analyze the left field measurements and the defensive ability of the left fielder, and for lefties, look up right field and the right fielder. Confusingly, the “Opp%” or opposite field is the same field as the batter’s hand, so the Opp field for R batters is right field, and for L bats, left field.
Don’t forget that each ballpark is a unique environment (which is what makes visiting each one such a cool thing to do) so don’t just look up the dimensions; if you can, take a moment to find one of the many ball park visualizers that are starting to appear all over the web. These sites will give you a virtual image or map of the stadium, so you can track the dimensions and shape of the field and the fences. If you’re wondering why this would be valuable, besides as a virtual trip to every ball stadium, (which is cool, but not as cool as doing irl), think of or picture Fenway or Minute Maid stadium and you’ll quickly see why we want to do this.
There are several different ways we can use this data in our DFS strategy. Obviously, the first thing we can do is take the pool of batters our HR analysis has given the best probabilities of going yard, and examine their direction percentages, and get a sense of the likely field they will use. Then, mark the depth of the field, and the ability and speed of the defender. Then, use the batter’s avgDST and L15 DST to alter the batter’s HR probability up or down based on the match up. You can also use the players FB% and LD% and match them with the fielder’s speed and range to alter the probabilities for an extra base hit (this second would apply more to cash games I believe). For GB%, it makes sense to check on the batter’s susceptibility to the shift play, and the fielding team’s use of the shift (if I was allowed to eliminate one thing from baseball it would be the shift). If you can play a massive amount of entries in a GPP, this probably won’t matter as much to you, but if you are playing one, or even just a few dozen, this extra bit of information can help you make some close calls. This also helps to explain why the DIST statcast advanced stat is so important, but also less universal than the other statcast data. Why? Well, look at those field maps again!
And, believe it or not, this ties back in to our analysis of using Exit Velocity (and Launch Angle) several issues ago (see the blog’s archive). The uniqueness of each stadium, and its impact on offensive production goes way beyond just a correlation between the stadium and the Vegas run line (although that is a great short hand metric for all of this other stuff). Writers at Fangraphs and Baseball Prospectus have examined how each park has a different statcast effect, and even more important, each stadium has its own EV and Launch Angle profile That goes beyond just matching DST. Some parks (Coors is the easy example but there are others) are very conducive to a fly ball’s performance, while other parks are fly ball kill zones. Great hitters from recent past generations, like Keith Hernandez , always used to talk about stadiums with flyball kill or dead zones, but that could never be proved until recently. Spend some time looking into this, and you’ll discover what expert hitters like Hernandez ‘felt’ about a park turned out to be statistically true! For example, Miller Park is rated as the best park factor for HR by Rotogrinders (LINK), and well it should be, nearly 100% of FB hit with an EV of 100 are home runs, while the MLB avg is roughly half that! In addition to their fly ball properties, Coor’s and the Roger’s center allowed the most line drive HR’s, nearly a third more than the MB avg. Every stadium has its unique data—find it and use it to your advantage!
Just as the matchup of FB or GB hitter and pitcher type has an impact on wOBA a true batting average, “Steve Staude”:http://www.fangraphs.com/blogs/an-unsolicited-follow-up-study-of-pull/ at fangraphs found that Pull and Opp% have a positive impact on a player’s ISO, while balls hit to Mid% lowered a player’s ISO (the same holds true for SLG and BABIP, btw), and ISO is one of the key stats for modeling HR and Xtra base hit production. Thus, while Pull% Mid% and Opp% are descriptive stats, they are very useful in creating models or profiles, of players and their DFS value.
Daniel Steinberg has also developed a method of using Pull% to gain an edge in finding player value.
Remember to keep this data, and all MLB stats and ideas about how to use them in context. This data provides another way of looking at players and how they play the game, and as DFS gamers, we should always be looking for new ways to think about our game. A high pull% and a high DST number are not proof that a hitter will go yard (although both help a lot), but examining them is in in keeping with my theory about advanced stats— when used in correlation, they allow us to create highly accurate models of players with probabilities of various outcomes for those players on the field. Its up to us as DFS owners to use this info to get the best of it.