Pre-Musings - CheeseIsGood's 2020 Daily Fantasy Baseball Prep: Part 3
Welcome back to the Pre-Musings (in which we Muse before we Muse) for Part 3. In last week’s article, I walked through the stats I’ll use to break down pitching for DFS, and today we’re going to shift to the hitting side.
I will say right off the bat that hitting is far more difficult to project in one single day than pitching. The main factor in this is plain old sample size. A starting pitcher will face 20-30 batters, which gives him 20-30 chances for his skill set to work its magic. But a hitter will get four, maybe five at bats in a game, and that’s it. Let’s use a coin flipping analogy to illustrate this further:
- Build DFS Lineups Like a Pro!
- Access to Content and LineupHQ
- MMA, KBO, Soccer and PGA
A player’s skill set is to some extent a known quantity. A coin’s chance of landing heads or tails is a known quantity: It’s 50/50. The more times you flip a coin, the more likely you are going to end up with that 50/50 split. If you flip a coin four times, you will occasionally get four heads or four tails. But flip that coin 25 times, you can flip that thing all day and you’ll never end up getting 25 straight heads or tails. That’s how it works with hitters vs pitchers. We could know with precision a batter’s projected outcome against a certain pitcher, but still in just four chances, he could easily land on either end of the spectrum.
Add to this small sample size the bullpen and the uncertainty of batted balls, and you should be able to see why there is so much variance in a single batter’s outcome on any given day.
I’m going to start with the Cliff Notes here, as a starting point to show you yet again where the variance comes from with hitting, and why we should not be surprised when Mike Trout goes 0-4, but also what stats can we actually look at to find something useful.
STARTING WITH CLIFF NOTES
For any one individual batter in one individual at bat, we can consider strikeouts, walks, hard hits, GB/LD/FB. This is basically all we know:
Step One – The batter either hits the ball or he doesn’t. If he doesn’t, it’s a walk or a strikeout.
Step Two – If he hits it, it’s either on the ground or in the air. And if in the air, a line drive or fly ball.
Step Three – How hard does he hit it?
Step Four – Our old friend Mr. Variance in the form of BABIP, HR/FB%, Ballpark, Weather, Runners On Base, Defensive Alignment, Speed of Runner, Sun In Fielders Eye, Lazy Third Baseman Thinking About Favorite TV Show, Non-Lazy Left-Fielder Making Spectacular Diving Catch, etc.
That is really all we can do for one at bat. Determine the likelihood of a walk, a strikeout, a ground ball, fly ball or line drive, and how hard the ball is hit. Clifton Hillegass has again done his job, now we move on.
The Very Basic Basics – Plate Appearances, K%/BB%
Like with pitching, strikeouts and walks are the most projectable skill for hitters, but the pitcher still controls more of that than the hitter does. Let’s get even more basic than that. If sample size is what makes the pitchers more projectable, then let’s go for sample size. What does that mean? Plate appearances. In cash games, you will almost never want a batter who is low in the order. The more chances you have to hit the ball, the more likely you are to hit the ball. That is science! (I don’t know if that’s really science). Add to that, if a team is good and scores a lot of runs, then that lineup is going to turn over more times in a game, making the players at the top of that lineup more likely to get five at bats as opposed to the four that someone else is getting. Yes, one at bat matters.
If we agree that there is a lot of variance in batted balls, we should also agree that more balls in play give us more chances for good things to happen. A walk is fine, but has a limited ceiling of DFS points that can come with it. A strikeout is not fine and gives us a zero. So, with that in mind, as should be extremely obvious, I prefer batters who don’t strike out. If they walk a lot, that is fine, but not as important to me as not striking out. But, whereas strikeout percentage is the leading stat for me on the pitching side, it is less of a deciding factor for me on the hitting side. A soft hit ground ball doesn’t have much more chance of success than a strikeout, and at that point, I’d rather just have a walk. So, yes, strikeouts matter for batters, but they are at the bottom of the list of importance.
Because of the way the scoring works in DFS, with the emphasis on power, runs and RBI, when we’re looking for top scores in tournaments, I’m not going to let potential plate appearances or strikeout rate outweigh plain old power potential.
In the pitching piece, I talked about the three true outcomes in baseball – Strikeout, Walk, Home Run. Just like with pitching, I want to get rid of as much batted ball noise as possible. With a home run, nothing else matters. So the first thing I want from a batter is the potential to hit a home run. And what makes a home run? A hard hit fly ball or line drive. That’s what I want. Give me hitters who hit the ball hard and in the air, and the home runs and extra-base hits will follow. The real starting point here as far as the stats I’ll use in the Musings is hard hits.
THE BEST HITTERS ARE THE ONES WHO HIT HARD HITS HARD
On the pitching side, I talked about how pitchers have less control over hard-hit rate than they do over strikeouts and walks. That is because while the hitters are the ones who have the most say in how hard a ball gets hit. Some players are simply stronger than others. Let’s take a look at the league average numbers that we’re shooting for here:
2016 – 31.4%
2017 – 31.8%
2018 – 35.3%
2019 – 38.0%
You’ll notice the unmistakable trend here of hard hits going up and up. Strikeouts have also gone up along with them as the league has shifted to a ‘hit the ball hard, we don’t care what else happens’ attitude. We’ll see what happens this year, but my expectation is that we don’t see a further increase in these numbers, and we settle somewhere in the 35-38% range as the average hard-hit rate. The key here is not to worry about what the average number will be, but to focus on those top hitters who are up in the 40%+ range consistently. If we look at the leaderboard from 2019, it will be immediately clear how hard-hit rate correlates to home runs:
1) Aaron Judge – 53.8%
2) Miguel Sano – 52.7%
3) Nelson Cruz – 52.5%
4) Yordan Alvarez – 51.1%
5) Christian Yelich – 50.8%
Sometimes this stuff is so easy. Just hit the ball hard. And what goes alongside hard hits? Hit the ball in the air. Among those five batters who led the league in hard-hit rate, none had a ground ball rate above 44% and all had a line drive rate above 20%. I’ll say it again – Hit the ball hard and in the air.
With pitchers, I talked about HR/FB% as a “luck stat.” For hitters, it is not at all a luck stat- it is a skill. Similar to the hard-hit rate, and the very reason why it’s a luck stat that will regress towards average for pitchers is because the hitters control this stat. And they control it by what we’ve already mentioned: Hard hits. The harder a fly ball is hit, the more often it goes for a home run. So while the league average HR/FB% for a pitcher is in the 13-15% range, we do not expect hitters to all clump together in that range.
Elite batters will be up towards a 30% HR/FB, while the Mallex Smiths of the world will be down around 5%. I talk more about the plain old hard-hit rate and fly balls; I just wanted to highlight that HR/FB% is a completely different thing for pitchers and hitters.
IF NOT HOME RUNS, LINE DRIVES
When fly balls do not leave the park, they are usually outs. Ground balls never leave the park and are usually stopped by the infielders. Line drives? They are rarely caught or stopped by infielders. A few google searches bring up various articles that have shown the numbers for BABIP and batting average on line drives vs ground balls or fly balls. It is staggering stuff. I don’t have the 2019 numbers, but they are similar rates every year. According to fangraphs, in the 2014 season, these were the numbers:
Ground Balls – .239 AVG, .220 wOBA
Fly Balls – .207 AVG, .335 wOBA
Line Drives – .685 AVG, .684 wOBA
HELLO! The tricky thing here is that the difference between an average line drive rate (21%) and an elite line drive rate of 27-28% is small enough that it’s not very reasonable to think that a four at-bat sample size is likely to result in more line drives for an average player vs an elite player. For this reason, as much as I love line-drive hitters, and will lean to them especially in cash games, it is FAR behind hard-hit rate on the list of factors I use on a daily basis.
So we know hard hits are good, line drives are great, and fly balls lead to home runs. Let’s make a cheat sheet of some basic levels to know what we’re looking for.
The league averages for batted ball types are the same as they are with pitchers. Essentially a 43/21/36 average for GB/LD/FB. For batters, a 50%+ ground ball rate is someone I refer to as an extreme ground ball hitter, and rarely someone I’ll want to use on his own in tournaments. There are some exceptions to this with ground ball hitters who hit for power specifically against certain types of fly ball pitchers, but for the most part, we don’t want ground ball hitters as one-offs in tournaments. Just as with fly balls, hard hits are still good for ground ball hitters, and there is nothing I like less than ground ball hitters who do not hit the ball hard.
For fly balls, the higher the fly ball rate, the more the home run upside, but also the more inconsistent those batters become. Because fly balls are almost always an out if not leaving the park, we’ll see more 0-4 days from extreme fly ball hitters, even if they are hitting the ball well. If you add in high strikeouts along with fly balls, then you have the very definition of a boom or bust player.
For example, Joey Gallo has a career 50% fly-ball rate and 38% strikeout rate. But also a 48% hard-hit rate. Just as you’d expect from that skill set, he hits a lot of home runs, but also has a horrendously low .212 career batting average.
I will refer to batters in the 45%+ range as a fly-ball hitter. As for line drives, to be considered a line-drive hitter, we need more sample size to know a players true skill level, so be careful in getting overly excited about short term line drive rates. When I refer to players as line-drive hitters, I talking 24%+ over multiple seasons.
This is a lot of numbers being thrown around, but let me boil it down real simple:
Hard Hit Rate First. Line Drives and Fly Balls Next. Done.
Single Stat Cheatcodes
It used to Batting Average, then it was On Base Percentage, then Slugging Percentage. Next it moved to OPS (On Base Plus Slugging), and then we were onto wOBA (Weighted On Base Average) and ISO (Isolated Slugging %). Then there’s wRC+ (Weighted Runs Created Plus) and WAR (Wins Above Replacement). Now we’ve got xEverything (xSLG, xWOBA, etc) with new stats being created every year.
Here’s what I’ll say about all these stats – They are all awesome and tremendous and wonderful and I love them like brothers, but none of them actually tell us what a player is going to do today.
In order to shorthand some points in the daily musings, I will use wOBA and ISO fairly regularly to show which batters I like. BUT, here’s the key, I don’t like them BECAUSE of their wOBA or ISO. I like them because of what we’ve already talked about. They hit the ball hard, they hit the ball often, and they hit the ball in the air. Over time, wOBA and ISO will match up to that basic batted ball data, meaning a player who hits the ball hard and in the air will end up with a high ISO. But, I want to type in all caps so it’s like I’m screaming this, which I am – HIS ISO IS HIGH BECAUSE HE HITS THE BALL HARD AND IN THE AIR. HIS WOBA IS HIGH BECAUSE HE HITS THE BALL OFTEN AND HE HITS THE BALL HARD.
If you don’t have time to look at every batted ball stat, or just don’t have interest in it, trust me, I get it. That’s fine. wOBA and ISO and other xStats will give you a very good glimpse at who is good; my point is just that the batted ball basics are the building blocks that make up all the other stats.
I think I will create a new stat that encompasses what really matters. We will call it HHHHH (BTYBC)
‘Hitters Hitting Hard Hits Hard’ (Brought To You By Cheese)
Runs, RBI and Why We Stack
One thing that we can’t accurately project is when a batter will come to the plate with runners on base and in scoring position. A basic ground ball single is a completely different event with the bases loaded as opposed to with the bases empty. This is the first place where we can gain an edge from correlation, or at least having batters on good teams. I don’t actually believe that run and RBI correlation is the main reason why stacks win tournaments (we’ll get to that in an upcoming episode), but stacking and correlating players absolutely gives your lineup upside in that regard. Having players on the same team not only gives you that potential extra plate appearance that we talked about up top if the team does well, but also adds to the likelihood that you can pick up an extra run or RBI. This is basic DFS 101, but how do we measure it? I am certain you are not going to like my answer, but we can’t measure it. Projection systems will have a boost to players on high total teams in an attempt to measure it, but it is nothing close to an exact science. And this is exactly why we do correlate players and stack multiple players from the same team. We can gauge which teams have the best chance to score the most runs on a slate more than we can gauge which exact players on that team will get the runs and RBI. The difference in a player with a projection of 12 DFS points and 13 DFS points is completely irrelevant in one game. So, if the choice comes down to a similarly priced player, I will side with the 12 point projected player over the 13 point projected player if he correlates with the rest of my roster.
Let me pause here and say that if you are new to MLB DFS and are coming from NBA DFS, you need to forget everything you know and start from scratch with lineup construction and player evaluation. I might write a full pre-musing about this, but there are no two sports more completely opposite for DFS than MLB and NBA. To me, the biggest edge in early season MLB comes from having players fresh off NBA season that assume projections are the key to success in MLB. Of course, we want good projections (I recommend THE BAT), but lineup construction cannot be done with projections alone, especially in tournaments.
To boil this down, in cash games, I prefer batters on good teams (high projected totals), as the more runs a team scores, the more opportunities that lineup has for runs and RBI. But this does fall into the Variance category, and not the Things We Can Know category. In cash games, give me players who hit the ball hard, hit line drives, and don’t strike out. In tournaments, correlate and take the happy variance of additional runs and RBI from unexpected places.
- Build DFS Lineups Like a Pro!
- Access to Content and LineupHQ
- MMA, KBO, Soccer and PGA
Bullpens, Ballparks and Barometric Pressure, Oh My
Right at the beginning of this, I gave some reasons why we can’t accurately project DFS scoring in one game for a hitter, beginning with sample size and then moving to batted ball variance. But the other huge factor that can be considered, but not accurately quantified is the opposing bullpen. Every year, we’re seeing starting pitchers throw fewer and fewer innings and a majority of starters only facing each lineup twice. This means that in a majority of games, only half of a hitters at-bats are going to come against the starting pitcher. But we’re doing most of our analysis based on the assumption that he is facing a certain skill set. We do have bullpen rankings to know which teams are better than others, but the fact is every team has both good and bad pitchers in the bullpen, in addition to both left and right-handed relievers. All this unpredictability adds to why I focus first and foremost on batters who hit the ball hard. We don’t know if he’ll be facing a high strikeout pitcher or a lefty or righty, or a fly ball pitcher or a ground ball pitcher after the first couple at bats. So give me the players who hit the ball hard and let the rest sort itself out.
The bullpen factor is another thing that gives an edge to correlation and stacking in tournaments. While we don’t know exactly which pitchers are coming into the game, any team that gets into the middle of the bullpen against a bad team is going to have a boost for every batter in the lineup. When games get out of hand, that’s when we’ll see a team using its worst relievers and getting shelled over and over, and at that point, the whole offense benefits.
I’ll use this bullpen discussion to point out the difference in batters’ platoon splits, meaning how they hit against right or left-handed pitching. This is another area where the pitcher’s numbers outweigh the hitter’s numbers for me. Yes, some players are clearly better or worse against a certain type of pitcher, but the R vs L matchup has more to do with the fact that pitchers throw different types of pitches against different handedness of batters. We also need a much bigger sample size to get any usefulness from a hitters platoon splits, in addition to the fact that we don’t know what handedness of pitcher they will face later in the game. So, yes, I will refer to batters’ numbers against the handedness of the opposing starter, but it’s more about the pitcher’s numbers.
Ballparks and Weather – Ballpark factors and weather are more knowable than bullpens. We do have actual data for which ballparks boost home runs and other factors, and thanks to Kevin Roth and his WeatherEdge tool, we can see how different weather conditions affect offense. What I will say here is that you should figure out for yourself where you want your decision point to be in how to factor these in. I see a lot of people double or triple counting these factors, to the point that they become far overvalued. This is what I mean: if you use Vegas totals to find high scoring offenses, the ballpark and weather is already factored in. If you use projections to find high scoring offenses, the ballpark and weather is already factored in. If you are doing your own game by game research, you will add ballpark and weather factors in to give that team a boost.
What you don’t want to do is play this little game, as spoken by your inner dialogue:
Boston has the highest projected total tonight, excellent, I’m playing 50% J.D. Martinez.
Oh my, look how high J.D. Martinez’s projection is, I’m playing 75%.
Wow, look, the Red Sox are playing in Baltimore which boosts right-handed power and it’s 97 degrees, I’m playing 100% J.D. Martinez.
You should only be using one of those data points to boost your opinion of that player or team. Ballparks and weather are real factors, and measurable factors, but I urge you not to overdo it.
Wait, What About Statcast?
MLB has entered a new era of analysis with statcast data. We now how have exact exit velocity and launch angle for every ball hit in the league. We have spin rates for pitchers and data that tracks grids of the strike zone in tremendous detail. Essentially, Exit Velocity = Hard Hit Rate and Launch Angle = GB/LD/FB. But clearly, the statcast data is more exact. So that raises the question, why wouldn’t we use the more exact data? I do think we are nearing the point in the next few seasons where we’ll have enough understanding of how to use this statcast data that I might make the switch. But where we currently are, I haven’t seen anything that leads me to believe that trying to be more exact is a benefit in the sample of one game. I do want to know if we’re getting a ground ball or a fly ball, but I think we’re getting goofy to imply that there’s a difference in a four at-bat sample size between a guy with a 17.8 degree launch angle vs a guy with a 17.4 degree launch angle. For me, trying to decipher something like that and factor it in to my research would be nonsensical. I don’t have an issue with anyone who wants to replace hard hit rate with either exit velocity or the statcast version of hard hit %. They will absolutely tell us the same thing, and I don’t see that one is far better than the other. I just know that for me, I like simple buckets of ‘these guys hit the ball hard’, and when I try to break it down to ‘this guy has a 92.6 mph exit velocity and this guy is 91.9 mph”, then I’m overthinking it. For reference, in 2019, Jorge Soler was at 92.6, Matt Olson was at 91.9. Just tell me they both hit the ball hard, because they do.
To be clear, I don’t think there’s anything wrong with using statcast data as your primary research point if that’s what you want to do. The data is correct and will get you to the same endpoint, but I personally still prefer the more general numbers.
MORE CLIFF NOTES
There is a never-ending array of stats and metrics that you can use to break down hitting, but in the sample size of one game, there is really only so much we can know. Here is a cheat sheet of the stats I’ll most often reference in the Musings this year, in order of importance:
1) Hard Hit Rate
2) GB/LD/FB (Ground Ball% / Line Drive % / Fly Ball %)
3) K% (Strikeouts)
4) BB% (Walks)
5) Team Context (Runs and RBI), Opposing Bullpen
6) Ballpark and Weather Factors
7) Single Stat Cheat Codes – ISO, wOBA, etc
It’s almost daily fantasy baseball season, let’s go get those HHHHH (BTYBC)!
Pre-Musings Part 1
Pre-Musings Part 2
Pre-Musings Part 3
Pre-Musings Part 4