Things I've Learned in EPBL

Leagues
DEL Daily News
DEL Forum

Coach Tools
MyDEL
Search Coaches
Coach Records
Change Requests
Changes
Newbie Help
Help Pages

DEL Time: 07:48

The material here is excerpted from a series of press releases by Morris Cohen in EPBL. The analysis presented is based on EPBL archival data, rather than any "inside information" on the calculations in the baseball simulator.

Should I use batter or pitcher friendly park? Turf or grass?
Should I look at stats or abilities when I make draft choices?
How valuable are each of the abilities?
What is the value of defense, arm, and speed ratings?
How do I evaluate stats and performance?
How do young players develop and mature (part 1)?
How do young players develop and mature (part 2)?
Other Advice

Should I use batter or pitcher friendly park? Turf or grass?

The DEL help pages state the following:

Artificial turf makes most ground balls harder to handle (they bounce better and roll better), but reduces the number of bad bounces. A larger field reduces the number of home runs, but makes it harder for the outfielders to cover everything as well.

The description is a little vague. It used to be that you'd set the actual distance to the fence, and there were several options, now there's only two and the difference is not as extreme.

An useful way to investigate this is to use the At-Bats database. If you're not familiar with it, go here. There you can immediately call up the league statistics when playing in various types of ballparks, over 10 seasons of EPBL play. I used it to find the below info:

Effect on HRs: Using a batter friendly park (390 feet) instead of a pitcher friendly park (400 feet) increases the percentage of plate appearances that result in HRs from 2.2% to 3.0%. Taken at face value this is a lot, more than 1/3 increase in HRs.

But those numbers leave out the effect of team matching their personnel to their stadium. So a team with lots of power hitters will more likely pick a batter-friendly stadium anyways. To reduce this effect, we can look only at the visiting team, which hasn't been built around the stadium. This changes the probabilities to 2.29% and 2.74%, respectively.

For a regular player making 700 plate appearances in a year, this translates to about 19.7% more HRs in half of your games that are at home, or ~10% of HRs in total. If you have a good power-hitting team (and/or a pitching staff that is good at preventing HRs), then increasing yours and your opponents HRs both by 10% will probably benefit you more than other teams.

So switching to a batter friendly park will give a 30-HR hitter an extra 3 HRs each season, on average. There is no meaningful effect of the ballpark size on batting average.

Next up is the effect of turf and grass. Repeating the same procedure, we see the following:

Batting average: 250 on grass, 265 on turf. So the effect of playing on turf is probably like adding 7 points to a player's batting average, since half the games are at home.

To evaluate double plays, we look at all scenarios where a runner is on first but 2nd and 3rd bases are empty, 0 or 1 outs, and a ground ball is hit. There is still enough data in 10 seasons to see a trend:

On turf, 34% of these situations led to double plays. On grass, its 32%. Not a huge difference there. So I would say the effect of batting average is probably a bigger factor than the number of double plays, and you should choose turf if your team is expected to have a better batting average than most of your opponents.

As far as evaluating players, none of these effects are huge, but it is worth keeping in mind when you trade for or sign a player. What ballpark is that player coming from?

Should I look at stats or abilities when I make draft choices?

Drafted players come in with abilities and also with some stats from a simulated set of games, presumably against the other rookies. I've observed over the years that certain owners look mostly at abilities, some look at both abilities and stats, and some look heavily and stats. In this column, we'll get into the value of each.

Here's what's worth remembering: EPBL players have a random "fuzz" number added to their abilities, let's say it's a couple points. That means if you see this guy:


W.Sedeno     IF 18r  2  2  2  6  8  1  5  1  8  L  L

You really could have either of these two players:


W.Sedeno     IF 18r  4  4  4  8 10  3  7  3  8  L  L
W.Sedeno     IF 18r  0  0  0  4  6  0  3  0  8  L  L

(This is in addition to rounding uncertainty, so you don't know whether that 6 contact is actually 5.5 of 6.4.)

I think a single fuzz number is applied to all the abilities in EPBL, so if the fuzz is positive or negative, it's a big deal. I think this is unique to EPBL, so in other DEL baseball leagues there is less uncertainty about a player's ability on draft day.

In any event, as a player grows into maturity, this random fuzz disappears little by little. Practically speaking, this means a player who is underrated (i.e. their fuzz at draft day is negative) will more likely grow faster than other players as that negative fuzz wears off. The overrated player will stagnate because his growth with age will be offset by the positive fuzz wearing off.

The biggest value to stats is that they help you figure out whether a player has a negative fuzz or a positive fuzz. Because when the players play those combine games, they do it with their true ability, not with the fuzzed/randomized abilities we see on the draft list.

Here's the way I handle it: I give each drafted player 2 ratings. One based just on his abilities, and one based just on his stats. Each rating is basically a z-score of that player compared to all the others AT THAT AGE, so 0 would mean an average player, and 1.0 would be a player who's better than ~84% of the draft players. Note the importance of comparing a player to others of his age, because 18 year olds can be expected to have worse stats than 22 year olds since they less developed.

The ability rating comes from projecting a player's ability into the future and then applying a formula to figure out the player's future value. The stats rating comes from looking at the combine stats and again plugging into a formula. These will be discussed in a future column, time permitting.

What I do next is take Stats rating - Ability rating to make an "Underrated Score". This basically tells you how overrated (negative) or underrated (positive) a player is. An overrated score of 1.0 means his statistics in the combine were 1 standard deviation better than his abilities are. That's a good thing, it means that player may be better than what you see in his abilities but they simply have a negative fuzz.

Using many years of historical EPBL data (which you can get from here), I verified that this is a useful technique, by looking at the correlation between a player's Underrated Score, and the amount of ability growth they showed from draft day to when they turn 26. And I found a significant positive correlation. To get more specific, I found that each point of Underrated score corresponds to about 0.20-0.25 ability points for 18 year old batters, and about 0.05-0.06 points for 22 year old batters. So the effect of the fuzz is definitely higher for 18 year olds as a reflection of higher uncertainty. So if you like to draft 18 year olds, it's even more important to look at the stats along with abilities.

In the case of Sedeno, above, my top player in this year's draft (who went #4), I calculated an underrated score of a very high 1.96, and I estimate that his abilities are all 0.69 points higher than what he showed at draft day (which is part of why I consider him the top player in the draft). That's of course only my best guess based on the limited information that we have, and my presumably imperfect but probably still useful statistical techniques.

Correlary: As you evaluate which minor league players to cut, sign, trade away or for, look at how they are performing. If you find a player who isn't playing nearly as well as his abilities indicate he should, that's a hint that he may be overrated.

So, to summarize this post: Don't ignore combine state, they are valuable and help you to forecast which players will grow faster and which ones will stagnate.

How valuable are each of the abilities?

I think this an interesting question because there is rather limited information about what abilities are actually important. We all know the rough guidelines -- Contact for batters is critical, discipline makes for more walks, Arm and control are critical for pitchers and durability determines whether a pitcher is a starter or reliever (but it doesn't really impact how good they are when they play). But it's hard to quantify precisely what matters more than other things.

In the player sort pages, the following weights are used by position if using the default ability sorts (normalized to max 10)


Pos Ds Sp Co Pw Df Ar Cn
C    3  2 10  2  0  4  0
IF   3  2 10  2  3  1  0
OF   3  3 10  2  3  1  0
P    1  0  0  0  0 10 10

But I have found my own combination through some statistics.

How to determine? So first, I made a single rating of a player's offensive contributions (or for a pitcher, precenting offense), similar to OWAR or Runs Created, and it's similar but a little different for pitchers and hitters. This will be detailed in a future post. Then, I gathered several seasons of player statistics, so that I have both their abilities and their statistics rating, taking only full time players that had enough ABs to quality. I set up a simple linear regression, and for those who want the details of the math, see appendix. What this means is that I calculated the relationship between each ability and season statistics, all at once. I am assuming linearity, meaning that the improvement from 8 CN to 9 CN is the same as the improvement from 9 CN to 10 CN. It also means that, for instance, the product of Cn x Ar doesn't tell you anything that you could not have gotten from just Cn or Ar separately. These assumptions may not be precisely the case, probably only Andy knows that. I tried some more general approaches but didn't see a convincing reason I needed to use that over the above assumptions (it could be that I didn't have enough data). So I would be careful not to overapply these lessons, for instance to predict the hitting abilities of pitchers (which almost always stink anyways so why bother), for instance, but for major league hitters I think this is useful.

I did this separately for hitters and pitchers. I found that for pitchers only Ds, Cn and Ar matter, the rest make no difference. Dr is of course important to tell you if a pitcher is a starter, and it may also affect a hitter's injury risk but I didn't look at that. For pitchers, Ds also impacts clutch performance disproportionately which won't show up in this analysis, and I'm not sure how strong that effect is. So Ds is probably slightly more valuable than I've calculated. I am also ignoring the impact of aggressive pitching settings and ballpark variations, since this is essentially a league-average. But in principle, your ballpark and aggression settings will have a minor impact on the type of pitcher you'll look for, but I'd rather ignore that effect.

To be more exact, I got the following weights:

For hitters:

67% Contact
17% Power
17% Discipline

For pitchers:

46% Control
34% Arm
20% Discipline

This doesn't include defense so that will be the subject of a future column since it varies for C, IF, OF, and that's another animal. Other abilities don't matter for pure offense. I guess speed matters if you like to steal a lot of bases, but I don't (as will be explained in a future column).

So my prototypical player is:


Pos Dr Ds Sp Co Pw Df Ar Cn
Bat  ?  3  ? 10  3  ?  ?  0
Pit  ?  4  ?  0  0  ?  7 10

The question marks are either not included in this analysis (like defense), or the importance is determined by other factors (like how much you like to steal bases, and whether you want a starting pitcher or a reliever).

A few observations: Contact is by far the most important for hitters. The ironic thing is that for many players, adding a point of contact improves a player's HR hitting as much as a point of power. Sounds counterintuitive but I've found it to be true. Think of it this way: Power determines the fraction of your hits that are HRs. Contact means you get more hits, and if you get 20% more hits, you'll also get 20% more HRs. So contact helps doubly in that sense.

For pitchers, control matters more than arm, another place I differ from the help article. Most of us eventually figure out than the 10 6 pitcher has a very poor career compared to the 6 10 pitcher, so that analysis proved it. Also, don't totally ignore discipline for pitchers, it does have a surprising impact.

In the next post, I'll address the value of the remaining abilities, namely arm, defense, and speed.

Andy's note: I'm not sure if speed was considered in this analysis, but it does help create extra-base hits (along with contact).

APPENDIX: To set up the calculation, I had a matrix equation that looked like this
Ax=B

A is a matrix with each row being the abilities for one player. B is a single-column matrix, each row is a single number describing how good the statistics were of the player in matrix A (similar to runs created).

x is a column vector with the weights for each ability. That's basically what we're trying to find out from this exercise.

The best solution to this is standard and well known in linear algebra, it is
x (A^T A)^-1 A^T B
Where ^T means transpose and ^-1 means inverse. I used a program called Matlab to solve it, those of you with a science or engineering background have probably heard of it or used it, and there are other programs that can do it, I'm sure. I don't think it's possible to solve that equation with Excel, since it requires inverting a matrix.

What is the value of defense, arm, and speed ratings?

To review, last post I ended up putting the following weights for the "prototype" player, and it came out to:


Pos Dr Ds Sp Co Pw Df Ar Cn
Bat  ?  3  ? 10  3  ?  ?  0
Pit  ?  4  ?  0  0  ?  7 10

There are several question marks that I didn't fill out, because I was only including the abilities that impact hitting, or for pichers, preventing hitting. In this post, I will attempt to include the effect of the other skills, namely defense. This is harder to do in DEL and I'd say there's a little more uncertainty in my result, but the conclusion did change how I built my team, namely I emphasized my IF defense more than I had before.

So let me simplify to start off: I will ignore the impact of speed on stealing bases. My approach has for awhile been to steal bases fairly rarely. I find that in order for stealing bases to be worthwhile, you have to be succcessful about 65% of the time, and that would make you just break even. The league average is about 70%, so the advantage of stealing a lot is not so huge. Plus, DEL doesn't seem to have the best algorithm for figuring out when to steal if you set it high, and it doesn't give you any control over what situations you'll steal, apart from whether you're behind or ahead. As such, I'd rather set my team to steal only rarely so my success rate goes up, and I optimize my roster for players who can hit, not who can steal. Adapt the rest of what I say if you really want to steal bases.

Now, we're left the question of how to factor in defense, including the Df, Sp, and Ar abilities. Here's how I handled this, using player statistics from one year. What I want is a measure of number of innings played on defense. DEL doesn't have this statistic or at least I didn't how to access it, so I took plate appearances as a proxy for defensive innings played and assume it's proportional. It doesn't line up exactly but if I take only NL teams that don't have the DH, it's close enough.

The key defensive statistic is PM, or plays made (and also PA or plays attempted). This is the number of outs someone made on defense. For better defenders, this number is higher per inning, because their range is larger and they end up making the outs instead of other players. With catchers, though, it's the opposite: More plays attempted means the opposing team steals bases often. So the higher-arm catchers have a SMALLER number of PM and PA. To see this directly, compare Kosut on Chicago AL, to Rodriguez on Houston, and look at the number of plays attempted even though both are starters.

PM and PA is the most meaningful defensive statistic available in DEL. There's also number of errors but these are few, so the impact of making more outs, as reflected by PM and PA, is much greater than adding a couple errors on the season. So I looked at the correlation between various abilities and the number of plays per inning that a player made. What you can observe is that changing abilities had a measurable impact on the number of plays attempted and made. Here's what I found:

Catchers: Only Ar impacted the PM. Speed and defense had a very small effect, if any.
Infielders/Outfielders: Only Df impacted the PM. Arm and speed had a very small effect, if any.

Andy's note: this is my only significant disagreement with Morris' system. I also made an analysis of the play-by-play database, and found statistically significant contributions from arm and speed for infield defense, and speed for outfield defense. But, I do agree that Df is the most important defensive ability.

So the lesson here is that speed doesn't matter at all. If you have a player who has a high Sp and low Df, it means he is fast on the base paths but not when he's chasing a fly ball. On the other hand, ignore defense for catchers and look only at arm. I didn't look at the importance of those abilities for pitchers because I was too lazy.

Side point, by the way: One thing I've noticed is that "Gold Glove" winner at catcher is usually someone with a low arm. Whatever criteria is used to pick the winner at catcher seems flawed in that sense. So don't acquire a catcher because he won a Gold Glove and therefore you think he'll be good defenively.

Anyways, next question is how important the effect of these abilities are, now that we know which ones are important. So here's how I tried to figure that out: If an infielder or outfielder makes an extra out due to being a good defender, which takes away a hit from the other team, that's equivalent to getting an extra hit himself. So I'll assume each extra PM a player makes robs the other team of a single and therefore is like that player got an extra single themself. This principle is not perfect, because obviously some plays prevent extra base hits, or are plays that another player would have made anyways. But I think it's a pretty good assumption overall.

For a catcher, though, the stat means something different. Catchers with high arm discourage the other team from attempting a steal, so high-arm catchers actually have very low number of plays attempted. Now I need to bring in one statistic that will be decribed as part of the next column, but I need it sooner: A failed stolen base hurts a team about 1.6 times more than a successful stolen base helps a team. The reason this matters is that if a catcher allows lots of steals, only some of those steals will be successful, and the total impact of stolen and failed attempts must be taken into account. This is very important, actually, because as I mentioned earlier, the "break even" point of stolen bases is ~65%. Since in EPBL overall steals are only successful ~70% of the team, so encouraging the other team to attempt lots of steals doesn't actually hurt you as much as you'd think.

So now that we have linked defensive performance and drawn an equivalency to hitting, we can directly quantify how defensive abilities relate to offensive abilities, namely contact. Here's what I calculated:

Catchers: 1 point of Ar is equivalent to about 0.15 points of Co
Infielders: 1 point of Df is equivalent to about 0.4 points of Co
Outfielders: 1 point of Df is equivalent to about 0.3 points of Co

Anyways, when I landed on this result, I was a little surprised, it was a bigger effect than I expected for IF, and smaller for catchers (for the reason I described above). The difference between an infielder with 8 Df and one with 3 Df is pretty big -- it's equivalent to changing the contact ability by 2 points. The importance is a little smaller for OF. So if you're going to pick a position to have a great hitter with bad defense, better to have it be your catcher, or if not, in the outfield.

Separately, I found that the center defensive positions (2B, SS, and CF) make more plays than the corners. So this will come as no surprise, but put your worst defenders at LF, RF, 1B and, 3B, and don't worry so much about the defensive abilities of your catcher.

So armed with this knowledge, I can now update the above prototype ability, and I'll go to one decimal place so it's more exact:


Pos   Dr   Ds   Sp   Co   Pw   Df   Ar   Cn
C    0.0  2.5  0.0 10.0  2.5  0.0  1.5  0.0
IF   0.0  2.5  0.0 10.0  2.5  4.0  0.0  0.0
OF   0.0  2.5  0.0 10.0  2.5  3.0  0.0  0.0
P    ???  4.3  0.0  0.0  0.0  0.0  7.4 10.0

And this is the measuring stick that I use to evaluate players. I left a question mark under Dr for pitcher, because obviously it impacts how many innings they can pitch so is worth looking at. But as far as I know it doesn't impact performance when a player is on the field.

So, if I want to evaluate this player...


Name         Ps Ag  Dr Ds Sp Co Pw Df Ar Cn
D.Kosut      C  34  10  4  1  8  5  1  8  1

...I would take the sum of the product of each ability with the measuring stick, so:
10*0.0 + 4*2.5 + 1*0.9 + 8*10.0 + 5*2.5 + 1*0.0 + 8*1.5 + 1*0.0 = 115.4.
This number could be compared to any other batter to get an evaluation of a player's worth based STRICTLY on abilities at the CURRENT time. We'll get into how to project abilities into the future in a later column.

How do I evaluate stats and performance?

The last column pretty much gave away my formula for evaluating the abilities of players. So there are two big topics left before you can build a player evaluation system based on my (however imperfect) techniques: (A) How do you evaluate statistical performance, and (B) How do you project a young player's growth. Today I'll cover A, then next time I move on to B, which will probably be done in two parts since it's a big topic.

Part 2 of this series covered whether to look at combine stats or abilities when picking players to draft. I left unspecified the formula I used to calculate how valuable a player is based on his statistics. The basic question/idea is this: If a player had XX singles, YY doubles, ZZ triples, AA homers, BB walks, and CC strikeouts, in DD plate appearances, can we come up with a single number that states how valuable he is?

This is similar to the "runs created" stat that is common enough. You can read a Wikipedia article on it here: https://en.wikipedia.org/wiki/Runs_created, and there's a few formulas to use and yada yada you can Google it and find out more.

But I was never sure whether the balance in EPBL is the same as real life. So I calculated an EPBL-specific formula myself. I looked at complete team performances for each season. At the end of each season, I have 30 data points, as each team has a certain number of singles, doubles, triples, etc. And that led to certain number of runs. It stands to reason that there ought to be a correlation between, for instance, the number of singles and the number of runs. This has all been done before for real life but the balance of statistics in EPBL is different so we need one for DEL.

So I compiled this for every season in EPBL since season 51, which you can get here: http://www.dolphinsim.com/clmanager/epbl/archive.htm. I then ran the matrix linear regression model, the same as that which is described in part 3 of this column series, to simultaneusly find the connection between a those offensive statistics and the number of runs a team scored for the whole season. Here's what I found:

Singles are worth 0.46 runs
Doubles are worth 0.84 runs
Triples are worth 1.02 runs
Homers are worth 1.45 runs
Walks are worth 0.38 runs
Strikeout is worth -0.07 runs
Fielding out is worth -0.11 runs

Fielding out is the the number of ABs minus the number of hits minus the number of SOs. So basically, it's "everything else".

This seems to stand up to several sanity checks. For instance, a walk is a little less valuable than a single, since it doesn't in general advance runners as far. A HR is like a single plus an extra run for bringing in yourself. And a strikeout is bad but not as bad as a walk is good. A fielding out is a little worse than a strikeout, because of the added pain of possibly causing a double play. (Note: the effect of sacrifice plays would counter this, but must be smaller than the impact of double plays). The fact that SOs drive higher pitch counts may also be a small factor.

So using the above formula, you can estimate, with a single number, how valuable is a player's offensive performance, by looking at the DEL runs created divided by number of plate appearance (which is at-bats + walks). This is basically a measure of that player's efficiency at the plate. DEL does make other statistics available in some places, like on the team stats page you can track double plays. But the purpose of the above is to use only the statistics that are on the "team roster" page, or on free agent lists.

I also included stolen bases and got this:

Stolen bases are worth 0.38 runs
Caught stealing is worth -0.55 runs

Which underlines the point I made in earlier column about stealing bases. In order for stolen bases to be break-even you need at least 60% success rate. And that's the league average. Being caught stealing hurts worse than a stolen base helps.

For pitchers, the stats line doesn't give you 1B, 2B, and 3B separately, but only total hits. So here's the formula and weights I use for pitchers

Non-HR hits are worth 0.61 runs
HRs are worth 1.34 runs
Walks are worth 0.47 runs
Strikeouts are worth -0.10 runs
Fielding outs are worth -0.13 runs

Once again, fielding outs is total batters faced minus all hits minus all walks minus all strikeouts.

Obviously for pitchers you want the opposite, more runs per batter faced is bad. So you want a small number for pitchers. The sanity checks still work, apart from the mismatch between these values and those of hitters, for instance, on principle a HR should be worth the same for both. Could be a result of different types of data going in.

This doesn't include the effect of defense, just hitting, and obviously a player's value is also on the defensive side. So what I do is take this statistics and modify it by adding or subtracting a number depending on their defensive performance/ability. I've written enough on this in past posts that you can probably do this yourself. Translation: I'm too lazy to write more.

A lot of this is sort of standard stuff, and in fact the numbers I presented above have been calculated for real life, and they're not too far off. The math technique is fairly standard.

But there's a whole other undercurrent to this that one might not ordinarily realize, which is that the values above depends on the quality of the team. For instance, if you have a really good offense already, then adding a big HR hitter makes a bigger difference, because you're more likely to have men on base. On the other hand, if you have a terrible offense, then adding an extra HR is more likely just a solo shot. So really the numbers above should be tuned based on how good your offense is, and how good the team you're facing is. Another way to say it is that the effects of each can't be isolated. The value of HRs depends on the number of hits and walks you get, and vice versa. But those above numbers represent the historical EPBL average. If I really wanted to do this right, I'd account for some of these details with one of a few different techniques.

In any event, I use these metrics on free agent signings, as even when a player has matured, there is an "intangible" ability that makes players play either better or worse than what their abilities show. So don't judge a player, even a veteran one, based only on abilities -- look at the statistics.

I've been regularly using this metric to look at the combine statistics for drafted players, so I can compare them to the other drafted players (namely those at the same age) and to historical averages for players OF THE SAME AGE. Earlier posts told you how to evaluate abilities. This is what I do for statistics. So by now the outlines of how I evaluate players should be getting clearer. I end with two ratings for each player which I use to determine which ones to draft.

But the big remaining question, at least for drafting, is how to project player abilities. How you evaluate the abilities of a drafted player should be based not on his abilities on draft day but how they project when he reaches his prime years. I haven't specified this yet, but it will be the next subject, so stay tuned.

How do young players develop and mature (part 1)?

One of the key questions in DEL is how to predict the future abilities of prospects, draftees, and young players. There are several things to look at, including age, TR, and stats. In past posts we've covered how to evaluate stats so now we're going to focus on this question: If you know only a young player's current ability, age and TR, what likely ability will they have when they reach age 26 and hit their prime? Tr is supposed to measure how well a player has learned the skills of baseball, but between that and age it's confusing to figure out which is more important.

I should note before I go any further that there is randomness to this part that should surprise no one. At each step when a player ages or matures, you expect their ability to go up a little, but really this means that it could go up a lot, or it could go not go up at all (or maybe even go down), and that's just luck. Add up all these separate and random events, and the question to ask is not what will a player's abilities be, but what is the RANGE of possible abilities that a player might end up with. So instead of saying "M.Cohen will be a 8 Co hitter when he's 26", you should say "M.Cohen has a XX% chance of being a 7 or better Co hitter, and a XX% chance of being an 8 or better Co hitter".

There are two parts to this. First is how would the players grow if the randomness/luck were taken out? Second is how much uncertainty does all the luck at every stage of growth contribute?

In today's post, we will cover the first part, or the "expected" improvements that occur at each step. In the column after thsi one, we'll cover the randomness that occurs in each step which adds uncertainty. I think this is going to be a long post, as there's a lot to say and some raw numbers to paste in.

One thing to note is that the player's true abilities has decimal points that we don't see, so a player may actually gain ability from, say, 8.0 to 8.4, even if the number you see doesn't change. I *THINK* there's a trick to find out what the actual decimal is, and I'll reveal this in a future column.

The key source of data is the archive of abilities, where you can download every player in the league's ability at various points during the year for the past 10 seasons, which you can get here. I started tracking this 10 seasons ago so I now have 20 years of data built up, which I've used to figure out the below. Using that, you can systematically look at how much growth, for instance, 19 year old pitchers with 5 Cn are expected to have (or another way to look at it is what is the probability that they will jump to a 6 Cn.) With this much data available you can start to reverse engineer the growth model.

There are three times when abilities change for players: Post-draft training camp, subsequent training camps, and end of season. Important point: Each one has different effects, so we need to discuss each one separately. At the end of the season, players gain a year of age, and at the end of training camps for NON-rookies, players add some TR points and mature. In the post-draft rookie training camp, neither changes.

The post-draft training camp is the one that occurs right after the draft is done, in the middle of the season, and it affects only the just-drafted players. This one is easy -- I've found that the net gain in ability is zero. The post-draft training camp is merely a luck crapshoot, on average the abilities do not improve. So for this post we can ignore this one (we'll consider it next time).

A few general observations first:

If you have two prospects with the same age and abilities, but one has a lower TR, the one with the lower TR will *on average* gain more ability in training camp. But when the season ends and the players age, the TR doesn't matter, only the age determines how much improvement occurs.
If you have two prospects with the same TR and ability, but one is younger, then the younger player will on average gain more ability. This seems to apply to both the training camp and end-of-season improvements.
The amount each ability gains is different by position and ability. Meaning you shouldn't assume that Ar grows at the same rate as Cn, even in the absence of luck.

THE END OF SEASON GAIN

Let's start with the end of season, when players age by a year (and Tr does not change). Let's put down some numbers. The below list shows the multiplication in ability for each year. The row labeled 18 is for players who turn 19 and improve, or the first end-of-season after an 18-year old has been drafted.


----------Batters End of Season----------
	   Ds	   Co	   Po	   Df	   Ar
18	1.044	1.077	1.034	1.075	1.099
19	1.071	1.078	1.049	1.072	1.080
20	1.041	1.054	1.049	1.065	1.072
21	1.027	1.040	1.027	1.024	1.041
22	1.022	1.018	1.022	1.033	1.034
23	1.005	1.005	1.006	1.005	1.005
24	1.000	1.000	1.000	1.000	1.000
25	1.000	1.000	1.000	1.000	1.000

So for instance, if a player has a contact of 1 when they're 18, on average you can expect them to improve to a 1.077. If they have a contact of 2, you can expect them to improve to a 2.154. There's a critical assumption here that the ability should be *multiplied* by a certain amount, as opposed to having a certain amount *added*. And it's important because if my assumption is correct, then 2 players with a difference of, say, 1 point in Co at age 18, are likely to have a bigger difference when they.

My analysis here is simple in that I am assuming the multiple remains the same for any initial ability. For instance, a 4 CO is going to grow at the same rate as a 8 CO for the same age and TR level. I actually have some evidence that this is NOT true, but frankly by the time I had enough data to figure it out, I had already put enough work into calculating the above numbers and I just didn't feel like correcting it. So take this with a grain of salt. But with that in mind let's do some analysis.

The improvement is a lot higher in the early years, and rate of growth slows with each passing year. But there's a couple exceptions, namely that the Po and Ds attributes improve a little faster when a player turns 20 than whan they turn 19. But in general, an 18 or 19 year old is improving much faster than a 22 year old. By the time players reach the age of 23, they're done improving, and further ability gains at the end of the season are strictly from luck. I also have this for speed, but improvement in speed seems to work a little differently so I'd rather just leave that out than open another can of works. Plus, as I wrote in Column Part 4, I don't pay attention to speed at all, anyways.

Here is the same chart for pitchers:


--Pitchers End of Season--
	   Ds	   Ar      Cn
18	1.046	1.096	1.093
19	1.079	1.079	1.069
20	1.037	1.062	1.033
21	1.025	1.037	1.043
22	1.020	1.029	1.018
23	1.005	1.005	1.004
24	1.000	1.000	1.000
25	1.000	1.000	1.000

Just as for hitters, the improvement gets slower as players get older, and at age 23, they're done. The early years shows faster improvement for Ar and Cn, although like for batters, the Ds grows fastest the year a player turns 20, then slows down.

One interesting note, by the way: I didn't include durability because I found that it does not change on average. It does change due to randomness, but if you draft someone with a 3 Dr, the best guess is that he'll remain with a 3 Dr until he reaches his peak. So the lesson here is don't draft a long reliever thinking he'll become a starter as he improves.

THE TRAINING CAMP GAIN

This one is a little more complicated, because the gain is a function both of a player's age and his TR level. On average prospects add ~2 points of TR in each training camp although that varies. So what I do is claculate a "maturity index" for each player, which is 2*Age + TR. Honestly, I don't remember how I came to that combination but somehow I did. So basicaly, on average you can expect a player to gain 4 points each season (2 when they age a year, 2 in training camps).

So the equivalent for the above chart is sorted by maturity index, not age, which is the left column. The other numbers are, once again, the fractional gain in ability that a player gets on average in training camp. Here are the results I calculated:


------Batters Training Camp------
       Ds      Co      Po      Df      Ar
40	1.323	1.081	1.000	1.045	1.030
44	1.223	1.071	1.000	1.046	1.030
48	1.174	1.059	1.000	1.042	1.030
52	1.150	1.056	1.000	1.035	1.023
56	1.121	1.047	1.000	1.035	1.022
60	1.090	1.040	1.000	1.031	1.020
64	1.070	1.029	1.000	1.030	1.016
68	1.052	1.021	1.000	1.022	1.011
72	1.032	1.000	1.000	1.000	1.000
76	1.000	1.000	1.000	1.000	1.000
80	1.000	1.000	1.000	1.000	1.000

--Pitchers Training Camp--
       Ds      Ar      Cn
40	1.307	1.029	1.144
44	1.223	1.030	1.122
48	1.172	1.025	1.109
52	1.156	1.024	1.098
56	1.124	1.024	1.078
60	1.093	1.019	1.061
64	1.076	1.013	1.044
68	1.057	1.007	1.034
72	1.037	1.023	1.020
76	1.000	1.000	1.000
80	1.000	1.000	1.000

The most raw players will be 19 year olds with TR of 3, so that would be a maturity index of 19*2 + 3 41 (this table does not apply to the post-draft training camp, where there is no average growth, so there can't be 18 year olds). Anyways, a development index of 41 rounds to the row labeled 40 on this chart. So that player can expect to have his Ds rating multiplied by 1.323 if he's a hitter.

So with this chart you can start to project a player's growth in a set of training camps. Since each player gains a year of age and ON AVERAGE 2 points of TR in each camp, they basically move down one row with each subsequent training camp. So if you have a pitcher who enters his first full season with a maturity index of 48, and he's 20 years old, you can expect each season he will move one row down on the table. So his Arm goes up by 2.5%, then 2.4% the next year, then 2.44% again the next year, and then 1.9% the nest year, and so on.

A couple observations about training camps:

Batters gain a huge amount of discipline in training camp. I guess that makes sense; discipline is an "experience" thing, as opposed to a "growing up" thing. On the other hand, batters do not gain any power on average in training camp. I guess this is the opposite -- power comes from pure strength, which doesn't necessarily require playing time or experience (although contact does obviously).
Pitchers gain in arm strength is surprisingly slow in training camp, whereas a pitcher makes much larger strides in Ds and Cn. This is different from the end-of-season, where growth in Ar and Cn are about even.

AN EXAMPLE

So now that we have growth tables for every age and level of development, we can start to project a player's abilities. To repeat the example player I've been using, let's take Sedeno, my choice for the best player in the draft when the pick was made. His abilities went down in training camp, but as I stated earlier, that was simply bad luck. But let's project his abilities now, even after he regressed badly. We note that he starts off with a maturity index of 44, and is expected to gain 4 points of maturity index every year. So here it goes:


DRAFT DAY --- W.Sedeno     IF 18r  2  2  2  5  7  1  4  1  8  L  L

Let's follow in particular his contact, which starts out at 5. Here is the progression of how his abilities will change with each step. At each step I'll take the appropriate number from the above tables, and each year go down on row in the table:


Post-draft training camp: No change
Aging 18--->19 years old: Multiply by 1.077
Training at 19 years old: Multiply by 1.059
Aging 19--->20 years old: Multiply by 1.078
Training at 20 years old: Multiply by 1.056
Aging 20--->21 years old: Multiply by 1.054
Training at 21 years old: Multiply by 1.047
Aging 21--->22 years old: Multiply by 1.040
Training at 22 years old: Multiply by 1.040
Aging 22--->23 years old: Multiply by 1.018
Training at 23 years old: Multiply by 1.029
Aging 23--->24 years old: Multiply by 1.005
Training at 24 years old: Multiply by 1.021
Aging 24--->25 years old: Multiply by 1.000
Training at 25 years old: Multiply by 1.000
Aging 25--->26 years old: Multiply by 1.000

Now take the product of all the numbers (1.077 x 1.059 x 1.078 x 1.056 .....) and you get 1.76. So since he starts out with a Co of 5, the best guess is that his contact will end up at 5*1.76=8.8 Not bad, but if he hadn't lost a point of contact in post-draft training camp, his expected maturity Arm would be 10.56. Oh well.

Just to reiterate, all the above refers to the EXPECTED gain. At each of those steps there is a random chance that makes the ability go up either more or less than what is in those tables. In the next column, we will discuss how that works.

How do young players develop and mature (part 2)?

So in the last post we looked at how players develop, on average. But there is considerable uncertainty and luck along the way. In this post, we will quantify that. But basically, our goal is not to say "this player will have a contact of 9 when he's mature", but rather "his contact will probably be between these two numbers" or "there's a __% chance this player will have a contact of 8 or better".

If you're not familiar with the idea of standard deviation, you should probably stop reading now, go Google and read about it, then come back. Otherwise, read on.

In addition to standard deviation, there is one useful thing to know, which is a term called variance, which is the square of the standard deviation. One useful property is that when you have a set of independent randomized events, the variances add up. Independent means that one event doesn't affect the next. Like if you roll some dice two times in a row, the numbers you get the second time has nothing to do with the numbers you get the first time. So the player development process can be thought of as a series of independent events (training camp, end of season, next training camp, end of season, etc etc). To figure out the range of possible values that a player's abilities might have, there is a simple two-step process:

Add up all the variances for all the events from now until maturity (which I usually define as age 26)
Take the square root, and that's the standard deviation or uncertainty

So all I need to do is tell you what the variance is for a given situation. There are three sections: The rookie training camp, the regular training camp, and the end of the season. I'll report each one separately.

ROOKIE TRAINING CAMP

As stated in Part 6 of this column, players on average do not gain any ability in their rookie training camp, it is just a random luck game, with as many players losing as gaining ability. As far as I can tell, also, the amount of randomness is the same for all ages, doesn't matter if you draft 18 or 22 year olds. So this one is simple, I give you the variances for each attribute


          Dr   Ds   Sp   Co   Po   Df   Ar   Cn
HITTERS:  0.57 0.11 0.31 0.19 0.25 0.20 0.24 0.04
PITCHERS: 0.56 0.11 0.21 0.10 0.19 0.13 0.26 0.15

VETERAN TRANING CAMP RANDOMNESS

Same as above, here are the variances, which again do not seem to vary by age or TR level so I just need to give you one row for hitters and pitchers:


          Dr   Ds   Sp   Co   Po   Df   Ar   Cn
HITTERS:  0.29 0.24 0.27 0.23 0.22 0.23 0.20 0.07
PITCHERS: 0.28 0.23 0.23 0.12 0.20 0.14 0.24 0.23

END OF SEASON RANDOMNESS

This is a little tricker in that you have to first look at the total variance in the before/after abilities, but then also remove the fact that the average gain makes for a lot of jitter. What I mean is that because abilities are rounded to the nearest number, it creates some noisiness. 10% of players increasing their CO by a point because it's rounded creates some variance, whereas if everyone increases their CO by 10% there's no variance. But this can be estimated and subtracted and I won't get into the math of it. Also, unlike the training camp randomness, it IS a function of age (but not TR level).

So here's the variance chart for hitters and pitchers as a function of age. Note that the age refers to the age of player right BEFORE the end of the season.


----------Batters End of Season----------
	   Ds	   Co	   Po	   Df	   Ar
18	0.076	0.142	0.121	0.138	0.152
19	0.119	0.201	0.161	0.162	0.172
20	0.108	0.182	0.158	0.156	0.155
21	0.088	0.152	0.110	0.094	0.108
22	0.071	0.101	0.091	0.106	0.094
23	0.018	0.032	0.026	0.022	0.018
24	0.001	0.000	0.001	0.059	0.035
25	0.000	0.000	0.000	0.000	0.033

--Pitchers End of Season--
	   Ds	   Ar      Cn
18	0.082	0.179	0.143
19	0.129	0.240	0.179
20	0.102	0.180	0.124
21	0.082	0.148	0.133
22	0.069	0.124	0.089
23	0.020	0.029	0.027
24	0.001	0.068	0.001
25	0.000	0.054	0.079

CONCLUSION

The past several posts have give you a lot of information about how I have been evaluating players, particularly rookies to draft. I don't think it is perfect but I think, based on how my farm system has improved since I started implementing it, that it is probably quite good. Even if you don't exactly use the numbers, probably the general trends I've identified will help you tune your own scouting strategy.

Other Advice

Never, ever, ever use the ctrade result in evaluating trades. I've found it to be pretty worthless, and pretty easy to game it to make a bad trade look good. Always ignore it at least once you have even a little experience. Even a mediocre owner in this league is a better judge.
Scour the waiver wire for cheap prospects, particularly in the first AND SECOND rounds of the year's waiver wire. You'd be surprised how many decent prospects get released by teams when the roster trims from 175 to 125.
If a good prospect suddenly gets released midseason, consider the possibility of offering a huge short term deal that carries a player through age 24. For example, if a top 21 year old prospect is a free agent, offer them a 4-year contract with a good salary. Yeah, you'll overpay them, but then when they turn 25, their salary reduces to 0.04 if you just let their contract expire, and it automatically renews their contract, so then you've then got a very cheap player.
Don't ever trade quality for quantity, or one star player for a large number of mediocre players. If you scout right, "above average" players are currently very very easy to find. What is NOT easy to find is a top superstar, so if you're going to trade one away, be damn sure you're getting one back.
Always ignore the letter grades and look at stats. And by always, I mean always. Similar to #1, if you are even a mediocre owner you are better than using the letter grades. Also a lot of player consistently over or under achieve, so look at stats. There are some top notch B players out there, and some A players that wouldn't even make the roster of a playoff team.

Andy's note: to the fifth point, the letter grades are simply a reduction of the default ability sorts (on the player stat pages) to an A through F scoring. If you don't feel like implementing the scoring system presented in parts 3 and 4 of this series, you should at minimum use the player sort pages. That said, a comprehensive rating using both performance and abilities is far more useful than ability-only ratings.

Credits:

written by Morris Cohen
edits and additional comments by Andy Dolphin

Home - Rankings - Terms of Service - Privacy - Downloads - Search - Contact

Contents