Thursday, September 27, 2007

Which is the Better Sim for Historical Leagues: BM or OOTP?

In my last post, I introduced a measure called the Noll-Scully measure. The concept behind the Noll-Scully measure is that a perfectly competitive league would be a league where every team was the same -- had the same strength of pitching talent, batting talent, fielding talent, managing talent, etc.


In such a league, you would expect a team to finish at 81-81 for the year -- after all, it is playing against teams as evenly matched as it is, and trying to determine who would be most likely to win any particular game would be similiar to flipping a coin.


However, not every team would finish 81-81 in this league. There would be some random chance along the way. After all, if you flip a coin ten straight times, you don't always get five heads and five tails. Sometimes, you get four heads or six heads -- or an even greater "deviation" from the mean. So there would be some deviations from an 81-81 record.


With a lot of games, the "number of wins" variable approaches a bell curve distribution. One could expect that our "average" team would win between 74.7 and 87.3 games about 68.2 percent of the time. We could expect that it would win between 68.4 and 93.6 games about 95 percent of the time! The chances of one of these even teams in an even league winning less than 69 or more than 93 games are very small indeed: only 4.6 percent.


What the Noll-Scully measure does it that it compares the "scatter" of a perfectly matched league to the "scatter" of any league you give it. The actual measure is (standard deviation of wins in given league)/(standard deviation of wins in perfect league) = (standard deviation of wins in given league)/[(1/2)*(square root of games played)]. The denominator comes from what one would expect in a binomial, or "coin-flip" distribution.


A "perfectly competitive" league would have a Noll-Scully measure of 1.0, since the numerator would be equal to the denominator. I gave a list of Noll-Scully measures for various sports leagues in an earlier posts.


CatKnight then did an interesting experiment. He performed historical simulations for a number of years, and then he compared how well Out of the Park Baseball (OOTP) did with Baseball Mogul (BM). His results follow, and I quote the private message he sent me:


(* * *)


Hi, petrel: I said I'd do some experiments and get back to you on this. Interesting tidings - not entirely surprising.


First, here are the values you took from that article - modern values which we more or less instinctively expect:


National Basketball Association: 2.89

American League: 1.78

National League: 1.76

National Hockey League: 1.70

National Football Legue: 1.48


*******


Now, I ran both OOTP and BM through the years 1901-1925. League balance was still kinda fragile at that time, so it's useful to look at what really happened:


MLB (1901-1925): Average 2.52

Most Competitive: 1.87 (1918)

Least Competitive: 3.39 (1909)


OOTP first: Fictional setup, but everyone in the correct cities with hopefully correct demographics and statistical patterns for that time.


OOTP (1901-1925): Average 1.60

Most Competitive: 1.12 (1913)

Least Competitive: 2.04 (1903)


So...far more competitive - even a bit more so than modern day baseball, but purists might wonder if it's realistic for the time.


BM (1901-1925): Average 2.39

Most Competitive: 1.74 (1906)

Least Competitive: 3.16 (1922)


BM's numbers are much closer to what really happened, though a little more draconian than we modern day fans may be used to.


*****


One of the main questions BM's raised is whether the financial model unbalances the game over time - so Year 1-2 may go smoothly, but by Year 10 competitive balance is gone. Let's look:


OOTP (1901-05) 1.91

(1906-1910) 1.61

(1911-1915) 1.40

(1916-1920) 1.63

(1921-1925) 1.43


So pretty uniform. OOTP actually settled down once the AIs got moving after a few years.


BM (1901-05) 2.10

(1906-10) 2.00

(1911-15) 2.45

(1916-20) 2.47

(1921-25) 2.96


BM did fine for about ten years. After 1920 the situation descended rapidly, partially because the Braves tanked (.299 or worse record) for four straight years and the Phils for two.


No team 'tanked' during the OOTP run. It happened twelve times (7 with the Braves) during BM.


*******


The last indicator of competition is how often one can expect to win the pennant. Here are the results:


Real Major Leagues

10: New York Giants

6: Boston Red Sox, Philadelphia Athletics

5: Pittsburgh Pirates, Chicago Cubs

4: Chicago White Sox

3: New York Yankees, Detroit Tigers

2: Washington Senators, Brooklyn Dodgers

1: Cleveland Indians, Philadelphia Phillies, Boston Braves, Cincinnati Reds

0: St Louis Browns, St Louis Cardinals


OOTP
10: New York Yankees

8: Chicago Cubs

4: St Louis Cardinals, St Louis Browns, Washington Senators

3: Philadelphia Athletics, Cincinnati Reds, Brooklyn Dodgers

2: Detroit Tigers, New York Giants, Pittsburgh Pirates, Philadelphia Phillies

1: Cleveland Indians, Chicago White Sox, Boston Braves

0: Boston Red Sox


A little more inclusive, but not much. Two teams dominating for so long (NYY, CHC) is interesting.


BM
8: New York Giants

7: Cleveland Indians

5: Washington Senators, Chicago Cubs

4: Brooklyn Dodgers

3: Cincinnati Reds, Pittsburgh Pirates, Philadelphia Athletics, St Louis Browns, New York Yankees, Boston Red Sox

2: St Louis Cardinals

1: Chicago White Sox

0: Detroit Tigers, Philadelphia Phillies, Boston Braves


More domination here, and more teams left out - but not our usual suspects. The NY Giants almost always dominate and are directly responsible for a rise in Noll-Scully in the mid-late 1910s, and Washington is (interestingly) the big AL market, but Cleveland's not that large. The Cubs don't usually benefit from Chicago's size.


OOTP - 1901 through 1925 v 2.0


(I then) attempted to recreate what would have been correct financially for this time period:


No free agency

Arbitration after 1 yr (to force the AI to think about a player's true value, as contracts were only 1 yr in this time period)

No revenue sharing

No salary cap


OOTP with corrections above
1901-25: 1.71


1901-05: 1.71

1906-10: 1.88

1911-15: 1.78

1916-20: 1.64

1921-25: 1.55


Best Year: 1.20 (1921)

Worst Year: 2.23 (1910)


This league showed more signs of "have" and "have not" syndrome than the original run of OOTP. Still, while these ratings are more competitive than early 20th century ball, they're comparable to modern figures.


Again, no team 'tanked.' The worst record belonged to the 1903 Tigers (47-93 .335) followed by the 1915 Yankees (52-102 .337). The best record belonged to the 1915 White Sox (109-45 .707). In all, only one team recorded 100 losses, and three recorded 100 wins.


Upward mobility within the league suffered compared to OOTP or even BM, possibly because of the lack of player mobility. Teams would stay down for a decade or more before recovering. Using my current franchise relocation rules, FOUR teams would have moved: Cincinnati and Detroit in 1916, the Yankees in 1918, and Braves in 1925. This is higher than I'd intended and, again, suggests teams in a rut are going to stay there, and those on or near the top aren't going to fall far.


Pennants by Team:
11) Chicago White Sox

9) Pittsburgh Pirates

6) Brooklyn Dodgers

5) Boston Red Sox

4) St. Louis Browns

3) St. Louis Cardinals, Philadelphia Athletics, Philadelphia Phillies

2) Cincinnati Reds, New York Giants

1) Cleveland Indians, New York Yankees

0) Washington Senators, Boston Braves, Chicago Cubs, Detroit Tigers


Again, a bit top heavy with the White Sox and Pirates dominating their leagues.


--CatKnight


(* * *)


Thanks once again to CatKnight for all the hard work.


After reading all of the above, I am forced to conclude that OOTP simply does a better job of historical simulation than BM does. After a while, BM "degenerates", with large market teams dominating, and if one doesn't decide to play by some set of "house rules" balance becomes farcical.


This leads to two questions: 1) how come BM hasn't been knocked out of the market yet, and 2) how come BM doesn't change their engine to be a bit more realistic over the long run?


As for BM and the market, both CatKnight and I agree on one thing: Baseball Mogul's ease of set-up and play simply blows OOTP out of the water. The OOTP curve is sharp, and even though I've purchased OOTP, I rarely play it. There's so much going on in OOTP, the menus are hard to find and you never know exactly where to find what you're looking for. Baseball Mogul's menus are simple, and you can just breeze right through and get started, although with none of the options in setting up leagues that OOTP offers.


In some quarters, BM is damned for its simplicity. However, with Baseball Mogul, at least you don't have to read a 500 page manual to get started.


So how come BM doesn't change their historical engine to be more realistic? CatKnight came up with some theories with which I agree. Either a) Clay Dreslough simply isn't aware of how unbalanced the game becomes over a long period of time, b) he simply doesn't care that it becomes unbalanced, as he figures that most people just want to play in the 2007-plus era with big-market teams dominating, or c) he doesn't intend to make it a priority at this time.


What can I conclude? For dynasty makers, you still have the same choice: BM or OOTP. Making a choice between one or the other has always involved trade-offs. OOTP is actually "tighter" than reality, in that it's actually more competitive than is historically appropriate. BM is "looser" than reality -- it is less competitive than a real league over time. The question -- as always -- is "what do you want"?

1 comment:

Anonymous said...

Hey James,

"After reading all of the above, I am forced to conclude that OOTP simply does a better job of historical simulation than BM does."

I disagree with your analysis of the data. Mogul has a tendency for rich teams to dominate because it's modeled on the modern era and doesn't have the option to turn off free agency. But CatKnight's numbers show that Mogul's Noll/Scully values are MUCH more realistic and I don't see anything in the pennant distributions that tells us much. MLB's best dynasty won 10 pennants, as did OOTP's, while in Mogul they only won 8. So if anything there is less dominance in Mogul.

Mogul has one more team that never won a pennant (3 instead of 2) so you could argue it sucks a bit more to be poor in Mogul than it does in real life but I don't think those pennant totals have much statistical significance.

What is interesting is who is winning the pennants. The best team in MLB and Mogul during that time period was the New York Giants. In OOTP, the Giants only win two pennants.

In MLB, the Red Sox won 6 and the Yankees won 3 in that time period. In Mogul they each won 3 .. so the rivalry is there but it's a bit more even. In OOTP, the Yankees won 10 and the Sox had none. This is one example of OOTP's oversimplified financial engine. SI knows that Boston is a small city. But they are in England so they don't realize that Boston has an incredible fan base that lets them pump more money into their team that someone like the Dodgers or Mets.

I think your analysis is clouded by the fact that you already know that Mogul does get worse over time because you've played it enough to know that. From this data I'd say that in regards to competitive balance Mogul is like buying a Corvette. It's a great car but it breaks down over time. OOTP is like buying a Hyundai. It's not all that great, but at least it doesn't really get worse.

"b) he simply doesn't care that it becomes unbalanced, as he figures that most people just want to play in the 2007-plus era with big-market teams dominating, or c) he doesn't intend to make it a priority at this time"

Both of these have some truth in them. Mogul produces better PLAYER stats than OOTP so that means less time invested in the TEAM balance.

But the theory you leave out that I've seen Clay mention on the forums and in email is that Mogul is a game not a replay sim. When Mogul came out there was already a good replay sim (Diamond Mind) on the market. The point of Mogul is to experience the challenge of picking a team and making it successful. In OOTP its super easy to pick the Devil Rays and make them into a dynasty ... in fact getting the database configured and struggling with the interface is the hardest part. But in Mogul on the highest difficulty level it takes some effort to turn a poor team into a dominant team.

You can see that I'm a Mogul fanboy. Like you I don't play OOTP much anymore. You and CatKnight have done some useful analysis here and I forwarded it to Clay to help improve the game.

JP