Some of you know I’ve spent the past couple years helping the Strat-O-Matic Game Company research the Negro Leagues for a computer game and card set. Finally, after a long wait (and a lot of work), the product is getting very, very close to being released. For those Strat fans wanting more information, last Fall I did an interview with one of the columnists at the Bill James web site. It contains some additional information on the process behind the research and the Negro Leagues in general.
INTERVIEW WITH SCOTT SIMKUS
By Roel Torres
You Learn Something New Every Day
In a recent essay about the greatest offensive catcher of all-time, I wrote a fair amount about Negro League legend and Hall of Famer, Josh Gibson. I said, “You can’t track down a single concrete statistic for him. You can’t.” And I wrote that with confidence. It seemed like conventional wisdom. But of course, I should have known better. This is a complex world we live in, full of buried facts and undiscovered information. Often, the key is merely knowing where to look. A question that may be seen as a dark and unfathomable mystery by one man may produce a simple, transparent, self-explanatory answer from another. And I need to remember this. I have to remind myself that just because a statistic isn’t listed on Google, or Wikipedia, or Baseball-Reference, or Retrosheet doesn’t mean that it can’t be found. The art of research runs far deeper than that. The act of discovery has many more roads to choose from. As Agent Fox Mulder was fond of saying, “The Truth is out there.” It certainly is. And this is where Scott Simkus fits in.
Scott was kind enough to get in contact with me and inform me that statistics do exist for our friend, Josh Gibson. These stats are found in the pages of historical black newspapers such as the Pittsburgh Courier, Chicago Defender, Indianapolis Freeman and Baltimore Afro-American. They are found in newspapers, and archives, and libraries. We have batting averages, home runs, and RBIs. And more. Much, much more. There are lefty and righty platoon differentials. There are fielding percentages. Then he started telling me about groundball/flyball splits and range factors and ballpark effects and league equivalencies. In the Negro Leagues. I was stunned. My jaw dropped. Did you know these numbers existed? Did you know that those stats could be produced? I didn’t. I had no idea. Fielding percentages for Cool Papa Bell and Jud Wilson and Buck O’Neill. Groundball/flyball splits for Satchel Paige and Bullet Joe Rogan and Smokey Joe Williams. Lefty/righty splits for Buck Leonard and Oscar Charleston and even, yes, Josh Gibson.
Are you intrigued? Keep reading. Scott and I made a deal. He would let me interview him for Bill James Online if I agreed to be interviewed for a new blog he wanted to launch (Author’s note: I got the far better end of that exchange.) What follows next is the result of my correspondence with Scott Simkus, Chicago Cubs fan, baseball historian, and Negro League expert. I have to say, I definitely learned a lot from our discussion. And, I suspect that you may as well. Enjoy.
In the Beginning (the Introduction)
After I posted my essay “Mike Piazza vs. Josh Gibson,” Scott sent me the following email:
I enjoyed your article on the Josh Gibson/ Mike Piazza debate. I think the dead-even results of the on-line poll (and James’ own musings about the two players, which seemingly waver back and forth), speak volumes about how far we’ve come in understanding the Negro League players. That’s right: A 50/50 split represents huge progress in the grand scheme of things. The fact we acknowledge Gibson may actually have been better than a star who just finished his career is meaningful.
If there had been an internet in the 1970s and there had been a poll about the relative offensive skills of Johnny Bench (catcher du jour of the era) versus Josh Gibson, what do you suppose the results would have been? I’d bet on a Cincinnati landslide. With how little we knew about Gibson and the Negro Leaguers at the time, I don’t believe it’s unrealistic that over 90% of the voters would have gone with Johnny. Maybe even 95%. There’s no way the baseball public of that era would have come anywhere close to 50/50.
As if we needed more evidence about how our understanding of baseball history has progressed, merely look at the most influential mainstream baseball writers of the 1970s. Were they writing extensively about the Negro Leagues? Were they giving serious consideration that any black ballers (other than Satchel Paige, perhaps) should be part of the conversation about the top baseball players of all-time? No. It wasn’t that they were bad guys (no more or less moral – or immoral – than today’s baseball big foots), its just that black baseball had not yet become part of white baseball’s historical consciousness.
Bill James, arguably the most influential baseball writer of the past 25 years, takes Negro League players seriously. He’ll admit he doesn’t have all the answers, but he knows intuitively men like Josh Gibson, John Henry Lloyd, Oscar Charleston, and Satchel Paige need to be included in any discussion about the greatest players (black or white) of all-time.
Now, the only minor issue I have with your essay is the question about data for Josh Gibson. Prior to the special 2006 Negro League election at the Hall of Fame, Major League baseball (in conjunction with the Cooperstown boys) funded a landmark statistical study on the black leagues. Included in this process was a book called Shades of Glory, by Lawrence D. Hogan, which included an appendix with career numbers for guys like Charleston , Paige… and Josh Gibson. By their count, in 510 career games, Gibson hit .359 and averaged about 35 home runs per 154 games.
To put this into some type of context, his batting average and HR ratio are the BEST of all-time in the Negro Leagues. Numero Uno. Plus he walked a lot…his OBP would be top 2 or 3. OPS #1. Think about that for a minute. This is a league that included Charleston , Lloyd, Cool Papa Bell, John Beckwith, Monte Irvin, Roy Campanella, Turkey Stearnes, Willard Brown, Judy Johnson, Willie Wells, Martin Dihigo and Larry Doby. Yet there he is – Josh Gibson – at the top of the heap. A catcher – playing in Forbes Field and Griffith Stadium (two abysmal places to hit for righties), dominating his league.
I’m working as a consultant to the Strat-O-Matic Game Company in New York on a Negro League set which they plan to release after the holidays. Over the past four years, I’ve been able to go a step further than the Hall of Fame study, calculating lefty/righty splits and defensive numbers for 103 different Negro League All-Stars, including Josh Gibson. Let me know if you need any additional information.
Keep up the thought-provoking work,
Okay. That was the letter I received. There was a lot to take in. But those sentences at the end caught my attention. Scott had spent four years calculating lefty/righty splits and defensive numbers for over a hundred Negro League All-Stars. I was curious. I wanted to know more.
We traded a couple of emails and, as I mentioned earlier, Scott let me know that he had been going through box scores, calculating groundball/flyball ratios, fielding percentages, range factors, ballpark effects, and League Equivalency data for the Negro Leagues. He had home/road splits for Cool Papa Bell. He had spray charts for Josh Gibson. This was groundbreaking. This was state of the art. I was slightly staggered. I couldn’t think straight.
I asked if he would be willing to do an interview for Bill James Online, and here are the results.
Roel Torres: How did you end up studying the Negro Leagues in such depth?
Scott Simkus: I consider myself an all-around baseball nut, more than just a Negro League historian. I’ve got over 1500 House of David box scores, hundreds of stories and boxes from when the Japanese teams visited the USA, pre-WWII. Major League exhibition games. Hundreds of boxes from the famous Chicago semi-pro league pre 1912. I’m extremely interested in “outsider baseball” pre-integration, of which the Negro Leaguers were a huge part.
And you were able to find an abundance of Negro League data? How many box scores do you have?
Official Negro League games: over 3000.
That’s seems ridiculous. Doesn’t that get overwhelming? When Strat-O-Matic asked you to organize the stats, how did you approach working with a data set of that size?
I wound up zeroing in on the top 5-7 years for the 103 players included in our set and focused on the teams they played for.
I have to know: how do you acquire three thousand Negro League box scores?
I’ve spent the last four years going through microfilm and digitized historical newspapers. I’m a member of several libraries here in Chicago (as well as a couple out of state) which have helped me get access to the most important historical black newspapers such as the Pittsburgh Courier, Chicago Defender, Indianapolis Freeman and Baltimore Afro-American.
So the historical black newspapers were able to provide the backbone for your work?
There’s also a lot of information in historical white newspapers such as the Chicago Tribune, Washington Post, Indianapolis Star, and Kansas City Star. Plus there were hundreds of neutral site games in little podunk towns covered in podunk papers.
Sounds to me like you basically turned over every stone.
I’ve never counted the different newspaper archives used for the Strat-O-Matic set, but it must exceed 100 different titles.
Well, when you find all this stuff in a library, what do you do with it? How do you keep track of everything?
Actually, thanks to companies like ProQuest and NewspaperArchive, who digitize historical papers, a lot of this can be done from home. There are costs associated with using online services, but nothing in life is free. I print hard copies, catalogue them by year, then file them in chronological order. I have several four-drawer file cabinets in my closet where these are stored. I don’t just collect boxes, I also print interesting biographical stories and photographs from the old newspapers and categorize them as well.
Did you do all this through trial and error? Did you figure everything out, work out your system as you went along?
One of the people who has had an immense influence on my work is a guy named Gary Ashwill. He is – hands down – one of the most talented, generous people researching Negro League baseball today. He was a member of the Negro League Research group who worked on the Hall of Fame data. Early on, Ashwill helped point me in the right direction in terms of research and getting access to historical newspapers.
That’s a good guy to have on your side… Okay, give me an idea on some of the statistical breakdowns you can discover by working with the box scores?
Depends on how much of your life you’re willing to sacrifice. The box scores themselves pose so many challenges (missing columns, such as AT BATS, and incorrect totals in the bottom portion of the boxes) that it takes an immense effort sometimes, just to decipher one game.
And were you starting from scratch with all these box scores, or was some of this already collected?
The last thing in the world I wanted to do was build my own database. Negro League historians such as John Holway and James A. Riley have been collecting boxes since the 1970s. A SABR guy named Larry Lester and the Hall of Fame research group have the largest collection of Negro League boxes in the world, but for reasons not worthy of my uninformed speculation, they’ve never been made available to the public. I was pretty naïve a couple years ago. I thought I’d make a couple phone calls, send some emails, and – presto! – people would help give me access to big chunks of box scores. Turns out that’s not how things work in the real world.
What makes everything so difficult? Why is access so limited?
For whatever reason, Negro League research and development is a competitive business. It’s like R&D in the pharmaceutical industry, but without the big money. So I started slowly a few years ago- 15 to 20 hours per week – then worked myself into a frenzy this year. Between my real job (and the baseball project), I worked over 80 hours per week for four straight months this summer to get this thing done.
Uh, call me crazy, but I don’t believe human beings are meant to work on any project over 80 hours a week for four straight months. Holy mackerel. Was it worth it? Do you feel like all the Strat-O-Matic research paid off?
My wife and kids are happy this is almost over. It starts out as fun, but by the end it’s hell to sit there with a magnifying glass, photocopied box scores and an excel spread sheet pulled up on the computer for 8 to 10 straight hours. But is it worth it? No doubt. What makes my work for the Strat-O-Matic game so unique is the lefty/righty splits and fielding data for guys like Josh Gibson and Willard Brown. Nobody has ever done that before. It’s just a game, but I think it makes an important contribution to our understanding of the Negro League players.
Right. So you have platoon differentials for your hitters. But you also have breakdowns on pitching stats?
Yeah, we’ve got L/R splits for pitchers, stolen bases against per 9 innings pitched (to calculate their “hold” ratings), and groundball/flyball ratios for pitchers. Leroy Matlock, for instance, who was a star lefty for the mid-1930s Pittsburgh Crawfords, was an extreme flyball pitcher. 27% of his non-K putouts were recorded by outfielders, which is high compared to his contemporaries in the set. In The Neyer/James Guide to Pitchers Rob and Bill write “Matlock was famously tough on lefties.” I’m happy to report he was, indeed, tough on lefties, but he didn’t overwhelm them to the same extent as somebody like John Donaldson. Donaldson was a deadball era southpaw who was filthy on lefties, holding them to an anemic .167 batting average in our study.
I’m pretty sure that’s the most information I’ve ever seen about Leroy Matlock’s flyball tendencies and John Donaldson’s lefty numbers. I mean, that stuff is fascinating. It’s pure gold. It really is. After you’ve finished up the Strat-O-Matic project, have you ever thought about writing a book about all this material?
I’ve got a half-written baseball book, fully outlined and ready to be shopped, collecting dust on my bookshelf. The Strat-O-Matic project really evolved from the research I was conducting for this book, and ironically – in a very real sense, the game became my life in 2008. I’ve had to eat, sleep and drink Strat-O-Matic and Negro League ball this year to get this work done and the book itself has had to take a back seat.
So you already have something in the works. You’ve already laid the groundwork on it.
I haven’t had time to pursue a literary agent or publishing house, but the working title is Zulu Cannibals, Canadian Clowns, Satchel Paige and a Bearded Troupe of Religious Zealots: AN UNCENSORED TRIP INSIDE AMERICA’S INTERNATIONAL BASEBALL FREAK SHOW
Just rolls off the tongue, huh?
The title suggests that the scope of the book is more than just statistical analysis.
The book is really a wry, irreverent look at the world of outsider baseball- pre-1947, and the cast of fascinating characters (black, white, Latin, and Asian) who were forced to perform on its periphery. You’re not going to see the same tired old stories re-hashed here. I’ve got home run spray charts for Josh Gibson. Home/Road data for Cool Papa Bell. Statistics for the House of David and Bloomer Girls.
Spray charts for Gibson. Home/Road data for Cool Papa Bell. Every time you bring up something new, my head feels like it’s going to explode.
Every chapter promises to be something unprecedented and often bizarre. I’ve got the Japanese teams challenging the white Pacific Coast League ballclubs in the 1930s. What the book tries to do, in a funny, entertaining way, is look at the talent pyramid in baseball pre-integration to see if we can learn what the real shape and size of it was.
What kind of timetable are we talking here?
I was hoping to offer the book to the baseball masses in 2009, but due the Strat-O-Matic project, a 2010 release is probably more realistic. Just need to find the right publishing house now.
So the book is on the horizon. And it sounds like the Strat-O-Matic set is a little closer around the corner?
We’re currently working on the finishing touches, MLEs, league quality issues, etc. to convert it into a playable, realistic game that is compatible with what they’ve been producing since 1961. There have been other Negro League sims marketed, but nothing based on statistical research as detailed as this stuff. We’ve even calculated flyball/ groundball outs for guys like Satchel Paige. Fielding percentages and range factors for everybody. It’s going to blow people away.
And let me just confirm this for the record: you mentioned finding a couple of Filipinos who played in the Negro Leagues?
There have been at least two Filipinos who played with Negro League teams during the 1920s- interestingly, both were pitchers. A couple years back I stumbled across a reference to the “Filipino” pitcher Claudio Manela of the Cuban Stars. This was in a box score against a white semi-pro team featuring George Halas in the outfield. After the season, Halas was going to work on his other business, a little something we now call the “National Football League.”
I think I’ve heard of them. They might have a future. Who was the second Filipino?
Just recently, Gary Ashwill found a box score and reference to Jose Tombo, a Filipino who got a trial with the Chicago American Giants in the early 20s, then later played semi-pro ball with a Hawaiian all-star team in the Chicago area. Neither one of these guys will be in the SOM set, of course.
Awesome. Benny Agbayani now has competition as the Greatest Filipino baseball player of all time…
Scott, thanks for taking the time to talk to me. It’s been tremendously educational and a real pleasure. This is amazing stuff, and you’re doing a real service to baseball by doing this research.
Before I let you go, weigh in on the question – Who do you think is the better offensive catcher? Mike Piazza or Josh Gibson?
Mike Piazza was the top offensive catcher of his generation, and arguably the top hitting white backstop of all-time. For several seasons, he was among the top right-handed hitters in the game, period, regardless of what position they played.
Josh Gibson, on the other hand, was not just the top offensive catcher in black baseball during his era – he is the Greatest Hitter in the History of the Negro Leagues. He is – statistically speaking now – a combination of Rogers Hornsby and Babe Ruth.
A combo of Hornsby and Ruth. That’s a pretty good player.
I can tell you who I believe is overrated in the Negro Leagues (and there are plenty, some of them with plaques in the HOF), but when it comes to Gibson – as much respect as he gets in the SABR community already, he may yet be slightly underrated. The numbers that have emerged portray a ballplayer who may be larger than his legend.
I like Mike Piazza, and enjoyed following his career. But Josh Gibson’s numbers – his place in the context of the league in which he played – puts him at a different level. And if we tossed in the defensive skills issue, this wouldn’t even be a contest. Gibson, by a landslide.
I had a feeling you’d say that.
Once again, I’d like to thank Scott Simkus for generously donating his time and discussing his research on the Negro Leagues with me. It truly is exciting to think that we live in an age when we can get home/road splits, spray charts, range factor, fielding percentages, groundball/flyball ratios, stolen bases allowed per 9 innings, lefty/righty splits, park effects, and league equivalencies for all these wonderful Negro League legends who played the game so well. It’s simply astonishing. Once again, never underestimate the power of libraries, newspapers, and box scores. They’ll surprise you every time.
Don’t forget to check out Scott’s work over at http://scottsimkus.wordpress.com/. You can get his thoughts on baseball, and updates on his work on the upcoming Strat-O-Matic Negro Leagues set. And, if I remember to reciprocate, he should also have another interview posted on his site in the near future, but this time he asks the questions and I provide the answers. Of course, I can’t imagine how that could possibly compare to all the cool stuff he shared in his responses. Maybe I’ll make up stats for the Mexican League or something… –RT