Japanese Projections – Part 3: Hitters

November 9, 2006

This post has been made largely irrelevant by the excellent work of Jeff Sackmann over at the Hardball Times (Jeff is also perhaps the primary reason for the author’s GMAT score – thanks again for the great blog, Jeff!). Even so, this subject deserves attention.

Akinori Iwamura has been posted by his team and is pretty much guaranteed to be on an MLB team next year. Earlier, we used the factors Aaron Gleeman developed for his Kenji Johjima projection in an off-the-cuff Iwamura projection. Today we’ll be more rigorous.

Most translation systems include data from players who go both from NPB to MLB and from MLB to NPB. Instead, we will only use data from NPB players born in Japan who later played in MLB. The sample, unfortunately, is quite small: it includes Ichiro Suzuki, Hideki Matsui, Tsuyoshi Shinjo, Kazuo Matsui, Tadahito Iguchi, So Taguchi, and Kenji Johjima. We used data from their final three seasons in Japan and compared that with their first two seasons (assuming they had that many) in America. This was done using matched PAs; thus, in all cases, Japanese numbers were interpolated to match the number of plate appearances each player had in America. Summary data for each statistic gave us the following translations:


Stat    NPB	MLB	Factor
PA	6939	6939	N/A
AB	6046	6251	1.0339
H	1912	1820	0.9519
2B	348	324	0.931
3B	30	41	1.3667
HR	300	157	0.5233
TB	3220	2697	0.8376
SO	948	927	0.9778
BB	718	516	0.7187
HBP	85	65	0.7647

To use these factors, simply apply them to an NPB line while holding PA constant (for instance, if player X hits 20 HR in 500 PA in NPB, he’d hit about 10 HR in 500 PA in America). The largest factor, by far, is for HR. Going to America is devastating to NPB home run hitters – they hit homers at roughly half the rate per plate appearance in America than they did in Japan. Interestingly enough, this group struck out less in America than in Japan, which indicates they probably changed their hitting approach significantly.

Akinori Iwamura

Here are his numbers for 2004-2006 in Japan.

Akinori Iwamura Stats

Here are those numbers translated to MLB using the above method.

Akinori Iwamura Translated Stats

He loses about 20 points of AVG, 40 points of OBP, and 100 points of SLG on average. Yikes. Here’s a 3-year weighted projection, pro-rated to 160 games played (as he was very durable in Japan).

Akinori Iwamura 2007 Projection

That’s a little better than my back-of-the-envelope projection from last week, but still not all that great for a third baseman. Basically, it’s a slightly better version of David Bell. Here’s hoping he can play second. He did win another gold glove this year, so that’s something.

Tadahito Iguchi

Let’s try out this method on some other players, even though it’s cheating (you shouldn’t apply a model to the data used in making it). Below are Iguchi’s projected 2005 line (year in italics) and his actual line (below).

Tadahito Iguchi Projection vs Actual 2005

Not bad – I’ll take a projection that is within 21 points of OPS anytime. Of course, we expected this to happen, since his numbers were used to make the model.

What about some other Japanese stars? Let’s take a look.

Kosuke Fukudome

This star OF for the Chunichi Dragons probably isn’t making the jump any time soon, but let’s take a look at what he might do in the bigs.

Kosuke Fukudome 2007 Projection

As you can see, his skills transfer very well. Along with his good hitting numbers, Fukudome is an outstanding center fielder; he’d find quite a few suitors in MLB if he wanted to try his luck here. Sadly, he turns 30 next April, so we’ll have missed the prime of his career if and when he ever decides to come over.

Shinnosuke Abe

Abe is the Yomiuri Giants’ starting catcher. He’ll never be posted, so we’ll have to wait for him to come over via free agency if he wants to play in MLB (he won’t be eligible for three more seasons). Here’s his projection.

Shinnosuke Abe 2007 Projection

Abe slugged .630 in 2004, but hasn’t come near that since, and it shows in his projection. His HR have gone 33-26-10 in the past three years. Pass.

Next week, we’ll take another look at pitcher projections using homegrown translations that will hopefully be a little more accurate and/or believable.


Matsuzaka End-of-Auction Contest

November 8, 2006

5 minutes ago, bidding for Daisuke Matsuzaka’s posting rights officially ended. MLB will forward the amount of the winning bid to Seibu, who has 4 days to decide whether or not to accept it. At that point, the winning team is announced and they can begin to negotiate with Matsuzaka’s agent, Scott Boras.

In the comments, feel free to guess the amount of the winning bid and the team that won. Our predictions after the break:

Read the rest of this entry »

Japanese Projections – Part 2: Pitchers

November 8, 2006

Earlier, we talked about how hard it is to predict the future. As our old professor Larry Sabato used to say:

He who lives by the crystal ball ends up eating ground glass.

So, with that in mind, let’s fry up a delicious glass omelet!

Jim Albright’s system gives us the means to translate Japanese into American, so to speak: numbers from NPB become numbers from MLB. Of course, the translations overlook several factors. They do not account for park effects, for one. Another: they don’t adjust for age. And finally: they don’t account for league difficulty. These are problems I’ll try to tackle at some point in the future; but for now, we’ll overlook our beauty queen’s gapped teeth and barely noticeable moustaches, and get her ready for the swimwear competition.

The first set of translations are easy: we simply hold IP constant and multiply the other statistics by the translation factors. As mentioned yesterday, 100 hits in 100 IP become 107 hits in 100 IP. Et cetera. (Stats we don’t have translation factors for, like HBP and WP, were left unchanged). The largest adjustment turns out to be for home runs; despite the bigger parks in America, pitchers have trouble keeping the ball in the yard when they make the trip over here.

So, we’ve translated hits, home runs, strikeouts, and walks to MLB equivalents. What now? We used Bill James’ component ERA formula to calculate an ERC for each player. Then, based on the number of innings, we figure out how many earned runs that player must have allowed given the number of innings they pitched.

Aside: the ERC formula requires BFP (which we don’t have for all years) as one of its inputs. Using the Lahman database and Excel, I regressed BFP on IPouts, H, BB, K, HBP, and HR. I used the weights from this regression to estimate BFP. For a full season’s worth of hitters (800+ BFP) the calculated value is usually within 5 BFP and rarely further than 15 BFP from the correct value. Click here to see the regression results.

Then, using their historical ratio of R to ER, we take a stab at guessing how many unearned runs they might have allowed in addition. If we wanted, we could also try to guess how many wins and losses a player would have had based on their RA, an assumed team RA and run context. For now, we’ll just ignore them in our translated statistics.

After we have all our translations done, we should adjust everything for age and park. And maybe we will, later. But for now, a simple flat 3/2/1 projection without mean regression will have to suffice. What that means in English: we will assign each of the last three years a weight of either 3 (for the most recent year), 2, or 1. We will then calculate the weighted average for stats like BB, H, K, IP, etc. using that algorithm. ERC, ER, and R are re-calculated as described above. Finally, we will re-calculate starts and innings pitched based on the assumption that Japanese pitchers will throw fewer pitches per start (but start more frequently) in America, and pro-rate other stats accordingly.

Aside: Since the start of the 2000 season, 420 pitchers have started at least 25 games with one team during a season while making no relief appearances. I calculated the average number of batters those pitchers faced per start — it’s 26.7. From this number, we can assume either a number of starts or a number of batters faced and back into innings pitched (and hence other numbers) that way.

Kei Igawa

Here’s how Igawa did in Japan the past three years:

Kei Igawa Actual Statistics

Pretty good numbers (although lots of home runs). 228 K in 200 IP looks great. Watch what happens after the translation:

Kei Igawa Translated Statistics

Some good, some bad. Note that Igawa’s BB/K ratios are always pretty good, though he gave up too many baserunners and homers in ’04 and ’05. Hard to find anything wrong with the translated 2006 line, although we find it a tad too optimistic a translation. Keep in mind that GS has not been adjusted, and it’s unlikely that Igawa would have stayed in each start as long as these stats would lead you to believe. We will adjust for that in his projection.

Aside: You might wonder why Igawa’s Japanese ERA was nearly identical in 2004 and 2005 yet translated so differently. The first numbers use his actual Japanese ERA; the second estimate what his ERA would have been in America given his component stats. Thus, despite posting similar ERAs in Japan in 2004 and 2005, Igawa’s components indicate he pitched much better in 2004 than he did the following year.

Now, we project his stats using the model described above. We will assume he makes 30 starts and faces 26.7 batters per start. Also, we assume he plays for a team that scored 4.85 runs per game (splitting the difference between the AL and the NL, as this is projection applies to neither league in particular). Finally, we’ll assume he got a decision for every 9 IP and calculate his winning percentage using James’ pythagorean formula with an exponent of 1.82. That gives us this:

Kei Igawa 2007 Projection

The ERA is deceptive – he’s giving up a lot of unearned runs. Basically a league-average starter. This line is somewhat similar to Matt Clement or Jeremy Bonderman ca. 2005. If he can match this projection, he’s worth Jeff Suppan money.

Hiroki Kuroda

Kuroda apparently re-signed with the Carp already, but let’s take a look anyway. Japanese actual stats:

Hiroki Kuroda Stats

He doesn’t strike out a ton of hitters, but he keeps the ball in the park and has great control (his R/ER ratios are surprising – they’re very low for a groundball pitcher, as he reportedly is). Translated:

Hiroki Kuroda Translated Stats

Those hold up very well, mainly because he doesn’t walk anyone and keeps the ball down. Note the 2005 3.17 translated ERA matches the 3.17 actual ERA by a lucky quirk: his Japanese peripherals suggested he was unlucky to have an ERA as high as it was. Projected to 2007:

Hiroki Kuroda 2007 MLB Projection

A Cy Young candidate in the National League. Two important caveats: first, there’s no age adjustment, and he’s on the bad side of 30. This would cause him to take a hit. Second, it seems unlikely that a guy who relies on control could allow so many balls in play but so few over the fence. This projection is probably at least a run too low.

Kazumi Saitoh

Ahh, my favorite player in NPB. His numbers are fantastic; how will they hold up? Actual numbers:

Kazumi Saitoh Stats

Not a lot of innings in 2004 and 2005; was he hurt? Tons of runs in 2004 despite pretty good peripherals, too. Translations:

Kazumi Saitoh Translated Stats

Not a bad 2006, huh? The H/9 looks too low, though. He would have run away with the Cy Young if he put up those numbers in MLB. 2007 projection:

Kazumi Saitoh 2007 Projection

Sign me up! I’m not sure if he would be able to sustain the BABIP, though. I think this projection is a tad optimistic, but I buy it more than Kuroda’s. Note that I didn’t give Saitoh 30 starts, as that would have been a reach given the number of innings he’s thrown recently.

Daisuke Matsuzaka

What we’ve all been waiting for. Actual stats:

Daisuke Matsuzaka Actual Stats
Absolutely dominant. 138 hits in 186 innings is incredible. He did miss a few starts in 2004 to injury. Translated:

Daisuke Matsuzaka Translated Stats
It’s hard not to get excited. K/BB is still over 5. HR rates are low. Wow. And the projection:

Daisuke Matsuzaka 2007 Projection
Wonder why teams are bidding $25 million just to talk to this guy? Now you know. He probably won’t be this good — his projected BABIP is too high, for instance. But you never know…

Igawa survives MLB All Stars

November 7, 2006

Kei Igawa started last night (the night of the 7th) against the MLB All Stars, according to this article in the Daily Yomiuri. He was wild but effective, allowing five walks but just two runs, and left with the score tied two all. MLB then piled on five runs against Japanese relievers, winning the game 7-2. Igawa also struck out four.

Daily Yomiuri: MLB All-Stars clean up again

Japanese Projections – Part 1: Background

November 7, 2006

Predicting the future is hard, but it can be a lot of fun.

Projection is the art (fools say science) of predicting the future when applied to baseball players. Generally, it involves looking at what a player has done in the past, especially in the recent past, adjusting for various things (like age, injuries, et cetera), and applying an algorithm or set of algorithms to create a set of magic numbers. These magic numbers, though based on reality, can have tragic consequences. For instance, two years ago I reasoned that the $10m the White Sox committed to Jermaine Dye for 2005 and 2006 might have been better spent on Jose Cruz Jr. For those keeping score:

Player   Year    OPS
Dye      2005   .845
Cruz     2005   .837
Dye      2006  1.007
Cruz     2006   .734

Oh well, one outta two ain’t bad, right? Point being, at least for idiots like me, it’s hard to predict the future for major league players – there is a lot of variability, for lots of reasons. And it gets harder.

f major league players are hard to project, what about Japanese league players? The first step in our quixotic attempt to predict the future for this small subset of players is translating their Japanese statistics into a context we can understand. Although we have access to plenty of Japanese statistics, we aren’t quite sure what they mean. For instance, a major league pitcher with a 3.00 ERA last year performed very well indeed. A 3.00 ERA in Nippon Professional Baseball isn’t as good; how good is it?

In attempting to answer that question, we run into many problems. First, the sample of players who play in both NPB and MLB is very small — only a few players each year go in one direction or another. Compare that with the minor leagues; every American-born player who played a major league game in 2006 spent some time in the minors… and we still have trouble projecting minor league players! Also, the sample is biased; players who go from MLB to Japan tend to do so because they weren’t good enough to play in MLB any more, whereas Japanese players come to America because they were too good for the league. Unlike minor league translations, which usually compare players across multiple minor league levels in the same year, Japanese translations rely on comparing performance in separate years, even though much might have changed in the interim.

Aside: Imagine for a moment that Bizarro World Andruw Jones spent 2005 and 2006 (after learning his new batting stance and thus gaining lots of power at the plate) in Japan. We can assume he would have put up astronomical numbers: at least 60-65 home runs, for instance, despite the shortened season. Jones in 2003 and 2004 had a slugging percentage of .500; in 2005 and 2006, he slugged .553. Thus, translating Bizarro Jones’ stats to MLB (by comparing them with 2003-2004 Jones) would overestimate the dampening effect on SLG of MLB by at least 10%, the improvement in his slugging percentage exogenous to league difficulty.

In addition to all the factors mentioned above, there are factors that are unquantifiable. Japan and the United States are very different places; how much of a player’s failure to hit stems from the relative difference in the level of competition, and how much stems from him not being able to find his favorite foods, from not having anyone in the clubhouse to talk to, from not being able to converse with 99% of the people he meets?

Translations are a crude way of adjusting for these factors in one fell swoop. We know they aren’t very good, but they’re the best we have. Jim Albright came up with these translations for pitchers:

League hits homers walks strikeouts
Japan 14624 1545 5832 10963
Majors 15737 1910 6252 9695
Adjustment factors* 1.076 1.236 1.072 0.884

These are matched-innings translations. That is, they assume that the pitchers in both leagues pitched the same number of innings in each case. If Pitcher A gave up 100 hits in 100 innings in Japan, we’d expect him to give up 108 hits in 100 innings in America.

Because I’m stubborn, I like to reinvent the wheel. I plan on taking a second look at these translation factors later. For now, they’ll be an easy way of giving us some translated data we can use to take a stab at the question on everyone’s mind: how will Daisuke Matsuzaka do in the major leagues?

My daughter’s first steps

November 6, 2006

No baseball today. Check back tomorrow. In the meantime, here’s a video of my daughter Fiona taking her first steps (without holding onto us) this past weekend.

Report: Igawa to be Posted

November 3, 2006

Interesting stuff. The Daily Yomiuri reports that the Hanshin Tigers will allow star LHP Kei Igawa to follow his dream and jump over to MLB by posting him next week. Igawa is scheduled to start next Tuesday in the NPB/MLB All Star series currently underway in Japan. I expect Hanshin expects to replace their star with free agent Hiroki Kuroda, who the article reports is filing for free agency despite a contract offer of 3 years, 1 billion yen from his current team.

As for Igawa, he’ll probably have plenty of suitors. Along with all the losers in the Matsuzaka sweepstakes, expect the Mariners, Dodgers, and Braves to be interested.

Report: Tigers to let Igawa go
The Daily Yomiuri: Link

Matsuzaka Posted Today

November 2, 2006

MatsuzakaFinally. Besides Seattle, the Angels, Orioles, and Giants have all reportedly dropped out of the race. Unfortunately, this means that Matsuzaka is even more likely to end up on the Yankees. Booo! Teams have until 5pm next Wednesday to get their bids in; after that, Seibu has a few more days to decide whether or not it will accept the bid.

mlb.com: Let the Bidding for Matsuzaka Begin
Japan Times: Matsuzaka ‘relieved’ Seibu OKs his request

NPB Player News

November 2, 2006

I’m slammed today, so projections will have to wait. In the mean time…

The Mariners, considered by some to be the favorites for the Daisukster, decided not to bid on him when he is posted. This will come as a surprise to lots of folks – the M’s obviously have had a lot of success with previous imports Ichiro! and Kenji Johjima, so it only made sense that they’d try to go to the well again. What does this mean? A (slightly) lower winning bid, perhaps, and a lot of disappointed fans in Seattle.

So the impetus for this story is an article on Yahoo! Japan; I don’t read Japanese, so I’ll have to take the MLB Trade Rumors folks at their word. The Hanshin Tigers are making a pitch for Kuroda. However, if Kuroda signs with Hanshin, that probably makes Kei Igawa more likely to be posted. See below.

Igawa wants to come over and has for years; he’s a bit of a showman, apparently, and wants to show what he can do in MLB. If Hanshin wants Kuroda, the posting fee they earn from Igawa would pay for a bunch of free agents. Stay tuned.

Japanese Future MLBers: #1 – Daisuke Matsuzaka

November 1, 2006

The only thing in recent memory less surprising than Gary Sheffield’s involvement in a contract dispute is the #1 player on this list, Seibu Lions ace Daisuke Matsuzaka.

Daisuke Matsuzaka, Seibu Lions RHSPLike as not, you’ve already heard of Matsuzaka. Just in case you haven’t, here’s the scoop: he’s probably the best baseball player in the world not on a major league roster (I can hear the Cubans griping already – calm down, will y’all?). He’s 26 years old and has been dominating Nippon Professional Baseball for eight seasons. 2007 is the final year of his contract with Seibu; rather than lose him to free agency for no return, they plan on using the posting system to sell his contract rights to a major league team.

Matsuzaka stands 6 feet tall and weighs about 185 pounds, so he’s a little smaller than you’d like for a right-handed pitcher, but he has proved fairly durable. The stories of him pitching in the Koshien Tournament are already legendary – he pitched 17 innings and threw 250 pitches one day for the victory, then came out in relief and picked up the save on the next. This year, he’s averaged around 140 pitches per start, albeit pitching on longer rest than he will in America.

Matsuzaka’s statistical record is excellent. He won the Sawamura award with a dominant season in 2001, although the heavy workload he endured that year might have caused him to miss time the following season. These days, he strikes out over a batter per inning, walks few, and doesn’t give up many homers. His statistics have been basically the same as those of Saitoh the past few seasons, but Saitoh came away with the hardware.

Matsuzaka throws a lot of pitches and throws them for strikes. He hit 94 on the gun many times during the World Baseball Classic as he spotted his fastball in and around the strike zone, although he usually sits in the 88-92 range. His slider has a very sharp break, like a yo-yo being snapped backward; he uses the pitch to make batters look foolish. And then there’s the gyroball. Right now, Will Carroll is hiking through the Japanese hinterlands with a crappy camcorder attempting to record footage of this mythical beast.

Buttercup: Westley, what about the G.O.U.S’s?
Westley: Gyroballs of Unusual Size? I don’t believe they exist.

In my opinion, Matsuzaka’s arsenal is exciting enough that it doesn’t need the added hype of a mystery pitch that Matsuzaka himself claims he doesn’t really throw. No, really – his stuff is great. Check the video if you don’t believe me.

Despite his obvious talent, Matsuzaka is anything but a sure thing. He has an awful lot of mileage on his young arm; as hard as it was to witness Francisco Liriano’s elbow pop, imagine if your team had just invested $90 million in him. Also, Matsuzaka might struggle under the microscope of a major media market like New York or Boston. Finally, the effects of culture shock, while probably minor, are still unpredictable at best.

Me? I’m optimistic. I expect 200 IP, 180 K, 45 BB, 25 HR, and a 3.40 ERA from Matsuzaka, and won’t be surprised if he beats that projection. He’s good enough to do it. Jeff Sackmann thinks that will earn him roughly $50 million in guaranteed salary, plus a posting fee of $25 million to Seibu. That seems reasonable enough to me, but again, I wouldn’t be surprised if the number ended up even higher. We’ll know for sure in the next couple of weeks.

Daisuke Matsuzaka 2006 Stats:

 W  L   ERA  GP  GS CG ShO Hld  GF Sv     IP    H   R  ER  HR  BB   SO
17  5  2.13  25  25 14   2   0   0  0  186.1  138  50  44  13  34  200

Daisuke Matsuzaka Career Statistics
Daisuke Matsuzaka Biographical Data
Detect-O-Vision on Matsuzaka