Introduction to Similar Players in MLB

It’s a common question to compare baseball players against each other. The question is what do you actually compare? Their playing styles? Positions they played? Teams they played for? Eras they played in? There are several dimensions to which this problem’s complexity increases dramatically. In fact, several people now try to compare Yasiel Puig against Mike Trout (like Mark Saxon). However, how would you compare them?

I’m interested in developing an analytical technique that will remove the manual labor of looking at statistics. By removing that tedious task, it would be interesting to see how player’s careers compare against other player careers. I’m hoping that in the end, I can use this as a significant factor in being able to predict whether a player will end up in the Hall of Fame.

More to come on this topic, but here’s a little tease:

  • Data sourced from Sean Lahman
  • Techniques include:
    • Dynamic Programming
    • Social Network Analysis
    • Logistic Regression
  • Programmed entirely in R and RStudio

The next problem is how to use all of this similarity to figure out what players are actually similar to each other. Keep an eye out for my next blog on using Social Network Analysis to cluster these players together and incorporating more than just Games Played.

Christopher Teixeira
Christopher Teixeira
Department Chief Engineer

My interests include using my skills for the public good and playing with baseball data.