ビックデータとpythonではじめる野球の統計分析 #pyconjp
TRANSCRIPT
Python Analyzing Baseball Data With Python
PyCon JP 2016 Talk Session 2016/9/22 Shinichi Nakagawa, Ai Makabi
Starting Member
• Who am I ?
• Hack→! Hack!!
• (MLBAM) Python & (pitchpx)
• Jupyter lab, pandas, matplotlib
• LT
•
• VS ※
• @amacbee
•
Who am I?
• Shinichi Nakagawa(@shinyorke)
• Python ※
• visasQ( ) Web Engineer
• Pythonista, Agile & Scrum & Kanban coach, Baseball Analyst
• /
•
• ( )
•
8-7 9 .
Hack! PythonPyCon JP 2015
http://www.slideshare.net/shinyorke/hackpython-pyconjp
Hack! PythonPyCon JP 2015
http://www.slideshare.net/shinyorke/hackpython-pyconjp
Hack! PythonPyCon JP 2015
http://www.slideshare.net/shinyorke/hackpython-pyconjp
MLBAM Dataset(Pitch f/x)
• MLB Advanced Media(MLB ) XML
.
• PITCH f/x
Statscast Sensor ( / ) .
• ( )100 .
• Copyright , OK.
• Github .
• http://gd2.mlb.com/components
1 (2016/4/6 VS )
http://gd2.mlb.com/components/game/mlb/year_2016/month_04/day_06/gid_2016_04_06_lanmlb_sdnmlb_1/inning/inning_1.xml
※ pitcher=“628317"
Analyzing Baseball Data with R
• MLBAM Retrosheet(※ Hack
)
• R(R Studio, pitchRx)
• https://www.amazon.co.jp/dp/B00GBC36S4
pitchpx - Getting MLB dataset
• MLBAM XML
CSV .
• pitchRx(R) (@shinyorke) .
• Python 3.3.x ( :Legacy Python!!!).
• BeautifulSoup( ),click( )
.
• PyPI .
• Jupyter, pandas .
$ # Python 3.3 ( Python 3.4 )$ pip install pitchpx$ # 2015/8/1-8/12$ pitchpx -s 20150801 -e 20150812 -o .
• Jupyter + pandas + matplotlib ,
PyData .
• Jupyter lab .
• INPUT/OUTPUT, pandas.
• matplotlib,seaborn .
• Qiita, .
Yu Darvish(2013, 2016)
• TJ( ) ( etc…)
• TJ (2013) (SL) (FC)
• TJ (2016) 4 (FF) 2 (FT) SL FC
• !?
2013(TJ ) 2016(TJ )
※ 'CH': 'Change-up', 'CU': 'Curveball', 'EP': 'Ephuus', 'FA': 'Fastball', 'FC': 'Cut Fastball', 'FF': 'four-seam Fastball', 'FO': 'Forkball', 'FS': 'Split-finger Fastball', 'FT': 'two-seam Fastball', 'KC': 'Knuckle Curve', 'KN': 'Knuckleball', 'SC': 'Screwball', 'SI': 'Sinker', 'SL': 'Slider', 'UN': 'Unknown'
8
•
http://www.sponichi.co.jp/baseball/news/2016/08/03/kiji/K20160803013085860.html
• ( ) ( )
• ?
5 7 8
• ( DL )
• 5-7 8
• 4
2016/5-7 2016/8
※ 'CH': 'Change-up', 'CU': 'Curveball', 'EP': 'Ephuus', 'FA': 'Fastball', 'FC': 'Cut Fastball', 'FF': 'four-seam Fastball', 'FO': 'Forkball', 'FS': 'Split-finger Fastball', 'FT': 'two-seam Fastball', 'KC': 'Knuckle Curve', 'KN': 'Knuckleball', 'SC': 'Screwball', 'SI': 'Sinker', 'SL': 'Slider', 'UN': 'Unknown'
( )
•
51
http://blog.livedoor.jp/nanjstu/archives/49340618.html
• 3,000
VS (2016 )
•
• (FF)
• (FT) ( )
• (42) (33) !?
Ichiro Suzuki Joey Votto
※ 'CH': 'Change-up', 'CU': 'Curveball', 'EP': 'Ephuus', 'FA': 'Fastball', 'FC': 'Cut Fastball', 'FF': 'four-seam Fastball', 'FO': 'Forkball', 'FS': 'Split-finger Fastball', 'FT': 'two-seam Fastball', 'KC': 'Knuckle Curve', 'KN': 'Knuckleball', 'SC': 'Screwball', 'SI': 'Sinker', 'SL': 'Slider', 'UN': 'Unknown'
• ( 165cm) 2
• … (SL).
• ?!?!?!?
• !?( )
Jose Altuve(26) Kris Bryant(24)
※ 'CH': 'Change-up', 'CU': 'Curveball', 'EP': 'Ephuus', 'FA': 'Fastball', 'FC': 'Cut Fastball', 'FF': 'four-seam Fastball', 'FO': 'Forkball', 'FS': 'Split-finger Fastball', 'FT': 'two-seam Fastball', 'KC': 'Knuckle Curve', 'KN': 'Knuckleball', 'SC': 'Screwball', 'SI': 'Sinker', 'SL': 'Slider', 'UN': 'Unknown'
VS Part2(Hit Zone)
• X (start_speed, ), Y , (pa_event_cd)
• pa_event_cd (20), (21), (22), (23)
※Retrosheet
• pitch_type orz
※ 'CH': 'Change-up', 'CU': 'Curveball', 'EP': 'Ephuus', 'FA': 'Fastball', 'FC': 'Cut Fastball', 'FF': 'four-seam Fastball', 'FO': 'Forkball', 'FS': 'Split-finger Fastball', 'FT': 'two-seam Fastball', 'KC': 'Knuckle Curve', 'KN': 'Knuckleball', 'SC': 'Screwball', 'SI': 'Sinker', 'SL': 'Slider', 'UN': 'Unknown'
Ichiro Suzuki Joey Votto
VS Part2(Hit Zone)
• X (start_speed, ), Y , (pa_event_cd)
• pa_event_cd (20), (21), (22), (23)
※Retrosheet
• pitch_type orz
※ 'CH': 'Change-up', 'CU': 'Curveball', 'EP': 'Ephuus', 'FA': 'Fastball', 'FC': 'Cut Fastball', 'FF': 'four-seam Fastball', 'FO': 'Forkball', 'FS': 'Split-finger Fastball', 'FT': 'two-seam Fastball', 'KC': 'Knuckle Curve', 'KN': 'Knuckleball', 'SC': 'Screwball', 'SI': 'Sinker', 'SL': 'Slider', 'UN': 'Unknown'
Ichiro Suzuki Joey Votto
- -
http://sportsworld.nbcsports.com/bill-james-statistical-revolution/
• Kawasaki.rb #kwskrb http://kawasakirb.github.io/PyCon JP 2014 2016,
• BPStudy #bpstudy http://bpstudy.connpass.com/ BPStudy Hack .
•