Canadians in College Baseball
Introduction
The 2023-2024 school year marks the fourth season I have worked with the Canadian Baseball Network (CBN) to help identify Canadians who are playing college baseball at the NCAA (Div. I, II and III), NAIA, NJCAA (Div. I, II and III), CCCAA, NWAC and USCAA levels.
Bob Elliott, the site’s Editor in Chief, has known my dad for a long time, as he covered my dad’s college football team at the University of Ottawa. My dad connected us due to my interest in working in the baseball industry and Bob’s extensive career working as a baseball reporter.
Bob told me that he maintained a list of Canadians playing college baseball and that it was a very tedious process for him as he checked each college’s roster to see if any players’ hometowns were in Canada. He recognized that this process needed to be improved and asked me if I knew anything about web scraping. So, my work with the CBN began.
Canadians in College
The main focus of the project was writing code that would accurately identify the Canadian players who are playing at the 1,700+ college baseball programs in the country.
Since my background is in software development, I was not satisfied just emailing Mr. Elliott this list each week. I wanted to create an end product that required no work on their end. Thus, the deliverable is a formatted Google Sheet that is embedded on the CBN website. See the spreadsheet here.
Canadians in College Stats
Once the Canadians in College part of the project was running smoothly, an obvious expansion emerged… collecting player stats. After all, what is a baseball project without stats?
Since not every school’s website keeps a stats page that can be scraped, I determined that I would need to get the stats from the website of each league (NCAA, NAIA, NJCAA, etc.).
Similarly to Canadians in College, the goal here was to create a formatted Google Sheet that could be embedded on the CBN website. See the spreadsheet here.
All-Canadian Ballot
At the end of each season, Mr. Elliott sends out a ballot to qualified voters to determine which players will be named to the CBN’s All-Canadian team.
Since I collect each Canadian players’ stats throughout the season, I wrote a script that generates a new Google Sheet that separates the players by position and displays their stats.
Once I share the spreadsheet with Mr. Elliott, he removes players who did not play much or whose stats are below average. He then distributes a read-only copy of the spreadsheet to the voters, so they can place their votes with all of the eligible candidates and their statistics in front of them.
Summary
Overall, this project has been a great opportunity to take a real-world problem and build a technical solution to help solve it. It has furthered my skills in a number of areas including…
- Python coding
- Web Scraping
- Automation
- Data Quality
- Google Sheets API
- Customer Service (the CBN and its users as the customers)
I look forward to continuing to partner with the CBN and helping them inform their readers with accurate information and statistics.
Links
For more technical details of the project, visit the
GitHub Repository