Strike Zone Detection Model
By Taiko Ibuki
Architecture: YOLOv8-medium (yolov8m)
Task: Object Detection 4 classes: ball, batter, pitcher, strike_zone
- Model Description This object detection model identifies and localizes key elements of a baseball pitch to automate ball-strike calls. Using YOLOv8, this model fine-tunes a previously trained dataset from Roboflow to fit four classes: ball, batter, pitcher, and strike_zone. This model is intended to help objectively evaluate pitch calls and can be used for broadcast overlay systems or umpire analytics platforms. Architecture: YOLOv8-medium (yolov8m) Training basis: Fine-tuned from ROBO ump (Roboflow Universe) Task: Object detection (4 classes) Target mAP@50: > 0.85 Achieved mAP@50: 0.92
Intended use cases: Broadcast overlay systems displaying a real-time model-driven strike zone visualization Umpire analytics platforms comparing model calls vs. official calls per game Research and development for automated ball-strike officiating systems
- Training Data Dataset Source
Base dataset: ROBO ump Roboflow Universe, accessed 2025
https://universe.roboflow.com/toasty-workspace/roboump
Platform: Roboflow Universe (roboflow.com/universe)
Collection: Broadcast MLB footage (center-field fixed-angle camera)
Resolution: 640 ร 640 px (resized)
Class Distribution ball: ~551 images โ 70 / 20 / 10 split
batter: ~774 images โ 70 / 20 / 10 split
pitcher: ~774 images โ 70 / 20 / 10 split
strike_zone: ~300 images โ 70 / 20 / 10 split
Annotation Process The base dataset provided semi-automated annotations for ball, batter, and pitcher. Strike zone annotations were not included and were added entirely through manual labeling approximately 300 strike zones annotated in Roboflow, defined as the rectangular region from the batter's knees to the midpoint of the torso. A 10% quality review found false positives in the ball class (crowd balls, logos, advertising). Around 5% of ball annotations were removed. Roughly 3 hours of total manual correction work was spent tightening boxes and adding missed detections across all classes.
Data Augmentation Horizontal flip (50% probability) Color augmentation and saturation jitter
- Training Procedure Framework: Ultralytics
Hardware: T4 GPU
Batch size: Default (auto-tuned)
Epochs: 50
Image size: 640
Early stopping: Not applied
Preprocessing: Auto-resize to 640 ร 640, normalization
- Evaluation Results Overall Metrics
mAP@50: 0.92 (target > 0.85)
Overall F1: 0.91 (target >= 0.80)
Per-Class Breakdown
ball: mAP@50 = 0.72, F1 = 0.76
batter: mAP@50 = 0.97, F1 = 0.95
pitcher: mAP@50 = 0.97, F1 = 0.84
strike_zone: mAP@50 = 0.82, F1 = 0.88
Ball: 69 correctly predicted, 33 missed as background (~32% miss rate)
Batter: 145/145 correct
Pitcher: 145/145 correct
Strike_zone: 44/47 correct, 3 missed as background
Background false positives: 9 labeled as ball, 4 as strike_zone
Performance Analysis
The overall mAP@50 of 0.92 exceeds the 0.85 target and F1 of 0.91 clears the 0.80 threshold. Batter and pitcher detection is near-perfect, reflecting their large consistent silhouettes. Training and validation loss curves show steady convergence with no signs of overfitting.
The ball class is the most critical failure point. A mAP@50 of 0.72 means roughly 1 in 4 pitches in flight is missed or incorrectly detected. Since ball is the most important class for pitch classification, this gap matters more than the strong overall score suggests. The strike_zone class achieved F1 = 0.88 despite having the fewest training instances, though its mAP@50 of 0.82 falls just short of target.

- Limitations and Biases
Known Failure Cases
Ball near white uniforms: Low contrast causes false negatives when the ball overlaps light-colored jerseys Strike zone partially off-frame: Bounding box gets truncated when the batter stands near the frame edge Occlusion: When the catcher, batter, or pitcher obscures the ball at the plate, recall drops significantly Low-light or compressed frames: JPEG artifacts confuse the detector, especially for the small ball class: After a pitch is completed the broadcast will visualize where the ball landed, this was a cause of false positives.
Poor-Performing Class: Ball The ball class is the weakest performer (mAP@50 = 0.72) because the ball is typically only 10-20 pixels across at broadcast resolution and moves at 80-100 mph, causing motion blur in standard 30fps footage. Background crowd balls and logos also created false positives requiring manual cleanup.
Contextual Limitations
Fixed camera angle required: The model only works on center-field broadcast footage and will not generalize to other angles No 3D depth: A 2D bounding box cannot confirm a pitch actually crossed through the zone, only that it was near it Strike zone variability: The zone was manually estimated per frame and varies by batter height, introducing annotator variability Strike zone detection accuracy: The model likely learned to detect the catcher as a proxy for the strike zone rather than the true zone boundary. This means the bounding box does not represent exact strike zone measurements, making the model unsuitable for precise ball-strike determination without additional post-processing or a separate zone estimation method
Inappropriate Use Cases
Real-time umpire replacement: Ball detection at 0.72 mAP is not reliable enough for official game calls Non-broadcast footage: Amateur, youth league, or non-MLB footage is untested and likely to underperform Non-standard camera setups: Any angle other than center-field broadcast will produce unreliable results

