Strike Zone Detection Model

By Taiko Ibuki

Architecture: YOLOv8-medium (yolov8m)

Task: Object Detection 4 classes: ball, batter, pitcher, strike_zone

Model Description This object detection model identifies and localizes key elements of a baseball pitch to automate ball-strike calls. Using YOLOv8, this model fine-tunes a previously trained dataset from Roboflow to fit four classes: ball, batter, pitcher, and strike_zone. This model is intended to help objectively evaluate pitch calls and can be used for broadcast overlay systems or umpire analytics platforms. Architecture: YOLOv8-medium (yolov8m) Training basis: Fine-tuned from ROBO ump (Roboflow Universe) Task: Object detection (4 classes) Target mAP@50: > 0.85 Achieved mAP@50: 0.92

Intended use cases: Broadcast overlay systems displaying a real-time model-driven strike zone visualization Umpire analytics platforms comparing model calls vs. official calls per game Research and development for automated ball-strike officiating systems

Training Data Dataset Source

Base dataset: ROBO ump Roboflow Universe, accessed 2025

https://universe.roboflow.com/toasty-workspace/roboump

Platform: Roboflow Universe (roboflow.com/universe)

Collection: Broadcast MLB footage (center-field fixed-angle camera)

Resolution: 640 × 640 px (resized)

Class Distribution ball: ~551 images — 70 / 20 / 10 split

batter: ~774 images — 70 / 20 / 10 split

pitcher: ~774 images — 70 / 20 / 10 split

strike_zone: ~300 images — 70 / 20 / 10 split

Annotation Process The base dataset provided semi-automated annotations for ball, batter, and pitcher. Strike zone annotations were not included and were added entirely through manual labeling approximately 300 strike zones annotated in Roboflow, defined as the rectangular region from the batter's knees to the midpoint of the torso. A 10% quality review found false positives in the ball class (crowd balls, logos, advertising). Around 5% of ball annotations were removed. Roughly 3 hours of total manual correction work was spent tightening boxes and adding missed detections across all classes.

Data Augmentation Horizontal flip (50% probability) Color augmentation and saturation jitter

Training Procedure Framework: Ultralytics

Hardware: T4 GPU

Batch size: Default (auto-tuned)

Epochs: 50

Image size: 640

Early stopping: Not applied

Preprocessing: Auto-resize to 640 × 640, normalization

Evaluation Results Overall Metrics

mAP@50: 0.92 (target > 0.85)

Overall F1: 0.91 (target >= 0.80)

Per-Class Breakdown

ball: mAP@50 = 0.72, F1 = 0.76

batter: mAP@50 = 0.97, F1 = 0.95

pitcher: mAP@50 = 0.97, F1 = 0.84

strike_zone: mAP@50 = 0.82, F1 = 0.88

Confusion Matrix Summary

Ball: 69 correctly predicted, 33 missed as background (~32% miss rate)

Batter: 145/145 correct

Pitcher: 145/145 correct

Strike_zone: 44/47 correct, 3 missed as background

Background false positives: 9 labeled as ball, 4 as strike_zone

Performance Analysis The overall mAP@50 of 0.92 exceeds the 0.85 target and F1 of 0.91 clears the 0.80 threshold. Batter and pitcher detection is near-perfect, reflecting their large consistent silhouettes. Training and validation loss curves show steady convergence with no signs of overfitting. The ball class is the most critical failure point. A mAP@50 of 0.72 means roughly 1 in 4 pitches in flight is missed or incorrectly detected. Since ball is the most important class for pitch classification, this gap matters more than the strong overall score suggests. The strike_zone class achieved F1 = 0.88 despite having the fewest training instances, though its mAP@50 of 0.82 falls just short of target.

Limitations and Biases

Known Failure Cases

Ball near white uniforms: Low contrast causes false negatives when the ball overlaps light-colored jerseys Strike zone partially off-frame: Bounding box gets truncated when the batter stands near the frame edge Occlusion: When the catcher, batter, or pitcher obscures the ball at the plate, recall drops significantly Low-light or compressed frames: JPEG artifacts confuse the detector, especially for the small ball class: After a pitch is completed the broadcast will visualize where the ball landed, this was a cause of false positives.

Poor-Performing Class: Ball The ball class is the weakest performer (mAP@50 = 0.72) because the ball is typically only 10-20 pixels across at broadcast resolution and moves at 80-100 mph, causing motion blur in standard 30fps footage. Background crowd balls and logos also created false positives requiring manual cleanup.

Contextual Limitations

Fixed camera angle required: The model only works on center-field broadcast footage and will not generalize to other angles No 3D depth: A 2D bounding box cannot confirm a pitch actually crossed through the zone, only that it was near it Strike zone variability: The zone was manually estimated per frame and varies by batter height, introducing annotator variability Strike zone detection accuracy: The model likely learned to detect the catcher as a proxy for the strike zone rather than the true zone boundary. This means the bounding box does not represent exact strike zone measurements, making the model unsuitable for precise ball-strike determination without additional post-processing or a separate zone estimation method

Inappropriate Use Cases

Real-time umpire replacement: Ball detection at 0.72 mAP is not reliable enough for official game calls Non-broadcast footage: Amateur, youth league, or non-MLB footage is untested and likely to underperform Non-standard camera setups: Any angle other than center-field broadcast will produce unreliable results

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support