Ultralytics YOLO
State-of-the-art real-time object detection — YOLOv8/YOLOv9/YOLOv10/YOLO11 models for detection, segmentation, pose estimation, and classification. Ultralytics features: YOLO class with pretrained models (yolov8n.pt to yolov8x.pt), model.predict() for inference, model.train() for fine-tuning, model.val() for evaluation, model.export() to ONNX/TensorRT/CoreML/TFLite, multi-task models (detect/segment/classify/pose), results with bounding boxes, masks, keypoints, and confidence scores, video/webcam/stream inference, and Python API + CLI. 1-50ms inference per image on GPU. Most popular object detection library — 30M+ downloads.
Score Breakdown
⚙ Agent Friendliness
🔒 Security
Local inference — no data sent externally. Pretrained weights from GitHub releases over HTTPS. AGPL license compliance is the primary risk for agent commercial deployments. Ultralytics Hub telemetry can be disabled with settings.reset().
⚡ Reliability
Best When
Building real-time agent vision systems that need to detect, segment, or estimate poses in images/video with GPU-accelerated inference — Ultralytics YOLO provides the best accuracy/speed tradeoff with the simplest training/deployment workflow in Python.
Avoid When
You need maximum accuracy (at cost of speed), research-level flexibility, or are detecting objects in very small images where YOLO's anchor-based approach struggles.
Use Cases
- • Agent object detection — from ultralytics import YOLO; model = YOLO('yolov8n.pt'); results = model.predict('image.jpg', conf=0.5) — detect 80 COCO classes in image; agent perceives objects with bounding boxes and confidence; nano model runs 1ms/image on GPU; results[0].boxes gives xyxy coordinates
- • Agent real-time detection — model = YOLO('yolov8s.pt'); results = model.predict(source='0', stream=True, show=True) — webcam detection stream; agent vision interface processes real-time video; stream=True yields Results objects frame by frame without memory accumulation
- • Agent custom object detection — model = YOLO('yolov8n.pt'); model.train(data='custom.yaml', epochs=100, imgsz=640) — fine-tune YOLOv8 on agent-specific objects; custom.yaml defines class names and dataset paths; 100 epochs with transfer learning from pretrained weights
- • Agent instance segmentation — model = YOLO('yolov8n-seg.pt'); results = model.predict('image.jpg'); masks = results[0].masks.data — pixel-level segmentation masks alongside bounding boxes; agent vision understands object shapes not just locations; -seg suffix for segmentation variants
- • Agent model export — model = YOLO('yolov8n.pt'); model.export(format='onnx'); model.export(format='engine') — export to ONNX for cross-platform, TensorRT for NVIDIA GPU optimization; agent detection runs 5-10x faster with TensorRT vs PyTorch; export once, deploy everywhere
Not For
- • Non-real-time high-accuracy detection — YOLO prioritizes speed; for maximum accuracy use two-stage detectors (Faster R-CNN, DETR) accepting 10x slower inference
- • Novel detection architectures research — YOLO is production-focused; for research flexibility use detectron2 or DETR directly
- • Counting/tracking across frames — YOLO is single-frame; combine with SORT or ByteTrack (included in ultralytics) for agent object tracking across video frames
Interface
Authentication
No auth for local inference. Ultralytics Hub (cloud training) requires API key. Pretrained weights download automatically from GitHub releases.
Pricing
Ultralytics YOLO is AGPL-3.0 licensed for open source. Commercial use requires paid license. Ultralytics Hub cloud training has free tier with paid tiers.
Agent Metadata
Known Gotchas
- ⚠ AGPL-3.0 license requires commercial license for proprietary agent products — Ultralytics YOLO is AGPL-3.0; using YOLO in a closed-source agent product requires purchasing commercial license from Ultralytics; AGPL requires derivative works to be open source; agent companies must audit license compliance before production deployment
- ⚠ results[0].boxes.xyxy is GPU tensor — results[0].boxes.xyxy returns CUDA tensor if model on GPU; agent code doing numpy operations must call .cpu().numpy() first; results[0].boxes.xyxy.cpu().numpy() gets NumPy array; iterating results[0] yields individual Detection objects
- ⚠ Model size (n/s/m/l/x) is critical speed/accuracy tradeoff — yolov8n.pt (nano) is fastest, lowest accuracy; yolov8x.pt is slowest, highest accuracy; agent real-time video pipelines must profile on target hardware; GPU inference at 1080p 30fps requires at least yolov8s on T4 GPU
- ⚠ Custom training data format must be YOLO format — train data requires images/ and labels/ directories with .txt label files (class cx cy w h normalized); COCO JSON or Pascal VOC XML requires conversion; use Roboflow or label-studio with YOLO export; wrong format gives 0 training samples warning
- ⚠ stream=True required for video/webcam inference — model.predict(source='video.mp4') without stream=True loads entire video into memory; agent video processing must use stream=True to get generator of Results; results = model.predict(source='video.mp4', stream=True); for chunk in results: process(chunk)
- ⚠ model.export() format changes inference API — model.export(format='onnx') creates model.onnx file that must be loaded separately with YOLO('model.onnx'); exported model may not support all Python API features; agent deployment using exported models must test full inference pipeline including NMS post-processing
Alternatives
Full Evaluation Report
Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for Ultralytics YOLO.
Scores are editorial opinions as of 2026-03-06.