Skip to content

Active Deterrence

Project start: December 2025 — Present

Role: AI / Computer Vision Engineer

Status: 🟢 Active Development

Collaboration: SmartDrones, Karabela


Project description

Active Deterrence is an AI Vision system for autonomous security drones. The project addresses a critical gap in current drone security solutions: while existing systems can detect motion, they lack the intelligence to understand what they see and react appropriately.

Current limitations of autonomous security drones:

  • detection without understanding — can detect movement but cannot classify what triggered the alert
  • high false positive rates — alerts triggered by animals, shadows, leaves
  • no intelligent response — operators are overwhelmed with notifications
  • passive monitoring only — no capability for active deterrence

I am building an end-to-end AI pipeline that transforms raw drone footage into actionable security intelligence — detecting and classifying objects (persons, vehicles), filtering false alarms, and enabling automated voice deterrence when intruders are detected.


Architecture Overview

Active Deterrence is an image → detection → decision → voice pipeline:

  • Video Ingestion — MP4/RTSP/JPG from drones (day + IR/night modes)
  • Computer Vision Layer — YOLOv8 fine-tuned on drone footage
  • Alert Logic — confidence threshold (40%+), alert cooldown, frame sampling
  • Voice AI (Active Deterrence) — ElevenLabs TTS for real-time voice alerts (~300-500ms latency)
  • Training Pipeline — GPU Droplet (H100/A100) on Digital Ocean with TensorBoard monitoring

Model Training Journey

A key aspect of this project was the iterative model training process. I conducted 4 training iterations with different dataset combinations:

Model Dataset mAP@50 Precision Recall Key Insight
v2 Karabela only (218 img) 82.4% 86.7% 85.4% Baseline, overfitting observed
v3 Karabela + HIT-UAV (~3k img) 93.5% 93.2% 90.3% Best performance
v4 + VisDrone (~10k img) 62.0% 79.1% 57.4% Domain mismatch degradation

Technical Learnings

More data ≠ better model
Adding 10,000 VisDrone images actually degraded performance. Analysis revealed class imbalance (vehicle-heavy), domain mismatch (different altitudes/angles), and small object challenges.

Confusion Matrix Analysis is Critical
The v4 model showed 9,000 false negatives for the vehicle class — objects the model completely missed. This guided the optimization strategy.

Small Object Detection is Hard
Vehicles from high-altitude drone footage often appear as <20px objects. Solutions include SAHI (Slicing Aided Hyper Inference), tiling preprocessing, and larger model architectures.


What I did

  1. Designed AI Vision system architecture for security drones
  2. Built data pipeline for ingestion and auditing from Google Drive
  3. Performed labeling in Roboflow (person, vehicle classes)
  4. Trained 4 model iterations with different datasets
  5. Integrated external datasets (HIT-UAV, VisDrone)
  6. Analyzed confusion matrix and identified small object problems
  7. Integrated ElevenLabs TTS for voice alerts
  8. Built Streamlit demo with full flow: upload → detection → alert
  9. Prepared GPU training infrastructure (Digital Ocean + TensorBoard)

Skills

Category Technologies
Computer Vision YOLOv8, PyTorch, OpenCV, Roboflow
Deep Learning Transfer Learning, Hyperparameter Tuning
MLOps TensorBoard, Digital Ocean GPU, Docker
Voice AI ElevenLabs TTS
Backend FastAPI, Python
Frontend Streamlit
Data FFmpeg, Annotation Tools

Results

  • Model v3 with mAP 93.5% — best result
  • Working demo: upload → detection → voice alert
  • Identified small object detection problem (vehicle class)
  • Training pipeline ready for scaling
  • Voice AI system with ~300-500ms latency
  • GPU training infrastructure on Digital Ocean

Roadmap

MVP (January 2026)

  • Computer Vision — person/vehicle detection
  • Model v2/v3/v4 (iterative training)
  • Streamlit demo with Voice AI
  • Confusion matrix analysis

Next Phase: Hyperparameter Optimization (Q1 2026)

  • GPU Droplet training with hyperparameter tuning
  • SAHI workflow for small object detection
  • Class balancing strategies

Future: MLOps & Production (Q1-Q2 2026)

  • MLflow Tracking & Model Registry
  • FastAPI backend deployment
  • Real-time RTSP inference
  • Integration with SmartDrones platform

Sample photos

Person Detection Vehicle Detection Multi-object Detection