Training datafor underwaterautonomy
Real-world data for underwater autonomy. We capture aligned video, sonar, IMU, and pilot control from ROV operations and turn it into training-ready VLA chunks.
Underwater AI lacks training data
Models trained on land or in simulation struggle underwater. Current, turbidity, and radio blackout change the sensing problem entirely. Without aligned vision-action-telemetry datasets, marine autonomy stays rule-based.
Underwater datasets today
| Dataset | Domain | Modalities | Size | Actions | Notes |
|---|---|---|---|---|---|
| nuScenes / DROID | Terrestrial | Vision, LiDAR, control | Hundreds of hours | ✓ | No fluid dynamics |
| USIM | Underwater | Vision, control | 25 hrs · 905K frames | ✓ | Synthetic only |
| SOVIS | Underwater | Vision, sonar | 76K frames | ✗ | Perception only |
| Aronnax (in progress) | Underwater | Vision · sonar · IMU · control | Pilot deployments | ✓ | Real ROV telemetry — pipeline live on USIM today |
Capture what pilots already do
Instead of building new vehicles, we record telemetry from commercial ROV missions — inspections, surveys, and maintenance — and process it through a single annotation pipeline.
The pipeline is validated on public USIM simulation data today and designed to run unchanged when real black-box ROV traces arrive.
Open dataset explorerPassive capture
A hardware tap on the ROV topside Ethernet bridge records MAVLink traffic — video, sonar, joystick input, IMU, depth, and pressure — without modifying the vehicle or interrupting the pilot.
Align and normalize
Streams are timestamped and normalized into uniform rows. PWM commands are scaled to a fixed contract so training code sees consistent action vectors across vehicles.
Auto-label and chunk
Physics-derived labels (e.g. fighting_current when thrust does not match IMU response) and ACT-style action chunks produce export-ready JSON and Parquet for model training.
Ingest → align → label → export
Four stages turn raw ROV streams into training-ready chunks. Live demo runs on USIM today.
Action chunking
Smooth control sequences
Rather than one thruster command per frame, the pipeline exports ACT-style chunks of future actions with exponential smoothing — fixed [k, 6] tensors ready for imitation learning.
Cross-modal sonar
Vision when you have it, sonar when you do not
Camera frames pair with forward-looking sonar masks in a shared timeline. When optical visibility drops, the same row structure carries sonar frames and detection boxes through export.
Physics-derived labels
No human annotator for hydrodynamics
When a pilot commands forward thrust but IMU acceleration stays near zero, the pipeline tags fighting_current — a subjective pilot reaction turned into an objective training token from RC_IN vs SCALED_IMU.
Build command
Who needs this data
The same aligned telemetry supports defense autonomy, commercial inspection, and academic benchmarking — any team training policies that must work underwater.
GPS-denied UUV navigation
Low-cost underwater vehicles need policies that handle current, turbidity, and acoustic sensing — not just pre-programmed waypoints.
Offshore inspection & IRM
ROV day rates and pilot shortages push operators toward resident autonomy. Better training data is the bottleneck.
Marine robotics research
Labs need aligned multimodal traces — not just perception frames — to benchmark underwater policies.
Data and policy, not another OS
Incumbents sell integration software and mission tools. We focus on the missing layer: aligned training data and learned control policies for underwater vehicles.
Greensea IQ · OPENSEA
Rules-based navigation OS and sensor integration layer. A VLA policy could sit on top of their edge stack as a behavioral layer.
SeeByte · SeeTrack
Mission planning and fleet C2. They handle where to go; a trained policy handles how to move moment-to-moment.
Classical CV stacks
ATR and rule-based obstacle avoidance break on novel debris and zero visibility. Learned policies from pilot data address edge cases rules miss.
Work with Aronnax Lab
We are incubating at UC San Diego StartBlue with Scripps Institution of Oceanography. Reach out for pilot partnerships, data access, or research collaboration.
Defense & UUV teams
Teams building attritable underwater platforms that need robust autonomy software.
Offshore operators
Inspection and IRM contractors looking to reduce vessel days and pilot load.
ROV fleet partners
Operators who can host passive capture hardware during routine missions.
Research labs
Oceanographic and marine robotics groups benchmarking underwater policies.
Aronnax Lab · StartBlue × UCSD Scripps
contact@aronnaxlab.ai