BETTY Dataset: A Multi-modal Dataset for Full-Stack Autonomy


1Robotics Institute, Carnegie Mellon University
2University of Modena and Reggio Emilio
3University of Waterloo
4University of Pittsburgh
BETTY Dataset: A Multi-modal Dataset for Full-Stack Autonomy

Abstract

We present the BETTY dataset, a large-scale, multi-modal dataset collected on several autonomous racing vehicles, targeting supervised and self-supervised state estimation, dynamics modeling, motion forecasting, perception, and more. Existing large-scale datasets, especially autonomous vehicle datasets, focus primarily on supervised perception, planning, and motion forecasting tasks. Our work enables multi-modal, data-driven methods by including all sensor inputs and the outputs from the software stack, along with semantic metadata and ground truth information. The dataset encompasses 4 years of data, currently comprising over 13 hours and 32 TB, collected on autonomous racing vehicle platforms. This data spans 6 diverse racing environments, including high-speed oval courses, for single and multi-agent algorithm evaluation in feature-sparse scenarios, as well as high-speed road courses with high longitudinal and lateral accelerations and tight, GPS- denied environments. It captures highly dynamic states, such as 63 m/s crashes, loss of tire traction, and operation at the limit of stability. By offering a large breadth of cross-modal and dynamic data, the BETTY dataset enables the training and testing of full autonomy stack pipelines, pushing the performance of all algorithms to the limits.

Explore the Data

Visualize our data! Explore πŸ”Ž and play around πŸ› Select one of the tracks to view camera, LiDAR, and timeseries data from this track.

Las Vegas Motor Speedway

Las Vegas

Lucas Oil

Indianapolis

Texas

Monza

Goodwood

πŸ“· PERCEPTION HIGHLIGHT : Explore how different all sensors look on each track!
Check out the onboard videos and pan around pointclouds

See a snapshot of the timeseries data!

HD Maps

πŸ—ΊοΈ MAP EXPLORATION : Pan around our HD maps
Menu > Tools > Navigation to change movement pattern

ROS2 Visualization

Platform Sensors & Rates

AV21 (betty) sensors
Topic Rate [Hz] Description ROS Message Type

BibTeX

@misc{nye2025bettydatasetmultimodaldataset,
  title={BETTY Dataset: A Multi-modal Dataset for Full-Stack Autonomy}, 
  author={Micah Nye and Ayoub Raji and Andrew Saba and Eidan Erlich and Robert Exley and Aragya Goyal and Alexander Matros and Ritesh Misra and Matthew Sivaprakasam and Marko Bertogna and Deva Ramanan and Sebastian Scherer},
  year={2025},
  eprint={2505.07266},
  archivePrefix={arXiv},
  primaryClass={cs.RO},
  url={https://arxiv.org/abs/2505.07266}, 
}