Understanding Autonomous Vehicle Tech Stacks - Sensor Simulation & Dataset Creation with CARLA - v0.0.3

Since getting CARLA set up on my local machine and manually driving vehicles around the city, I’ve taken the next another step in understanding how real self-driving car systems work: sensor simulation and labeled dataset creation.

This week, I built a workflow that simulates a vehicle driving through a city with mounted sensors — and automatically captures:

Camera images
IMU and GPS data
3D bounding boxes of other actors (vehicles and pedestrians)
Transformed camera pose data

All of this is saved per frame, which creates the basis of a synthetic dataset. Then I took it one step further: projecting the 3D actor bounding boxes into 2D so they align with the camera image — giving me fully labeled frames just like you’d see in datasets like KITTI or Waymo Open.

🚀 Repo: github.com/CodeJonesW/sensor_simulation_and_data_collection

What I Built

I created a two-part workflow:

1. `sensor_capture.py`

This script:

Spawns a Tesla vehicle in CARLA
Attaches an RGB camera, IMU, GPS, and optionally LiDAR
Records all sensor data per frame
Saves 3D bounding box and transform data for each nearby actor (car, pedestrian)

Output includes:

output/
  camera/
  bboxes/
  camera_transforms/
  imu.csv
  gps.csv

2. `export_dataset.py`

This script:

Loads the raw data
Projects 3D bounding boxes into 2D using the camera intrinsics + pose
Saves annotated label files in a lightweight JSON format
Builds a dataset_index.json for use in training or visualization tools

Final output:

output/final_dataset/
  images/
    000001.png
    ...
  labels/
    000001.json
    ...
  dataset_index.json

Each image now has a matching list of 2D bounding boxes — completely automated.

Why I Did This

I wanted to understand the pipeline from simulation to training data. That’s what companies like Cruise, Waymo, and Nvidia do every day: simulate scenarios, collect raw data, and label it for model training and evaluation. For example, Waymo’s in-house simulation platform, Simulation City, allows them to test billions of miles virtually — including rare and hazardous driving events — before deploying any updates to their real-world fleet. Every software change they make is first validated in simulation, and the resulting data is used to train and evaluate the machine learning models that power the Waymo Driver. This kind of workflow isn’t just for safety — it’s a cornerstone of how modern autonomous systems are built at scale.

For simulation to be an effective tool, it has to closely represent the environment it's emulating to provide accurate insights into a system’s performance. What happens in simulation must be predictive of what happens in the real world to ensure you are simulating the right things. This requires minimizing the differences between the simulated and real world, from the way your sensors see to the way your agents react to changes in their simulated environments — and are statistically representative of the real world.

Source: Waymo Blog — Simulation City

This project helped me internalize:

How to work with real-world coordinate transforms
How sensors interact in simulation
The complexity behind even “simple” bounding box annotations

It also got me thinking about:

Custom scenarios I could simulate (e.g., rare or edge-case behaviors)
How perception models use this data downstream
Using ROS 2 for more realistic architecture

Next Steps

This wraps up my sensor simulation phase. Next, I’ll probably start diving into perception models:

Try training YOLOv8 or Detectron2 on my own simulated data
Explore domain transfer challenges
Think about integrating LiDAR, depth, or segmentation views

Eventually, I want to move from perception into decision-making and planning — giving my simulated cars brains that respond to sensor input and act accordingly.

If you want to follow the project or clone it to build your own simulated dataset: 🔗 github.com/CodeJonesW/sensor_simulation_and_data_collection

I’d love to hear from anyone experimenting in this space.

Cheers!
Will