Traffic-IT is a comprehensive dataset designed to enhance the capabilities of multimodal large language models (MLLMs) in understanding complex traffic scenes. With a focus on diverse traffic scenarios, the dataset aims to support research and development in intelligent transportation systems, autonomous driving, and smart city applications.
230,000 Question-Answer pairs across 30,000 images
Covers 15 categories of weather and location, including sunny, rainy, snowy, foggy conditions
220,950 scene-specific annotations for various scenarios like main roads, city streets, and rural areas
Collected from 30,000 high-quality images across different times of day, environments, and locations
Includes 1,800+ hours of expert validation to ensure data accuracy and relevance
Global Scope: Includes images from Beijing, Chengdu, Melbourne, and more
Data Collection and Annotation
Traffic-IT was developed using a three-step process:
Image Collection: Sourced from dashcam footage, open-source datasets, and manual photography across diverse conditions and locations.
Question Design and AI Answering: Thirty traffic-specific questions were developed by experts, and answers were generated using GPT-4, covering real-world scenarios like weather impacts and traffic maneuvers.
Expert Validation: Each answer was reviewed and refined by traffic domain practitioners to ensure relevance and accuracy.
Below are examples from the Traffic-IT dataset, illustrating typical traffic scenes with corresponding Q&A pairs that reflect real-world traffic situations.