STCrowd Dataset

Let's Start

STCrowd Creative Agency

Let's Start

Large Scale and Sequential

STCrowd is collected by a 128-beam LiDAR and a monocular camera. The annotated dataset is comprised of 84 sequences with each sequence contains a variable number of continuously recorded frames, ranging from 50 to 800. The annotations is made per 0.4 second.



This dataset can be used for various tasks, including LiDAR-only, image-only, and sensor-fusion based pedestrian detection and tracking. We provide baselines and metrics for most of the tasks to facilitate further research.


Diverse crowd densities.

STCrowd contains crowd scenarios of various densities, from around 10 pedestrians to more than 30 pedestrians in each frame and 20 pedestrians on average. It also reveals local high-density characteristic in 2.4, 8, and 15.8 average number of pedestrians in 2, 5, and 10 meters centered on each pedestrian.


STCrowd release.

"STCrowd: A Multimodal Dataset for Pedestrian Perception in Crowded Scenes" has been accepted on CVPR 2022.


When using this dataset in your research, we will be happy if you cite us!.

For the raw dataset and detection baseline, please cite:

 title={STCrowd: A Multimodal Dataset for Pedestrian Perception in Crowded Scenes},
 author={Cong, Peishan and Zhu, Xinge and Qiao, Feng and Ren, Yiming and Peng, Xidong and Hou, Yuenan and Xu, Lan and Yang, Ruigang and Manocha, Dinesh and Ma, Yuexin},
 journal={arXiv preprint arXiv:2204.01026},


This dataset is made available for academic use only. However, we take your privacy seriously and mask the faces. If you find yourself or personal belongings in this dataset and feel unwell about it, please contact us.

Our tasks


Objection detection in crowd scene for point cloud and image. We provide consistent ID and bounding box in image and point cloud.

Explore Now


We provide the IDs and bounding box labels for each object in a sequence of scenes. The object ID is unique and consistent in time. The tracking method outputs a label and ID for each object in a sequence of scenes.

Explore Now



Explore Now