Real-world evaluation is also difficult to standardize, often requiring human resets and suffering from environment variability. As a result, simulation-based benchmarks have become popular for their reproducibility and ease of use.
Existing benchmarks mostly focus on household tasks. However, retail and logistics scenarios — such as shelf picking or order packing — remain underexplored. Dedicated benchmarks for these domains are needed to advance robotic capabilities in retail environments.
RoboBenchMart addresses limitations of prior works by providing code to generate diverse store layouts and robotic trajectories, enabling the training and benchmarking of robotic policies in retail environments.
@article{soshin2025robobenchmart,
title={RoboBenchMart: Benchmarking Robots in Retail Environment},
author={Soshin, Konstantin and Krapukhin, Alexander and Spiridonov, Andrei and Shepelev, Denis and Bukhtuev, Gregorii and Kuznetsov, Andrey and Shakhuro, Vlad},
journal={arXiv preprint arXiv:2511.10276},
year={2025}
}