Yu, Xiao, Zhao, Luo & Zeng (2026)
AI shopping agents have a memory problem - and Alibaba just solved it
A new benchmark spanning 1.2 million real products reveals that even state-of-the-art models struggle to remember what you actually want. A lightweight memory-augmented agent trained end-to-end beats them all.
- 1.2M
- Real-world products in the new benchmark dataset
- <70%
- Success rate for top-tier models including GPT-5