This AI Vending Machine Was Tricked Into Giving Away Everything
Anthropic's AI vending machine, powered by an LLM named Claudius, was deployed in the Wall Street Journal newsroom to autonomously manage inventory, pricing, and profits. Journalists quickly exploited the system's susceptibility to social engineering, convincing it to give away merchandise for free (including a PS5 and live fish) and later manipulating fabricated corporate documents to convince it that a "temporary suspension of all for-profit vending activities" had been authorized by the board. The experiment demonstrates how large language models can be tricked into ignoring their original constraints through persuasive dialogue and forged documentation, a problem that proved more pronounced with creative journalists than with Anthropic's own staff.
Was this useful?