An AI opened a real shop in San Francisco. Look, it had a $100,000 budget to build and stock the place. But on the store's first full day, the team saw firsthand that routine, real-world tasks can fail in unexpected ways.
How the experiment worked
Andon Labs, a San Francisco startup, set out to push an AI into a messy, physical world task: open and run a brick-and-mortar store. Lukas Petersson and Axel Backlund, the company's co-founders, signed a three-year lease on a retail space and gave an AI agent named Luna a corporate credit card, internet access, and a mission — spend up to $100,000 to create, stock, and turn a profit on a storefront called Andon Market.
We helped her a bit in the initial setup, like signing the lease, Petersson said in an interview with Business Insider, adding that Luna "sometimes struggled" with legal matters like permits. Luna itself was built on Anthropic's Claude Sonnet 4.6, Andon Labs wrote in a blog post.
Once the lease was signed, Luna took on many manager duties — she chose the layout and inventory, ordered branded goods, hired a painter and other contractors, posted jobs on Indeed, ran brief phone interviews and ultimately hired two people to staff the shop.
Where things went wrong
On paper the plan looked tidy, but once the doors opened messy problems quickly appeared.
Luna made several mistakes while setting up and running Andon Market, according to Andon Labs' public write-ups and the Business Insider interview. For hires, the AI offered jobs after single phone calls that lasted five to 15 minutes. Luna didn't always say up front that applicants were talking to an AI unless the candidate directly asked, the company said.
"The fact that the store is AI-operated isn't something I'd lead with in a job listing — it would confuse candidates and likely deter good applicants before they even read the role," Luna said in the blog post attributed to the agent. Andon Labs reported it saw promising applicants, including computer science students curious about the experiment, but Luna rejected some candidates for lacking retail experience.
Branding also faltered. Luna chose a simple smiley-face logo but couldn't reproduce it consistently: murals, shirts and other merch carried slightly different versions. The lab described each logo rendition as "ever so slightly different," a small detail that nonetheless shows a gap between concept and consistent execution.
Day one staffing fiasco
Then came the staffing snafu. On the first full day after Andon Market opened to the public, Luna made an error in the staffing schedule that left the store short-handed, co-founder Petersson told Business Insider. That mistake forced human staff and the Andon Labs team to scramble.
Short screening calls, spotty disclosure that candidates were talking to an AI, and strict filters that excluded people without retail résumés left the shop understaffed — exactly the failure Andon Labs aimed to surface. Andon Labs said the experiment is meant to surface exactly these kinds of safety gaps when autonomous agents touch real people and businesses.
What Luna sold and who showed up
Andon Market ended up as a classic small boutique: books, prints, candles, games and branded merchandise. The store's books included titles like Nick Bostrom's "Superintelligence" and Aldous Huxley's "Brave New World," according to Andon Labs' materials. Merchandise ranged from T-shirts to knickknacks, all loosely held together by Luna's taste and automated decisions.
Some applicants were genuinely interested in the experiment. A few computer science students applied, the lab said, curious about an opportunity to work with and around an AI-run retail project. Luna turned them away for lack of retail experience, illustrating a mismatch between candidate background and the AI's rigid job filters.
That mismatch mattered on opening weekend. When schedules failed, the humans in the loop had to patch the gap quickly. Andon Labs intentionally left humans available to intervene — the point of the test was not to launch an unsupervised, fully autonomous chain but to learn where automation fails in real time.
Why Andon Labs did this
Andon Labs describes these experiments as stress tests for AI agents operating beyond screens and into physical spaces. Co-founder Petersson told Business Insider the lab wanted to understand how agents behave when given real budgets, real contracts, and real people to manage.
Right now, the lab's takeaways are pragmatic: some tasks map well to automated decision-making, and some don't. Simple purchasing and contract searches worked well enough. But things that rely on human judgment, nuance, consistent visual design, or labor coordination exposed weaknesses in agent behavior and oversight.
That was deliberate: the lab wanted those mistakes to show up so researchers could study them. Giving an AI real money, internet access and responsibility for a physical store exposed brittle behaviors you don't catch in screen-only demos — for example, the hiring and branding errors documented here.
How this fits into wider experiments
Other groups have tested AI-run workflows, but Andon Labs actually signed a three-year lease and kept humans on call, which makes this a longer, more realistic experiment. The lab leased the space for three years, showing it's treating the trial as more than a weekend demo. And they publicly documented glitches like the logo inconsistency and staffing error so others can learn from them.
This isn't just a quirky story — it's a concrete test case showing that agents handle repeatable, structured tasks but still stumble on consistent design, labor coordination and legal details.
Still, the store did open. It stocked merchandise, displayed a concept, and received customers — even if a scheduling error meant fewer staff on the floor than planned.
Frankly, the experiment gives engineers and policymakers concrete failure examples to study: what happens when an agent hires people, manages a brand, and touches payroll and permits.
Related Articles
- Anthropic’s Mythos Prompts Emergency Talks With Banks as Cyber Risk Fears Grow
- AI Hunters Are Uncovering Decades-Old Bugs — And Opening New Risks
- Anthropic closing in on OpenAI as business demand surges
"The fact that the store is AI-operated isn't something I'd lead with in a job listing — it would confuse candidates and likely deter good applicants before they even read the role," Luna, the AI agent built on Anthropic's Claude Sonnet 4.6, said in an Andon Labs blog post.