All articles
PerspectiveJul 3, 20265 min read

New York's chatbot told small businesses to break the law. Nobody was watching for the moment it went stale.

New York City's MyCity chatbot for small business owners confidently told them they could pocket workers' tips, refuse cash, and turn away tenants who pay with housing vouchers, all illegal under rules settled for years before anyone asked. The failure wasn't the model reasoning badly. It was that nothing in the system was built to notice its answers had already drifted out of step with the law.

In October 2023, New York City launched a chatbot called MyCity, built to help small business owners find their way through the city's regulations: permits, payroll, hiring, the daily paperwork of running a shop. Five months later, journalist Colin Lecher tested it for The Markup, co-published with THE CITY, and found it was wrong in a specific and alarming way. It wasn't vague or evasive. It confidently told business owners that things flatly illegal were fine to do.

None of these rules were new

The bot said employers could keep a cut of their staff's tips. New York labor law has barred that for years. It said landlords could turn away tenants who pay with housing vouchers, illegal in New York City since 2008. It said a business didn't have to accept cash, illegal in the city since 2020. None of this was a grey area the bot got unlucky on. Every one of these protections was already settled, public, and enforceable long before anyone typed a question into the chat window. A follow-up from the Associated Press days later found more of the same, including advice that an employer could legally fire a worker for reporting sexual harassment. Legal Services NYC attorney Rosalind Black said it plainly: if a tool like this isn't accurate, it should come down.

The fix was patience

It didn't come down. New York City's own mayor kept MyCity online after the story broke, saying the errors would be fixed over time. It stayed live for roughly two more years. When it was finally retired, in early 2026, nobody had closed the gap between what it said and what the law required. It was retired by the next administration, as a line in a round of budget cuts. Two years is a long time for an answer to keep being wrong in exactly the same way, waiting on a fix that was never actually scheduled.

Reasoning wasn't the defect

The easy read on this story is that the model reasoned badly, that somewhere in its logic it talked itself into bad legal advice. That's not really what happened. Ask a general-purpose system enough questions about business regulation with no live, current source of truth underneath it, and it will produce fluent, confident, plausible-sounding answers, because fluent and confident is what it's built to be. The defect wasn't in how it reasoned from what it knew. The defect was that nothing in the system was ever assigned the job of checking whether what it knew still matched what was currently true. The tip rule, the voucher protection, the cash mandate: none of them moved while the bot was live. They had already moved, years earlier, and nobody had gone back to check the bot's answers against them since.

This is a different kind of failure than two authorities disagreeing with each other. A contradiction at least announces itself: two current answers that don't match, which someone eventually notices. This is quieter. One answer, delivered with total confidence, that simply stopped being true, with nothing in the system whose job it was to notice.

The boring question

The lesson from MyCity isn't "use a smarter model." A smarter model, asked the same question against the same stale grounding, gives the same wrong answer, just more persuasively. The lesson is that somewhere in the system, something has to be responsible for holding the organisation's current, authoritative rules, and for noticing when an answer has quietly stopped matching them. Not at launch, when everyone checks. Months and years later, when nobody is checking, which is precisely when it matters most.

Most agent failures don't start with bad intent, or even bad logic. They start with a wrong belief the system had no way of knowing was wrong, because nothing was watching for the moment its answer stopped matching reality. Nobody assigned that job to MyCity. Most organisations haven't assigned it to their own AI either.

Loriq builds SAGE, the governed memory engine. Talk to us.