I don't think it's necessarily a bad thing that an AI got it wrong.
I think the bigger issue is why the AI model got it wrong. It got the diagnosis wrong because it is a language model and is fundamentally not fit for use as a diagnostic tool. Not even a screening/aid tool for physicians.
There are AI tools designed for medical diagnoses, and those are indeed a major value-add for patients and physicians.
Tbf 500ms latency on - IIRC - a loopback network connection in a test environment is a lot. It's not hugely surprising that a curious engineer dug into that.