We have been alarmed at how the profit-driven US medical system has been eagerly embracing AI despite ample evidence of uneven performance, such as too-frequent, serious errors in what ought to be simple tasks like transcriptions of MD treatment notes. Two new Reuters investigations plus an article in Nature give further cause for pause.
The most troubling one is the newer of the two new Reuters investigations, which includes examples of harm done by AI-enhanced surgical tools. Worse, it specifically finds that recent AI updates have increased the incidence of serious errors and resulting harm to patients. The article opens with a major AI backfire by Johnson & Johnson:
In 2021, a unit of healthcare giant Johnson & Johnson announced “a leap forward”: It had added artificial intelligence to a medical device used…..to assist ear, nose and throat specialists in surgeries.
The device had already been on the market for about three years. Until then, the U.S. Food and Drug Administration had received unconfirmed reports of seven instances in which the device malfunctioned and another report of a patient injury. Since AI was added to the device, the FDA has received unconfirmed reports of at least 100 malfunctions and adverse events.
At least 10 people were injured between late 2021 and November 2025, according to the reports. Most allegedly involved errors in which the TruDi Navigation System misinformed surgeons about the location of their instruments while they were using them inside patients’ heads during operations.
Cerebrospinal fluid reportedly leaked from one patient’s nose. In another reported case, a surgeon mistakenly punctured the base of a patient’s skull. In two other cases, patients each allegedly suffered strokes after a major artery was accidentally injured.
Let’s stop here. First, some of these injuries were severe. Second, they call into question one of the premises of medicine practiced by AI, which is that it can do an adequate job of what in humans would be visualization and resulting decision-making. A comment on Twitter challenged this premise independent of these accounts, based on whether enough good training data could be obtained.
AI and robotics might be advancing at an amazing pace, but this is just BS, you still need a huge amount of training data to develop stand-alone surgical robots and this is not simply not available. Would be surprised if even 1% of surgeons is recording their procedures at the… https://t.co/3gC40zwGoE
— Dries Develtere (@DriesDeveltere) February 11, 2026
Admittedly the use case here is full AI surgery as opposed to surgical assistance. But the fact that patients are suffering major injuries in surgery via the tool mis-locating where it is a huge red flag.
The training set concern also raises doubts as to whether AI can ever adequately substitute for visual and manual examination by a doctor. KLG recently gave an example of the hazards of merely over-relying on a chart history versus physical presentation.1
Reuters later describes one of the strokes attributed to TrueDi. Note it resulted after what should have been a minor procedure:
In June 2022, a surgeon inserted a small balloon into Erin Ralph’s sinus cavity…Dr. Marc Dean was employing the TruDi Navigation System, which uses AI, to confirm the position of his instruments inside her head.
The procedure, known as a sinuplasty, is a minimally invasive technique to treat chronic sinusitis. A balloon is inflated to enlarge the sinus cavity opening, to allow better drainage and relieve inflammation.
But the TruDi system “misled and misdirected” Dean, according to the lawsuit Ralph filed… A carotid artery – which supplies blood to the brain, face and neck – allegedly was injured, leading to a blood clot….Ralph’s lawyer told a judge that Dean’s own records showed he “had no idea he was anywhere near the carotid artery.”…
After Ralph left the hospital, it became apparent that she had suffered a stroke…A section of her skull was removed “to allow her brain room to swell,” the GoFundMe appeal stated.
“I am still working in therapy,” Ralph said in an interview more than a year later in a blog about stroke victims. “It is hard to walk without a brace and to get my left arm back working, again.”
The story reports a later horrorshow with the same doctor:
In May 2023, Dean was using TruDi in another sinuplasty operation when patient Donna Fernihough’s carotid artery allegedly “blew.” Blood “was spraying all over” – even landing on an Acclarent representative [which distributes TrueDi] who was observing the surgery…
And we soon learn that Dean had a major conflict of interest:
Dean began consulting for Acclarent in 2014 and received more than $550,000 in consultant’s fees from the company through 2024, according to Open Payments, a federal database that tracks financial ties between companies and physicians. At least $135,000 of those fees related to the TruDi system.
While the focus on Dean might lead readers to conclude he’s just a not-very-good doctor somehow made worse by TruDi, the Reuters story describes how this Johnson & Johnson TruDi AI-induced performance deterioration is not an isolated case:
At least 1,357 medical devices using AI are now authorized by the FDA – double the number it had allowed through 2022. The TruDi system isn’t the only one to come under question: The FDA has received reports involving dozens of other AI-enhanced devices, including a heart monitor said to have overlooked abnormal heartbeats and an ultrasound device that allegedly misidentified fetal body parts.
Researchers from Johns Hopkins, Georgetown and Yale universities recently found that 60 FDA-authorized medical devices using AI were linked to 182 product recalls, according to a research letter published in the JAMA Health Forum in August. Their review showed that 43% of the recalls occurred less than a year after the devices were greenlighted. That’s about twice the recall rate of all devices authorized under similar FDA rules….
Reuters found that at least 1,401 of the reports filed to the FDA between 2021 and October 2025 concern medical devices that are on an FDA list of 1,357 products that use AI. The agency says the list isn’t comprehensive. Of those reports, at least 115 mention problems with software, algorithms or programming.
One FDA report in June 2025 alleged that AI software used for prenatal ultrasounds was misidentifying fetal body parts. Called Sonio Detect, it uses machine learning techniques to help analyze fetal images….
At least 16 reports claimed that AI-assisted heart monitors made by medical-device giant Medtronic failed to recognize abnormal rhythms or pauses. None of the reports mentioned injuries. Medtronic told the FDA that some of the incidents were caused by “user confusion.”.
Even as these incidents are rising, so too is AI deployment in devices:
Reuters takes pains to point out that software and computer assistance, including what formerly might have been called algos, is not new, and these advances have often been beneficial, such as pattern-matching to enhance images in cancer exams. It also laments at considerable length how the FDA had established a large team to evaluate AI in devices but the Trump Administration has been.
I wish the Reuters account had mentioned two other issues. First is that FDA regulation of devices is much more permissive than that of drugs. Second is that any introduction of software into a medical device, which is on the rise, is problematic. Like buyers of smart homes, patients are at risk of a vendor going out of business or (as with the AI) updates making matters worse.
We’ll give a shorter recap of the Reuters post from the start of the week, but urge you to read it in full. AI-powered apps and bots are barging into medicine. Doctors have questions gives numerous examples of patients seeking medical advice from AI and too many instances getting alarmingly bad readings, like telling a particular patient twice within months that he was set to die soon of cancer. But the vendors insist that the AI isn’t giving “advice”. Help me:
A growing number of mobile apps available on the Apple and Google app stores claim to use AI to assist patients with their medical complaints – even though they’re not supposed to offer diagnoses.
Under U.S. Food and Drug Administration guidelines, AI-based medical apps don’t require approval if they “are intended generally for patient education, and are not intended for use in the diagnosis of disease or other conditions.” Many apps have disclaimers that they aren’t a diagnostic tool and shouldn’t be used as a substitute for a physician. Some developers seem to be stretching the limits.
An app called “Eureka Health: AI Doctor” touted itself as “Your all-in-one personal health companion.” It stated on Apple’s App Store that it was “FOR INFORMATIONAL PURPOSES ONLY” and “does not diagnose or treat disease.”
But its developer, Sam Dot Co, also promoted the app on a website, where it stated in big letters: “Become your own doctor.”
“Ask, diagnose, treat,” the site stated. “Our AI doesn’t just diagnose – it connects you to prescriptions, lab orders, and real-world care.”
Apple removed the Eureka Health app only after Reuters made inquiries, confirming lax oversight.
And some of these apps are piss poor:
“AI Dermatologist: Skin Scanner” says on its website that it has more than 940,000 users and “has the same accuracy as a professional dermatologist.” Users can upload photos of moles and other skin conditions, and AI provides an “instant” risk assessment. “AI Dermatologist can save your life,” the site claims….
The app claims “over 97% accuracy.” But it has drawn hundreds of one-star reviews on app stores, and many users complain it’s inaccurate.
Finally to the paper in Nature. Troublingly, it compares AI chatbot performance in diagnosis and treatment to patients winging it. I am not making that up. Nature is using as a control for diagnosis patients NOT seeing a doctor. While one can see this as look at how a very carefully designed chatbot works (it was subjected to multiple levels of design, review, and testing), it has the effect of normalizing the idea of having AI play at being doctors.
A Twitter overview:
A paper in Nature Medicine suggests that large language models may not help members of the public make better decisions about their health in everyday medical situations. https://t.co/k086BJL2Qn pic.twitter.com/ND6kWq9cZ4
— Nature Portfolio (@NaturePortfolio) February 10, 2026
Note that the summary does not convey that “everyday situations” includes whether or not to go to the ER. From the paper:
We tested whether LLMs can assist members of the public in identifying underlying conditions and choosing a course of action (disposition) in ten medical scenarios in a controlled study with 1,298 participants. Participants were randomly assigned to receive assistance from an LLM (GPT-4o, Llama 3, Command R+) or a source of their choice (control).
The LLMs had performed well…except with actual patient-type humans:
Tested alone, LLMs complete the scenarios accurately, correctly identifying conditions in 94.9% of cases and disposition in 56.3% on average. However, participants using the same LLMs identified relevant conditions in fewer than 34.5% of cases and disposition in fewer than 44.2%, both no better than the control group.
The researchers did not constrain what the control group did, so they may have used the Internet, asked someone who had had similar symptoms or was otherwise deemed to be knowledgeable, or maybe even just relied on earlier doctor warnings. The point is the AI performed no better than patients stumbling around on their own.
So there is indeed a lot not to like about our Brave New World of more AI, less care by medical practitioners. But it’s going to be foisted even more on all of us. Enjoy being a guinea pig.
_____
1 From Coffee Break: Science and Medicine, Bad and Good:
Dr. Will Lyon is a geriatrician from Wauwatosa, Wisconsin (pop. ~48,000). From Front Porch Republic he writes of the practice of modern medicine in Doctoring and the Device Paradigm:
Before doctors see a patient, they perform a procedure called “chart review.” This involves reviewing the patient’s history, medications, lab or imaging data, and notes from any recent specialist visits or hospital stays. There is variation in how much chart review one prefers to perform before meeting a patient, but in general it is good and necessary to be sufficiently informed and prepared before the visit. But chart review can be a double-edged sword: it can save time and help put the history you obtain and the physical exam you perform into context, but it can also box you in to a false understanding of who the patient is. In the age of ubiquitous electronic health records, which promise an ostensibly more efficient method of chart review but also contain vast amounts of information, chart review can become daunting.
…
Several years ago, when my wife’s grandfather – “Opa” – presented to the ER with shortness of breath while I was on service in the hospital, I learned the value of meeting the real patient first.
I only learned of his arrival because I was notified by my family, and I could not access his medical record. Instead, I went straight to meet him in his emergency department room. On my elevator ride down, I thought about his shortness of breath. I knew that he had had a myocardial infarction earlier in the year, treated with the placement of a coronary stent.
When I walked in the room, he looked almost as pale as the bedsheets. When I shook his hands, I noticed that they were cool. He described feeling lightheaded whenever he stood up at home and was so short of breath that he wasn’t able to walk across his living room – a drastic change in his functional status. All of these signs suggested a common cause – anemia. However, the iPatient’s story suggested at a different suspected cause: new or recurrent heart problems. Or so I learned when the ER doctor stopped by.
Turns out it was anemia and not a heart attack, and according to Dr. Lyon “I still think of Opa’s case when I get lost in the weeds of chart review and need to remember that sometimes, the most valuable information is gathered from the patient by using our eyes, ears, and hands.” This is the lesson we try to teach our medical students from their first few weeks of medical school, even as they are consumed by biochemistry, genetics, and cell biology. From Dr. Lyon:
I do not intend to minimize the importance of reviewing the patient’s chart. Oftentimes, a thorough review provides critical information that guides your clinical approach (in the case I described, the fact that Opa was on blood thinners would increase the likelihood of blood loss as the cause of his anemia). Failing to identify key history on chart review can have devastating consequences, especially in the case of complex medical patients. The error comes when we mistake the iPatient for the flesh-and-blood human being in the exam room or hospital bed.

















