AI chatbots fail medical misinformation test, returning inaccurate and fabricated advice

Sahwa@reddthat.com · 18 days ago

AI chatbots fail medical misinformation test, returning inaccurate and fabricated advice

Rimu@piefed.social · 17 days ago

The study was done in Feb 2025 and they probably wrote the research proposal months before then, waited for approval / funding, etc. I don’t know the process of how academia works but I imagine it to be very slow and bureaucratic.

https://bmjopen.bmj.com/content/16/4/e112695

pooterbroo@programming.dev · 17 days ago

Well they didn’t even use the latest models in Feb 2025. They should’ve used DeepSeek R1 and OpenAI o3-mini which use additional test time compute to arrive at better answers. They used GPT 3.5 which was about 2½ years old at the time.