Researchers tested different medical scenarios with the chatbot. In more than half of cases in which doctors would send patients to the ER, the chatbot said it was OK to delay care.
ChatGPT Health — OpenAI’s new health-focused chatbot — frequently underestimated the severity of medical emergencies, according to a study published last week in the journal Nature Medicine.
In the study, researchers tested ChatGPT Health’s ability to triage, or assess the severity of, medical cases based on real-life scenarios.
Previous research has shown that ChatGPT can pass medical exams, and nearly two-thirds of physicians reported using some form of AI in 2024. But other research has shown that chatbots, including ChatGPT, don’t provide reliable medical advice.



LLMs like all computer software is deterministic. It has a stable output for all inputs. LLMs as users use them have random parameters inserted to make it act nondeterministically if you assume this random info is nondeterministic.
You’re being down voted because LLMs aren’t deterministic, it’s basically the biggest issue in productizing them. LLMs have a setting called “temperature” that is used to randomize the next token selection process meaning LLMs are inherently not deterministic.
If you se the temperature to 0, then it will produce consistent results, but the “quality” of output drops significantly.
If you give whatever random data source it uses the same seed, it will output the same thing.
So question then, what parameter controls deterministic results for an LLM?
It’s the temperature. If you set it to 0, no randomness is introduced.
Of course it impairs the llm substantially, but you CAN get deterministic results.
I honestly dont know. I think all that matters is the token window and a random seed used foe a random weighted choice.
I encourage you to do some additional research on LLMs and the underlying mathematical models before making statements on incorrect information
The answer to this question was Temperature. It’s one of the many hyperparameters available to the engineer loading the model. Begin with looking into the difference between hyperparameters and parameters, as they relate to LLMs.
I’m one of the contributors to the LIDA cognitive architecture. This is my space and I want to help people learn so we can begin to use this technology as was intended - not all this marketing wank.
Listen, this is going to sound like a loaded inflammatory question and I don’t really know how to fix that over text, but you say you’re in the space and I’m genuinely curious as to your take on this:
Do you think it’s possible to build LLM technology in a way that:
The core problem with this technology is the misuse/misunderstanding that:
Thank you for coming to my autistic TED talk <3
Edit: Also, fantastic question and never apologize for wanting to learn; keep that hunger and run with it
Well, this was exactly the answer I expected but I’m still disappointed.
I feel like I’m in a niche position where I want the technology to deliver on promises made (not inherently anti-AI) but even if they did I would still refuse to use them until the ethical and moral issues get solved in their creation and use (definitely anti-cramming-LLMs-into-every-facet-of-our-lives).
I miss being excited about machine learning, but LLMs being the whole topic now is so disappointing. Give us back domain specific, bespoke ML applications.
Not who you asked but
https://lemmy.world/comment/22464598
Showing that someone hasn’t answered your quiz question correctly isn’t a great way to make an argument.
You’ve missed the point - I was responding to someone answering in an authoritative manner about something of which they were mis-informed. I posed a question someone in the space would immediately know. The disappointing part is simply pasting my question into any search engine or LLM would immediately have said “Temperature.”
This is a perfect example of how we’re using our brain less and less and simply relying on “something” else to answer it for us. Do your research. Learn and teach.
Nothing Kairos is saying is misinformation though. Temperature applies randomness to a generated probability distribution for tokens. That doesn’t mean the probability distribution wasn’t generated deterministically. That doesn’t mean the randomness applied couldn’t be deterministic. How they describe it working is accurate, they don’t need to prove their qualifications and knowledge of jargon for that to be a good argument, and by focusing on that aspect of things in a way that doesn’t contradict the point, you are making a bad argument.
What’s lost is the question of what determinism even means in this context or why a property of being deterministic would even matter. It is unclear how being deterministic or not deterministic, by any definition, would have anything to do with how good a LLM is at making correct medical decisions, like the person starting this comment chain was implying.