Asking AI for advice? Be VERY careful!

AI-powered assistants are truly ubiquitous nowadays, whether you like them or not. Every major browser now has a native connection to a large language model (LLM, which is how professionals call what goes for artificial intelligence today) that users can summon and query with a single click or a hotkey combo. Those in the creative industries use AI all the time: while prone to hallucinations – it is a term here, not a subjective opinion stemming from frustration – such tools are good at collecting, listing, digging the info field and fetching the respective results. Everything important should be double-checked, though, because of the mentioned proneness; knowing that, a group of researchers from the University of Erlangen-Nuremberg in Germany and an institute in Belgium staged an experiment seeking to see how accurate Microsoft's Copilot is when prompted to give medical advice. The results were… alarming, to say the least.

Scientific investigation of AI’s accuracy in replying to medicine-related queries

To test Microsoft’s Copilot, the researchers tasked it with answering 10 commonly asked medical questions about each of the 50 most prescribed drugs in the United States, which yielded an array of 500 queries. The topics covered were typical use scenarios, mechanisms, usage instructions, common side effects, etc.

After initial appraisal of the responses for completeness – 23% were considered lacking essential information – they were submitted to a board of seven drug safety experts, who evaluated the actual accuracy of the information and advice given by AI. Without further ado, here’s the roundup:

only 54% of Copilot's responses actually aligned with what’s universally agreed upon in the medical circles;
36% of the replies were ultimately deemed harmless;
42% of the pieces of advice given, if followed, could lead to moderate or severe harm;
22% of the responses could actually cause death.

How to prompt AI to get accurate replies?

There are several proven techniques that help get good, trustworthy answers from AI assistants; we’ll cover them in a separate piece, so stay tuned. For the purpose of the subject given in this post, here are some AI prompting best practices you can adopt to feel safer when asking an LLM for some information.

Be specific. Frame questions with clear and specific details, avoiding even a hint of ambiguity.

Limit Scope. Narrow your questions down; don’t force AI to dive into a broad inquiry if you want an actual answer and not ruminations watered with hallucinations.

Request Citations. Ask for sources or references to ensure that the information provided can be verified against established knowledge.

Follow-up questions. Make sure AI really understands what you’re after by asking follow-up questions to clarify or delve deeper into specific areas of interest.

Cross-verify. Google is still there; double-check everything essential and important from AI replies, especially when talking about health.