How to Avoid AI Hallucinations That Damage Your Brand With Chain-of-Thought Prompting Techniques

Actionable Steps to Protect Your Business & Bottom Line from the Dangers of AI Hallucinations. Plus New AI Breakthroughs.

In today’s Future Friday…

Just like humans, AI models hallucinate too. And when they do, it can spell trouble for businesses.

But do you know what techniques or solutions solve this? Let’s jump into them.

Listen to today's edition:

Read Online and listen to the audio version.

Join 9,000+ founders getting actionable golden nuggets that are tailored to make your business more profitable.

TOPIC OF THE WEEK

AI Hallucinations: A Threat to Your Brand, Customer Trust, and Profits

Quick question: Is it true that George Washington once tweeted, “I cannot tell a lie... except on social media”?

I’m sure you’d agree that the first US president said no such thing.

Imagine if you heard that quote from ChatGPT. Would you suddenly believe it's legit? Probably not, right?

Large language models (LMMs) like ChatGPT and Google Bard are packed with loads of wild stuff. And it’s becoming a huge problem as more organizations and individuals start to adopt AI.

In fact, McKinsey said nearly half of organizations in its global survey are mitigating what they see as the biggest risk in these early AI days: Inaccuracy, or in this case, “AI hallucinations.”

When AI hallucinates, it can come up with an answer that seems so real, even though it is factually incorrect.

Here are some examples:

  1. Sentence Contradiction - The AI generates sentences that contradict each other within the same context or narrative.

    • Prompt: "Describe a day at the beach in short sentences."

    • Output: "The sun is shining. It's a rainy day. Kids are building sandcastles. There's no one around."

  2. Prompt Contradiction - The AI’s response contradicts the given prompt, leading to an output that doesn’t align with the requested information or context.

    • Prompt: "Write a congratulatory message for a graduation."

    • Output: "Sorry for your loss. We're all here to support you during this difficult time."

  3. Factual Contradiction - The AI provides information that is factually incorrect, confidently presenting fictitious data as facts.

    • Prompt: "What is the capital of France?"

    • Output: "The capital of France is Barcelona."

  4. Random Hallucinations - The AI generates information that is irrelevant to the prompt, introducing unrelated elements into the response.

    • Prompt: "Discuss the economic impact of COVID-19."

    • Output: "COVID-19 affected economies globally. Also, dolphins have been known to play with their food."

“Even state-of-the-art models are prone to producing falsehoods – they exhibit a tendency to invent facts in moments of uncertainty.

These hallucinations are particularly problematic in domains that require multi-step reasoning since a single logical error is enough to derail a much larger solution.”

OpenAI researchers

🏆 Golden Nuggets

  • Even advanced LLMs are prone to fabricating facts when faced with uncertainty.

  • Almost 50% of organizations are in the trenches, actively working to counteract the inaccuracies produced by AI.

  • This emphasizes the need for content filtering and clear, concise questioning to mitigate the risk of AI hallucinations.

💰 Impact On Your Business

Businesses are caught between the competitive pressure to advance in AI and the ethical responsibility to ensure the safety and accuracy of deployed AI tools. And the impact of AI hallucinations is multifaceted, affecting businesses, consumers, and society:

  • Spreading misinformation can be damaging to any brand.

  • Using AI for logistics and stock management? A glitch could mean you're either out of stock or drowning in excess. Not ideal, right?

  • If your AI customer service bots or recommendation systems slip up, it can lead to customer dissatisfaction, more complaints, and some might even leave.

  • Fixing these AI slip-ups isn't a one-time thing. It means constantly tweaking the model often, and yes, that's time and money.

How to Avoid Hallucinations With Chain-of-Thought Prompting Techniques

⚒️ Actionable Steps:

  1. Sequential Questioning - Instead of asking a complex question outright, break it down into a series of simpler questions that lead to the final answer.

     Instead of asking, "How does photosynthesis work?"

    ✔️ Start with:


    "What is photosynthesis?"

    "What are the main components involved in photosynthesis?"

    "What happens in the first stage of photosynthesis?"

    "What's the next stage after that?"

    ... and so on.

  2. Ask for Reasoning - After getting an answer, ask the model to explain its reasoning or the steps it took to arrive at that conclusion.

    ✔️ Question: "What's the capital of France?"

    ✔️ Follow-up: "How did you determine that answer?"

  3. Use Contextual Prompts - Provide the model with a role or context that encourages a step-by-step approach.


    ✔️ Prompt: "Imagine you're a teacher explaining a new concept to a student. Describe the process of cellular respiration step by step."

  4. Feedback Loop - If the model provides an answer that seems to skip steps or doesn't provide a clear chain of thought, ask it to clarify or elaborate on specific points.


    ✔️ Model's Answer: "Evaporation is when water turns into vapor."

    ✔️ Your Follow-up: "Can you explain the steps or conditions necessary for evaporation to occur?"

  5. Use Visual Aids or Analogies - Ask the model to describe processes or concepts as if it were drawing a diagram or using an analogy. This can encourage a more structured response.


    ✔️ Question: "Explain the water cycle as if you were drawing a diagram for a classroom."

💡 Ideas to Marinate

The tolerance for AI mistakes varies depending on the field. 

Think about it: in high-stakes industries like healthcare or the legal field, even a tiny slip-up can lead to major consequences.

It's a whole different ballgame compared to something like customer service, where you can usually make things right after a mistake.

So, it's not just about reducing hallucinations; it's about fine-tuning AI to match the context and industry demands. Identifying a baseline for an acceptable level of errors is also crucial.

Can We Fix AI’s Hallucination Problem?

In a recent podcast, Arthur.ai CEO Adam Wenchel said they’re working on automated solutions that can identify hallucinations by 87%.

How can they make that possible?

In a nutshell, the process involves augmenting prompts with data and breaking down responses into claims. Then, they determine if each claim is supported, not supported, or contracted by the data.

Another similar approach comes from Ex-Uber executive Flo Crivello:

We all know those generalist AI models like ChatGPT might excel in summarizing a news article, but then come up with something like “The capital of France is Barcelona”.

This is because generalist models are trained on massive datasets of text and code, which can contain inaccuracies, biases, and even hallucinations.

So, what if there’s a tool that could help collect labeled data for specific tasks and train AI models on that data to create ones that are less likely to generate hallucinations? 🤔 Flo put it like this:

🏆 Golden Nuggets

  • The key takeaway is the potential of GPT proxies to mitigate AI hallucinations.

  • They can be used to identify specific tasks where the generalist model is prone to hallucinating.

  • Once these tasks have been identified, specialized models can be trained to perform those tasks more accurately.

Fine-tuning that fits your needs

It seems like OpenPipe heard Flo’s prayers. OpenPipe offers a solution that converts “slow and expensive” general-purpose LMMs into “fast, cheap, and often more accurate models.”

💰 Impact On Your Business

  • These small models, fine-tuned on a specific prompt, excel at tasks, especially data extraction and classification.

  • They are cost-effective and efficient, offering up to 50x cost reduction compared to running ChatGPT.

⚒️ Actionable Steps

  1. Use GPT proxies like OpenPipe to collect labeled data for specific tasks. This will help you to identify specific tasks where the generalist model is prone to hallucinating.

  2. Once these tasks have been identified, specialized models can be trained to perform those tasks more accurately.

  3. Be critical of the output of generalist models. Always verify the accuracy of the information that is generated by generalist models, especially when it is used for important tasks.

💡 Business Use Cases

Here are some examples of how specialized AI models like OpenPipe could be used:

  • Summarization Model - condense lengthy documents like research papers.

  • Entity Extraction Model - pinpoint key details like names and places from text.

  • Email Labeling Model - sort emails as spam, important, or promotional.

  • Prompt Classification Model - determines the task a prompt needs, like summarizing or translating.

CAVEMINDS’ CURATION

A set of solid alternatives to OpenPipe

  • Arthur Bench - Arthur.ai has created a tool that helps companies evaluate LLMs in a “quick and consistent way.“ It provides scoring metrics to summarize quality and avoid hallucinations.

  • Vianai’s veryLLM - Similar to OpenPipe, it’s an open-source toolkit that can classify statements into distinct categories using context pools that LLMs are trained on. veryLLM is still under development, but it has the potential to be a valuable tool for reducing the risk of AI hallucinations.

  • Hugging Face Hub - Hugging Face has a fine-tuning capability that allows users to fine-tune pre-trained LLMs on their own datasets.

Visit our Cyber Cave and access the most extensive tool database on the internet.

NEEDLE MOVERS

That’s what SoftBank CEO Masayoshi Son said in a recent event, about those not recognizing that AGI will be 10 times smarter than the collective intelligence of all humans, in less than a decade.

He also mentioned that not using OpenAI's ChatGPT now is like living without electricity.

Funny enough, most of the people at this event were taking notes on paper notebooks, while AI can write it down for you.

OpenAI CEO Sam Altman and the former chief design officer of Apple, Jony Ive, are developing a wearable device that could help people manage their lives in a more efficient and effective way.

Whistleblowers say it can do almost everything, except convince you that '5 more minutes' in bed is a bad idea.

Oh, and btw, SoftBank CEO Masayoshi Son is also involved in the discussions with Altman and Ive—duh.

The man who invested over $140 billion in AI is really pushing it to be ‘the investment company for the AI revolution’.

The brilliant minds at DeepgramAI have unveiled their groundbreaking Nova 2 model. This new tool can transcribe 1 hour of audio in under 30 seconds.

Aside from being lightning-fast (5x to 40x), it also claims 30% fewer errors than competitors in real-time transcriptions.

That’s all for today!

Continue Reading

How was today's deep dive, cavebros, and cavebabes?

Login or Subscribe to participate in polls.

Don't be shy, give us your thoughts, whether it's a 🔥 or a 💩.

We promise we won't hunt you down. 😉

 

🌄 CaveTime is Over! 🌄

Thanks for reading, and until next time. Stay primal!

Reply

or to participate.