From: aidotengineer

Prompt engineering remains crucial for effectively working with Large Language Models (LLMs) and developing AI-based features [00:00:27].

Why Prompt Engineering is Still Important

While some memes suggest simply telling a model what to do, anyone who has actually shipped an AI-based feature understands that it is much more nuanced [00:01:00]. Understanding what you want the model to do can itself be a significant challenge [00:01:07].

Prompt engineering serves as an excellent starting point for individuals [00:01:12], offering the easiest and most accessible method to obtain better outputs from LLMs [00:01:15].

It is part of a larger system [00:01:19]. While everyone may have access to the same models, the prompts and architecture surrounding them can provide a competitive advantage in a product [00:01:23].

Simplicity and Efficiency

Prioritizing the simplest solution is key [00:01:38]. It’s easy to get carried away when working with LLMs [00:01:42]. Spending an hour experimenting with a prompt before concluding that a complex Retrieval Augmented Generation (RAG) system is necessary is not advisable [00:01:48]. If a solution can be achieved through prompt engineering, it is often much simpler to manage [00:02:00].

Key Prompt Engineering Methods

The speaker highlights two main and highly effective methods:

Other methods often fall under the umbrella of general reasoning prompts [00:02:14].

Chain of Thought Prompting

This method involves instructing the model to reason or “think about the problem” before providing an answer [00:02:27]. It helps break down problems into sub-problems [00:03:38], offering insight into the model’s thinking process, which aids troubleshooting [00:02:41]. It is widely applicable, easy to implement [00:02:48], and so powerful that it’s now often built into reasoning models [00:02:52].

A classic zero-shot approach is to add phrases like “think step by step” or “take a breath and take it through” to encourage reasoning before output [00:03:12]. Few-shot examples of reasoning steps can also be provided [00:03:17]. LLMs can also be used to generate these reasoning chains, for instance, through frameworks like Automatic Chain of Thought or AutoReason [00:03:30].

Few-shot Prompting

This involves including examples within the prompt for the model to mimic, effectively “showing rather than telling” [00:04:47]. Most performance gains are achieved with just one or two examples [00:05:19], as adding too many examples can sometimes degrade performance [00:05:29]. Builders typically need only one or two diverse examples to cover different inputs [00:05:37].

Meta Prompting

This involves using an LLM itself to create, refine, or improve a prompt [00:05:55]. Various frameworks and free tools exist for this purpose, such as those provided by Anthropic, OpenAI’s playground, and PromptHub [00:06:11]. PromptHub’s tool tailors the meta-prompt based on the selected model provider, as prompts optimized for one model might not be effective for another [00:06:22].

Prompting with Reasoning Models

Reasoning models behave very differently in terms of their internal workings and how they are prompted [00:06:56].

[!WARNING|Caveats with Reasoning Models]

  • Fewer Examples are Better: Research, such as Microsoft’s MedPrompt framework with O1 and findings from DeepSeek with R1, indicates that adding examples can lead to worse performance [00:07:16]. OpenAI also noted that providing excessive context can overcomplicate things and confuse the model [00:07:27].
  • Encourage More Reasoning: The more reasoning a model performs, the better the output can be [00:07:40]. Extended reasoning has been shown to yield better results [00:07:52], increasing accuracy and performance [00:08:05].
  • Avoid Instructing Reasoning: For reasoning models, there’s no need to explicitly instruct them on how to reason, as it’s often built-in [00:08:34]. Doing so can actually hurt performance [00:08:39].

[!NOTE|General Recommendations for Reasoning Models]

  • Use minimal prompting [00:08:12].
  • Provide a clear task description [00:08:14].
  • Encourage more reasoning if facing performance issues [00:08:22].
  • Avoid few-shot prompting, or start with only one or two examples if necessary [00:08:30].