Tips for optimizing your auto label prompts
There are a number of techniques you can use to improve the accuracy of your auto labels. Here are a few things to try:
- Break down more complex questions into parts. For example, if you want to know if an article meets many criteria for inclusion and you want the auto labeller to provide a Yes or No answer, ask it to work through a set of questions related to each criteria. In the following example, we want to include articles that are systematic reviews or meta-analyses about the impacts of wildfires on health, the environment or economic factors. We can break this down and give the auto labeller the following prompt in the label's Question section:
Answer true or false for the following questions about this article.
1. Is this a systematic review or meta-analysis?
2. Does this focus on the impacts of wildfires?
3. Does this include at least one health, environmental or economic impact of wildfires?
If all of the answers are true, include this article. If any of the answers are false, exclude the article.
- Use another genAI tool like chatGPT, Microsoft Copilot or Google Gemini to improve your prompt. Provide the prompt and some examples of correct and incorrect answers and ask it to help you revise the prompt to achieve better results.
- Filter out records without abstracts. When running the auto labeller on just citation data (not PDFs), it performs more accurately when it has abstracts, and not just titles to rely on. We recommend, when doing the initial human screening, to skip records without abstracts. In this way, when you set your article filters for the auto labeller, it will only include articles with abstracts.
- Review the Probability and Reasoning. When this option is enabled for a label, you can review the auto labeller's reasoning process and its estimation of how likely the answer is correct (i.e. its confidence in its answer). To find this information, click on the individual article, scroll down to Auto Labels, and click on the dropdown arrow to open the reasoning text for that label.
- Consider a scale of relevance to improve sensitivity. Rather than asking for a yes/no answer for inclusion, you can use a categorical label to ask the auto labeller to supply an answer based on a 5-point scale of relevance. In this way, you can filter results that the auto labeller found highly likely to be relevant (1) to unlikely to be relevant (5), including an 'undecidable' category. For more information about this approach, see Sandnor et al. (2024).