Overview of Sysrev's Auto-labeler

Sysrev has a built-in generative AI Auto-label feature. Users can choose from either OpenAI's GPT-4o model or Gemini's 2.5 Flash model when running the Auto-labeler. The Auto-labeler can be used to automate the labeling process. Sysrev generates an Auto-label report that allows users to compare auto-labeling results to human labeling, such that labels can be improved, assessed and optimized to maximize accuracy and provide a transparent assessment process.

There are a few important things to know about the Auto-label feature:

Restricted to Premium and Enterprise Accounts: The Auto-labeler is only available to users with verified payment methods.
It costs money to run the Auto-labeler: Running the Auto-labeler requires funds in your Sysrev project. More about payments and pricing can be found on the Auto-labeler Pricing and Payments help page.
The cost of running the Auto-labeler varies based on how many records you are labeling, the number of labels that are activated for auto-labeling and the amount of content that is being 'read' by the Auto-labeler (e.g., citation only vs full PDF), as well as the choice of model.
Only owners can run the Auto-labeler for premium accounts: If your user status for a project is set at 'member', you will not be able to use the Auto-labeler. Admins can run the auto-labeler in Enterprise accounts, but not Premium accounts.
The Auto-labeler does not operate across labels (with the exception of child labels within group labels): In other words, each label is independent from other labels. The Auto-labeler will not be able to 'read' answers from other labels. However, for group labels, the Auto-labeler considers all child labels (i.e., column variables) within the group label at the same time for a given row.
Be careful when editing your prompts if they've already run on large numbers of records: If you change an Auto-label—such as fixing a typo in the prompt, adding a category, or adjusting settings like “Require answer”—it will rerun on all previously labeled articles the next time you run it on the full (unfiltered) project. If you add new articles and do not change a previously run Auto-label, it will run only on the new articles to avoid unnecessary costs. To prevent reprocessing all records, run the Auto-labeler on a filtered set or avoid making any changes later in a project.
Prompts (i.e. descriptions) are limited to 4000 characters in length.

There are some useful settings that you can apply to control how the Auto-labeler runs, including:

You can choose which Auto-labels to activate: Each label you create has settings to indicate whether the label should be included or ignored in the the auto-labeling process. Note that the default is to include the label in the auto-labeling process.
You can choose what information the Auto-label should 'read': You can choose, at the label level, whether the Auto-labeler should consider citation information (e.g., title and abstract) only or should also 'read' attached content like PDFs.
You can have the Auto-label pre-populate label answers or not: You can set whether or not the Auto-label prefills label answers. In this way, you can use the Auto-labeler to speed up human screening (rather than replace it), where human screeners verify or correct auto-label answers rather than starting from scratch.
You can choose to Auto-label only a subset of records: You can set an article filter and/or limit the Auto-labeler to a certain number of records to control how many records are labeled in a given run of the Auto-labeler.
You can choose which large language model to use for each label: Choose between Gemini 2.5 Flash or OpenAI's GPT-4o. Costs vary.
You can allow the Auto-label to skip articles when no category options are appropriate or it can't locate an answer: By turning off 'Require answer' for a categorical or string Auto-label, you allow the Auto-label to skip articles if no category options are a good match or it can't find an answer to a string label question. If 'Require answer' is on, it will be forced to choose an answer from the available options you give it. Consider providing a 'null' or 'not applicable' option in your categories if you want to require answers to a label.