Developing a comprehensive search query

When working on a scoping or systematic review, authors need to develop and document at least one search query that is comprehensive and reproducible. Comprehensiveness is defined as the sensitivity of the search query in retrieving a large enough number of results to ensure that as many of the relevant studies are included in the review as possible. ("Sensitivity" in this context is defined as the proportion of relevant studies retrieved, in contrast with "precision," which is the proportion of retrieved studies that are relevant.) According to the Cochrane Handbook for Systematic Reviews of Interventions, searches for reviews “should seek to maximize sensitivity whilst striving for reasonable precision” (see MECIR Box 4.4.b). 

What follows are five steps for developing a comprehensive search query for evidence synthesis projects such as systematic or scoping reviews.

Step 1: Find an appropriate research question framework.

A helpful way to start developing a comprehensive search query is to use a research question framework. Having a well-developed research question helps identify the major concepts to include in the search query. The most widely used framework in health research is the PICO framework: Population/Patient, Intervention, Comparison, and Outcome. The article "Craft better research questions by using question development frameworks" highlights other frameworks in addition to PICO.

Step 2: Harvest terms.

After selecting a suitable framework, the next step is to define each concept or component of the chosen framework and start assigning keywords and controlled vocabulary. This step is known as term harvesting. One of the best ways to inform the term-harvesting process is to conduct a preliminary topic investigation or a scoping search of the topic. These searches can be composed of just a few keywords to see what the initial search results yield.

An optimal search strategy would include both the controlled vocabulary of the various literature databases (e.g., PubMed Medical Subject Heading [MeSH] terms) and your own keywords. The best way to identify synonyms that might yield the most relevant results is to read a few articles from your preliminary search, skim the abstracts in your results, and look for author-provided keywords or see what controlled vocabulary have been tagged to the articles.  

For example, the article "Racial disparities in adverse pregnancy outcomes and psychosocial stress" provides MeSH terms at the bottom of the PubMed record. One of those terms for this particular article is "hypertension." To optimize your search on a similar topic, think about all the synonyms and phrases related to this term, such as "hypertensive disorders" or "high blood pressure."  It’s also a good idea to look at the entry terms under each MeSH term definition.

Here are some helpful tips for term harvesting while you are conducting your preliminary search:

Step 3: Create search segments for each main concept or component of a framework using PubMed.

Before creating your search segments, it is highly recommended that you log in to your PubMed account. If you don't already have one, a personalized account can be created by signing in with any of the third-party apps (e.g., Google or ORCID) listed on the NCBI account page.

To create a search segment for each concept, combine the identified related terms (controlled vocabulary and keywords) and string them together by using the Boolean operator OR between each term. By using the OR operator, you are broadening the search by including search terms that will best define or characterize a specific concept. Put the words strung together with OR in parentheses. Once you've created the search strings, run each search segment separately in PubMed.

After all search segments have been run, using PubMed’s search history, which can be found on the Advanced Search page, now combine the search segments with the Boolean operator AND. By using the AND operator, you are combining the search segments for each concept and thereby narrowing the search.

Example: (yellow OR red OR green) AND (dog OR cat OR bird)

Want to learn more? Take a look at this great tutorial on developing a comprehensive search by the Welch Medical Library of Johns Hopkins University.

Additional Resources

A search query is a combination of search segments. Once you have developed a search query for one database, you will probably need to "translate" the search query so that it will work with other databases and their particular syntax. In PubMed, for example, you can put [tiab] after a search term to limit the search to the title and abstract, but this syntax will not work at all in Web of Science. For more about translating your search queries, there is this helpful guide from the Cornell University Libraries: A Guide to Evidence Synthesis: 6. Translate Search Strategies.

Step 4: Save your search query.

After combining all the search segments with the Boolean operator AND, make sure to document this first iteration of your comprehensive search:

Because both your search processes and queries need to be reported in the methods section of your paper, documentation and internal recording keeping are critical steps as your do your research. It is the systematic search query or queries that produce your dataset – the set of studies – that will be screened and reviewed by referees when you decide to publish your paper. As Cochrane advises, “Review authors should document the search process in enough detail to ensure that it can be reported correctly in the review. The searches of all the databases should be reproducible to the extent that this is possible” (see MECIR Box 4.5a). A clear and detailed methods section ensures the transparency and replicability of the review.

Step 5: Cross-check and assess the quality of the search query.

After saving the search query, take a glance at the results.

  • Did the search query retrieve too many results or not enough?
  • Are the results relevant?

Make the appropriate changes by editing the search segments. You may, for example, want to include more terms within each segment to expand the search, delete extraneous terms, or employ some strategies to filter the results (e.g., searching only in the title/abstract field or applying filters such as document or study type, age groups, language, or publication dates). Determining how much literature to retrieve will depend on the project timeline and how much time will be needed to screen the results. Cochrane provides useful hints on when to stop searching.

One way to assess the quality of the search query is to use the Peer Review of Electronic Search Strategies (PRESS) Guideline Checklist. This checklist can be used to self-assess a comprehensive search query or it can be used as a formal assessment tool in which a peer can review the search query objectively by way of the PRESS Guideline assessment form. This form can be included as part of the systematic review manuscript when it is submitted to a journal.

How to Cross-Check Your Results with a List of Key Articles

Another method is to cross-check your results with the list of key articles you identified during the preliminary topic investigation in Step 2. Does the search query you developed pick up most of the key articles? If your search query does not find these articles, then this would a good time to peruse those excluded articles to see whether there are any other phrases or keywords missing from your query. Sometimes even variant spellings can exclude an article from the search results (for example, gynaecology vs. gynecology).

To perform this cross-check in PubMed, you need to develop a search segment using only the key articles’ PubMed Identifiers (PMIDs). Each PubMed article is assigned a PMID number, which can be found just below the citation information. PMIDs do not change over time and are never reused. If for some reason you decide not to use PMIDs, you can also use partial citation information, such as article title, author(s), and/or publication year for each key article.

Assuming you are using the PMID method for this cross-checking step, collect all the PMIDs of the key articles and search for several PMIDs at once by entering each number in the search box separated by a space (e.g., 17170002 16381840). PubMed will enact the OR operator for these PMIDs. Run this search segment.

Image:
Using PubMed’s Search History query box, run a search of the key articles’ PMID numbers. Then clear the query box and run your comprehensive search query separately.
Figure 1: Using PubMed’s Search History query box, run a search of the key articles’ PMID numbers. Then clear the query box and run your comprehensive search query separately.  

Using PubMed’s search history, which can be found on the Advanced Search page, combine the comprehensive search query you developed with the search segment containing all the PMID numbers, this time using the Boolean operator AND.

Image:
Both searches will appear in the search history.  Click on the ellipsis under "Actions" for Search #3 and select "Add query"
Figure 2: Both searches will now appear in the search history.  Click on the ellipsis under "Actions" for Search #3 and select "Add query." 
Image:
After adding Search #3 to the Query box, go to the ellipsis for Search #1 and select "Add with AND"
Figure 3: After adding Search #3 to the Query box, go to the ellipsis for Search #1 and select "Add with AND"  and then click on Search (the blue button).

The results should either equal the total number of key articles or a smaller number. If the search query picked up most key articles, then you can be confident that the search has good specificity in retrieval of the relevant literature.

Image:
In this example, Search #4 is the result of combining searches #1 and #3.  Search #4 results (12) indicate that the comprehensive search query is effective in retrieving the 12 key articles from Search #1.
Figure 4: In this example, Search #4 is the result of combining searches #1 and #3.  Search #4 results (12) indicate that the comprehensive search query is effective in retrieving the 12 key articles from Search #1.  

Further Resources

Was this article helpful?
What made the article not helpful?