Google Scholar vs. AI Academic Search : Expanding the literature review toolkit

15 Jan 2025

Google Scholar turned 20 recently, and despite being the world's most widely used academic search tool, AI-driven alternatives like Undermind.ai, SciSpace, Scite.ai, Elicit.com etc have risen to challenge it.

Given that SMU has subscriptions to the first three AI academic search tools and the fourth, Elicit.com, has a freemium offering, you might be wondering how they match up to Google Scholar.

In my view, no single tool is best for all situations as they have their distinct strengths and weaknesses, and you should consider using a blend of them depending on your use case.

Below are my suggestions on when to use each tool.

Introduction

Besides Google Scholar, in this post, I will also compare it with the following "AI" academic search tools

Undermind.ai - agented based academic search
Elicit.com, SciSpace -- "Semantic search" tools
Lens.org, OpenAlex -- Keyword based tools, not "AI" but have interesting features

Why these tools? This is because for all their differences, there are all cross-disciplinary, huge "mega" indexes of around 200 million (based on estimates by SearchSmart.org). (Note: Both Elicit.com and Undermind.ai use the Semantic Scholar corpus).

These are much larger than typical academic databases. Even ones such as Scopus or Web of Science (Core collection) are only in the 80 million range (again estimated by SearchSmart.org).

Besides better relevancy in results, AI-enhanced capabilities have brought transformative features to academic discovery, which include

Answer Generation with Citations: Tools like Scite.ai assistant, SciSpace and Undermind.ai use retrieval-augmented generation (RAG) to directly answer complex research questions with cited references, streamlining the process of synthesizing information.
Synthesis Table Creation: AI tools like Elicit and SciSpace can extract and organize data into synthesis matrices, significantly reducing the effort required to compare, analyze, and synthesize findings across multiple studies.

This is a complicated and evolving topic, and I will not try to evaluate these features but focus on just the traditional search results.

Using smaller specialised databases to search

Does this mean you should never use smaller specialised databases? Of course not, there are situations where you may want or need to use more specialized databases. See table below.

In fact, for more comprehensive searching, I typically recommend you combine a large mega-index like Google Scholar with a specialized subject database closest to the field you are interested in.

While searching in databases or academic search engines with the largest pools should in theory cover most of what you want, their size and cross disciplinary nature also means sometimes relevant items might not be easily surfaced due to many false hits. This is why smaller or subject specific databases can help ensure things aren't missed, specifically given they typically support use of precise searches using controlled vocabulary.

Google Scholar: The All-Purpose Workhorse for Academic Search

Overview

Google Scholar remains the de facto starting point for most academic researchers. Its sheer scale, ease of use, and extensive cross-disciplinary coverage make it indispensable for general searches, finding seminal works, and exploratory research. However, its keyword-based approach, while robust, lacks the advanced precision features by traditional databases or semantic capabilities offered by newer tools.

Key Features and Strengths

Unmatched Coverage:
1. By far the largest academic database, indexing over 300M+ works, includes full-text access for most major publishers.
2. Cross-disciplinary scope ensures it is suitable for virtually any academic field.
3. Indexing extends beyond metadata (titles and abstracts) to full-text content, which is critical for finding niche mentions or highly specific terms.
Highly Cited Papers and Reviews:
1. Google Scholar's algorithm prioritises title matches and citation counts, ensuring that seminal and widely referenced papers rise to the top.
2. The "review articles" filter simplifies finding comprehensive literature reviews, ideal for starting new projects.
Versatility:
1. Useful features like support of forward citation searching, search alerts and recommendations
2. Google Scholar's simplicity and broad applicability make it a go-to tool when researchers are unsure of their exact needs.

Unique Use Cases

Exploring New Topics:
1. Its response time is near instant, making it best for doing multiple searches in quick succession when exploring new areas.
Full-Text Dependent Queries:
1. Its full-text coverage is invaluable for finding niche mentions (e.g., It's the only tool that can reliably find mentions of datasets, search tools or other techniques because it covers the method section in the full-text).
2. Example: Searching for use of jargon, tools or specific methodologies (e.g., "Bayesian latent variable modeling" or "Boardex") benefits from Google Scholar's unmatched text coverage.
Seminal or Foundational Research:
1. Its algorithm naturally brings seminal and review papers to the forefront, making it ideal for identifying key studies or highly referenced works on broad topics. Example: For a topic like "AI in education," Google Scholar will surface foundational, highly cited papers and reviews that provide a solid entry point into the subject.

Weaknesses

Limited Advanced Search Features:
- No support for nested Boolean operators, left or right truncation, or robust field-specific queries (e.g., you cannot filter to matches only in abstracts).
- Search strings capped at 256 characters, limiting complex queries.
- Lacks controlled vocabulary support (e.g., Medical Subject Headings [MeSH]).
Overwhelming Results:
- Full-text matches can lead to excessive, sometimes irrelevant, results.
- Example: Searching for "Elicit" the search tool yields irrelevant matches unless refined with domain qualifiers like "Elicit.com."
Systematic Review Limitations:
- Does not support bulk exports of records. Only can see up to 1k results, even manual browsing brings up no more than 1,000 results. Also lacks powerful searcher features to make precise searches (see above).

Alternatives for Power Users

Lens.org or OpenAlex: One of the closest in size to Google Scholar etc., but offer advanced Boolean logic, truncation, and field-specific queries for precision searching. However, their limited full-text indexing makes them less versatile for specific text-based queries.

Undermind.ai: The Precision-Focused, Iterative AI Search Engine

Overview

Undermind.ai (institutional subscription by SMU) stands out for its use of agent-based iterative search and use of GPT4 as a relevancy evaluator, making it highly effective for detailed, specific queries. By combining keyword and semantic methods with human-like iterative refinement, it mimics the strategies researchers use to try to exhaustively explore a topic.

Key Features

GPT-4 Relevance Assessment:
1. Undermind uses GPT-4 to evaluate the relevance of results based on title, abstract, and metadata, producing results closer to human judgment.
2. Example: Queries like "How effective are LLMs in title-abstract screening for systematic reviews?" are parsed semantically by GPT4 for precise relevance.
Iterative Search Process:
1. Adapts and refines searches across multiple rounds, incorporating citation chasing and dynamic adjustments.
2. Example: Searching for "Bayesian approaches in dynamic modeling" will refine its focus on relevant subfields through iterations.
Relatively High Recall:
1. Achieves recall rates that typically surpass most traditional tools particularly when query is specific. Example: Using a Systematic review as a gold standard, Undermind had a recall@10 rate of 72.7%, indicating its capability to surface relevant papers early.
AI generated answers::
1. Undermind provides a LLM generated report including a summary and main categories of papers
2. "Discuss with expert" function allows you to ask specific questions from papers found

Strengths

Precision for Specific Queries:
- Best suited for highly detailed or narrow searches where specificity is critical.
- Example: "Evaluate ChatGPT's performance as a screener in systematic reviews using metrics like precision and recall."
Comprehensive Results:
- Search also leverages citation searching and not just keyword, ensuring seminal and related papers are included.
Agent-Based Search:
- Operates like a digital research assistant, prompting users to refine queries and adapt strategies.

Weaknesses

Time-Consuming:
- Queries take 3+ minutes, making it unsuitable for quick searches or for broad exploration where you want to try multiple queries in succession.
Metadata-Dependent:
- Limited full-text access can hinder performance for deeply text-dependent queries.
Narrow Scope for Broad Queries:
- Less effective for broad topics like "AI in education", better off using Google Scholar
- It tends to be quite conservative in "making connections" and errs on side of excluding possibly relevant papers or arguably useful papers but do not fit the query exactly.
Non-Deterministic Results: Results can vary between repeated searches due to the nature of algorithms and/or pipeline.

Best Use Cases

Scoping of Systematic Reviews: Help with scoping of topics to check if there is a reasonable number of papers on a small, detailed topic
Specific searching after exploring an area using Google Scholar: Good for focusing on specific areas after doing a broad searcher
Looking for negative cases: Have a hunch or wondering if a specific type of study exists or has been done but can't find it? Use Undermind to check. Of course, if it fails to find the paper, it doesn't mean the paper doesn't exist, you still have to try other methods (e.g. citation searching, keyword searching) - or ask a librarian to help?

For more details refer to

Elicit.com and SciSpace: Beyond lexical search -- Semantic Search

Overview

These tools leverage semantic search techniques (e.g., matching of dense embeddings or, learned sparse embeddings like SPLADE and other hybrid methods) to match documents by meaning rather than just using exact keyword matches. This allows for natural language queries and broader exploratory capabilities. This class of tools can be used if you are at the start of your research, and you are unsure if you have the right keywords and you want to do frequent successive searches to explore.

Key Features

Semantic Search:
1. Matches based on conceptual similarity, enabling discovery even without exact keyword phrasing.
2. Example: Inputting text like “What are some techniques to identify review papers in AI” surfaces relevant methods-based papers.
Natural Language Queries:
1. Optimized for nature language inputs, making them more intuitive for users unfamiliar with Boolean logic.
2. Example type: “How can I find recent studies on neural networks for image recognition?” but not “Neural networks image recognition”
Post-Search Filtering:
1. Tools like SciSpace, Elicit.com allow filtering by journal quality (e.g., Q1/Q2 rankings in Scimago rankings).
AI generated answers:
1. Both Elicit and SciSpace generate answers from the top few relevant papers to answer the query
2. Both Elicit and SciSpace allow generation of synthesis matrix tables allowing you to compare papers across customizable dimensions.

Strengths

Exploratory Searches: It is slower than Google Scholar but still relatively fast, Ideal for general topics or when the right keywords are unclear.
Finding Hidden Gems: Semantic methods can surface unconventional (in the sense of lacking keywords used in the query) but relevant papers even when there is no match with keywords

Weaknesses

Bias Toward Newer and Lesser-Known Papers: Many (but not all) of these tools such as Elicit, SciSpace (but not Scite.ai assistant) lack citation-weighting in the scoring, often favoring recent but less-cited and lesser-known works.
Non-Deterministic Results: Results can vary between repeated searches due to the nature of algorithms and/or pipeline.
Quality Control Issues: These tools use sources like Semantic Scholar that have very Inclusive indexing and hence they can surface low-quality papers which is further worsened by the fact that they tend to score relevancy on similarity to query and not weight citations.

Best Use Cases

Broad Topic Exploration: Use when entering a new field and the exact terms are unknown.
Natural Language Queries: For researchers less familiar with precise keyword phrasing.

Conclusion: Expanding Academic Discovery with Keyword, Semantic, and Citation-Based Tools

The academic search landscape is evolving, offering researchers a suite of powerful tools that go beyond traditional keyword-based methods. While tools like Google Scholar remain indispensable for their extensive full-text coverage and citation weighting, new approaches such as semantic search and advanced citation-based discovery are reshaping how we find and connect with academic knowledge.

Semantic search tools like Elicit.com, SciSpace, and Undermind.ai excel at interpreting queries through meaning and context, uncovering relevant results even when exact keywords are unknown or when natural language queries are used. These tools are especially valuable for exploratory searches or when delving into newer fields with less standardized terminology.

In parallel, a growing class of citation-based literature mapping tools—including ResearchRabbit (see our review ), Connected Papers, LitMaps, and Inciteful (See our 2021 review of these 3 tools)—provides a dynamic way to visualize and navigate citation networks. These tools enable researchers to explore the relationships between papers, follow intellectual threads across disciplines, and discover both foundational works and emerging research. Such tools complement traditional search methods by offering visual, interactive approaches to citation chasing, helping researchers identify clusters of related studies and uncover overlooked connections.

Besides better relevancy in results, AI-enhanced capabilities have brought transformative features to academic discovery:

Answer Generation with Citations: Tools like Scite.ai assistant, SciSpace and Undermind.ai use retrieval-augmented generation (RAG) to directly answer complex research questions with cited references, streamlining the process of synthesizing information.
Synthesis Table Creation: AI tools like Elicit, SciSpace and Scite.ai assistant can extract and organize data into synthesis matrices, significantly reducing the effort required to compare, analyze, and synthesize findings across multiple studies.

If you would like to find out more on how these tools can help with your research, reach out to us at SMU Libraries.

Introduction

Using smaller specialised databases to search

Google Scholar: The All-Purpose Workhorse for Academic Search

Overview

Key Features and Strengths

Unique Use Cases

Weaknesses

Alternatives for Power Users

Undermind.ai: The Precision-Focused, Iterative AI Search Engine

Overview

Key Features

Strengths

Weaknesses

Best Use Cases

Elicit.com and SciSpace: Beyond lexical search -- Semantic Search

Overview

Key Features

Strengths

Weaknesses

Best Use Cases

Conclusion: Expanding Academic Discovery with Keyword, Semantic, and Citation-Based Tools

Where to Find Us

Get in Touch