By Aaron Tay, Head, Data Services
SciSpace is one of the “AI powered” academic search engines that leverages large language models for relevancy ranking and data extraction. We have had our eye on it for a while first doing an early review in August 2023, followed by a webinar on SciSpace in Jan 2024 (see recording here).
We are happy to announce that SMU Libraries have started a subscription to SciSpace, where you can access a SciSpace premium account. You can access it via either the Database A-Z or through the library search. You will need to go through a form to get access to the ID/password for the account. Do note this is a shared account, suitable for casual one-off use and testing.
If you are a faculty member or a PhD student intending to use SciSpace for an intensive project (e.g. systematic review, grant-funded projects), please contact us to request a dedicated SciSpace premium account. Each request will be decided on a case-by-case basis.
SciSpace premium account and new features
We have already covered the main features in both our review last year and our Webinar (recording here) earlier this year in Jan 2024.
But like any cutting-edge emerging tool, SciSpace evolves quickly. Here are a few new features that caught my eye, as well as some features specific to the SciSpace premium account.
Many “AI search tools” now can generate answers or summarizes from the top N papers using the power of Retrieval augmented generation to generate summaries from top papers found (for example, Scite.ai assistant (webinar recording), which we have a subscription). SciSpace can do it as well for the top 10 papers found and also offers a “high quality” mode that is available to only premium accounts.
It goes without saying such tools including SciSpace always have a chance of hallucinations and while they tend not to make up fictional papers wholesale (since the papers were found via Retrieval augmented generation), their cites/interpretations of the paper can still be wrong!
See this mini-review by HKUST library, and here’s another analysis focusing on law search tools.
Export into csv and setting instructions when creating custom columns
But arguably more useful is SciSpace’s functionality of extracting data from the paper and generating a synthesis matrix/table of papers. Not familiar with what this means? It’s essentially a table which lets you compare papers across different aspects. For example, you would have a row for each paper in a topic, and the columns could be named “Methodology”, “Sample size”, “Limitations” etc. Still unsure, look at the next image.
With a SciSpace premium account, you can now export the table into CSV!
You are of course not limited to just the default columns, but you can add your own custom column. Earlier versions of SciSpace only allowed you to name the column, but the latest version allows you to go beyond by giving specific instructions of what to extract. In the following example, I ask SciSpace to not just extract the discipline covered but also classify it into 4 broad areas.
Uploading full-text and asking queries
Data extraction using SciSpace as shown above is limited, because the information is drawn mostly just from abstract with some limited full-text from open sources. This may not be ideal if you want details typically not found in the abstract.
As such SciSpace allows you to upload PDFs into a personal library and do a query or extraction over it.
Warning: If you are using a shared account, other SMU users might be able to see it. Do not upload private and confidential files into SciSpace.
You can even subdivide your uploaded articles by folders and do separate analysis, ask queries over them etc.
Note: The same warning above about hallucinations applies here, please verify the extractions!
If you use this method to extract data, you can mouse over the citations and click on “Locate in PDF” to try to verify if the data extraction is done correctly.
Conclusion
It is no doubt based on comments from users during our trial, SciSpace is a useful tool with some fairly unique features. However, as with all tools using generative AI/Large language models, great care must be taken when using these tools as hallucinations will always be a factor.