By Aaron Tay, Head, Data Services
Last month, we introduced our webinar series on research impact beyond traditional metrics with Euan Adie's talk on Overton and policy impact. This second instalment tackles an equally important but often overlooked dimension: measuring teaching impact through curriculum data.
Joe Karaganis from Open Syllabus delivered a fascinating session on how course syllabi—long treated as ephemeral administrative documents—can be transformed into a powerful lens for understanding what actually gets taught. [Watch the full recording here.]

Building a Citation Index for Teaching
We're familiar with how Scopus indexes citations between journal articles, and how Overton tracks citations in policy documents. Open Syllabus does something analogous for teaching: it builds what is essentially a citation index of teaching resources.
The project collects syllabi at scale (currently around 28 million), parses their contents, and normalises assigned works into structured data. The result is a searchable database of what texts, articles, and other materials that are actually being assigned to students in classes.
While researchers have analysed syllabi before, most earlier work was small-scale. By the mid-2010s, advances in machine learning made it feasible for a relatively small team to process a corpus this large—identifying roughly 5 million unique works by matching title-author combinations against catalogues including Open Library, Library of Congress, and OpenAlex.
From Counting Assignments to Curriculum Analytics
In its early years, Open Syllabus focused on classic citation-index features: identifying works, counting how often they're assigned, and producing rankings. At its simplest, the interface identifies a work (a book, article, or other resource), counts how many syllabi it appears in, and normalises this to a score from 1 to 100.

This syllabus-based metric offers an alternative or complement to traditional citation measures when assessing impact for hiring, promotion, and other evaluation exercises. Joe noted, however, that this application remains ad hoc and experimental—it still needs to be "socialised" within academic communities before gaining widespread acceptance.
Over time, the project has moved beyond simple counting toward richer curriculum analytics. One promising direction is course similarity: algorithmically assessing how similar two courses are across different institutions or countries. This has practical applications for student transfer and credit recognition—determining whether courses at a "home" institution are genuinely equivalent to those at a "destination" institution.
Tracking Open Content and Skills
Because syllabi sit close to actual classroom practice, they offer a valuable window into how much teaching uses open content. Open Syllabus can also track adoption of open educational resources (OER), open access monographs, and open textbooks—distinguishing between these different communities since they serve different audiences. One practical use case: benchmarking your institution's open content adoption against peers and/or looking at which open content is assigned for which courses.

A newer direction represents courses as bundles of skills rather than just reading lists. The traditional degree is arguably too coarse a signal for employers. Course-level skills, linked to recognised taxonomies, would provide more granular information. Open Syllabus is working with skills extraction experts to apply these methods to their corpus, exploring how classroom learning can be articulated in terms directly relevant to employment.
Coverage, Bias, and Academic Freedom
Joe repeatedly emphasised a principle familiar from citation indexes: any metric reflects the corpus it's built on. Around 40% of Open Syllabus's syllabi come from the US. Some countries (like the Czech Republic) are heavily represented because institutions there use a singular centralised course-publication platforms that is easy to harvest. Others are more heterogeneous—syllabi may be split between library-managed reading lists and separate descriptive documents elsewhere.

There's also a significant academic freedom dimension. Faculty in some countries face risks for teaching politically sensitive topics, and the Open Syllabus team lacks the local knowledge to make nuanced decisions across 180+ countries. They initially used Freedom House's academic freedom score as a proxy, excluding syllabi from countries rated 2 and below. Singapore fell into that excluded category at the time.
Joe acknowledged this proxy is imperfect. Moreover, it may become untenable as the US is expected to fall below the threshold—forcing a rethink of the entire approach. He indicated that Singapore syllabi will be reinstated, though the corpus remains small without systematic institutional contributions.
Beyond Traditional Publications
The Open Syllabus corpus isn't limited to journal articles and books. Because of the variety of source catalogues, it also tracks musical compositions, audio-visual materials, periodicals, YouTube videos, and news media like New York Times articles. This richness helps capture the full range of materials used in teaching, though it introduces more heterogeneity and noise.
One limitation worth noting: different editions and translations are clustered into single "works," and authors aren't routinely disambiguated. This makes the system manageable but loses detail—institution-level metrics aren't produced on an ongoing basis. For the Financial Times collaboration (which ranked business schools by their faculty's "teaching power"), each author and institution had to be manually verified.
To address copyright and privacy concerns, syllabi are anonymised and only descriptive elements are displayed rather than full documents.
Looking Ahead
For institutions thinking about research impact, teaching quality, and graduate outcomes in a more integrated way, Open Syllabus points toward a future where course design, skills, and impact metrics are much more tightly connected than in the traditional citation-only world. With a sufficiently large institutional corpus, the data can track how assigned readings evolve over time, analyse the integration of open content, extract learning outcomes, and experiment with skills-based course representations.
It's early days, but seems to me the potential for connecting what we teach to how we measure impact is genuinely promising.
For a deeper discussion, watch the full recording here