Getting to know Dong Danping, Senior Librarian, Research & Data Services

10 Apr 2025

Danping joined SMU Libraries as an academic librarian in 2016 as a Research Data Management (RDM) librarian.

Since then, she led the project to establish SMU's SMU Research Data Repository (RDR). She is also knowledgeable in bibliometric tools such as Scival and has a keen interest in Data Anonymisation. She has also worked hard to learn coding, e.g. R/Quarto and Python scripting, which she uses to optimise the internal processes needed for running SMU repositories. Together with other colleagues like Mui Yen, she also offers the visual abstract service, which helps researchers better publicise their work.

One of the co-founders of the SMU Researcher Club, she supports the needs of emerging or early researchers. In particular, she works closely with the College of Graduate Research Studies (CGRS) to support Graduate Research Students.

Danping has been awarded the LAS (Library Association of Singapore) Library School Scholarship Award in 2014 and the LAS Newcomer Award in 2018 for her contributions to the profession.

In this Research Radar article, we catch up with her to learn more about her and her work.

Q: Hello Danping thank you for taking the time to chat with us. Could you tell us about your role as an academic librarian specialising in research data management and services?

I'd be happy to. When I first joined SMU Libraries, Research Data Management (RDM) was a relatively new area for me, but I quickly recognised how critical RDM is for trustworthy and reproducible research. However, RDM is frequently overlooked or deprioritised -- understandably, since academic focus tends to be on publications rather than their underlying data.

I came to realise that promoting RDM best practices goes beyond simply asking researchers to deposit data in some repository at the end of a project. It involves supporting the entire research data lifecycle: from data collection to data cleaning, from data organisation and documentation to implementing reproducible methods for analysis. This understanding motivated me to deepen my knowledge of both data practices and research workflows.

At first, the technical side of data felt daunting, but as I learned more, I began to see its value and started to enjoy learning and applying what I learned for problem solving. Over the years with SMU, I've explored (and sometimes taught workshops on) topics such as:

OpenRefine for data cleaning (from no-code to light coding)
Qualitative analysis with ATLAS.ti (with Bryan, our law librarian)
R programming and reproducible research with Quarto (with Bella, our coding expert)
Data anonymisation techniques (for protecting sensitive information)
Python scripting for versatile data tasks and AI applications

Currently, part of my work focuses on supporting early-career researchers---graduate students, postdocs, and research staff at SMU. While the term "early-career researchers" might not capture everyone perfectly, I appreciate how it reflects their potential as the future of academia. This group tends to be particularly curious and open-minded. I aim to understand their unique challenges and provide meaningful support, whether through RDM guidance, reproducible workflows, or connecting them with relevant library resources.

Q: What are some common challenges you see researchers face with research data?

I'd like to highlight three key challenges that come to mind---though they're also opportunities for growth:

The "Paper vs. Data" Priority Gap

I totally get it - publications are the currency of academia. But this is shifting over time, as journals, funders, and institutions increasingly require proper data archiving to ensure reproducibility (check out SMU's Research Data Management Policy here).

Data itself is gaining importance as a research output, and I see a lot more researchers publish data via platforms like GitHub, OSF, or SMU Research Data Repository (with extra perks like librarian support and generous storage). At SMU Libraries, we help researchers navigate open data requirements or help you publish data for research impact.

Potentials of AI supporting RDM

Generative AI tools like ChatGPT can now assist with drafting documentation, suggesting metadata, or explaining data issues---lowering the barrier and resources needed for proper RDM. While they're not perfect, they're a good starting point offering handy help with some expert oversight. (And of course, I'm always here for hands-on support as well!)

Keeping pace with new methods

Emerging techniques such as AI-driven analysis and computational workflows are exciting but require new skills and may seem daunting to non-technical disciplines. Here's the thing: while AI can teach these skills (e.g. coding, debugging), it requires baseline knowledge to vet AI's advice. That's where SMU Libraries would like to play a part as well, by providing support for computational analysis.

Q: What's one recent success story you can share?

I recently explored using AI to categorise SMU publications by our strategic priorities---Digital Transformation, Sustainability, and Growth in Asia. This task would typically require manually tagging thousands of publications, but with carefully designed prompts and DeepSeek's API, we generated a reasonably accurate classification using Python scripts. While the results may not be perfect, it has dramatically reduced the workload.

What's exciting to me is how accessible this kind of automation has become. With just basic coding knowledge and an understanding of programming logic---enough to articulate what you need---AI can now help write the necessary Python scripts. It's incredibly empowering to bridge that gap between idea and implementation.

Q: Is there anything exciting that you are working on now that you can share?

Absolutely! I've been experimenting with BERTopic to analyse text data, including library chat transcripts and SMU publications on Asia-related research. It's been fascinating to see how topic modeling can reveal hidden patterns in unstructured text, and I'm convinced these techniques could benefit researchers working with qualitative data as well.

This ties into a workshop I'm co-conducting with Aaron for the Singapore Rising Scholars Conference 2025: "Text Analysis for PhD Research: An Introduction to No-Code and Code-Light Tools" which covers BERTopic as well as no-code tools like Voyant and TALL.

Q: Finally, how can readers get in touch with you if they need support or want to learn more?

Please feel free to reach out to me! You can:

Email me at dpdong@smu.edu.sg
Send me a Teams message
Arrange a meeting to chat about anything related to your research and data

No question is too small - if I can't help directly, I'll connect you with someone who can.

Chat with SMU Libraries AI chatbot