Computational Approaches to Curation at Scale for Biomedical Research Assets (R01 Clinical Trial Not Allowed)
This funding opportunity supports U.S.-based researchers and institutions in developing innovative computational methods for automating the curation of large biomedical datasets to improve data accessibility and quality for scientific research.
Description
The National Library of Medicine (NLM) invites applications for research on computational approaches to large-scale curation of biomedical research data under the R01 Research Project Grant mechanism. This funding opportunity, titled "Computational Approaches to Curation at Scale for Biomedical Research Assets," seeks innovative projects that develop, test, and validate automated curation methodologies to manage and integrate large biomedical datasets. The overarching goal is to facilitate access to high-quality, FAIR (Findable, Accessible, Interoperable, and Reusable) data, which is crucial for advancing data-driven biomedical research and supporting open science. Research projects should develop computational techniques that enhance data annotation, integration, and management for diverse biomedical data types, such as genomics, proteomics, health records, and epidemiological data.
This funding opportunity encourages development of new computational tools or improvement of existing open-source tools that streamline data curation, maintain data integrity, and advance FAIR principles. Examples of research topics include automated curation of multi-omic data, text and metadata extraction, quality control methods for data accuracy, and innovative methods for harmonizing data across platforms. Proposals should detail specific digital asset types, data provenance, intended users, and use cases, along with clear metrics for assessing the curation approach’s efficiency, accuracy, and utility.
Applications should align with the NLM's strategic goal of making biomedical research outputs widely discoverable and accessible. Projects must present scalable solutions, detailing how proposed methods exceed current standards in speed, quality, and cost-effectiveness. Non-responsive applications, such as those focusing on manual curation, financial datasets, or NIH-defined clinical trials, will not be considered. Additionally, projects must disseminate results, including software and curated datasets, for open access, ensuring broad applicability and potential for future research use.
The NLM expects to fund projects with budgets up to $250,000 in direct costs annually for up to four years, depending on project scope and justification. Applications may be new, renewal, or resubmission, and the total number of awards is contingent on NIH appropriations and proposal quality. Only U.S.-based institutions and organizations are eligible to apply, including higher education institutions, nonprofits, small businesses, government agencies, and certain Native American tribal organizations. However, foreign entities are not eligible to apply as primary applicants, though they may participate as collaborators.
Applicants should submit a letter of intent 30 days before the due date, though it is not required. Key submission dates include an initial due date of January 28, 2025, with subsequent cycles in May and September. Applications must be submitted electronically via ASSIST, Grants.gov, or an institution’s system-to-system solution and must comply with NIH’s updated FORMS-I application package requirements, which will be available prior to the submission dates.
Review criteria will assess the project’s significance, innovation, rigor, feasibility, investigator expertise, and institutional resources. Additional considerations include human subjects protections, animal welfare (if applicable), and data privacy measures for any aggregated public data. Successful applications should demonstrate significant advancement in automated biomedical data curation and offer durable solutions that can adapt as new research needs emerge.