NCDS Data Internship

About Data Librarianship

Data Services Internship

The ability to provide data services is now a part of many job openings and new opportunities in health sciences and academic libraries. In turn, the goal of the NCDS Internship Program is to provide practical experiences to interns that include the soft and hard skills needed to enter data librarian positions, including working on a team, understanding the data lifecycle, and working with data. In an effort to diversify the profession, the National Center for Data Services (NCDS) of the Network of the National Library of Medicine (NNLM) provides internships for people from underrepresented racial and ethnic groups.The goal of this program is to introduce students from historically excluded racial and ethnic groups to data librarianship in a health sciences context. These paid internships offer opportunities to gain practical experience while working with a mentor in a guided environment on structured, data-related projects. The practical experiences developed during the internship will provide participants with skills needed to be competitive for data librarian positions.

Up to 12 applicants will be selected.

The internships are flexible and unique in that they allow students to participate either on-site or remote. Depending on the location and interest of the intern, these internships will be in-person or virtual (possibly including a site visit), and with a host library or within the NCDS.

**Due to the ongoing pandemic, all internships are currently 100% virtual.**

About NNLM

The mission of the Network of the National Library of Medicine (NNLM) is to advance the progress of medicine and improve the public's health by providing U.S. researchers, health professionals, public health workforce, educators, and the public with equal access to biomedical and health information resources and data. NNLM’s main goals are to work through libraries and other members to support a highly trained workforce for biomedical and health information resources and data, improve health literacy, and increase health equity through information. The NNLM Regional Medical Libraries (RMLs), Offices, and Centers rely upon partnerships with Network members to achieve these goals by providing training and funding and other opportunities for development. 

NCDS held an information session about this internship on March 8:

We will have another session on the following date:

March 24, 1 pm Eastern

Dates

The 10-week internship runs from June - August each year, contingent upon the availability of funds.

Eligibility

  • Must be a US citizen or permanent resident.
  • Must be currently enrolled in an accredited LIS graduate program.
  • Must be a member of a marginalized racial or ethnic group.
  • Must complete all application materials, and if selected, must provide three professional references.

Projects

There are multiple projects available from our site partners. These projects involve structured activities including data cleaning, structuring, analysis, or visualization, or a guided in-depth project in data curation. Each intern will be provided with training and mentoring throughout.

2023 Project Details

Project 1: "Exploring the NNLM Data Warehouse." (NNLM National Evaluation Center) The NNLM Data Warehouse stores NNLM data from the early 2000s to the present. This data is used to feed interactive dashboards, and NNLM members have the option to request data for their own research, analysis, and reporting. This project will help adapt existing data to a new data reporting system (CiviCRM) and add modifications and improvements to enhance reporting capabilities and analyses.

Data Skills to be Developed: 1) Learn how to interpret and understand data models and database schemas  2) Learn how to use relational databases  3) Become proficient in the SQL programming language  4) Use Python to interact with the database and create data pipeline script  5) Integrate the GitHub code repository into your coding workflow  6) Lean how to retrieve data from APIs  7) Create data visualizations with Tableau

---

Project 2: "Enhanced Research Metrics: Turning Publication Data into Actionable Insights" (Edward G. Miner Libraries, University of Rochester) Research metrics, such as author collaboration networks and publication impact, can provide valuable insights into scientific production. However, collecting and organizing these data can be a time-consuming and resource-intensive process. This project aims to utilize bibliographic data from Scopus to generate research metrics for various medical departments at our medical center.

Data Skills to be Developed: 1) Data analysis and visualization 2) Data manipulation and cleaning 3) Use of an IDE, e.g., (Visual Studio Code) 4) Familiarity with Rbiblioshiny package 

---

Project 3: "Ecology of infectious disease" (Cary Institute of Ecosystem Studies) We are looking to predict the next disease outbreak before it happens. This project focuses on (1) identifying animals that amplify disease, using computer algorithms that compare traits of known disease carriers with species not yet known to carry disease, and (2) examining which combinations of species, pathogens, and environmental conditions give rise to disease outbreaks. Interns will help clean and augment a subset of data from the Global Infectious Disease and Epidemiology Network (GIDEON) https://www.gideononline.com/.

Data Skills to be Developed:  1) Creating tidy data  2) Working with and exploring biological and ecological data about mammals and zoonotic pathogens around the world  3) Using R, Python, or other scripting and coding languages 4) plotting and visualizing data 5) learning about data and software management plans

---

Project 4: "Git Primer Development" (The Data Curation Network) Data Curation Primers are detailed reference documents centered on a specific subject, disciplinary area, or curation task that can be used by curators when curating a dataset that falls outside of their expertise. These step-by-step resources provide a shared knowledge base for a specific data format, method, or tool. Interns will help develop a primer for Git, a distributed version control system that tracks changes in any set of computer files.

Data Skills to be Developed: 1) Git competency 2) GitHub competency 3) Familiarity with different data types 4) Data literacy   

---

Past Programs

2022 Project Details

2022 Internship Participants

Some Project Outcomes

Clinical Trials Data Primer by Liliana Gonzalez, Mikala Narlock, and Shawna Taylor

My Summer Code-Ability by Jodecy Guerra

NCDS Data Librarianship Intern Resource Guide by Aundria Parkman, Justin de la Cruz, Genevieve Milliken, Peace Ossom-Williamson, Mikala Narlock, Shawna Taylor, Jennifer Darragh, Wind Cowles, and Scout Calvert

Application

Up to 12 applicants will be selected.

Applications are open in March of each year. To apply, please complete the following application form (the link will be updated annually). Applications are reviewed according to a rubric, and applicants under review will need three references to complete a short reference form from NYU Langone Health.

2023 Application Form 

Would you like to contact a coach? 

A coach is a person external to the process who can assist with ideas for your letter or questions about how to format a resume. Want someone to review your materials to get their feedback? Or are you looking for any other support as your work on your application?

Contact Negeen Aghassibake, Data Visualization Librarian at University of Washington Libraries: negeena@uw.edu

NIH Public Access Policy

Recipients of NNLM funding are required to deposit any peer-reviewed manuscript upon acceptance for publication in PubMed Central in accordance with the NIH Public Access Policy.

Data Sharing and Development of Training Materials

To facilitate the dissemination of knowledge and information associated with the NNLM Cooperative Agreement Award, all are required to share any data or training material resulting from funding. This information must be submitted to the following collection sites as applicable: Network of the National Library of Medicine (NNLM) website; Other websites specifically designated by the NLM as part of the Network of the National Library of Medicine (considering changes in the project and data repositories required to maintain sharing within the Network). In addition, recipients of funding are expected to use or adapt existing training materials before developing new materials. Consult with your RML/Office and the NNLM Training Office (NTO) prior to developing materials. Publication and Copyrighting: Per Section 8.2.1. - Right in Data (Publication and Copyrighting) of the NIH Grants Policy Statement. The NIH must be given a royalty-free, nonexclusive, and irrevocable license for the Federal government to reproduce, publish, or otherwise use any materials developed as a result of funding and to authorize others to do so for Federal purposes, i.e. the ongoing development of the Network of the National Library of Medicine.

Data developed by participants and consultants are also subject to this policy.

NIH Acknowledgement

Any resources developed with project funds must include an acknowledgment of NIH grant support and a disclaimer. Please consult with the NCDS for the specific acknowledgement statement to be used for your project award.

Application Review and Scoring Criteria

Review and selection of proposals - reviewers selected by the RML/Office. The Review Committee is made up of Network members who have representative data librarian experience. The Review Committee will make final recommendations for selection to the Associate Director of NCDS.

Applicants may request to receive a copy of reviewer comments and rubric. If further clarification is needed, the applicant will be given 1 week to submit more information.

Scored Review Criteria

The application will be scored in the following areas:

  • Describes applicant's interest in the internship and how the internship would benefit the applicant. (20 pts)
  • Describes the applicant's experience working with or seeking to learn about data or providing data services. (5 pts)
  • Describes and details projects the applicant may be interested in undertaking, as part of the internship. (5 pts)

Acceptance

Applicants who are accepted will be notified in late April. Onboarding procedures will follow in order to begin processing for employment. NCDS will provide an internship agreement for all selected candidates.

Participation

Student interns will attend regular meetings and check-ins, complete all assigned trainings, and submit a final project report. Students are strongly encouraged to submit their work to conferences and journals. Each intern will have the opportunity to receive funding from the NCDS to present at one conference where their proposal has been accepted.

On this Page