Bioinformatics Intern


Biotech Internship

The Center of Excellence (CoE) in Data Science and Bioinformatics at LabCorp Information Technology department applies various data science disciplines (including artificial intelligence, machine learning, graph databases, statistics, bioinformatics, and natural language processing) to our clinical, operational and financial challenges, and creates opportunities to enhance the value of our offerings to our customers. Additionally, it integrates the data sciences and bioinformatics efforts between the

LabCorp diagnostics and drug development units and serves as a collaboration platform to foster teamwork and learning throughout the LabCorp IT organization.

The internship program in this CoE provide a unique opportunity for the students to interact with the CoE personnel and get hands-on experience and knowledge of solving real life problems in data sciences and bioinformatics. It also contributes directly to LabCorp research and development efforts to address challenging data science issues and speed up critical production development efforts.

Project Description:

Utilizing Deep Learning in Position Specific Modeling in Somatic Mutation Detection for Low Frequency Mutations

How is this project meaningful to the student's education?

  • Provides opportunity for the student to learn processing and analyzing large datasets from sequencing high-throughput instrument.
  • Student will learn the concepts of deep learning
  • Student gets an exposure to a practical use of dimensional reduction and feature selection in predictive modeling
  • This project provides an opportunity for the student to have hands experience on training machine learning methods using high dimensional data in an unbalance population

How is this project valuable to the organization?

  • Liquid Biopsy: Internal studies and peer reviewed publications have shown that technical and biological noise is loci dependent. Such behavior may be attributed to multitude of factors including sequencing context, evolutionary conservation aspects, epigenetics make up, process complexity, etc.. Learning and modeling the position specific noise in DNA/RNA sequencing is key in keeping the PPV under control for somatic mutation detection for a highly sensitive test, which shall benefit and array of products that involves NGS including Liquid Biopsy ( for cancer monitoring or companion diagnostics) , tissue profiling, or tumor burden estimation. Better PPV leads to less number of unnecessary orthogonal confirmations and manual curations which is conducive to an overall lower cost for the test. Hence making the test more accessible to community oncologists and more appealing to CROs.

List project deliverables you will ask of an intern (what are their goals):

  • Survey of Deep learning tools available with pros and cons, Bringing one or more public tools in house.
  • Educate the team through one or more seminar about the deep learning concept with practical examples (use internal or publicly available datasets) for CNV detections, and modeling the noise during SNV detection.


Key competencies, skill sets and attributes needed for this scope of work:

  • Enrolled in a graduate program in computer science, applied math, bioinformatics, or related programs.
  • Course work and research activities in Machine Learning,
  • Course work and research activities in statistical modeling
  • Experience in programming in R and Python
  • Graduate level knowledge in algorithms
  • Familiarity with Linux operating system




Monday through Friday, 8:00-5:00