Data Labeling for Medical Text

NLP, named entity recognition, and other AI methods obtain powerful insights from unstructured medical data stored as text, such as EMRs, de-identified patient information, and scientific literature.

Our annotation services can quickly tag thousands of text strings, conversations, paragraphs and more, organizing large volumes of data for AI applications. Examples include classifying research citations, extracting medical misinformation from social media sites, and determining drug-target relationships in pharmacological literature. 

Data Types

Text
Text
Text

Case Studies | 

Data Labeling for Medical Text

Dataseer

Dataseer

Dataseer’s vision is a world where full research datasets—not just the final paper—are made available to the scientific community worldwide. Centaur Labs provided the skilled network to label data types across dozens of domains, in over 20,000 snippets from academic papers. The speed of Centaur’s labeling pipeline allowed Dataseer to augment rare classes in their training set quickly and confidently.

Factmata

Factmata

Factmata teamed up with Centaur Labs to train their content moderation algorithm to identify medically harmful misinformation related to COVID-19. For a dataset of over 4,000 tweets, Centaur Labs collected over 14 opinions per tweet in less than one week. Factmata’s non-linear SVM classifier achieved an initial accuracy of 68% with this input.

scite

scite

scite is developing a smarter citation metric, and requires thousands of classified snippets of academic papers to do so. Centaur matched up over 7,000 text citation snippets with a skilled labeling force capable of extracting nuance from noise. What would have taken scite’s internal labeling team of PhDs months to perform was completed in a few weeks.

Customer Testimonials

Centaur Labs is an amazing resource. So amazing that I almost don't want others to know about it because of how easy they make it to build up high quality training data. Their user base combined with their application allowed us to easily and efficiently scale up our training data in a very controlled way that ensured it was high quality based on metrics we cared about in-house. I would without hesitation recommend their service, so long as you are not doing what we are doing!

Josh Nicholson
Co-Founder and CEO
scite.ai

I get pitched on annotation tools all the time and Centaur is simply the best ... they offer you the accuracy and sophistication of medical experts at the price and speed of Mechanical Turk.

Dhruv Gulati
CEO
FactMata

Explore more labeling solutions

Get started with Centaur Labs today