Case Studies Georgetown University's CSET Accelerates AI Development with Snorkel Flow
Edit This Case Study Record

Georgetown University's CSET Accelerates AI Development with Snorkel Flow

Analytics & Modeling - Machine Learning
Analytics & Modeling - Natural Language Processing (NLP)
Education
Software
Product Research & Development
Quality Assurance
Automated Disease Diagnosis
Predictive Quality Analytics
Data Science Services
Software Design & Engineering Services
CSET's data science team wanted to build NLP applications that classify scientific articles such as technical papers on virology. The team realized that manual labeling workflows would be impractical for the task.
Read More
Georgetown University's Center for Security and Emerging Technology (CSET) is a leading research organization focused on studying the security implications of emerging technologies. The center aims to provide policymakers with data-driven analysis and recommendations to navigate the complex landscape of technological advancements. CSET's data science team is dedicated to developing advanced AI models to classify and analyze scientific articles, particularly in the field of virology. The team faced significant challenges in managing the manual labeling workflows required for their NLP applications, which led them to seek more efficient and scalable solutions.
Read More
With the help of integrated analysis tools within Snorkel Flow, the team was able to pinpoint data slices for domain expert spot-checks and troubleshooting to improve accuracy. This approach powered an active learning workflow, significantly enhancing the efficiency of their labeling process. Snorkel Flow's advanced features, such as autosuggest and cluster labeling functions (LFs), enabled the team to create 107,000 programmatic labels. This not only reduced the labeling time by 50% but also improved productivity and accuracy. Within days, the team achieved 85% accuracy on a classification model, demonstrating the effectiveness of the solution.
Read More
The team was able to create 107,000 programmatic labels using Snorkel Flow's advanced features like autosuggest and cluster LFs.
There was a 50% reduction in labeling time, which significantly improved the team's productivity.
The active learning workflow allowed for pinpointing data slices for domain expert spot-checks and troubleshooting, enhancing the accuracy of the models.
107,000 programmatic labels created.
50% reduction in labeling time.
85% accuracy on a classification model within days.
Download PDF Version
test test