Case Studies Georgetown University's CSET Accelerates AI Development with Snorkel Flow

Edit This Case Study Record

	Georgetown University's CSET Accelerates AI Development with Snorkel Flow

Georgetown University's CSET Accelerates AI Development with Snorkel Flow

Technology Category	Analytics & Modeling - Machine Learning Analytics & Modeling - Natural Language Processing (NLP)
Applicable Industries	Education Software
Applicable Functions	Product Research & Development Quality Assurance
Use Cases	Automated Disease Diagnosis Predictive Quality Analytics
Services	Data Science Services Software Design & Engineering Services
Challenge	CSET's data science team wanted to build NLP applications that classify scientific articles such as technical papers on virology. The team realized that manual labeling workflows would be impractical for the task. Read More
About Customer	Georgetown University's Center for Security and Emerging Technology (CSET) is a leading research organization focused on studying the security implications of emerging technologies. The center aims to provide policymakers with data-driven analysis and recommendations to navigate the complex landscape of technological advancements. CSET's data science team is dedicated to developing advanced AI models to classify and analyze scientific articles, particularly in the field of virology. The team faced significant challenges in managing the manual labeling workflows required for their NLP applications, which led them to seek more efficient and scalable solutions. Read More
Solution	With the help of integrated analysis tools within Snorkel Flow, the team was able to pinpoint data slices for domain expert spot-checks and troubleshooting to improve accuracy. This approach powered an active learning workflow, significantly enhancing the efficiency of their labeling process. Snorkel Flow's advanced features, such as autosuggest and cluster labeling functions (LFs), enabled the team to create 107,000 programmatic labels. This not only reduced the labeling time by 50% but also improved productivity and accuracy. Within days, the team achieved 85% accuracy on a classification model, demonstrating the effectiveness of the solution. Read More Log in to view content
Contents

Technology Category

Analytics & Modeling - Machine Learning

Analytics & Modeling - Natural Language Processing (NLP)

Applicable Industries

Education

Software

Applicable Functions

Product Research & Development

Quality Assurance

Use Cases

Automated Disease Diagnosis

Predictive Quality Analytics

Services

Data Science Services

Software Design & Engineering Services

Challenge

CSET's data science team wanted to build NLP applications that classify scientific articles such as technical papers on virology. The team realized that manual labeling workflows would be impractical for the task.

About Customer

Georgetown University's Center for Security and Emerging Technology (CSET) is a leading research organization focused on studying the security implications of emerging technologies. The center aims to provide policymakers with data-driven analysis and recommendations to navigate the complex landscape of technological advancements. CSET's data science team is dedicated to developing advanced AI models to classify and analyze scientific articles, particularly in the field of virology. The team faced significant challenges in managing the manual labeling workflows required for their NLP applications, which led them to seek more efficient and scalable solutions.

Solution

With the help of integrated analysis tools within Snorkel Flow, the team was able to pinpoint data slices for domain expert spot-checks and troubleshooting to improve accuracy. This approach powered an active learning workflow, significantly enhancing the efficiency of their labeling process. Snorkel Flow's advanced features, such as autosuggest and cluster labeling functions (LFs), enabled the team to create 107,000 programmatic labels. This not only reduced the labeling time by 50% but also improved productivity and accuracy. Within days, the team achieved 85% accuracy on a classification model, demonstrating the effectiveness of the solution.

Impact #1	The team was able to create 107,000 programmatic labels using Snorkel Flow's advanced features like autosuggest and cluster LFs.
Impact #2	There was a 50% reduction in labeling time, which significantly improved the team's productivity.
Impact #3	The active learning workflow allowed for pinpointing data slices for domain expert spot-checks and troubleshooting, enhancing the accuracy of the models.

Impact #1

The team was able to create 107,000 programmatic labels using Snorkel Flow's advanced features like autosuggest and cluster LFs.

Impact #2

There was a 50% reduction in labeling time, which significantly improved the team's productivity.

Impact #3

The active learning workflow allowed for pinpointing data slices for domain expert spot-checks and troubleshooting, enhancing the accuracy of the models.

Benefit #1	107,000 programmatic labels created.
Benefit #2	50% reduction in labeling time.
Benefit #3	85% accuracy on a classification model within days.

Benefit #1

107,000 programmatic labels created.

Benefit #2

50% reduction in labeling time.

Benefit #3

85% accuracy on a classification model within days.

Download PDF Version

Overview

Georgetown University's CSET Accelerates AI Development with Snorkel Flow

Operational Impact

Quantitative Benefit