Snorkel AI Case Studies Accelerating NLP Application Development with Foundation Models: A Pixability Case Study
Edit This Case Study Record
Snorkel AI Logo

Accelerating NLP Application Development with Foundation Models: A Pixability Case Study

Snorkel AI
Analytics & Modeling - Machine Learning
Analytics & Modeling - Natural Language Processing (NLP)
Cement
Education
Product Research & Development
Warehouse & Inventory Management
Chatbots
Virtual Training
Data Science Services
Training
Pixability, a data and technology company, provides advertisers with the ability to accurately target content and audiences on YouTube. However, with over 700 million hours of YouTube content being watched daily, Pixability faced the challenge of continuously and accurately categorizing billions of videos to ensure ads run on brand-suitable content. Their existing natural language processing (NLP) model for classifying videos was not performing strongly enough. The process of labeling training data for the machine learning solution was slow due to reliance on external data labeling services that required multiple iterations. Collaboration was constrained due to limited time domain experts and data scientists had to solve for ambiguous labels. Additionally, valuable information within titles, descriptions, content, and tags was difficult to normalize.
Read More
Pixability is a data and technology company that enables advertisers to accurately target the right content and audience on YouTube. They use machine learning to automatically identify and categorize YouTube content, helping advertisers maximize their reach with suitable content and optimize ad spend. Pixability's services are crucial for brands looking to maximize their reach on YouTube, a platform where viewers watch over 700 million hours of content daily. By providing granular insights into the suitability of content for brand alignment, Pixability helps advertisers ensure their ads are seen by the right audience, thereby improving the return on their video ad spend.
Read More
Pixability turned to Snorkel Flow’s Data-centric Foundation Model Development workflow to build an NLP application in less time than it took a third-party data labeling service to label a single dataset. This workflow allowed Pixability to scale up the number of classes they could classify to over 600 while also increasing model accuracy to over 90%. The team used Snorkel Flow’s Foundation Model Warm Start with zero-shot learning to jump-start training data creation. They then used Foundation Model Prompt Builder to develop and refine prompts to correct out-of-the-box FM errors and pull more domain-specific knowledge from various FMs. They created prompts that asked the FM to classify videos based on the description. This programmatic approach to labeling data using knowledge from foundation models generated 500,000 labeled training data points that were used to train a model with 90% accuracy. The team was also able to unlock multi-label NLP capabilities, providing more specific classifications for videos.
Read More
By leveraging Snorkel Flow’s Data-centric Foundation Model Development workflow, Pixability was able to create a model in weeks instead of months. This not only accelerated their product roadmap by several months but also unlocked new capabilities that will help them provide deeper insights and improved services to their customers. The programmatic approach to labeling data in-house gave the Pixability team greater control over their NLP training data creation and rapid iteration, freeing the capacity to expand to more use cases. The increased granularity of video classification, from broad categories like 'sports' to more specific ones like 'basketball' or 'hockey', allows Pixability to better place their customers’ ads on the most suitable YouTube content, thereby improving the return on customer video ad spend and satisfaction with Pixability’s services.
Built an NLP application in less time than it took a third-party data labeling service to label a single dataset.
Scaled up the number of classes they could classify to over 600.
Increased model accuracy to over 90%.
Download PDF Version
test test