Gathr Case Studies Real-time Multi-lingual Classification and Sentiment Analysis of Text
Edit This Case Study Record
Gathr Logo

Real-time Multi-lingual Classification and Sentiment Analysis of Text

Gathr
Analytics & Modeling - Big Data Analytics
Analytics & Modeling - Natural Language Processing (NLP)
Telecommunications
Business Operation
Data Science Services
The client, a major telecom company providing nationwide telecom services, was in need of a system that could perform real-time, multi-lingual classification and sentiment analysis of text data. They were looking for a solution that allows storing, indexing, and querying PetaBytes (PBs) of data with a very high throughput. The critical requirements included the ability to ingest and parse a high volume of data [250M (15 TB) records/day] of varied types such as weblogs, email, chat, and files. They also needed to apply real-time multi-lingual classification and sentiment analysis with very high accuracy (four nines), store metadata and raw binary data for querying, and meet a Query SLA of 5s on cold data.
Read More
The customer in this case study is a major telecom company that provides nationwide telecom services. They were in need of a system that could perform real-time, multi-lingual classification and sentiment analysis of text data. The company deals with a high volume of data of varied types such as weblogs, email, chat, and files. They required a solution that could ingest and parse this data, apply real-time multi-lingual classification and sentiment analysis with very high accuracy, store metadata and raw binary data for querying, and meet a Query SLA of 5s on cold data.
Read More
Impetus provided a solution that consisted of three modules. The Analytics Module was responsible for performing text categorization and sentiment analysis. It implemented a matrix decomposition-based text-classification algorithm. The incoming test document had to pass through a series of pre-processing and numerical computations. Impetus designed the classifier to achieve very low latency. The Event Store/Indexer Abstraction Layer was responsible for storing and indexing the information based on the configuration. The Publish Module was responsible for publishing the analytical result or event data to the external system.
Read More
Rapid and accurate real-time text categorization and sentiment analysis
Adjustable text categorization for domain-specific classes
Multi-lingual support
Ingest and parse high volume of data [250M (15 TB) records/day]
Achieved very high accuracy (four nines) in real-time multi-lingual classification and sentiment analysis
Met Query SLA of 5s on cold data
Download PDF Version
test test