Gathr Case Studies Power massive scale, real-time data processing by modernizing legacy ETL frameworks
Edit This Case Study Record
Gathr Logo

Power massive scale, real-time data processing by modernizing legacy ETL frameworks

Gathr
Analytics & Modeling - Real Time Analytics
Application Infrastructure & Middleware - Data Exchange & Integration
Platform as a Service (PaaS) - Data Management Platforms
Security & Public Safety
Discrete Manufacturing
Quality Assurance
Edge Computing & Edge Intelligence
Predictive Maintenance
Real-Time Location System (RTLS)
Cloud Planning, Design & Implementation Services
Data Science Services
Enterprises need to analyze large volumes of data from various sources in real-time to make strategic business decisions. They often create custom frameworks to process these large data sets, which can lead to technical debt and dependency on IT teams who understand the historical choices made during the initial platform designs. This can risk impacting businesses and increase customization costs. The customer, a leading security and intelligence software provider, wanted to modernize their existing big data applications. They were looking for an easy-to-use and scalable solution that could process 1.5 billion transactions generated per day from multiple real-time feeds. They needed a near-zero-code solution for ETL processing jobs that could perform real-time ingestion and complex processing, ensure high throughput while indexing and storing, and detect anomalies in transactions.
Read More
The customer is a leading security and intelligence software provider. They focus on creating powerful intelligence and investigation technologies for federal and state-level security agencies. Their solutions enable the security agencies to understand the cyber threats through intercepting communication data, data integration, and advanced data analytics by leveraging artificial intelligence models on big data. They were looking to modernize their existing big data applications and needed a scalable solution that could process 1.5 billion transactions generated per day from multiple real-time feeds.
Read More
The customer implemented applications that run on a scalable Spark compute engine as structured streaming data pipelines using Gathr. Gathr's vast library of components for data acquisition, processing, enrichment, and storage was used for the ETL solution. The entire data flow was created and orchestrated in Gathr's Web Studio using a low-code methodology. Key technologies and components involved were Kafka for streaming data in real-time, Gathr’s out-of-the-box ETL components for data processing, and Gathr processors and storage components to create a Polyglot architecture. ClearInsight, a rapid application development platform, was also used. The new solution replaced the legacy solution built using various real-time and non-blocking I/O processing frameworks and addressed challenges such as lengthy development cycles, inability to alter processing behavior via configuration changes, time-consuming debugging and rectification process, lack of version management and hot-swap features, complex operations, and stringent SLAs on data availability and query results.
Read More
Replaced roughly ~1 million lines of code in ~3 weeks using Gathr frameworks.
Achieved a high throughput of 100000+ transactions/second, enabling processing of 1.5 billion records per day.
Reduced the overall release cycle from 8 months to 8 weeks.
Reduced codebase by approximately 1 million lines.
Increased throughput to 100000+ transactions/second.
Reduced release cycle from 8 months to 8 weeks.
Download PDF Version
test test