Case Studies Building Karmic’s Data Infrastructure
Edit This Case Study Record

Building Karmic’s Data Infrastructure

Analytics & Modeling - Big Data Analytics
Application Infrastructure & Middleware - Data Exchange & Integration
Platform as a Service (PaaS) - Data Management Platforms
Finance & Insurance
Retail
Business Operation
Sales & Marketing
Cloud Planning, Design & Implementation Services
Software Design & Engineering Services
System Integration
One of the biggest challenges Yang faced was in choosing and leveraging third-party tools. “How do you weigh vendors when you don’t really know what your needs are, and how those needs will change over time?” For a data warehousing solution, Yang ended up siding with Amazon Redshift, as it met all of his needs for storage, speed, and security. But to get data into Redshift, he needed an ETL solution to match it. Stitch Data was the first provider that caught his eye and that he later implemented - but it wasn’t long before his team outgrew it. “Plug-and-play tools like Stitch work great for straightforward workflows, but we needed more customization and access under the hood to not only comply with our security requirements, but also stay competitive with companies that have more developed data infrastructures” said Yang. “The fact that we didn’t have control over transformations forced us to consider other, more comprehensive options.”
Read More
Karmic Labs delivers the future of expense management with a platform for employers, banks, and retailers to manage debit card and fund distribution amongst their customers and members. At a time of momentous growth for Karmic, their ability to build a strong, scalable data infrastructure became increasingly critical. Echoing what Airbnb refers to as “Data Democratization,” Karmic’s Data Science Product Manager, Yang Wang, explains that, “the more accessible data is, the faster we can iterate, and the further we can get in the game.” Yang joined Karmic when that data infrastructure was largely nonexistent, but it soon became one of this team’s highest priorities to fill that gap. “The second you build a software, you want to know what works and what doesn’t. We desperately needed more high-level analysis,” he said.
Read More
In his research for other options, Yang came across Astronomer’s Managed Apache Airflow module. While he hadn’t heard of Apache Airflow, his research proved that the open-source software had a strong community behind it and was a good fit for the job. “There were no other managed Airflow services out there, and we didn’t have the DevOps resources to run it ourselves” he said. Not long thereafter, he migrated his workflows to our Cloud platform. Karmic now uses Apache Airflow on Astronomer to sync their application database (Postgres) to their data warehouse (Amazon Redshift). Directly on our platform, Yang created a dynamic workflow that both automates that process and complies with Karmic’s security requirements. Due to the sensitive nature of their business, Karmic requires a whitelisted IP and SSH for some database connections. Since Astronomer’s Cloud Airflow service runs in a serverless architecture where each task instance runs in a separate container, there was no immediately obvious place to store the key file needed for an SSH connection (in this case, for Postgres). But by working with Astronomer, Karmic was able to configure a custom Airflow hook that opens an SSH tunnel in each task instance that requires access to that database - and closes that tunnel once the task finishes.
Read More
With Astronomer, Karmic can trust that their data in Redshift remains reliable for both external and internal reporting.
At an organizational level, Astronomer allows Yang to fulfill a two-pronged goal: to make sure that data is widely available and, more importantly, accessible.
For Karmic, having reliable data in Redshift is the gateway to leveraging complementary analytics tools used by the rest of the team.
Download PDF Version
test test