Run:AI Case Studies How one company went from 28% GPU utilization to 73% with Run:ai

Edit This Case Study Record

	How one company went from 28% GPU utilization to 73% with Run:ai Run:AI

How one company went from 28% GPU utilization to 73% with Run:ai

Run:AI

Technology Category	Analytics & Modeling - Machine Learning Application Infrastructure & Middleware - API Integration & Management
Applicable Industries	Software Telecommunications
Applicable Functions	Business Operation Product Research & Development
Use Cases	Computer Vision Predictive Maintenance
Services	Data Science Services System Integration
Challenge	The company, a world leader in facial recognition technologies, was facing several challenges with their GPU utilization. They were unable to successfully share resources across teams and projects due to static allocation of GPU resources, which led to bottlenecks and inaccessible infrastructure. The lack of visibility and management of available resources was slowing down their jobs. Despite the low utilization of existing hardware, visibility issues and bottlenecks made it seem like additional hardware was necessary, leading to increased costs. The company was considering an additional GPU investment with a planned hardware purchase cost of over $1 million dollars. Read More
About Customer	The customer is a multinational company that is a world leader in facial recognition technologies. They provide AI services to many large enterprises, often in real-time. Accuracy, measured in terms of maximizing performance of camera resolution and FPS, density of faces, and field of view are critically important to the company and their customers. They have an on-premises environment with 24 Nvidia DGX servers and additional GPU workstations, and a team of 30 researchers spread across two continents. Read More
Solution	The company implemented Run:ai's platform to address their challenges. The platform increased GPU utilization by moving teams from static, manual GPU allocations to pooled, dynamic resource sharing across the organization. It also increased productivity for the data science teams using hardware abstraction, simplified workflows, and automated GPU resource allocations. The platform provided visibility into the GPU cluster, its utilization, usage patterns, wait times, etc., allowing the company to better plan hardware spending. Furthermore, it accelerated training times, using automated, dynamic allocation of resources which enabled the data science teams to complete training processes significantly faster. Read More Log in to view content
Contents

Technology Category

Analytics & Modeling - Machine Learning

Application Infrastructure & Middleware - API Integration & Management

Applicable Industries

Software

Telecommunications

Applicable Functions

Business Operation

Product Research & Development

Use Cases

Computer Vision

Predictive Maintenance

Services

Data Science Services

System Integration

Challenge

The company, a world leader in facial recognition technologies, was facing several challenges with their GPU utilization. They were unable to successfully share resources across teams and projects due to static allocation of GPU resources, which led to bottlenecks and inaccessible infrastructure. The lack of visibility and management of available resources was slowing down their jobs. Despite the low utilization of existing hardware, visibility issues and bottlenecks made it seem like additional hardware was necessary, leading to increased costs. The company was considering an additional GPU investment with a planned hardware purchase cost of over $1 million dollars.

About Customer

The customer is a multinational company that is a world leader in facial recognition technologies. They provide AI services to many large enterprises, often in real-time. Accuracy, measured in terms of maximizing performance of camera resolution and FPS, density of faces, and field of view are critically important to the company and their customers. They have an on-premises environment with 24 Nvidia DGX servers and additional GPU workstations, and a team of 30 researchers spread across two continents.

Solution

The company implemented Run:ai's platform to address their challenges. The platform increased GPU utilization by moving teams from static, manual GPU allocations to pooled, dynamic resource sharing across the organization. It also increased productivity for the data science teams using hardware abstraction, simplified workflows, and automated GPU resource allocations. The platform provided visibility into the GPU cluster, its utilization, usage patterns, wait times, etc., allowing the company to better plan hardware spending. Furthermore, it accelerated training times, using automated, dynamic allocation of resources which enabled the data science teams to complete training processes significantly faster.

Impact #1	The company managed to go from 28% GPU utilization to optimization of over 70%.
Impact #2	They achieved a 2X increase in the speed of their training models.
Impact #3	The data science teams simplified GPU utilization workflows and increased productivity by 2X, allowing them to more quickly deliver value with deep learning models.

Impact #1

The company managed to go from 28% GPU utilization to optimization of over 70%.

Impact #2

They achieved a 2X increase in the speed of their training models.

Impact #3

The data science teams simplified GPU utilization workflows and increased productivity by 2X, allowing them to more quickly deliver value with deep learning models.

Benefit #1	70% Average GPU Utilization, leading to higher ROI.
Benefit #2	2X Experiments per GPU, leading to better Data Science.
Benefit #3	Multi-GPU Training by Default, leading to Faster Time to Value.

Benefit #1

70% Average GPU Utilization, leading to higher ROI.

Benefit #2

2X Experiments per GPU, leading to better Data Science.

Benefit #3

Multi-GPU Training by Default, leading to Faster Time to Value.

Download PDF Version

Overview

How one company went from 28% GPU utilization to 73% with Run:ai

Operational Impact

Quantitative Benefit