Case Studies
    ANDOR
  • (5,807)
    • (2,609)
    • (1,767)
    • (765)
    • (625)
    • (301)
    • (237)
    • (163)
    • (155)
    • (101)
    • (94)
    • (87)
    • (49)
    • (28)
    • (14)
    • (2)
    • View all
  • (5,166)
    • (2,533)
    • (1,338)
    • (761)
    • (490)
    • (437)
    • (345)
    • (86)
    • (1)
    • View all
  • (4,457)
    • (1,809)
    • (1,307)
    • (480)
    • (428)
    • (424)
    • (361)
    • (272)
    • (211)
    • (199)
    • (195)
    • (41)
    • (8)
    • (8)
    • (5)
    • (1)
    • View all
  • (4,164)
    • (2,055)
    • (1,256)
    • (926)
    • (169)
    • (9)
    • View all
  • (2,495)
    • (1,263)
    • (472)
    • (342)
    • (227)
    • (181)
    • (150)
    • (142)
    • (140)
    • (129)
    • (99)
    • View all
  • View all 15 Technologies
    ANDOR
  • (1,744)
  • (1,638)
  • (1,622)
  • (1,463)
  • (1,443)
  • (1,412)
  • (1,316)
  • (1,178)
  • (1,061)
  • (1,023)
  • (838)
  • (815)
  • (799)
  • (721)
  • (633)
  • (607)
  • (600)
  • (552)
  • (507)
  • (443)
  • (383)
  • (351)
  • (316)
  • (306)
  • (299)
  • (265)
  • (237)
  • (193)
  • (193)
  • (184)
  • (168)
  • (165)
  • (127)
  • (117)
  • (116)
  • (81)
  • (80)
  • (64)
  • (58)
  • (56)
  • (23)
  • (9)
  • View all 42 Industries
    ANDOR
  • (5,826)
  • (4,167)
  • (3,100)
  • (2,784)
  • (2,671)
  • (1,598)
  • (1,477)
  • (1,301)
  • (1,024)
  • (970)
  • (804)
  • (253)
  • (203)
  • View all 13 Functional Areas
    ANDOR
  • (2,573)
  • (2,489)
  • (1,873)
  • (1,561)
  • (1,553)
  • (1,531)
  • (1,128)
  • (1,029)
  • (910)
  • (696)
  • (647)
  • (624)
  • (610)
  • (537)
  • (521)
  • (515)
  • (493)
  • (425)
  • (405)
  • (365)
  • (351)
  • (348)
  • (345)
  • (317)
  • (313)
  • (293)
  • (272)
  • (244)
  • (241)
  • (238)
  • (237)
  • (217)
  • (214)
  • (211)
  • (207)
  • (207)
  • (202)
  • (191)
  • (188)
  • (182)
  • (181)
  • (175)
  • (160)
  • (156)
  • (144)
  • (143)
  • (142)
  • (142)
  • (141)
  • (138)
  • (120)
  • (119)
  • (118)
  • (116)
  • (114)
  • (108)
  • (107)
  • (99)
  • (97)
  • (96)
  • (96)
  • (90)
  • (88)
  • (87)
  • (85)
  • (83)
  • (82)
  • (81)
  • (80)
  • (73)
  • (67)
  • (66)
  • (64)
  • (61)
  • (61)
  • (59)
  • (59)
  • (59)
  • (57)
  • (53)
  • (53)
  • (50)
  • (49)
  • (48)
  • (44)
  • (39)
  • (36)
  • (36)
  • (35)
  • (32)
  • (31)
  • (30)
  • (29)
  • (27)
  • (27)
  • (26)
  • (26)
  • (26)
  • (22)
  • (22)
  • (21)
  • (19)
  • (19)
  • (19)
  • (18)
  • (17)
  • (17)
  • (16)
  • (14)
  • (13)
  • (13)
  • (12)
  • (11)
  • (11)
  • (11)
  • (9)
  • (7)
  • (6)
  • (5)
  • (4)
  • (4)
  • (3)
  • (2)
  • (2)
  • (2)
  • (2)
  • (1)
  • View all 127 Use Cases
    ANDOR
  • (10,416)
  • (3,525)
  • (3,404)
  • (2,998)
  • (2,615)
  • (1,261)
  • (932)
  • (347)
  • (10)
  • View all 9 Services
    ANDOR
  • (507)
  • (432)
  • (382)
  • (304)
  • (246)
  • (143)
  • (116)
  • (112)
  • (106)
  • (87)
  • (85)
  • (78)
  • (75)
  • (73)
  • (72)
  • (69)
  • (69)
  • (67)
  • (65)
  • (65)
  • (64)
  • (62)
  • (58)
  • (55)
  • (54)
  • (54)
  • (53)
  • (53)
  • (52)
  • (52)
  • (51)
  • (50)
  • (50)
  • (49)
  • (47)
  • (46)
  • (43)
  • (42)
  • (37)
  • (35)
  • (32)
  • (31)
  • (31)
  • (30)
  • (30)
  • (28)
  • (27)
  • (24)
  • (24)
  • (23)
  • (23)
  • (22)
  • (22)
  • (21)
  • (20)
  • (20)
  • (19)
  • (19)
  • (19)
  • (19)
  • (18)
  • (18)
  • (18)
  • (18)
  • (17)
  • (17)
  • (16)
  • (16)
  • (16)
  • (16)
  • (16)
  • (16)
  • (16)
  • (16)
  • (15)
  • (15)
  • (14)
  • (14)
  • (14)
  • (14)
  • (14)
  • (14)
  • (14)
  • (13)
  • (13)
  • (13)
  • (13)
  • (13)
  • (13)
  • (13)
  • (13)
  • (13)
  • (12)
  • (12)
  • (12)
  • (12)
  • (12)
  • (12)
  • (11)
  • (11)
  • (11)
  • (11)
  • (11)
  • (11)
  • (11)
  • (11)
  • (11)
  • (11)
  • (10)
  • (10)
  • (10)
  • (10)
  • (9)
  • (9)
  • (9)
  • (9)
  • (9)
  • (9)
  • (9)
  • (9)
  • (9)
  • (9)
  • (9)
  • (9)
  • (9)
  • (8)
  • (8)
  • (8)
  • (8)
  • (8)
  • (8)
  • (8)
  • (8)
  • (8)
  • (8)
  • (7)
  • (7)
  • (7)
  • (7)
  • (7)
  • (7)
  • (7)
  • (7)
  • (7)
  • (7)
  • (7)
  • (7)
  • (7)
  • (7)
  • (7)
  • (7)
  • (7)
  • (7)
  • (7)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (6)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (5)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (4)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (3)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (2)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • (1)
  • View all 731 Suppliers
Connect?
Please feel encouraged to schedule a call with us:
Schedule a Call
Or directly send us an email:
19,090 case studies
Case Study missing? Just let us know via Add New Case Study.
19,090 Case Studies Selected
USD 0.00
Buy This List
Compare
Sort by:
Scale’s Synthetic Data Enhances Kaleido AI's Visual AI Capabilities
Kaleido AI, a Vienna-based company, is dedicated to simplifying complex technology by creating tools that accelerate workflows and foster creativity. The company introduced remove.bg, an automatic image background remover, and Unscreen, a video background remover, which gained immense popularity and led to its acquisition by Canva in 2021. However, Kaleido AI faced a significant challenge in improving its machine learning models. The company's models required a large volume of high-quality data, but they encountered several edge cases in a specific segmentation task where their model performed poorly. Collecting and labeling tens of thousands of real-world images with a large diversity of patterns, images, backgrounds, and textures was difficult. Open datasets did not have enough high-quality images of this particular class. Kaleido AI initially relied on real-world data to train its segmentation models, but this approach was complex, resource-intensive, and costly.
Download PDF
Enhancing Autonomous Trucking with Synthetic Data: A Kodiak Robotics Case Study
Kodiak Robotics, an autonomous technology company, is developing self-driving capabilities for the long-haul trucking industry. The company uses a unique sensor fusion system and a lightweight mapping solution to navigate highway driving and deliver freight efficiently. However, the company faced a significant challenge in training its software to handle rare scenarios, such as pedestrians walking on the highway. These edge cases are crucial for a production-level autonomous vehicle system, but collecting enough real-world examples to train the models reliably was proving difficult.
Download PDF
Nuro Enhances Autonomous Vehicle Safety with Nucleus Object Autotag
Nuro, a robotics company specializing in autonomous vehicles for delivery services, faced a significant challenge in identifying infrequent but meaningful scenarios in their training data. The company's autonomous vehicles, designed to deliver goods from produce to prescriptions, needed to be able to identify and respond to a variety of obstacles, including pedestrians in unusual postures, animals, occluded and backlit pedestrians, and infrequently encountered vehicles such as excavators. However, these labels were not present in the ground truth of their training data. The company's internal tool was only able to identify a limited number of these scenarios, falling short of the thousands of images that needed to be identified and labeled for comprehensive training of their autonomous vehicles.
Download PDF
Copymint Prevention for NFT Marketplaces: A Case Study on OpenSea
OpenSea, the world's leading marketplace for non-fungible tokens (NFTs), was facing a significant challenge in detecting and mitigating copymints and fraud. Copymints are duplicates or imitations of popular NFTs, which can deceive users, especially those new to the world of NFTs. Trust and safety are crucial for welcoming new people into the Web3 ecosystem, and OpenSea was looking for a vendor to help advance their detection and removal capabilities. The team had already used rule-based systems to capture forms of deception, but it was a challenge to achieve the desired speed, recall, and precision needed to effectively address fraud in the marketplace.
Download PDF
Orchard Robotics Leverages Scale Rapid for Precision Crop Management
Orchard Robotics, a company providing AI-first precision crop management solutions to farmers, faced a significant challenge in collecting and utilizing precision data across vast commercial orchards. The company developed tractor-mounted, AI-powered camera systems to collect precision data about every tree. However, the company needed to accurately count every fruit on every tree, a task that proved to be incredibly difficult and tedious, especially when the fruit was small. As a small team, Orchard Robotics struggled to scale these annotations in-house. They initially tried using three other major data-labeling services, but they could not achieve the consistent quality they needed. The quality varied dramatically between batches, and they could not provide feedback to the annotators on the quality of the labels. These platforms also did not offer ellipses as an annotation type, forcing Orchard Robotics to rely on bounding boxes, a less-than-ideal option when labeling spherical fruit.
Download PDF
Enhancing Accounts Payable Training Data with Scale Document AI: A Case Study on SAP
SAP, a leading software corporation, was facing a challenge in improving its products around document processing, particularly those dealing with invoices, purchase orders, and payment advices. The team had a vast collection of customer documents but required a partner to create a comprehensive dataset to enhance their accounts payable products while respecting data ownership, privacy, and sensitivity. The need for high-quality data was paramount for performant models. SAP needed superior quality training data to train models for processing and extracting crucial information from purchase orders and invoices in English, German, and Spanish. The variability in customer data, with some providing thousands of documents a week and others taking months for a fraction of the same volume, added to the complexity of the challenge.
Download PDF
Enhancing Log Scaling and Inventory Management with Scale Rapid
The TimberEye team faced a significant challenge in enhancing their mobile application's log scaling capabilities. The app, which uses computer vision and LiDAR mapping technology, was designed to help lumber suppliers and buyers categorize and scale logs faster, more safely, and with better accuracy. However, the team wanted to experiment with an instance segmentation model to further improve the app's scaling capabilities. The process of annotating images for segmentation proved to be a daunting task. TimberEye CEO and Founder Scott Gregg attempted to annotate a segmentation dataset on his own, but after three days and only 1,000 images labeled, he was burned out. The process was significantly more challenging and time-consuming than annotating images for object detection, requiring 100-200 mouse clicks per image instead of just 4. The team was overwhelmed and stuck, with only 5% of the dataset they needed to annotate complete.
Download PDF
Velodyne's Use of Scale Nucleus for Efficient Data Annotation in 3D Lidar Technology
Velodyne Lidar, a company that builds lidar sensors for safe navigation and autonomy across various industries, was facing a challenge in managing and selecting relevant training data from the large quantities of sensor data they collected. The data team found it relatively easy to classify common indoor robotics scenes as these scenarios made up a large portion of the datasets captured on their test robots. However, finding rarer scenarios, such as a warehouse employee stacking boxes on the top of a scissor lift, proved to be a difficult task. The team was in need of an out-of-the-box solution that could provide the necessary tools for efficient data selection and management.
Download PDF
Vistapath's Partnership with Scale Studio: Enhancing Patient Experience through Next-Generation Pathology Lab
Vistapath, a pathology lab, was facing a significant challenge in the grossing process, a critical step in diagnosing diseases like cancer. Grossing involves assessing and documenting the physical characteristics of tissue samples, a process that is prone to human error and can lead to misdiagnoses. Vistapath aimed to reduce these errors by leveraging computer vision and artificial intelligence. However, they faced a problem in developing a robust tissue detection model. The model required hundreds to thousands of accurately annotated images, a task that required a tool that could be easily used by their histologists and experts. Initially, Vistapath used an open-source annotation tool, but it lacked automation and scalability. They then tried a tool with more automation, but it failed to meet their security and compliance requirements. Therefore, Vistapath needed a partner who could provide an annotation automation tool that could meet their strict security and compliance requirements.
Download PDF
Voxel's Transformation: Enhancing In-house Labeling Operations for High-Quality Training Data
Voxel, a company leveraging AI and computer vision to manage risk and operations, faced two significant challenges. Firstly, they needed to maintain high-quality training data for their computer vision system. Secondly, they sought to automate their labeling process for faster throughput while retaining their in-house annotation team. Voxel had already invested in an in-house annotation team of subject matter experts, but they were struggling with efficiency in their labeling operations. They had been using an open-source solution, Computer Vision Annotation Tool (CVAT), which was causing bottlenecks as they increased the volume of annotations needed for model training. From an operational perspective, Voxel found it difficult to efficiently collect data and insights on the data labeling process, leading to significant manual effort. The tool couldn’t effectively link data quality to individual annotators, making it hard to identify the cause of low-quality labels. On the engineering side, Voxel had to custom-build data pipelines for new customer projects, a process that took multiple engineers four weeks for each project.
Download PDF
Yuka's Rapid Product Database Expansion with Scale Rapid
Yuka, a mobile application that provides health impact information for food products and cosmetics, faced a significant challenge in managing its rapidly growing database. The database, which already contained over 4 million products, was expanding at a rate of approximately 1,200 new products daily. Yuka's small team was unable to manually review each new product added to the platform, a process that often required multiple transcription tasks. The application initially used OCR to scan product images for nutritional information and ingredients, but this process was not always accurate. OCR struggled with images featuring inconsistent lighting, obstructions, or irregular text surfaces. As a result, about 60% of the images submitted to Yuka needed to be outsourced to a human annotator. This was a daunting task for Yuka's small team, especially considering their goal to provide a product's health score within 2-3 hours of its addition to the database.
Download PDF
Big Four Consulting Firm Leverages NLP for Efficient Auditing with Snorkel Flow
A globally renowned consulting firm, with a history spanning over a century, was seeking to enhance its auditing capabilities by leveraging artificial intelligence. The firm's reputation hinged on its ability to conduct thorough audits, irrespective of their size, complexity, or location. The firm's experts were spending significant time manually reviewing various accounting, auditing, and industry information, a process that was both time-consuming and costly. The firm estimated that each auditor search lasted 10 minutes and cost $50-60 on average. The firm's data science team was tasked with streamlining news monitoring to anticipate changes in capital markets, regulatory trends, or technological innovation. They aimed to use custom NLP models to automatically analyze, categorize, and extract key client information from various sources. However, they faced challenges in labeling training data for the machine learning algorithms. It took three experts a week to label 500 training data points, and they found it nearly impossible to adapt to changes in data or business goals on the fly.
Download PDF
Georgetown University’s CSET Leverages Snorkel Flow for NLP Applications in Policy Research
The Center for Security and Emerging Technology (CSET) at Georgetown University was faced with the challenge of building NLP applications to classify complex research documents. The goal was to surface scientific articles of analytic interest to inform data-driven policy recommendations. However, the team found that a large-scale manual labeling effort would be impractical. They initially experimented with the Snorkel Research Project, which allowed them to programmatically label 90K data points within weeks, achieving 77% precision. However, the collaboration between data scientists and subject-matter experts was time-consuming and inefficient, involving spreadsheets, Slack channels, and Python scripts. This workflow made improving data and model quality a slow process. The team was constrained by inefficient tooling to auto-label, gain visibility into data, and improve training data and model quality. The lack of an integrated feedback loop from model training and analysis to labeling also meant that data scientists and subject matter experts had to spend long cycles re-labeling data to match evolving business criteria. These challenges limited the team’s capacity to deliver production-grade models, shorten project timelines, and take on more projects.
Download PDF
Automating KYC Verification with AI: A Case Study of a Global Custodial Bank
A global custodial bank was facing a significant challenge in its Know Your Customer (KYC) process. Analysts and investment managers were spending over 10,000 hours annually reviewing and transcribing 10-Ks, which are critical for verifying a company’s identity, establishing a risk profile, and informing multiple business processes. The bank was processing over 10,000 documents each year, with each document taking 30-90 minutes to review. The process was further complicated by the fact that 10-Ks come in various formats, and if any information was missing or incorrect, analysts had to spend additional time hunting it down. This not only lengthened the customer onboarding process but also gave competitors an opportunity to swoop in. The bank had tried to solve the problem using a rule-based system, but it proved to be rigid and could only identify a narrow scope of information for certain document formats/layouts. The system also required frequent updates due to constant changes in regulations across several regions, which took months to implement.
Download PDF
Scaling Clinical Trial Screening at MSKCC with Snorkel Flow
Memorial Sloan Kettering Cancer Center (MSKCC), the world’s oldest and largest cancer center, was faced with the challenge of identifying patients as candidates for clinical trial studies by classifying the presence of a relevant protein, HER-2. The process of reviewing patient records for HER-2 was laborious and time-consuming as it required clinicians and researchers to sift through complex, variable patient data. The data science team at MSKCC wanted to use AI/ML to classify patient records based on the presence of HER-2, but the lack of labeled training data was a significant bottleneck. Labeling data, especially complex patient records, required clinician and researcher expertise and was prohibitively slow and expensive. Even when experts were able to manually annotate training data, their labels were at times inconsistent, limiting model performance potential.
Download PDF
Accelerating NLP Application Development with Foundation Models: A Pixability Case Study
Pixability, a data and technology company, provides advertisers with the ability to accurately target content and audiences on YouTube. However, with over 700 million hours of YouTube content being watched daily, Pixability faced the challenge of continuously and accurately categorizing billions of videos to ensure ads run on brand-suitable content. Their existing natural language processing (NLP) model for classifying videos was not performing strongly enough. The process of labeling training data for the machine learning solution was slow due to reliance on external data labeling services that required multiple iterations. Collaboration was constrained due to limited time domain experts and data scientists had to solve for ambiguous labels. Additionally, valuable information within titles, descriptions, content, and tags was difficult to normalize.
Download PDF
Enhancing Proactive Well Management: Schlumberger's Use of Snorkel Flow
Schlumberger, a leading provider of technology and services for the energy industry, faced a significant challenge in extracting crucial information from a vast array of daily reports. These reports, ranging from daily drilling reports to well maintenance logs, each had their unique structure and format, making it difficult for Schlumberger’s team to quickly extract the necessary information. The team attempted to automate the information extraction using Named Entity Recognition (NER), but off-the-shelf ML models failed to identify the scientific terms related to the Exploration and Production (E&P) industry. Creating a domain-specific training dataset was time-consuming and not scalable, taking anywhere from 1-3 hours per document. The team needed to identify 18 different industry-specific entities and automatically associate data with these entities. However, the rich information was buried within tabular and raw text in PDFs with varied formatting across reports from different companies. There was also poor collaboration between domain experts and data scientists, with cumbersome file sharing and ad-hoc meetings.
Download PDF
AnyFlexo: Pioneering eCommerce in the Traditional Flexo Printing Industry
AnyFlexo, a B2B e-marketplace based in Estonia, was established to address the challenges faced by the traditional flexo printing industry. The founders, who have been in the business for decades, recognized that the industry was excessively reliant on offline channels and slow to digitalize. This lack of digitalization was hindering transparency, information exchange, and growth, particularly for small and medium-sized players. The founders also faced the 'chicken and egg' dilemma, a common challenge in the marketplace industry. This dilemma refers to the difficulty of balancing the seller-to-customer ratio and deciding whom to approach first. The founders needed to convince sellers to join the platform while also attracting buyers to ensure the platform's success.
Download PDF
Bozinga: A B2B Wholesale Ecommerce Marketplace at Scale
Bozinga, an American online B2B wholesale marketplace, aimed to create a pan-American B2B eCommerce multi-vendor marketplace. The platform was envisioned to connect manufacturers, distributors, service providers, and trading companies, facilitating B2B trade in a cohesive online ecosystem. The company wanted to create a secure platform that would ease and efficiencize online business transactions. A key requirement was an RFQ module that would enable offers and counter-offers to facilitate negotiations. Upon mutual acceptance, bulk orders needed to be processed in accord with the buyer’s requirements. The challenge was to customize a solution that would meet these specific needs and align with the shared business goals.
Download PDF
Leveraging IoT for eCommerce Success: A Case Study of BuyCBDSupps
BuyCBDSupps, an eCommerce startup based in Oregon, US, was founded in March 2020 with the initial plan to create a brand of supplements with CBD. The company aimed to establish a multi-vendor platform where they could sell their own product and also allow other vendors to sell CBD products. However, the founders faced several challenges in the early stages. The process of researching and choosing the right multi-vendor platform was intimidating due to the plethora of options available and the varying opinions and suggestions about them. Additionally, the varying legislation behind selling in various states in the US posed a significant challenge. The founders also had to navigate the complexities of the CBD industry, where there are many restrictions and regulations.
Download PDF
Building Namibia’s Leading eCommerce Brand – DotDune Success Journey
DotDune, an eCommerce platform in Namibia, was faced with the challenge of establishing an online marketplace in a developing country with a small population of approximately 2.5 million people. The lack of interest from international online retailers to unlock the Namibian online shopping opportunity led DotDune to develop an in-house eCommerce multi-vendor business. However, the geographical size of Namibia and its low population density posed a significant challenge for cost-effective parcel delivery. Additionally, the need for substantial customization due to Namibia's unique market conditions added to the complexity of the project. The team was also tasked with the challenge of setting targets and crafting a 5-year strategy to become Namibia’s leading eCommerce marketplace by 2025.
Download PDF
DueDash: Revolutionizing Startup Fundraising with IoT
The founders of DueDash, an online startup fundraising community, faced a significant challenge in their journey to establish a platform that would connect startups with experienced professionals and investors. The primary issue was the lack of a team and resources to build a Minimum Viable Product (MVP) that could test the market. They initially used different block building systems, but this approach was time-consuming and inefficient. Another significant challenge was the inherent trust issues associated with online marketplaces. The founders understood that trust was a crucial factor for the success of their platform, but building this trust was more challenging in an online environment compared to offline engagements.
Download PDF
Go Ethnyk: A Thriving Cosmetics Marketplace Powered by Yo!Kart
Missinn Aklo, the founder of Go Ethnyk, envisioned a specialized marketplace for selling beauty, cosmetics, and perfume products for every skin type. His goal was to assist small retailers and enterprises in gaining more brand exposure and creating a network of loyal consumers. He wanted to provide young brands and small retailers with a platform to sell cosmetic products quickly and easily. His vision was to create an online marketplace for all ethnicities, regardless of age or gender, providing them with essential eCommerce tools to sell and develop their business on the web. However, the challenge was to create such a platform without imposing hefty subscription charges on the sellers.
Download PDF
Kilakitu: Transforming Burundi's Shopping Experience with IoT
Kilakitu, the first online marketplace platform in Burundi, was established with the vision of transforming the shopping preferences of the people in the East African nation. The founder, who had experienced the convenience of online shopping in developed countries like the US and India, wanted to introduce the same experience to the people of Burundi. However, the challenge was that Burundi is one of the poorest countries in the world, with around 50% of the population lacking access to a smartphone. This made identifying the target market audience a huge task, causing most e-commerce startups not to be financially viable. Furthermore, less than 5% of the population were online shoppers due to a lack of know-how to use e-commerce platforms.
Download PDF
Mycart Mauritius: Transforming Online Shopping with Yo!Kart
Vishal Anand, a Mauritian entrepreneur, envisioned a marketplace that would bring together all customers and vendors from Mauritius & Rodrigues onto a single online platform. The goal was to create a platform where businesses of all sizes could sell their products, providing customers with a wide range of products irrespective of their location in Mauritius and Rodrigues. However, the challenge was to create a user-friendly and reliable website that could handle a large volume of products and vendors, while also providing a seamless shopping experience for customers.
Download PDF
Voyij's Ecommerce Platform: Revolutionizing Alaskan Tourism
Voyij, a company dedicated to connecting vacationers with local Alaskan businesses, faced a significant challenge. Each year, over 2.25 million tourists visit Alaska, but many struggle to find unified, authentic Alaskan experiences. Existing digital eCommerce platforms were disjointed, offering either local tours, activities, or merchandise/gifts/souvenirs, but not all three. The lack of a comprehensive platform meant that the true essence of local Alaska was not being accurately represented. Voyij sought to develop the first online platform for Alaskan businesses to share local stories with travelers, while also enabling them to shop and book activities online. They needed a solution that could handle the complexities of their vision, including tiered pricing, activity availability calendars, multiple pickup locations, and more.
Download PDF
IoT Case Study: Building weDIY - A Thriving Online Marketplace for Handmade Items
The online handmade products industry in Germany has seen a significant surge in growth over the last two decades, with the eCommerce revenue in the DIY sector expected to reach US$55.39 billion by 2023. This growth is largely attributed to creative DIY online marketplaces like Etsy and weDIY. However, creating such a platform that caters to the unique needs of creative designers and sellers, while also providing a seamless shopping experience for consumers, presents a significant challenge. Erkan Eroglu, the founder of weDIY, envisioned a platform that would serve as a virtual hub for designers, artists, and creative individuals to exchange ideas, interact with sellers, rate and purchase products, and find inspiration. The challenge was to create a platform that would bridge the gap between creative individuals and buyers passionate about acquiring handmade products and value the associated benefits such as environmental sustainability and superior quality.
Download PDF
Wekasuwa: Transforming eCommerce for Small and Medium Businesses in Nigeria
Wekasuwa, a multi-vendor eCommerce marketplace, was established with the aim of bridging the gap between small businesses and buyers in Nigeria. The founder's vision was to maximize digitization in the country and stimulate economic growth through an eCommerce platform. The challenge was to provide value-added services and enable businesses to easily showcase their goods online. They needed a platform that offered flexible payment methods, multiple delivery methods, and separate vendor dashboards to ensure a hassle-free online presence. However, they had a tight budget and needed to find an affordable multi-vendor marketplace software that could meet their needs.
Download PDF
Revolutionizing Liquor Distribution in Africa through IoT
The client, a leading brewing company, was grappling with the challenge of price transparency in the fragmented liquor distribution market in Africa. The lack of transparent prices led to price discrimination and dissatisfaction among end customers, including mom & pop stores, taverns, and small beer cafes. The market dynamics were such that small businesses had weak responsiveness, leading to a lower market share and reach. The liquor monopoly was concentrated in the hands of a few big operators, limiting price choices for small businesses. The client sought to develop a B2B liquor selling platform to provide fair prices to end customers and break the monopoly of big operators.
Download PDF
Leveraging Machine Learning for Enhanced Telecommunications Services: A Case Study of Spark New Zealand
Spark New Zealand, the country's largest telecommunications and digital services company, was faced with the challenge of understanding their customers' needs at a granular level to provide better services. The company aimed to expand the number of machine learning (ML) use cases across the organization to achieve this goal. They started their journey in machine learning by trying to predict churn and understand customer preferences. However, as the number of use cases and the size of the team expanded, they faced issues with model performance and monitoring. The dynamic nature of data and the need for continuous monitoring and troubleshooting of models posed significant challenges. Checking the performance of over 50 models every week was a tedious and time-consuming task. The company needed a solution that could help them monitor these changes more effectively and proactively approach the output of models.
Download PDF
test test