Progress Case Studies Providing Scalability and Speed for a National Treasure
Edit This Case Study Record
Progress Logo

Providing Scalability and Speed for a National Treasure

Progress
Analytics & Modeling - Real Time Analytics
Platform as a Service (PaaS) - Data Management Platforms
Education
Search & Rescue
Data Science Services
System Integration
The U.S. National Archives and University of Virginia (U.Va.) Press faced the challenge of transforming Founders Online, a scholarly tool traditionally accessed by a limited number of researchers, into a national resource capable of serving the public at large. The original platform was not designed to handle concurrent users at scale. The old platform's performance under load deteriorated quickly, and testing suggested the architecture would only support 100 concurrent users. The design imperative for the original system was to preserve the look and feel of print volumes, which had a minimal impact on smaller files, but longer, outlying collections stressed the system. The old platform recreated each search from scratch, putting an unnecessary burden on computing resources while slowing the system down. To make matters more complex, the organization had the equivalent of just 1.5 full-time programmers to devote to the project.
Read More
Founders Online is a free online tool commissioned by the U.S. National Archives and implemented by University of Virginia (U.Va.) Press that lets the public access the papers of six of America's Founding Fathers: Thomas Jefferson, Benjamin Franklin, George Washington, James Madison, John Adams and Alexander Hamilton. Funded by the National Historical Publications and Records Commission of the National Archives, Founders Online grew out of 50 years of scholarly efforts and gives unique insight into some of the brightest minds of the Age of Enlightenment. The website provides searchable access to over 150,000 documents, a number that's projected to grow to 175,000.
Read More
MarkLogic, the only Enterprise NoSQL database platform, was used to build content systems that can manage billions of data points quickly. It allows enterprises to populate data in whatever form they have, not just the inflexible rows and columns of traditional structured databases. Using MarkLogic's native search, navigation and rendering capabilities, U.Va. Press didn't have to rebuild from the ground up to rescale its existing platform. It simply started thinking about queries in the aggregate, instead of on a document-by-document basis. For example, whereas a traditional structured database crawls through millions of rows and columns, MarkLogic uses data mapping to locate relevant documents quickly. The customer created a static index of previous search results, and prepopulated it before public launch with the most common search terms and links. Now, when a known search term is entered into the system, it simply serves up those existing results, instead of crawling through each stored document again. And the more people who use the system, the richer that stored search cache becomes.
Read More
Sub-second search: The result for users of the site - the public - is a quick, Google-like search experience for a remarkable collection of documents written over 200 years ago.
Leverage existing IT resources: By using data mapping to look at aggregate groups of documents, the customer was able to avoid irrelevant results and information bottlenecks to return results quickly and accurately. It also built on previous programming, using existing switches to duplicate processes already in place.
Better data insight: The customer created a static index of previous search results, and prepopulated it before public launch with the most common search terms and links. Now, when a known search term is entered into the system, it simply serves up those existing results, instead of crawling through each stored document again. And the more people who use the system, the richer that stored search cache becomes.
Cut the response time for a large, 90-page document from 19 seconds to just 1.86 milliseconds.
When concurrent load increased to 5,000 users - or 50x projected capacity during initial testing - average response was still just 120 milliseconds.
Download PDF Version
test test