Atempo Blog

From HPC to Azure Cloud: The Story of weSystems' 50PB+ Archive with Miria

Written by Rachel Martinez | May 16, 2024 6:00:00 AM

As the amount of data generated by HighPerformance Computing (HPC) applications continues to grow, so does the challenge of archiving these large data sets to the cloud. Atempo, a leading provider of data protection and archiving solutions, has partnered with weSystems, an HPCspecialist, to address this challenge for an automotive supplier needing to preserve sensitive autonomous vehicle data for 10 years or more. In this blog post, we will take a technical deep dive into this project requirement and discover how Atempo Miria was instrumental to help weSystems to successfully archive these large HPC data sets to the cloud.

Customer Requirements

The customer, a well-known German automotive supplier, was facing a significant challenge in archiving the massive amounts of data generated by its fleet of autonomous cars and associated AI systems. The data was accumulating at a rate of 2PB monthly and needed to be stored and archived for long periods across multiple sites in Europe and the USA. The customer required a solution capable to transfer massive data volumes across 2 to 3 remote sites to support collaboration and further processing while ensuring long-term archiving capabilities and rapid data retrieval.

One of the complexities of the project was the customer specific request to be able to retrieve data without impact for the 3-months following the data archive. This is not a standard feature of cloud archives as data retrieval is usually more expensive than storage and takes between 48h-72 hours to retrieve data.

weSystems built a caching system as intermediary storage, using Miria to collect, and move data from each remote location to push it to the cache, before archiving the data sets to Azure.

 

How Atempo Miria Works

Atempo Miria is a flexible and versatile data management solution that provides fast and secure workflows tailored to customer needs. To address the specific needs of this customer, weSystems used the following Miria data services:

  1. Synchronization: Miria collects petabyte-scale data sets on each of the source storages at the customer site and consolidates the data to a central location in Berlin via multiple dedicated 10Gbit/s connections across Europe and the USA.

  1. Archiving: A second Miria process collects the data from the central location in Berlin and archives it to the cloud archive tier, where the data is preserved for 10 to 12 years.

Miria's modular data services and ability to seamlessly connect with nearly any storage platform make it a unique solution in the market. Its wide storage compatibility eases data collection on diverse customer storage platforms, while its versatile storage integration methods, including CIFS-SMB/NFS mounts and advanced capabilities such as Miria's FastScan, ensure efficient and fast data protection and synchronization at scale through automated Snapshot differentials.

 

The Solutions Combined

weSystems' team expertly combined the following solutions to design a tailored ad-hoc solution that exceeded the customer's requirements:

  • Miria's data services, archiving / synchronization to enable fast data movement, ensured cost-efficient archiving for decades, on-demand restores and flexibly adapted to the company's growth.
  • Ceph storage was implemented as the central caching system integrated into the configuration allowing the customer to retrieve archived data within a 3-month period without financial impact or delay. This significant improvement over the 48 to 72 hours typically associated with the cloud's cold tier was achieved by providing an almost instant retrieval to users in remote locations at no additional cost.

 

 

Conclusion

In conclusion, Atempo Miria is a unique and powerful solution for archiving large HPC data sets to the cloud. Its modular data services and ability to seamlessly connect with nearly any storage platform make it a versatile and efficient solution for even the most complex data management challenges. With Miria, customers can achieve the levels of performance and quality required for massive-scale projects, as evidenced by the successful archiving of 50PB+ of sensitive data in this project.

Read the Full Case Study