Customer testimony : how to build a state of the art AWS infrastructure for Gen AI

AWS Generative AI
Temps de lecture : 4 minutes

Tatra is a growing company specialized in the development of artificial intelligence technologies. They focus on building tools to help people to use generative AI simply. Some of their notable products and services include Piper Tool, a flexible machine learning MVP acceleration tool, Insomnia Gen AI, a B2C tool for image and video recognition and generation, Tatraskul IT courses, De-Fi blockchain solutions, or Deep Lab, an internal lab for natural language processing and machine learning.

By migrating to AWS Cloud, Tatra pursued several goals : having a secure, scalable infrastructure that would enable them to launch a freemium service for Insomnia Gen AI, explore new use cases and offer on-demand training of custom machine learning models. This migration represents a significant step towards scaling their operations and broadening their service offerings. In this blogpost, we will dive into this customer use case, and see how Devoteam helped Tatra with the challenge of building a state of the art AWS infrastructure tailored for AI products and services.

AWS Migration

Oleg Sokolov, CTO at Tatra, explains to us why Tatra chose to move to AWS Cloud : « We already had some part of our backend on AWS but we decided to move most of our infrastructure on AWS to be able to scale easily, and to take advantage of having all in one place: infrastructure, CI/CD and machine learning models. »

Indeed, the initial architecture faced several challenges: the system could not scale effectively due to its reliance on a bare-metal server with only one GPU. If the GPU was fully utilized, jobs would queue, sometimes leading to long wait times and resulting in timeouts; performance was limited by the available GPU resources. Additionally, a single server failure meant the entire Insomnia service would go down.

With this architecture, Insomnia faced significant hurdles in growth and customer onboarding. The inability to train and create new models on the fly led to customer frustration. Moreover, they found themselves stuck, unable to innovate or develop new services and offerings for their customers.

Scaling for Gen AI

The main objective of this migration was to transition Insomnia Gen AI from a test standalone server and a proof-of-concept (POC) architecture and a custom hosting service to a fully-fledged production infrastructure on AWS. This move was aimed at enhancing their capacity to host and run a larger array of AI models, scale their services effectively, and achieve improved availability and performance.

AWS Gen AI

The transformation led to a fully automated platform capable of handling a large volume of requests simultaneously. This change ensured consistent response times and high service availability. By primarily utilizing AWS managed services, the platform can now scale efficiently, accommodating growth without sacrificing the quality of service.

Benefits from AWS migration

As Oleg Sokolov explains to us, they are now relying on several AWS services:

« We use EC2 for hosting our virtual servers, Route 53 for DNS management, and S3 for storing our generated images and internal datasets. RDS for Aurora is our choice for database management. We also execute specific functions using AWS Lambda and leverage Amazon SageMaker for our machine learning workflows.

Amazon SageMaker is amazing and simple. I thought I would need a lot of knowledge, a lot of courses to run SageMaker, but it is really easy to use and today it is a great instrument for us. Thanks to the help of Devoteam, we can now change models, run some training, and if we need more RAM, we just have to change the instance type. Before, we had to build our infrastructure and deploy manually, this would take us a few days. Now we just need 10 minutes to deploy new models. »

A significant advantage of this new setup is the control over costs. By adopting a pay-per-use solution, it becomes easier to align their pricing model with the linear costs of the services, ensuring financial efficiency and predictability.

Moreover, the shift to AWS has opened up almost unlimited resources. This expansion allows them to explore new use cases and enables customers to train new models on the fly using their own pictures. Such flexibility and resource availability have significantly broadened the scope of what they can offer.

AWS Well Architected Framework and Cloud Security

Finally, with the support of Devoteam in transitioning to AWS, they now operate within a robust landing zone that adheres to the AWS Well Architected Framework. This adherence to AWS’s best practices ensures a more secure, efficient, and reliable cloud environment, setting a strong foundation for continued innovation and growth.

As Oleg concludes : « Devoteam’s help brought us the best practices in terms of AWS security. We now have a secure infrastructure with different accounts, we can invite new developers and be confident about billing or access to our databases. It is quite beneficial to be in a secure environment. For us, it is really important to be able to focus on our core competency, which is AI. Our new infrastructure on AWS is now ready to scale for thousands of ML/AI users.»

On the same topic : LightOn, pionnier de la Gen AI, mise sur le Cloud AWS et SageMaker

Commentaires :

A lire également sur le sujet :