If you ever had the project of building a Datalake in AWS, you certainly have considered using AWS Glue crawlers at some point, would it be because you read the AWS documentation, a related article or simply the StackOverflow most upvoted solution. At Devoteam Revolve, we did too. This article aims to describe the reasons why we in...
Lire la suiteEven though AWS provides more and more resources and possibilities to work with your data in the Cloud, some people still feeling the need to work with their data into their local pySpark environment. The idea of this article / tutorial is to show how to do that and help you to understand what happens under the hood. This article was ...
Lire la suiteA la croisée de plusieurs disciplines, la Data Science s'appuie sur des méthodes et des algorithmes pour tirer des informations et de la connaissances à partir de données structurées et non structurées. Encore inconnus il y a quelques années, les métiers de la Data Science et du Machine Learning évoluent très vite. Compétences, méthodes,...
Lire la suite