A theoretical course covering topics on how to handle data at scale and the different tools needed for orchestrating big data systems and manage the workflow. Learners will be able to dive into the vast world of data and computing at scale and get a comprehensive overview of the distributed resource management ecosystem.
Learn how to optimize your code and to speed up current data processing using PySpark. In this course, students will work through best practices of how and when to use PySpark. They will explore what they can do with PySpark and how to use distributed computing within PySpark.
A course that covers theory and implementation on a specific cloud platform covering topics on distributed data storage systems. Learners will be able to dive into the nature of storing and processing data at scale using tools like Hadoop on a selected cloud platform. This course will allow students to get a great foundation for creating and managing distributed data storage resources.
A theoretical course covering topics on how to handle data at scale and the different tools needed for distributed data storage, analysis, and management. Learners will be able to dive into the vast world of data and computing at scale and get a comprehensive overview of distributed computing.
There was a problem reporting this post.
Please confirm you want to block this member.
You will no longer be able to:
Please note: This action will also remove this member from your connections and send a report to the site admin. Please allow a few minutes for this process to complete.