Scaling Machine Learning with Spark
Learn how to build end-to-end scalable machine learning solutions with Apache Spark. With this practical guide, author Adi Polak introduces data and ML practitioners to creative solutions that supersede today''s traditional methods. You''ll learn a more holistic approach that takes you beyond specific requirements and organizational goals--allowing data and ML practitioners to collaborate and understand each other better.
Scaling Machine Learning with Spark examines several technologies for building end-to-end distributed ML workflows based on the Apache Spark ecosystem with Spark MLlib, MLflow, TensorFlow, and PyTorch. If you''re a data scientist who works with machine learning, this book shows you when and why to use each technology.
You will:
- Explore machine learning, including distributed computing concepts and terminology
- Manage the ML lifecycle with MLflow
- Ingest data and perform basic preprocessing with Spark
- Explore feature engineering, and use Spark to extract features
- Train a model with MLlib and build a pipeline to reproduce it
- Build a data system to combine the power of Spark with deep learning
- Get a step-by-step example of working with distributed TensorFlow
- Use PyTorch to scale machine learning and its internal architecture
- Författare
- Adi Polak
- ISBN
- 9781098106799
- Språk
- engelska
- Utgivningsdatum
- 2023-03-07
- Förlag
- O'Reilly Media
- Tillgängliga elektroniska format
- PDF - Adobe DRM
- Läs e-boken här
- E-boksläsare i mobil/surfplatta
- Läsplatta
- Dator

