Machine learning with PySpark : with natural language processing and recommender systems

By:

Singh, Pramod

Publication details: Apress 2022 BerkeleyEdition: 2nd edDescription: 220pISBN:

9781484284339

Subject(s):

DDC classification:

006.3 SIN

Summary: This book covers topics such as building scalable machine learning models, natural language processing, and recommender systems. It covers the fundamentals of Apache Spark, traditional machine learning algorithm implementations, and data ingestion and processing to solve business problems. The book demonstrates how to build supervised models like linear regression, logistic regression, decision trees, and random forests, and how to automate these steps using Spark pipelines. It also introduces Koalas in Spark and how to automate data workflow using Airflow and PySpark's latest ML library. The book aims to help developers build and train various machine learning models, understand data processing using Koalas in Spark, and handle issues like feature engineering, class balance, bias and variance, and cross validation.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Current library	Collection	Call number	Status	Date due	Barcode	Item holds
Book	Jammu General Stacks	Non-fiction	006.3 SIN (Browse shelf(Opens below))	Available		IIMJ-8438

Total holds: 0

1. Introduction to Spark 3.1 2. Manage Data with PySpark 3. Introduction to Machine Learning 4. Linear Regression with PySpark 5. Logistic Regression with PySpark 6. Ensembling with PySpark 7. Clustering with PySpark 8. Recommendation Engine with PySpark 9. Advanced Feature Engineering with PySpark

This book covers topics such as building scalable machine learning models, natural language processing, and recommender systems. It covers the fundamentals of Apache Spark, traditional machine learning algorithm implementations, and data ingestion and processing to solve business problems. The book demonstrates how to build supervised models like linear regression, logistic regression, decision trees, and random forests, and how to automate these steps using Spark pipelines. It also introduces Koalas in Spark and how to automate data workflow using Airflow and PySpark's latest ML library. The book aims to help developers build and train various machine learning models, understand data processing using Koalas in Spark, and handle issues like feature engineering, class balance, bias and variance, and cross validation.

There are no comments on this title.

to post a comment.