

Paperback: 276 pages
Publisher: O'Reilly Media; 1 edition (April 20, 2015)
Language: English
ISBN-10: 1491912766
ISBN-13: 978-1491912768
Product Dimensions: 7 x 0.6 x 9.2 inches
Shipping Weight: 14.4 ounces (View shipping rates and policies)
Average Customer Review: 4.6 out of 5 stars See all reviews (21 customer reviews)
Best Sellers Rank: #19,153 in Books (See Top 100 in Books) #1 in Books > Computers & Technology > Web Development & Design > Website Analytics #13 in Books > Computers & Technology > Programming > Languages & Tools > Java #13 in Books > Textbooks > Computer Science > Database Storage & Design

TL;DR If you are looking for a intro to data science, data analysis and machine learning at scale - this is the right book. Sure, there are others, maybe more popular books from O'Reilly considering these topics, but the authors of those are using R and Python and the books are not focused on the performance and scalability. For closer details regarding Spark you can also take a look at this introductory Spark book - Learning Spark.This book presents 9 case studies of data analysis applications in various domains. The topics are diverse and the authors always use real world datasets. Beside learning Spark and a data science you will also have the opportunity to gain insight about topics like taxi traffic in NYC, deforestation or neuroscience. Without any previous exposure or contact with machine learning readers might struggle to understand certain chapters, so I think it's good idea to actually try those examples yourself while reading and Google for further details about the used methods. Many of the chapters end only with basic models, which barely outperform the baselines, so if you want to, there is a lot of space for their improvement and further work.Spark itself provides it's users with APIs in three languages - Java, Scala and Python. This books successfully covers each one of these, although you can feel slight preference of a Scala throughout the book. For Scala starters - they always explain some of the special constructs or syntax features which is in fact a nice thing. Introduction and Appendix chapters provides basic information about the Spark core, RDDs (Resilient distributed datasets) or options of running Spark - whether in cluster (Mesos, YARN, Spark's own) or standalone settings.
Advanced Analytics with Spark: Patterns for Learning from Data at Scale Analytics: Data Science, Data Analysis and Predictive Analytics for Business (Algorithms, Business Intelligence, Statistical Analysis, Decision Analysis, Business Analytics, Data Mining, Big Data) Data Analytics: What Every Business Must Know About Big Data And Data Science (Data Analytics for Business, Predictive Analysis, Big Data) Data Analytics: Practical Data Analysis and Statistical Guide to Transform and Evolve Any Business. Leveraging the Power of Data Analytics, Data ... (Hacking Freedom and Data Driven) (Volume 2) Machine Learning with Spark - Tackle Big Data with Powerful Spark Machine Learning Algorithms A collection of Advanced Data Science and Machine Learning Interview Questions Solved in Python and Spark (II): Hands-on Big Data and Machine ... Programming Interview Questions) (Volume 7) Analytics: Data Science, Data Analysis and Predictive Analytics for Business Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data The Spark Story Bible: Spark a Journey through God's Word Learning Spark: Lightning-Fast Big Data Analysis Data Analytics with Hadoop: An Introduction for Data Scientists Agile Data Science: Building Data Analytics Applications with Hadoop Big Data in Practice: How 45 Successful Companies Used Big Data Analytics to Deliver Extraordinary Results Healthcare Data Analytics (Chapman & Hall/CRC Data Mining and Knowledge Discovery Series) From Big Data to Big Profits: Success with Data and Analytics RapidMiner: Data Mining Use Cases and Business Analytics Applications (Chapman & Hall/CRC Data Mining and Knowledge Discovery Series) Pocket Neighborhoods: Creating Small-Scale Community in a Large-Scale World Scale Studies for Viola: Based on the Hrimaly Scale Studies for the Violin Rand McNally 2017 Large Scale Road Atlas (Rand Mcnally Large Scale Road Atlas USA) L590 - Progressive Scale Studies - Scale Study and Practical Theory in Major and Minor Keys for the Young Violinist