- Quickly get familiar with data science using Python 3.5
- Save time (and effort) with all the essential tools explained
- Create effective data science projects and avoid common pitfalls with the help of examples and hints dictated by experience
Fully expanded and upgraded, the second edition of Python Data Science Essentials takes you through all you need to know to suceed in data science using Python. Get modern insight into the core of Python data, including the latest versions of Jupyter notebooks, NumPy, pandas and scikit-learn. Look beyond the fundamentals with beautiful data visualizations with Seaborn and ggplot, web development with Bottle, and even the new frontiers of deep learning with Theano and TensorFlow.
Dive into building your essential Python 3.5 data science toolbox, using a single-source approach that will allow to to work with Python 2.7 as well. Get to grips fast with data munging and preprocessing, and all the techniques you need to load, analyse, and process your data. Finally, get a complete overview of principal machine learning algorithms, graph analysis techniques, and all the visualization and deployment instruments that make it easier to present your results to an audience of both data science experts and business users.
What you will learn
- Set up your data science toolbox using a Python scientific environment on Windows, Mac, and Linux
- Get data ready for your data science project
- Manipulate, fix, and explore data in order to solve data science problems
- Set up an experimental pipeline to test your data science hypotheses
- Choose the most effective and scalable learning algorithm for your data science tasks
- Optimize your machine learning models to get the best performance
- Explore and cluster graphs, taking advantage of interconnections and links in your data
About the Author
Alberto Boschetti is a data scientist with expertise in signal processing and statistics. He holds a PhD in telecommunication engineering and currently lives and works in London. In his work projects, he faces challenges ranging from natural language processing (NLP), behavioral analysis, and machine learning to distributed processing. He is very passionate about his job and always tries to stay updated about the latest developments in data science technologies, attending meet-ups, conferences, and other events.
Luca Massaron is a data scientist and marketing research director specializing in multivariate statistical analysis, machine learning, and customer insight, with over a decade of experience of solving real-world problems and generating value for stakeholders by applying reasoning, statistics, data mining, and algorithms. From being a pioneer of web audience analysis in Italy to achieving the rank of a top ten Kaggler, he has always been very passionate about every aspect of data and its analysis, and also about demonstrating the potential of data-driven knowledge discovery to both experts and non-experts. Favoring simplicity over unnecessary sophistication, Luca believes that a lot can be achieved in data science just by doing the essentials.
Table of Contents
Chapter 1. First Steps
Chapter 2. Data Munging
Chapter 3. The Data Pipeline
Chapter 4. Machine Learning
Chapter 5. Social Network Analysis
Chapter 6. Visualization, Insights, And Results
Chapter 7. Strengthen Your Python Foundations