Automated Data Collection with R: A Practical Guide to Web Scraping and Text Mining Front Cover

Automated Data Collection with R: A Practical Guide to Web Scraping and Text Mining

  • Length: 480 pages
  • Edition: 1
  • Publisher:
  • Publication Date: 2015-01-20
  • ISBN-10: 111883481X
  • ISBN-13: 9781118834817
  • Sales Rank: #1196774 (See Top 100 Books)
Description

A hands on guide to web scraping and text mining for both beginners and experienced users of R

  • Introduces fundamental concepts of the main architecture of the web and databases and covers HTTP, HTML, XML, JSON, SQL.
  • Provides basic techniques to query web documents and data sets (XPath and regular expressions).
  • An extensive set of exercises are presented to guide the reader through each technique.
  • Explores both supervised and unsupervised techniques as well as advanced techniques such as data scraping and text management.
  • Case studies are featured throughout along with examples for each technique presented.
  • R code and solutions to exercises featured in the book are provided on a supporting website.

Table of Contents

Chapter 1: Introduction

Part One: A Primer on Web and Data Technologies
Chapter 2: HTML
Chapter 3: XML and JSON
Chapter 4: XPath
Chapter 5: HTTP
Chapter 6: AJAX
Chapter 7: SQL and relational databases
Chapter 8: Regular expressions and essential string functions

Part Two: A Practical Toolbox for Web Scraping and Text Mining
Chapter 9: Scraping the Web
Chapter 10: Statistical text processing
Chapter 11: Managing data projects

Part Three: A Bag of Case Studies
Chapter 12: Collaboration networks in the US Senate
Chapter 13: Parsing information from semistructured documents
Chapter 14: Predicting the 2014 Academy Awards using Twitter
Chapter 15: Mapping the geographic distribution of names
Chapter 16: Gathering data on mobile phones
Chapter 17: Analyzing sentiments of product reviews

To access the link, solve the captcha.