Python 3 Text Processing with NLTK 3 Cookbook

Python 3 Text Processing with NLTK 3 Cookbook Front Cover
2 Reviews
310 pages

Book Description

Over 80 practical recipes on techniques using Python's NLTK 3.0

About This Book

  • Break text down into its component parts for spelling correction, feature extraction, and phrase transformation
  • Learn how to do custom sentiment and named entity recognition
  • Work through the natural language processing concepts with simple and easy-to-follow programming recipes

Who This Book Is For

This book is intended for Python programmers interested in learning how to do natural language processing. Maybe you've learned the limits of regular expressions the hard way, or you've realized that human language cannot be deterministically parsed like a language. Perhaps you have more text than you know what to do with, and need automated ways to analyze and structure that text. This Cookbook will show you how to train and use statistical language to process text in ways that are practically impossible with standard programming tools. A basic knowledge of Python and the basic text processing concepts is expected. Some experience with regular expressions will also be helpful.

In Detail

This book will show you the essential techniques of text and language processing. Starting with tokenization, stemming, and the WordNet , you'll progress to part-of-speech tagging, phrase chunking, and named entity recognition. You'll learn how various text corpora are organized, as well as how to create your own custom corpus. Then, you'll move onto text classification with a focus on sentiment analysis. And because NLP can be computationally expensive on large bodies of text, you'll try a few methods for distributed text processing. Finally, you'll be introduced to a number of other small but complementary Python libraries for text analysis, cleaning, and parsing.

This cookbook provides simple, straightforward examples so you can quickly learn text processing with Python and NLTK.

Table of Contents

Chapter 1: Tokenizing Text and WordNet Basics
Chapter 2: Replacing and Correcting Words
Chapter 3: Creating Custom Corpora
Chapter 4: Part-of-speech Tagging
Chapter 5: Extracting Chunks
Chapter 6: Transforming Chunks and Trees
Chapter 7: Text Classification
Chapter 8: Distributed Processing and Handling Large Datasets
Chapter 9: Parsing Specific Data Types

Appendix: Penn Treebank Part-of-speech Tags

Book Details

  • Title: Python 3 Text Processing with NLTK 3 Cookbook
  • Author:
  • Length: 310 pages
  • Edition: 1
  • Language: English
  • Publisher:
  • Publication Date: 2014-09-01
  • ISBN-10: 1782167854
  • ISBN-13: 9781782167853