Create your own natural language training corpus for machine learning This example-driven book walks you through the annotation cycle from selecting an annotation task & creating the annotation
Specification to designing the guidelines creating a "gold standard" corpus & then beginning the actual data creation with the annotation process Systems exist for analyzing existing corpora but making a new corpus can be extremely complex To help you build a foundation for your own machine learning goals this easy-to-use guide
Includes:: case studies that demonstrate four different annotation tasks in detail You'll also learn how to use a lightweight software package for annotating texts & adjudicating the annotations This book is a perfect companion to O' Reilly's Natural Language Processing with Python which describes how to use existing corpora with the Natural Language Toolkit