Show Summary Details
Research Methods in the Social Sciences: An A-Z of key concepts

Research Methods in the Social Sciences: An A-Z of key concepts (1st edn)

Jean-Frédéric Morin, Christian Olsson, and Ece Özlem Atikcan
Page of

Printed from Oxford Politics Trove. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 21 October 2021

p. 14Automated Text Analysis

The Application of Automatic Text Processing in the Social Scienceslocked

p. 14Automated Text Analysis

The Application of Automatic Text Processing in the Social Scienceslocked

  • Panagis Yannis

Abstract

This chapter examines automated text analysis (ATA), which describes the different methodologies that can be applied in order to perform text analysis with the use of computer software. ATA is a computer-assisted method for analysing text, whenever the analysis would be prohibitively labour-intensive due to the volume of texts to be analysed. ATA methods have become more popular due to current interest in big data, taking into account the volume of textual content that is made easily accessible by the digitization of human activity. Key to ATA is the notion of corpus, which is a collection of texts. A necessary step before starting any analysis is to collect together the necessary documents and construct the corpora that will be used. Which texts need to be included in this step is dictated by the research question. After text collection, some processing steps need to be taken before the analysis starts, for example tokenization and part-of-speech tagging. Tokenization is the process of splitting a text into its constituent words, also called tokens, whereas part-of-speech tagging assigns each word a label that indicates the respective part-of-speech.

You do not currently have access to this chapter

Sign in

Please sign in to access the full content.

Subscribe

Access to the full content requires a subscription