Corpus Approaches to Language in Social Media 1st Edition by Matteo Di Cristofaro – Ebook PDF Instant Download/Delivery: 100091559X, 9781000915594
Full dowload Corpus Approaches to Language in Social Media 1st Edition after payment

Product details:
ISBN 10: 100091559X
ISBN 13: 9781000915594
Author: Matteo Di Cristofaro
This book showcases the unique possibilities of corpus linguistic methodologies in engaging with and analysing language data from social media, surveying current approaches, and offering guidelines and best practices for doing language analysis. The book provides an overview of how language in social media has been approached by linguists and non-linguists, before delving into the identification of the datasets requirements needed to pursue investigations in social media, and of the technical aspects of particular platforms that may influence the analysis, such as emoticons, retweets, and metadata. Sample Python code, along with general guidelines for using it, is provided to empower researchers to apply these techniques in their own work, supported by actual examples from three real-life case studies. Di Cristofaro highlights the full potential of using these methodologies in analysing social media language data and the ways in which they might pave the way for future applications of data analysis and processing for corpus linguistics. The book will be key reading for researchers in corpus linguistics and linguists and social scientists interested in data-driven analysis of social media.
Corpus Approaches to Language in Social Media 1st Table of contents:
1 Introduction
Setting the stage
Interconnecting with the digital
Digital humanities as practices of interconnections
More than numbers: studying cognition and society through corpus approaches
Scope and structure of this book
About the companion website
References
2 Social media as digital research data
The impact of the digital on cognition and society
Open source
Copyright and ethics
Copyright issues
Ethical issues
The characteristics of a corpus
More than text: corpus metadata, textual markup, and annotation
Metadata
Evaluating metadata
Textual markup
Annotations
References
3 Fundamentals of corpus linguistics
Corpus tools
The building blocks of corpus linguistics
Type, token, lemma
Frequencies and frequency lists
Dispersion
Concordances and key-word-in-context
Collocations
Keywords
Stoplist
Advancements in corpus linguistics
A corpus approach perspective on sentiment analysis and topic modelling
References
4 Imagining the data: corpus design
Setting up the working environment
Command-line interface and virtual programming environments
A note about programming languages
CSV, XML and HTML, JSON
CSV
XML and HTML
JSON
Preserving the data
Internet Archive and the Wayback Machine
WARC format
git
Working with digital textual data
Unicode, UTF-8, character encodings
Regular expressions
Towards data collection
References
5 Creating the data: corpus collection
Collecting the data: general remarks
Crawling and scraping web data
APIs
General purpose scrapers
#LancsBox
Archivebox
Trafilatura
The coding way: BeautifulSoup
Platform-specific scrapers
YouTube
Data processing
Dates, time, and Unix time
Text normalisation
PDF, Word, images
Detecting the language(s) used in a text
Emoticons and emojis
Hashtags
Other elements
Annotations
Verticalised format
Exploring the collected data
Cleaning and formatting the data
References
6 Case studies
Analysing crypto-drug market fora
Background
Context
Corpus design
Data processing
Corpus analysis
Analysing the language of far-right groups on Twitter and Facebook
Background
Context
Corpus design
Data processing
Corpus analysis
The communicative modus operandi of online child sexual groomers
Background
Context
Corpus design
Data processing
Corpus analysis
References
7 Conclusion
A broad view of corpus approaches
References
Appendix
Index
People also search for Corpus Approaches to Language in Social Media 1st :
corpus linguistics methods
language and new media in corpus linguistics
corpus methods
corpus studies
Tags:
Matteo Di Cristofaro,Corpus Approaches,Language