Skip to main content
Loading Events

« All Events

  • This event has passed.

ICPSR: Machine Learning for the Analysis of Text as Data

July 8, 2019 @ 9:00 am - July 12, 2019 @ 5:00 pm

 

ICPSR: Machine Learning for the Analysis of Text as Data

 
Instructor(s):

Brice Acree, Ohio State University

Quantitative analysis of digitized text represents an exciting and challenging frontier of data science across a broad spectrum of disciplines. From the analysis of physicians’ notes to identify patients with diabetes, to the assessment of global happiness through the analysis of speech on Twitter, patterns in massive text corpora have led to important scientific advancements. In this course we will cover several central computational and statistical methods for the analysis of text as data. Topics will include the manipulation and summarization of text data, dictionary methods of text analysis, prediction and classification with textual data, document clustering, text reuse measurement, and statistical topic models. Each method will be illustrated with hands-on examples using R. Participants will develop an understanding of the challenges and opportunities presented by the analysis of text as data, as well as the practical computational skills to complete independent analyses. The R packages covered in this course include tm, lda, textreuse, glmnet and openNLP.

One distinguishing focus of this course will be the use of text analytics for the reliable and valid development and testing of scientific theory. Most methods of text analysis have been developed with predictive or descriptive motivations. For each method we cover in the current course, we will review how the method has been and can be applied to draw theoretical inferences regarding processes surrounding text generation.

Prerequisites: Participants should be familiar with linear and generalized linear models (e.g. logit, poisson, etc.), and have at least some exposure to the R environment before the workshop. The class will review aspects of R on the first day. No prior knowledge of text processing or modeling is assumed.

Fee: Members = $1700; Non-members = $3200

For registration details, click here.

Details

Start:
July 8, 2019 @ 9:00 am
End:
July 12, 2019 @ 5:00 pm
Event Category:
Website:
https://www.icpsr.umich.edu/icpsrweb/sumprog/

Organizers

odum_bull2
odum_bull2

Venue

219 Davis Library
208 Raleigh Street
Chapel Hill, NC 27514 United States
+ Google Map
Phone
(919) 962-1151
View Venue Website