Linguistic Gaming with Python
Python Practice 8
In this exercise, you introduce yourselves to the
Natural Language
Toolkit, which is written in Python and contains many useful
applications for natural language processing.
Part 1: Make sure you have a working installation of NLTK.
Either install it on your computer or use the computers in the
MacRoom.
Part 2: Work through Chapter 1 of the on-line
NLTK book.
Note that the on-line version works with Python 2.X.
Part 3: Do the following exercises that are suggested in
Chapter 1 (whenever it says: Your Turn).
- Try searching for other words; to save re-typing, you
might be able to use up-arrow, Ctrl-up-arrow or Alt-p to access the
previous command and modify the word being searched. You can also
try searches on some of the other texts we have included. For
example, search Sense and Sensibility for the word affection, using
text2.concordance("affection"). Search the book of Genesis to find
out how long some people lived, using
text3.concordance("lived"). You could look at text4, the Inaugural
Address Corpus, to see examples of English going back to 1789, and
search for words like nation, terror, god to see how these words
have been used differently over time. We've also included text5, the
NPS Chat Corpus: search this for unconventional words like im, ur,
lol. (Note that this corpus is uncensored!)
- Pick another pair of words and compare their usage in two
different texts, using the similar() and common_contexts()
functions.
- How many times does the word lol appear in text5? How much is
this as a percentage of the total number of words in this text?
Please send the outcome of the exercises to
aikaterini-lida at uni konstanz by 11:30 am on December 19th, 2014.
End