Dictionaries II

Evaluation and Analysis

Is this dictionary any good? One way to ask this question is to look at its precision and recall: how often to the words it assigns to a topic really belong there, and for all the words in a topic, how often does the dictionary assign them to it? These can be hard to estimate with manual methods like content analysis dictionaries, but we can still lay down a general framework that will also apply to the topic models, and many other kinds of coding exercises.

Its all very well to discover where a dictionary is weak, but can we do any thing about it? Turns out, yes, though there’ll still be a manual element.

Readings

Lecture

Link (not yet)