All files: taboo-words.zip
Computational Generation and Dissection of Lexical Replacement Humor
This readme file contains information about taboo(-inducing) word lists used in the experiments of the following paper.
Alessandro Valitutti, Antoine Doucet, Jukka M. Toivanen, and Hannu Toivonen: Computational Generation and Dissection of Lexical Replacement Humor. Submitted to Natural Language Engineering. 2015.
If you use the word lists in your research, please cite the above paper as the source of the words.
The files containing the word lists are
Taboo word classes
- connotational taboo words are unspeakable words where the taboo is in the utterance itself
- taboo-inducing words are not taboos in themselves, but depending on their use they can induce taboo meanings.
Please, see the paper referenced above for further information about the classification of taboo words.
The taboo(-inducing) words were hand-picked from three sources:
- words used as funny autocorrections from http://www.damnyouautocorrect.com
- profanities from http://www.urbandictionary.com and http://onlineslangdictionary.com
words related to sex, from the sexuality domain of WordNet-Domains, cf.
Magnini, B. and Cavaglià, G. (2000): Integrating Subject Field Codes into WordNet. In Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC2000), Athens, Greece.