Part-of-speech tagging of historical Dutch texts

Pre-PhD Fellow: Dieuwke Hupkes
Supervisor: prof.dr. Rens Bod

Enriching a text with part-of-speech tags can be useful for researching both the content and the form of this text. Given (relatively) large amounts of labeled data, it is possible to train computer models to perform part-of-speech tagging of texts in the same language. Such trained taggers work very well when they encounter words that they were trained on, but also perform reasonably well on new words, making use of their contexts. Developing good performing POStaggers for low-resource data is still an open problem. This project aims to improve the quality of POStagging of historical Dutch.

Research Themes » Part-of-speech tagging of historical Dutch texts