First page Back Continue Last page Overview Graphics
The idea
Aim: Detect a wide range of syntax differences
Tag 2 or more collections of comparable material (using an automatic POS-tagger):
- we did: learners, or 2nd language speakers versus natives
- many other options: different regions or discourses
Take n-grams (3-5grams) of POS-tags
Statistically compare their frequencies
Sort the significant POS-n-grams by weight (frequency or extent of difference)
Notes: