wiki:Corpora/enTenTen

enTenTen

English TenTen corpus.

The corpus is tagged with  TreeTagger using the  English parameter file (Latin1).

Changelog

v1.0 (15 November 2010)

  • initial version -- 3.3 billion tokens