esTenTen
Spanish TenTen corpus.
The corpus is tagged with TreeTagger using the Spanish parameter file (UTF-8).
Changelog
v2.0 (30 September 2011)
- removed Catalan and Galician texts
- corpus size reduced by 79 million tokens
v1.0 (13 April 2011)
- initial version -- 2.5 billion tokens
