Thai WaC
The corpus is prepared by Corpus factory method described here. Full details are described in Kilgarriff et al. at LREC 2010.
Corpus is tokenised using Swath Word Segmentation tool downloadable at http://www.cs.cmu.edu/~paisarn/software.html
