wiki:Corpora/CantoneseWaC

Cantonese WaC

This corpus is collected using Cantonese only seed words and  Corpus Factory method. We hope the corpus represent Cantonese only text without mixing with other Chinese variants. Full details are described in  Kilgarriff et al. at LREC 2010.