Preparing a Corpus for the Sketch Engine: Overview
To prepare a corpus for the Sketch Engine, we must
- Prepare the data, including both
- SkE/PrepareText [the text]
- SkE/PrepareHeaders [the header information] (if any)
- Prepare a SkE/CorpusConfig [corpus configuration file]
- Prepare a SkE/SubcorpusConfig [subcorpus configuration file] This step is needed if you wish to compile subcorpora which can be shared by multiple users
- Prepare a grammatical relations definitions (gramrels) file: see SkE/CorpusQuerying This step is needed if you require word sketches or a thesaurus (the thesaurus takes the word sketch database as input).
- Compile the corpus (see compiling corpora)
