Sketch Engine
  • Login
  • Wiki
  • Timeline
  • View Tickets
  • New Ticket
  • Search
  • Settings

Wiki Navigation

  • Start Page
  • Index by Title
  • Index by Date
  • Last Change

Preparing a Corpus for the Sketch Engine: Overview

To prepare a corpus for the Sketch Engine, we must

  • Prepare the data, including both
    • SkE/PrepareText [the text]
    • SkE/PrepareHeaders [the header information] (if any)
  • Prepare a SkE/CorpusConfig [corpus configuration file]
  • Run the encodevert program.

(Here we assume a running SketchEngine installation.)

This will give us a corpus which can be queried to give a range of concordances and lists. If, in addition, word sketches are required we must also

  • Prepare a grammatical relations definitions (gramrels) file: see SkE/CorpusQuerying
  • Run the mkws.sh script.

This will also prepare the thesaurus which requires no additional inputs. It takes the word sketch database as input.

Download in other formats:

  • Plain Text

Sketch Engine
Bringing Corpora to the Masses

Lexical Computing Ltd

Brought to you by
Lexical Computing Ltd