Sketch Engine
  • Login
  • Wiki
  • Timeline
  • View Tickets
  • New Ticket
  • Search
  • Settings

Wiki Navigation

  • Start Page
  • Index by Title
  • Index by Date
  • Last Change

Methods Documentation

back to index

Examples

In this section, we show several examples of how is the Sketch Engine accessible automatically from a program, mainly using the JSON format. The example set is expected to be growing in time.

Example 1

This example presents how to connect the Sketch Engine server, send a query (in this particular case simple word list query) and parse the result for JSON syntax. Available for Java and Python. Note that many modules for JSON parsing are available, you do not have to use the one from the examples.

Example 1 - Java source:

package jsonexample;

import java.io.BufferedReader;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.Authenticator;
import java.net.PasswordAuthentication;
import java.net.URL;
import org.json.*;

public class Main {
    
    public Main() {
    }
    
    public static void main(String[] args) {
        String data;
            
            // url with the query
        String url_string = "http://beta.sketchengine.co.uk/auth/corpora/run.cgi/wordlist?corpname=preloaded/bnc;wlattr=word;wlminfreq=5;wlmaxitems=100;wlpat=test.*;format=json"

        final String usr = "<username>";
        final String passwd = "<password>";
        
            // authentication issues
        Authenticator auth = new Authenticator() {
            protected PasswordAuthentication  getPasswordAuthentication () {
                return new PasswordAuthentication(usr, passwd.toCharArray());
            }
        };
        Authenticator.setDefault(auth);
        
        try {
                // connecting the SketchEngine Server
            URL url = new URL(url_string);
            InputStream stream = url.openStream();
            InputStreamReader isr = new InputStreamReader(stream);
            BufferedReader reader = new BufferedReader(isr);
            
                // json data receiving
            data = reader.readLine(); // json data are on the first line
            
                // now, in the 'data' variable, there is a json string
                // that can be parsed for json syntax
            JSONObject json = new JSONObject(data);
            System.out.println(json.toString(3));
        } 
        catch(Exception e) {
            e.printStackTrace();
        }
    } 
}

Example 1 - Python source:

#!/usr/bin/python

import urllib2, base64
import simplejson

url = 'http://beta.sketchengine.co.uk/auth/corpora/run.cgi/wordlist?corpname=preloaded/bnc;wlattr=word;wlminfreq=5;wlmaxitems=100;wlpat=test.*;format=json'

usr = '<username>'
passwd = '<password>'

request = urllib2.Request(url)

# authentication
base64string = base64.encodestring('%s:%s' % (usr, passwd))[:-1]
request.add_header("Authorization", "Basic %s" % base64string)

# json data receiving
file = urllib2.urlopen(request)
data = file.read()
file.close()

# now, in the 'data' variable, there is a json string that can be parsed
# for json syntax (e.g. by simplejson)

json_obj = simplejson.loads(data)
print simplejson.dumps(json_obj, sort_keys=True, indent=3)

Example 2

This example presents an easy way how to convert usual structures (dictionaries for Python, Maps for Java) to JSON objects and how to use the obtained JSON objects as a query to Sketch Engine. Available for Java and Python.

Example 2 - Java source: (view full source)

String data, url_string;
String base_url = "http://beta.sketchengine.co.uk/auth/corpora/run.cgi/";
String method = "wordlist";
Map attrs;
JSONObject json_query;

...

// creating query string
attrs = new HashMap();
attrs.put("corpname", "preloaded/bnc");
attrs.put("wlattr", "word");
attrs.put("wlpat", "test.*");
attrs.put("format", "json");
json_query = new JSONObject(attrs);
url_string = base_url + method + "?json=" + json_query.toString();

Example 2 - Python source: (view full source)

import urllib, urllib2, base64
import simplejson

...

base_url = 'http://beta.sketchengine.co.uk/auth/corpora/run.cgi/'
method = 'wordlist'

# creating query string
attrs = dict(corpname='preloaded/bnc', wlattr='word', wlpat='test.*',
               format='json')
encoded_attrs = urllib.quote(simplejson.JSONEncoder().encode(attrs))
url = base_url + method + '?json=%s' % encoded_attrs

Example 3

This example demonstrates how to get a list of frequencies from a list of CQL queries. If the SketchEngine runs on a local machine (i.e. 'base_url' starts with 'http://localhost/') and so there is no network overhead, the computation should be very fast.

Example 3 - Java source: (view full source)

int qlist_size = 4;
String data, url_string;
String base_url = "http://beta.sketchengine.co.uk/auth/corpora/run.cgi/";
String method = "view";
String query_list[] = new String[qlist_size];
Map attrs;
JSONObject json_query;

...

// specifying attributes
attrs = new HashMap();
attrs.put("corpname", "preloaded/bnc");
attrs.put("pagesize", "1");
attrs.put("format", "json");
// query list can be loaded from a file, ...
query_list[0] = "[lemma=\"test\"]";
query_list[1] = "[lemma=\"drug\"][lemma=\"test\"]";
query_list[2] = "[lemma=\"blood\"][lemma=\"test\"]";
query_list[3] = "[lemma=\"test\"][lemma=\"result\"]";

for (int i = 0; i < qlist_size; i++) {
    attrs.put("q", "q" + query_list[i]);
    json_query = new JSONObject(attrs);
    url_string = base_url + method + "?json=" + json_query.toString();

    try {
            // connecting the SketchEngine Server
        URL url = new URL(url_string);
        InputStream stream = url.openStream();
        InputStreamReader isr = new InputStreamReader(stream);
        BufferedReader reader = new BufferedReader(isr);

            // json data receiving
        data = reader.readLine(); // json data are on the first line

            // now, in the 'data' variable, there is a json string
            // that can be parsed for json syntax
        JSONObject json = new JSONObject(data);
        System.out.println(query_list[i] + "\t" + json.get("concsize").toString());

...

Example 3 - Python source: (view full source)

...

base_url = 'http://beta.sketchengine.co.uk/auth/corpora/run.cgi/'
method = 'view'

# creating query string
attrs = dict(corpname='preloaded/bnc', q='', pagesize='1', format='json')
# query_list can be read from a file, ...
query_list = ['[lemma="test"]',
              '[lemma="drug"][lemma="test"]',
              '[lemma="blood"][lemma="test"]',
              '[lemma="test"][lemma="result"]'
             ]

for query in query_list:
    attrs['q'] = 'q' + query

    encoded_attrs = urllib.quote(simplejson.JSONEncoder().encode(attrs))
    url = base_url + method + '?json=%s' % encoded_attrs

    request = urllib2.Request(url)

    # authentication
    base64string = base64.encodestring('%s:%s' % (usr, passwd))[:-1]
    request.add_header("Authorization", "Basic %s" % base64string)

    # json data receiving
    file = urllib2.urlopen(request)
    data = file.read()
    file.close()

    # now, in the 'data' variable, there is a json string that can be parsed
    # for json syntax (e.g. by simplejson)
    json_obj = simplejson.loads(data)

    print query + '\t' + str(json_obj.get('concsize', '0'))

back to index

Attachments

  • example2.java (2.2 kB) - added by vojta on 06/18/08 12:16:22.

Download in other formats:

  • Plain Text

Sketch Engine
Bringing Corpora to the Masses

Lexical Computing Ltd

Brought to you by
Lexical Computing Ltd