wiki:SkE/Config/DynamicAttributes

To make use of dynamic attributes they have to be set up in corpus config file .

DYNAMIC feature requires to be set to a of some internal function or to a name of function from external shared library.

DYNLIB has to be set to "internal" or to the name of shared library accordingly.

Internal functions

- striplastn        (str,n) - returns str striped from last n characters
- lowercase   (str, locale) - returns str in lowercase
- getfirstn        (str, n) - returns first n characters of str
- getnchar         (str, n) - returns n-th character of str
- getnextchars (str,attr,n) - returns n characters after attr(character)
- getnextchar   (str, attr) - returns the character after attr(character)
ATTRIBUTE   lemma {
          DYNAMIC    striplastn
          DYNLIB     internal
          ARG1       "2"
          FUNTYPE    i
          FROMATTR   lempos
          TYPE       index
}
ATTRIBUTE   lc {
          DYNAMIC    lowercase
          DYNLIB     internal
          ARG1       "C"
          FUNTYPE    s
          FROMATTR   word
          TYPE       index
          TRANSQUERY yes
}
ATTRIBUTE   tag {
         DYNAMIC     getfirstn
         DYNLIB      internal
         ARG1        "3"
         FUNTYPE     i
         FROMATTR    ambtag
         TYPE        index
}
ATTRIBUTE   k {
         DYNAMIC     getnchar
         DYNLIB      internal
         ARG1        1
         FUNTYPE     i
         FROMATTR    tag
         TYPE        index
}
ATTRIBUTE   g {
         DYNAMIC     getnextchar
         DYNLIB      internal
         ARG1        "g"
         FUNTYPE     c
         FROMATTR    tag
         TYPE        index
}
ATTRIBUTE   g3 {
         DYNAMIC     getnextchar
         DYNLIB      internal
         ARG1        "g"
         ARG2        3
         FUNTYPE     ci
         FROMATTR    tag
         TYPE        index
}

Shared library

Following example function takes the year of publishing of the document and determins the epoch from which the document comes.

  • the source code (epoch.c):
    #include <stdio.h>
    
    const char * epoch (char* year)
    {
           int y;
           sscanf(year, "%d",&y);
           if(y<1990) return ("before 1990");
           if(y<2001) return ("1990-2000");
           if(y<2005) return ("2001-2004");
           if(y<2009) return ("2005-2008");
           return ("2009 and later");
    }
    
  • to compile the library use:
    gcc -shared -o epoch.so epoch.c
    
  • the important part from the corpus configuration file:
    STRUCTURE doc {
        ATTRIBUTE year
        ATTRIBUTE time {
             DYNAMIC         epoch
             DYNLIB          "/corpora/vert/greek/epoch.so"
             FUNTYPE         0
             FROMATTR        year
             TYPE            index
             TRANSQUERY      yes
             }
    }
    

The dynamic attribute will not be created when compiling the corpus using encodevert, it is necessary to create it additionaly using mkdynattr:

mkdynattr <corpus> <dynattr>
mkdynattr gkwac0.5 doc.time