Vienna-Oxford
International Corpus of English

Using VOICE-Online

10 VOICE POS Online

The purpose of this section is to provide help for using the VOICE POS Interface. VOICE POS Online is accessible to all registered users of VOICE Online. For detailed information on the procedures and guidelines developed for part-of-speech tagging and lemmatizing VOICE,and on tagging format and tags, please refer to VOICE Part-of-Speech Tagging and Lemmatization Manual. We strongly encourage users to familiarize themselves with this manual before working with VOICE POS Online.

10.1 User Interface

After registration and logging on to VOICE Online, you can access VOICE POS Online via the button on the VOICE Online Interface.
Screenshot of how to access VOICE POS Online via the VOICE
								Online Interface
Access to VOICE POS Online via the VOICE Online Interface

The search interface consists of a search field and a content area below the search field. The search field is used for entering token, tag or lemma information.

Screenshot of the search field and content area without specific
							search
VOICE POS Online Search Interface

For searching VOICE POS, simply enter your search in the white box in the top left hand corner of the interface and click "Search" button. To the right of the “Search” field, a number will appear indicating how many results your search has yielded. Your search results are displayed in the content area below. The content area is divided into four columns. These are, from left to right, "No." (for "Number"), "Identity" (for "Event ID and position in utterance"), "Search Results" and "Tag information".

Screenshot of the search field and content area with
							search
VOICE POS Online Search Interface with search

10.2 Searching VOICE POS

10.2.1 General search information

For any token position in the corpus, you can search for the following information:

  1. Token, e.g. gone
  2. Part of Speech, e.g. VVN
  3. Lemma, e.g. go

These can be searched for individually or in a co-textual sequence. Each individual item in a query needs to be separated with a space.

10.2.2 Default searches and further specified searches

If not further specified, entering a word will search for a token (not lemma), and entering a POS-tag will yield results for this word class occurring as either a form- or a function tag (see section 3.2 Tagging formats in VOICE VOICE Part-of-Speech Tagging and Lemmatization Manual).

Example: Default search for a token:

The search be will yield results for the token ‘be’, e.g. “agency can be translated”, but not for tokens assigned to the lemma ‘be’, such as 'is', 'were', 'being', etc.

Example: Default search for a POS-tag:

The search for the POS-tag NNS will yield all of the following kinds of results:

  1. multicultural teams_NNS(NNS)
  2. in one countries_NNS(NN)
  3. three university_NN(NNS)

Alternatively, users may want to further specify their searches, e.g. when searching for lemmata, or only form- or function-tags (see section 3.2 Tagging formats in VOICE in the VOICE Part-of-Speech Tagging and Lemmatization Manual), respectively. This can be done by labelling the search items as illustrated in table below.

Screenshot of how to explicitly label items
Explicitly labelling POS form and function and lemmata

Example: search for a POS-form tag with explicit labelling:

The query p:NN yields the result “three university_NN(NNS)”, but not e.g. in one countries_NNS(NN)

Example: Search for a POS-function tag with explicit labelling:

The query f:NN yields the result “from another countries_NNS(NN)”, but not e.g. three university_NN(NNS)

Example: Search for a lemma with explicit labelling::

The query l:be yields the results “i’m sorry”, “you were mentioning”, “you must be clear about” etc.

10.2.3 Sub-specifications

Users can sub-specify their queries further by adding a comma:

Examples:

  1. search for the token work as a noun: “work,NN”
  2. search for the tag VVP but only when occurring as the token walk: “VVP,walk”
  3. search for all items tagged VVP and listed under the lemma walk: “VVP,l:walk”

10.2.4 Wildcard searches

The wildcard character ‘*’, standing for zero or more characters, or ‘.’, standing for a single character can also be used (cf. Tables 1-5 for examples).

10.2.5 Limitation to speech events, domains, speech event types

Queries can be delimited to particular speech events, domains or speech event types, using e: or evt:.

Examples:

  1. work*,e:PBmtg3 limits the query to this speech event
  2. work*,e:PB* to the domain Professional/Business
  3. work*,e:*mtg* to the speech event type of meetings.

10.2.6 Contracted forms

VOICE transcripts are tokenized, which results e.g. in contracted forms being split into two parts (2 tokens): you + ‘re. As a consequence, queries including contracted forms need to be searched for with a space preceding the contracted form in VOICE POS Online.

Screenshot showing how to search for contracted forms
How to search for contracted forms

10.2.7 Further examples of query usage

Screenshot showing query examples by token
Query by token
Screenshot showing query examples by POS tag
Table 2: Query by POS tag
Screenshot showing query examples by lemma
Table 3: Query by lemma
Screenshot showing query examples of combined queries,
								seperated by comma
Table 4: Combined query on a single token, separated by comma
Screenshot showing query examples on a sequence of
								tokens
Table 5: Queries on a sequence of tokens

10.3 Exporting results

Users can export results by clicking one or more rows in the search result and pressing Ctrl+C. A separate window will open with a text version of the selected results, which can then be copied and exported.

Screenshot showing how to search export results from VOICE POS
							Online
Exporting search results from VOICE POS Online

10.4 Annotation differences between VOICE Online and VOICE POS Online

The mark-up in VOICE Online and VOICE POS Online differs for a number of items. The table below gives an overview of these differences in order to facilitate working with both versions of VOICE in parallel.

Screenshot showing annotation differences between VOICE Online and
							VOICE POS Online
Annotation differences between VOICE Online and VOICE POS Online

10.5 Additional help

For more information on the guiding principles and the decisions taken for the tagging and lemmatization of VOICE, click on the icon in the left hand corner of the VOICE POS Interface, which will redirect you to the VOICE Tagging and Lemmatization Manual. Clicking the icon will open a pdf file with a short version of the VOICE Tagset.
Screenshot showing additional help buttons
Additional help in VOICE POS Online

10.6 Recommended citation

The recommended citation for section 10. VOICE POS Online in this manual is:

VOICE Project. 2013. “VOICE POS Online”. Using VOICE Online. Vienna. http://univie.ac.at/voice/help/pos (date of last access).