International Corpus of English

Using VOICE-Online

8 Audio files

As of the end of November 2010, VOICE 1.0 Online includes an additional feature: Selected audio files are made available as audio streams in the VOICE Online interface. This section offers information on how the audio files can be accessed and listened to online and provides some background on the selection and anonymization of the audio files.

8.1 Selected audio files

The 23 corpus texts with available audio files are indicated via an audio icon (loudspeaker symbol) next to the event ID in the corpus tree.
Screenshot of corpus texts with audio icons in corpus tree
Corpus texts with audio icons in corpus tree
When you click on an audio icon, the complete corpus text is displayed in the content area and an audio player appears at the top of the interface.
Screenshot of a corpus text with audio player, pause mode, no time elapsed
Corpus text with audio player
The audio player appears in the full text view (in VOICE style and plain style) and the text header view of corpus texts for which an audio file is available. The audio player does not appear with corpus texts for which no audio file is available. It also does not appear when a list of search results is being displayed.

8.2 Playing an audio file

Streaming an audio file can be started by clicking on the play symbol in the audio player.
Screenshot of streaming an audio file, time format hh:mm:ss, time elapsed
Playing an audio file (time elapsed)
Clicking on the pause symbol , which appears as soon as an audio file is being streamed, pauses the playback. Subsequent clicks on play and pause allow you to continue or pause the playback.
The duration of the audio file which has elapsed thus far is displayed in the light-grey display area of the audio player in hh:mm:ss format. Clicking on the duration changes the time format from hh:mm:ss (time elapsed) to -hh:mm:ss (time remaining).
Screenshot of streaming an audio file, time format -hh:mm:ss, time remaining; indicate time bar
Playing an audio file (time remaining)
Clicking on the digits again changes the time format back to the time elapsed.

The light-blue time bar at the bottom of the display area represents the entire duration of the audio file which is being streamed. This duration can range from a few minutes to over an hour, depending on the duration of the speech event. Irrespective of the duration of the selected audio file, the length of the time bar is fixed. When an audio file is being played, the blue controller in the time bar gradually moves to the right, approaching the end of the bar as the streaming approaches the end of the audio file.

Navigating through an audio file is possible either by manipulating the blue controller in the time bar or by clicking on the two arrow symbols in the audio player.

Moving the cursor of the mouse to a particular position on the time bar and clicking on this position will move the controller to this position (point-and-click function). The audio file will automatically jump to the selected point in time. This will be reflected also in the duration indicated as time elapsed or time remaining. If the point-and-click function is used in pause mode, playback of the file will automatically start as soon as the controller has been moved to a new position in the file. Click on the pause symbol in order to pause the playback again.
Screenshot of point-and-click function and arrows
Navigating through an audio file: point-and-click function; ±10 second arrows
Clicking on the left and right arrows in the audio player allows you to navigate through the audio file at a more local level. Click on the left arrows to jump back 10 seconds and on the right arrows to jump forward 10 seconds. These actions can be performed repeatedly in play as well as in pause mode. The new position in the file will always also be reflected in a change of time elapsed or time remaining.

8.4 List of corpus texts with audio files

In order to view a comprehensive list of all 23 speech events in VOICE 1.0 Online for which audio files are available, click on the word "audio" near the logout button in top-left corner of the VOICE Online interface. Clicking on "audio" will take you to a subpage which lists all 23 event IDs and together with their duration of recording.
Screenshot of subpage with list of audio files
List of corpus texts with audio files
Clicking on an audio icon next to an event ID will take you to the full text view of the selected speech event in which playback of the audio file can be started with the help of the audio player. Click on the "audio" link again in order to return to the complete list of speech events with audio files.

8.5 Remarks on selection and anonymization of audio files

The audio files made available in the VOICE Online interface were selected according to permissions, quality of recording, diversity of domains and speech event types as well as diversity of individual speakers and L1 backgrounds. The 23 audio files which were selected comprise about 200,000 words, i.e. about a fifth of the corpus, and have an approximate duration of 22 hours.

Given the criteria mentioned above, audio files were selected in order to provide corpus users with a sampling of the actual recordings of ELF speech events on which the transcripts were based. The speech events which can now be listened to in VOICE 1.0 Online vary in terms of length, setting, type of activity, transactional and interactional goals, proficiency of individual ELF speakers, acquaintedness of speakers, degree of interactivity and, of course, linguistic content (such as level of formality/informality, politeness, topic of discussion, aspects of grammar, lexis and pronunciation, etc). In order to get an impression of the diversity of ELF interaction, we therefore encourage you to listen to different audio files, since a single speech event can never be representative.

Due to reasons of confidentiality, all audio files were anonymized in terms of sensitive content through post-processing of the audio recordings. In accordance with the guiding principle for anonymization adopted in the VOICE Mark-Up Conventions [2.1], names of people, companies, organizations, institutions as well as some locations, potentially sensitive unintelligible speech and foreign language output and gaps in transcription are thus intentionally replaced by a uniform 0-1 Hz noise in the audio file in order to protect the speakers' identities and personal details.

8.6 Tips for working with audio files

It is possible to scroll up and down in a transcript while listening to the audio file. Similarly, it is possible to navigate to the text header of a speech event and then back to the transcript while playing the audio file of this speech event.
If you get lost in a transcript:
Most internet browsers provide a search function (usually/often accessible with STRG+F) which can be used to look for words/phrases on a website. If you lose track of where you are in a transcript, i.e. if you want to find the location of a word/phrase/utterance you have just heard, try using this search function of your internet browser: Type in a salient word or phrase you hear on the recording and search for it in the transcript in order to locate its exact position in the transcript.
If you find a particularly interesting passage in a speech event, create a bookmark for an utterance within this passage and save it in your personal user account.
Retrieving audio passages with the help of bookmarks:
Bookmarks are labeled by each corpus user according to his/her personal preferences. A label for a bookmark can also include the position of the bookmarked utterance in the audio file. A bookmark might thus be labeled "text 00:12:34", for example. If you access your personal bookmarks archive, a bookmark label which includes information on the position of the bookmarked utterance within the audio file will thus help you to retrieve the passage in the recording. Click on the bookmark to retrieve the bookmarked utterance. Once the bookmarked utterance is displayed in the context of the complete corpus text in the display area, use the point-and-click function of the audio player to navigate to the position in the audio file. NB. Click on the play button once in order to load the audio file before using the point-and-click function.

8.7 Technical requirements for playing audio files

In addition to the general requirements and recommendations for accessing VOICE 1.0 Online, the following prerequisites have to be met:
  • A current version of the Adobe Flash Player
  • If a flashblock plugin is installed, it has to be ensured that it is not enabled for
For general specifications of the technical requirements for accessing and using VOICE 1.0 Online see the section on Browser recommendations.