Language: Pashto
DB Name: PAS_ASR002
Product Type: Conversational microphone data
Environment: Low background noise
Speaker: 40
Audio Hrs: 78
kHz: 16
Channels: 2

  • Each recording consists of a number of TransTAC style dialogues (monolingual 2-wayconversations). One speaker acts as an interviewer and the other as the interviewee
  • The interviewer appears in more than one set of dialogues but the interviewee is unique foreach set
  • Data collection scenarios are similar to TransTAC style (e.g. civil affairs, checkpoints etc.)
  • Demographic information is as follows:
  • Roughly 25% female and 75% male speakers
  • Broad range of ages from 18 years ? 55 years
  • Broad distribution across two dialect regions in Afghanistan
  • 40 hours of conversation data (equivalent to 80 hours of single channel audio)
  • Database is fully transcribed and time stamped
  • Database is accompanied by a pronunciation lexicon containing all transcribed words
  • A full translation of the transcripts into French is also available as an optional additionalpurchase