Off the shelf machine learning datasets repository from Appen. Find 250+ datasets across 80 languages and dialects for a variety of common AI and ML use cases.
This data contains thousands of audio utterances for common medical symptoms like “knee pain” or “headache,” totaling more than 8 hours in aggregate. Each utterance was created by individual human contributors based on a given symptom. These audio snippets can be used to train conversational agents in the medical field.
This dataset was created via a multi-job workflow. The first involved contributors writing text phrases to describe symptoms given. For example, for “headache,” a contributor might write “I need help with my migraines.” Subsequent jobs captured audio utterances for accepted text strings.
This dataset contains both the audio utterances and corresponding transcriptions.
This input data consists of symptom prompts. Human contributors based their text phrases on these prompts, which were then used to collect audio utterances later in this workflow. The “Data” tab above contains further information and data on the audio recordings that were eventually made from these prompts.