“Appen’s platform is really easy to use. What makes it great is you can reach so many different channels because of its global outreach.” – Kenneth Benoit, Director of the Data Science Institute, LSE
The University
Founded in 1895, the London School of Economics and Political Science (LSE) has long been a global leader in social sciences among universities. One of their many research wings, the Data Science Institute (DSI) focuses on studying data science as it pertains to social, political, and economic issues. Experiments cover a range of human matters and frequently include data annotation projects that require human labels.The Challenge
Researchers led by Kenneth Benoit at the Department of Methodology set out to study political science as it pertained to political texts—both in their content and in their sophistication. With the first project, their interest was in capturing the content of the messages that political actors send to others and further, using those discoveries to calculate political party positions. They found that relying on expert researchers to go through these messages was time-consuming, expensive, and nearly impossible to scale. Plus, using only experts in a field would provide a more singular perspective, making the data potentially more biased and less reliable. The team was in need of a more agile, reproducible process for data labeling that would replace their current approach. With the second project, researchers set out to identify indicators that would measure the sophistication, or readability, of political texts. To do so, they needed a large and varied sample size of texts, and numerous human labelers to compare texts to one another. They also wanted to reproduce the experiment across several languages, which would require fluent labelers in that language. Again, the challenge of experts in each of these languages was hard to find, expensive, and time-consuming. At the time, they were partnering with an organization that had limitations on the languages they could support, making it impossible to translate these political texts into all the languages they’d envisioned. The reporting that was needed for the research project was also not available through their provider so they had to calculate their own validity checks – something critical for research papers.The Solution
“Appen’s reporting features were very useful, as was knowing the completion times, the responses, and the reliability scores of the crowd.” – Kenneth Benoit, Director of Data Science Institute, LSEThe research team engaged us (at the time known as CrowdFlower) in 2015 after meeting at a conference. Our platform had several features that they needed:
- a dashboard that included important validation metrics, such as confidence checks
- user-friendly, so setting up jobs was a quick process
- access to an unrestricted global Crowd of contributors