We are committed to delivering dependable solutions to power artificial intelligence applications, and our Crowd plays a crucial role in accomplishing this objective. With a global community of over one million contributors, our diverse Crowd provides invaluable feedback on our clients' AI models. Their collective expertise enhances operational efficiency and customer satisfaction, making them indispensable to our business success.
Given the significance of our Crowd, it is vital to consistently attract top-tier contributors who can provide quality feedback on our clients' models. To achieve this, we have implemented state-of-the-art machine learning and statistical models that quantify essential contributor traits such as behavior, reliability, and commitment. These advanced models offer crucial insights to our recruiting and Crowd management teams, enabling them to streamline processes, assign relevant tasks to the most qualified contributors, and meet our customers' talent requirements more effectively than ever before.
The challenge: finding the best contributors at scale
The challenge at hand is to identify the most skilled contributors for a specific task on a large scale. If our work at Appen involved only a limited number of AI models and a small group of individuals providing feedback, it would be a straightforward task to determine which contributors should receive priority for specific tasks. However, the reality is that we are often concurrently managing numerous projects for a single client that require extensive feedback from a diverse range of contributors. To effectively serve our clients, we must efficiently oversee hundreds of thousands of people across global markets and make dynamic decisions regarding the prioritization of their unique skills. This is where the field of data science comes into play, enabling us to navigate this complex landscape.
The approach: using data to inform staffing decisions
We are currently developing a robust model to evaluate contributors based on their profile information, historical behaviors, and business value. This model generates a score to assess their suitability for specific projects. By implementing a precise and logical scoring system, we empower our operations teams to efficiently screen, process, and support our contributors.
Our primary goal is to achieve high accuracy and efficiency while working within limited time and resources. Here's how our data-driven system will assist us in making well-informed decisions regarding contributor management and recruitment:
- Real-time data retrieval and transformation: We collect and transform up-to-date raw data about our Crowd.
- Comprehensive contributor features: We extract measurable data points, consolidate them, and generate new, more relevant features that capture signals for contributor quality. This enables us to make informed calculations and rankings.
- Fine-tuning for optimal performance: We thoroughly validate and fine-tune our models to meet the unique requirements of different use cases. This ensures reliable and accurate insights.
- Interpretability and scalability: We prioritize making our features interpretable for a wide user base and usable for a large set of use cases. This allows us to adapt and scale our system to evolving needs.
- Optimizing resource allocation: Our system addresses operational prioritization, optimizing resource allocation to make the most efficient use of our limited resources.
The result? Streamlined project delivery and an exceptional experience for our contributors and clients.
A comprehensive perspective: The end-to-end procedures for data and operations
Having acquired a comprehensive grasp of our overarching strategy, let's now delve into a more intricate exploration of the technology's inner workings. We'll explore the data and operational procedures that are poised to revolutionize our approach to contributor management and recruitment.
1. Building a solid foundation: constructing the feature store
To ensure a thorough representation of contributors, we construct a feature store. This hub serves as an organized repository for capturing vital information related to their readiness, reliability, longevity, capacity, engagement, lifetime value, and other quality assessment signals. By generating detailed profiles, this powerful store enables us to precisely evaluate the quality of contributors.
2. Addressing the "cold start" challenge
We acknowledge that newly registered contributors present the unique challenge of onboarding and evaluation. To overcome the potential limitations of a "cold start," we leverage the collective knowledge of contributors within the same locales. By approximating descriptions based on statistically aggregated group data, we ensure inclusivity and extend our reach to a diverse pool of talent.
3. Choose, Apply, and Refine: Unleashing the Power of Algorithms
At Appen, we use many ranking heuristics and algorithms to evaluate our data. Among the most effective types are multiple-criteria decision-making algorithms. This lightweight yet powerful methodology comprehensively handles scores, weights, correlations, and normalizations, eliminating subjectivity and providing objective contributor assessments.
The following diagram illustrates the high-level procedures of how multiple-criterial decision-making algorithms solve a ranking and selection problem with numerous available options.
4. Model training and experimentation: tailoring to unique business requirements
Considering our diverse range of use cases, recruiting and crowd management teams often require different prioritizations based on specific business needs. We adopt a grid search approach to model training, exhaustively exploring all possible combinations of scoring, weighting, correlation, and normalization methods. This process implicitly "learns" the optimal weights for input features, ensuring a tailored approach to each unique business use case.
5. Simulating A-B testing: choosing the best model candidates
To select the models that best align with our client’s business use cases, we conduct rigorous A-B testing. By simulating the effects of new model deployments and replacements, we compare different versions of the experiment group against a control group. We meticulously analyze contributor progress, measuring the count and percentage of contributors transitioning between starting and ending statuses. This data-driven approach helps us identify the model candidates that yield the most significant improvements over our current baseline.
6. Interpretation and validation: understanding the models
Once we have a set of predictions and comparisons, we dive deep into understanding and validating the models. We review model parameters, including weights, scores, correlations, and other modeling details, alongside our business operation partners. Their valuable insights and expertise ensure that the derived parameters align with operational standards, allowing us to make informed decisions and provide accurate assessments.
7. Expanding insights: additional offerings by ML models
Our machine learning (ML) models not only provide scores and rankings but also enable us to define contributor quality tiers. By discretizing scores and assigning quality labels such as Poor, Fair, Good, Very Good, and Exceptional, we offer a consistent and standardized interpretation of quality measurements. This enhancement reduces manual efforts, clarifies understanding, and improves operational efficiency.
The future is data-driven, efficient, and inclusive
Contributor recruitment and management are complex processes, but through data-driven decisions and intelligent resource allocation, we're transforming the business landscape. By prioritizing relevant contributors based on their qualities, we optimize project delivery, create delightful customer experiences, and achieve a win-win-win outcome for Appen, our valued customers, and our dedicated contributors.
Together, let's unlock the power of AI for good and shape a future where technology drives positive change. Join us on this exciting journey as we build a better world through AI.