Search Relevance Optimization with Machine Learning Techniques
When you type a query into your favorite online retailer’s search bar, what kind of results do you expect? Most customers today expect search engines to be highly accurate, relevant, and instantaneous in delivery. Search engines should offer a personalized experience and predict your needs – even if your wording or spelling is a bit off or your query is somewhat vague. Companies are turning to artificial intelligence (AI) to power their search engines to excel in these areas.
Optimizing for search relevance—the level of accuracy between the query and the search results—is critical for many large organizations with robust websites, especially online retailers. Over 40% of customers go directly to the search bar first, meaning it’s the first impression they have of the site. The search engine directs customers to the products and information they want and, ideally, leads to sales. An engine with high search relevance produces greater customer satisfaction, conversion, and retention, while a bad search experience will likely cause frustration and negatively impact the customer.
While most companies understand how integral search is to the customer experience, how does search relevance work? Companies are diving in and investing in improving search relevance by leveraging machine learning (ML) techniques. These organizations use customer behavior and analytics for search relevance machine learning initiatives that connect customers to the things they want most.
AI-Powered Search vs. Basic Search
Search has evolved over time. In the past, search engines typically quantified the number of times search keywords appeared on web pages; the higher the number, the higher a particular thing (website, product, etc.) would be ranked in the search results. This basic search method of matching a query to text in a document is still used on many sites, especially those owned by smaller companies. But in 1998, Google changed everything. The company entered the scene and was the first to apply advanced statistical analysis techniques to interpret and categorize queries.
Today, many companies use statistical analysis powered by AI to drive search results. This is partially due to the increase in complexity of results. The content structure has evolved to feature more than just text, but also tags, descriptions, category markers, and other searchable metadata. In addition, companies now hope to incorporate their business priorities, the geolocation of the user, the user’s past behavior, and other contextual factors in determining specific content relevance for each individual. These complications create the need for intricate algorithms to derive interpretations and output solutions.
AI is further able to differentiate between low-quality content and high-quality content and rank these accordingly. For example, AI can identify search engine optimization (SEO) techniques that are attempting to unfairly benefit from an algorithm (such as stuffing keywords and invisible text into a product description or webpage) and place those search results below high-quality, intent-driven results.
Challenges in Improving Relevance
Customers have never been more demanding; our expectations are sky-high for tailored experiences and easy pathways to our needs and wants. Yet, people are all very different. I might type in a search query for “queen bedding,” and you might type “comforter,” however we both could very well be in search of the same product as our end result. People have different ways of asking the same question; learning all of these possible iterations is difficult for any algorithm.
Before even understanding iterations, though, algorithms must understand our language on an elementary level. Natural language understanding is the discipline wherein machines come to learn human language. For successful search relevance, the model must be able to detect, for instance, what the word “bedding” means and deliver the appropriate results. To make it more complex, that search engine should also guess that when I type “beding” I actually meant “bedding.” Models must take spelling, typos, and grammar errors into account, given their frequency.
Teaching a search engine to understand our natural language requires massive amounts of training data. This can be a discouraging hurdle to overcome for many companies, especially those of small or medium size, given the perception around expense, time, and effort required to collect and prepare such data.
Companies shouldn’t be discouraged, though. See how Shotzr worked with Appen to identify over 17,000 images that did not require additional labeling so they could focus on the ones that did, improving the search relevance for stock photography.
ML-Based Methods for Search Relevance Optimization
AI-powered search engines rely on natural language processing (NLP) to read, understand, interpret, and analyze queries. As mentioned, these models targeting improved search relevance require training on natural language data. This data must cover millions of use cases and edge cases, which run from vague to precise. A good algorithm should provide optimal search results even when the query isn’t obvious.
There are numerous techniques within the natural language processing discipline, including semantic annotation, text analysis, and named entity recognition. Many of these are covered in our brief introduction to NLP. What’s important to know is that these techniques equip machines with the tools to parse texts and uncover their meaning. A search engine can use the derived meaning to detect optimal results for the query and rank those higher.
Search relevance models may also use click tracking, which determines which result is statistically most likely to be the best fit for the individual based on that person’s past queries.
Specific search engines, such as Google Image Search or Adobe Stock Photos, require image analysis. Like NLP, image analysis is a technique that requires a tremendous amount of high-quality, annotated image data. Image analysis helps machines categorize images and image qualities into relevant, searchable characteristics.
In using any of these ML techniques, it’s best to have a human-in-the-loop approach to provide ground truth monitoring. For example, humans can rate whether specific queries are providing relevant search results. In erroneous cases, the human can provide feedback to the machine to improve its accuracy.
As query inputs evolve, machines will have to adapt as well. Traditionally, queries have been text-based, but now we’re seeing opportunities to search using an image or query using your voice. These will add new but not insurmountable complexities to search engines.
Insight from Appen Search Relevance Expert Kelly Sinclair
At Appen, we rely on our team of experts to help you build cutting-edge models utilizing AI-powered search that enables successful search relevance. In turn, this provides a quality customer experience and improves business ROI. Kelly Sinclair, our Director of Client Services Delivery, is one of our team’s leading experts in ensuring customer success when implementing and improving search relevance with machine learning. Kelly’s top three insights on successful search relevance projects include:
- Identify the business needs. Relevance is challenging. It can rely heavily on many changing variables such as semantics, location, or context. User intent is critical as this type of work can be subjective. A query conducted on a mobile device will yield results quite differently for a user performing the same search on a desktop. Success comes from a deep understanding of each project and its goals. These goals should be specific, measurable, achievable, and relevant
- Establish clear goals and metrics for the project. Developing quality data is not instantaneous – it requires training, reinforcement, and time-driven expertise. To do this, we have to define what success looks like. These measurable outcomes should be acceptable and agreed upon by all stakeholders involved. Projects are dynamic, and with each iteration of the cycle resulting in improvements in efficiency and data quality, we should review metrics to ensure we are still delivering value.
- Implement Data-Driven Decisions. The process of data-driven decision-making starts with the collection of data based on measurable goals and recognizing data signals. Machine learning can help identify gaps, recognize patterns and areas of improvement for decision making. We can then take the analytical approach to determine the next best step in response to those insights.
What Appen Can Do For You
Our search relevance optimization expertise spans over 20 years. We’ve used that time to successfully support our clients with high-quality training data for their unique search needs. Whether it’s helping Adobe Stock improve their search relevance or working with Microsoft’s Bing team to ramp up in new markets, we are here to help you achieve quick delivery and scalability of your AI search relevance models.
Learn more about our expertise and how we can help with your specific search relevance needs.