How AccuRanker determines search intent with AI
Last updated on Wednesday, September 21, 2022
The following article gives you an insight into the new AccuRanker (AI-based) search intent feature.
To better understand this article, we recommend you first read our other article: What is Search Intent?. Here, you will learn that we classify the search intent Google is targeting by looking at the SERP (search engine result page) more than the keyword. You will also learn why search intent may change over time, and that it is possible to have multiple intents for a single SERP/keyword.
In the rest of this article, we will refer to a SERP/keyword pair as a keyword for brevity, even though a keyword can have varying SERPs.
Search intent is a nuanced topic. And setting up a set of fixed rules to find the intent Google is targeting for a given keyword is near impossible. Fortunately, the developments in the field of AI and machine learning over the past decade makes new methods feasible. We now have the right toolset to use the enormous amounts of SERP data AccuRanker processes daily to ‘train’ a machine learning model.
As training data for the new search intent model, we used a combination of unlabelled and hand-labeled data. This dataset consists of the search intent for keywords as labeled by human experts combined with the corresponding SERP data. With machine learning techniques, patterns appear. These patterns are translated into a model which can be used to find the search intent for keywords outside the training dataset.
Using a machine learning model enables us to predict search intent with greater precision than rule-based approaches. However, reaching 100% precision is impossible for many reasons. Some of these reasons are:
- Even humans (up to 40%) looking at SERPs disagree about the search intent.
- The SERP can display multiple intents.
- There is not always a 100% alignment on the definition of different search intent categories.
We have tried outlining the AccuRanker definitions with examples. By using these definitions and assessing the machine learning model against these labels, we can achieve an agreement with the hand-labeled data of more than 90%. And we are often inclined to agree more with the machine learning model than the human label when going over the differences.
The new AccuRanker search intent model uses more than one hundred features of the SERP to figure out the search intent. These features are interdependent, and it is thus not easy to explain in detail how they work. If that were the case, we might as well use a rule-based approach.
The features include special words (translated to multiple languages) in keywords, titles, URLs, and descriptions as well as SERP features and other SERP metadata such as cost-per-click and AdWords competition.
One way to understand the new search intent model is to look at SHAP visualizations of how the features affect the model output in different cases.
The below image shows the top twenty features which determine if the intent is transactional.
Here, you get to see the inside of the new search intent model, and how decisions are made. This is shown in a slightly simplified manner.
The chart is read as follows:
- On the y-axis, you have the most impactful features for finding whether a keyword belongs to the transactional category.
- On the x-axis, you see the impact on the model output of the individual features going from negative to positive. The vertical line separates negative and positive impact.
- Each dot corresponds to a keyword. The color of the dot maps to the value of the corresponding feature for this keyword. Red means a high value, blue a low value.
The chart from the previous paragraph showed which features determines if the search intent is transactional. Let us examine this chart and take competition on AdWords (competition_adwords) as an example.
You see next to competition_adowrds on the chart that red dots are to the right of the vertical line. This means that high competition on AdWords (a red dot) makes it more likely to be a transactional keyword (to the right of the vertical line).
On the other hand, take a look at the presence of a featured snippet (page_featured_snippet). If this value is high (red dot) it means there is a featured snippet on the SERP. The red dots are all to the left of the vertical line. This means it is less likely to be a transactional keyword when there is a featured snippet.
Other things you can see on the chart is that Amazon being present one or more times (urls_count_amazon.) makes it more likely that the SERP is transactional and the opposite is true of Wikipedia.
These findings are not surprising. The cool part is that the machine learning model is not told any of this in advance. It has inferred it from the data. In addition, it has inferred the relationship between the different features. Notice that the dots are spread on the x-axis instead of being on top of each other. This is because the impact of SERP features on the model depends on which other features are present on the SERP. So just because there is high competition on AdWords the model will not necessarily conclude that the keyword is transactional.
At the other end of the spectrum, you can see the top twenty features which determine if the intent is informational.
Here, you see that a high competition on AdWords means it is unlikely to be informational intent. You also see that features such as video carousels, related questions and featured snippets are prevalent for SERPs with informational intent.
On the other hand, the word “best” typically indicates commercial rather than informational intent. The same goes for the review SERP feature and local results (page_maps_local).
Another interesting insight is that when Facebook is a part of the SERP, it is typically not informational intent. Instead, it is navigational intent.
For navigational intent, you see the below chart.
Obviously, sitelinks are often associated with navigational intent. You can also see that knowledge panels are often present for keywords with navigational intent. But they are also often present for informational keywords.
LinkedIn, Twitter and Facebook will often show up on SERPs with navigational intent. Local results and results about are also associated with navigational intent. Note that local results are also associated with commercial intent depending on the context.
There will typically not be high competition on AdWords, featured snippets, or thumbnails for navigational intent.
But bear in mind that there is always an exception to every rule. And that you can easily have a keyword with navigational intent which has features that show other intents. The charts and examples give overall indications for each type of intent.
You will also notice that in contrast to most other approaches for search intent, features in the AccuRanker model can be interdependent and can also make a search intent type less likely. Creating such a model is possible by utilizing AccuRanker’s vast amount of data in combination with advanced machine learning techniques.
For commercial intent, you will mostly see items that we have already described.
On the other hand, you will typically not see words such as “buy” or “sale”. Or domains like Amazon, Facebook, or Wikipedia.
This article has given you an insight into the new AccuRanker (AI-based) search intent feature. It presented components of how the model works, and what kind of features affect the different types of search intent in both positive and negative directions.
The new search intent feature is not told or taught any rules. Instead, the model discovers its own rules by pattern matching with a large number of examples. Simply put, the new search intent model learns from data. Having precise search intent labels allows you to group and target keywords by intent. And this is paramount when creating content.