Need Help? Talk to Our Experts
We are seeing more references to machine learning in how Google is ranking pages and other documents in search results.
That seems to be a direction that will leave what we know as traditional, or old school signals that are referred to as ranking signals behind.
It’s still worth considering some of those older ranking signals because they may play a role in how things are ranked.
As I was going through a new patent application from Google on ranking image search results, I decided that it was worth including what I used to look at when trying to rank images.
Images can rank highly in image search, and they can also help pages that they appear upon rank higher in organic web results, because they can help make a page more relevant for the query terms that page may be optimized for.
Here are signals that I would include when I rank image search results:
Those are signals that I would consider when I rank image search results and include images on a page to help that page rank as well.
A patent application that was published this week tells us about how machine learning might be used in ranking image search results. It doesn’t itemize features that might help an image in those rankings, such as alt text, captions, or file names, but it does refer to “features” that likely include those as well as other signals. It makes sense to start looking at these patents that cover machine learning approaches to ranking because they may end up becoming more common.
Giving Google a chance to try out different approaches, we are told that the machine learning model can use many different types of machine learning models.
The machine learning model can be a:
We are told more about this machine learning model. It is “used to accurately generate relevance scores for image-landing page pairs in the index database.”
We are told about an image search system, which includes a training engine.
The training engine trains the machine learning model on training data generated using image-landing page pairs that are already associated with ground truth or known values of the relevance score.
The patent shows an example of the machine learning model generating a relevance score for a particular image search result from an image, landing page, and query features. In this image, a searcher submits an image search query. The system generates image query features based on the user-submitted image search query.
That system also learns about landing page features for the landing page that has been identified by the particular image search result as well as image features for the image identified by that image search result.
The image search system would then provide the query features, the landing page features, and the image features as input to the machine learning model.
Google may rank image search results based on various factors
Those may be separate signals from:
This patent describes how it would rank image search results in this manner:
– Generating an image search results presentation that displays the candidate image search results ordered according to the ranking – Providing the image search results for presentation by a user device
If Google can rank image search query pairs based on relevance scores using a machine learning model, it can improve the relevance of the image search results in response to the image search query.
This differs from conventional methods to rank resources because the machine learning model receives a single input that includes features of the image search query, landing page, and the image identified by a given image search result to predicts the relevance of the image search result to the received query.
This process allows the machine learning model to be more dynamic and give more weight to landing page features or image features in a query-specific manner, improving the quality of the image search results that are returned to the user.
By using a machine learning model, the image search engine does not apply the same fixed weighting scheme for landing page features and image features for each received query. Instead, it combines the landing page and image features in a query-dependent manner.
The patent also tells us that a trained machine learning model can easily and optimally adjust weights assigned to various features based on changes to the initial signal distribution or additional features.
In a conventional image search, we are told that significant engineering effort is required to adjust the weights of a traditional manually tuned model based on changes to the initial signal distribution.
But under this patented process, adjusting the weights of a trained machine learning model based on changes to the signal distribution is significantly easier, thus improving the ease of maintenance of the image search engine.
Also, if a new feature is added, the manually tuned functions adjust the function on the new feature independently on an objective (i.e., loss function, while holding existing feature functions constant.)
But, a trained machine learning model can automatically adjust feature weights if a new feature is added.
Instead, the machine learning model can include the new feature and rebalance all its existing weights appropriately to optimize for the final objective.
Thus, the accuracy, efficiency, and maintenance of the image search engine can be improved.
The Rank Image Search results patent application can be found at
Ranking Image Search Results Using Machine Learning Models US Patent Application Number 16263398 File Date: 31.01.2019 Publication Number US20200201915 Publication Date June 25, 2020 Applicants Google LLC Inventors Manas Ashok Pathak, Sundeep Tirumalareddy, Wenyuan Yin, Suddha Kalyan Basu, Shubhang Verma, Sushrut Karanjkar, and Thomas Richard Strohmann
Abstract
Methods, systems, and apparatus including computer programs encoded on a computer storage medium, for ranking image search results using machine learning models. In one aspect, a method includes receiving an image search query from a user device; obtaining a plurality of candidate image search results; for each of the candidate image search results: processing (i) features of the image search query and (ii) features of the respective image identified by the candidate image search result using an image search result ranking machine learning model to generate a relevance score that measures a relevance of the candidate image search result to the image search query; ranking the candidate image search results based on the relevance scores; generating an image search results presentation; and providing the image search results for presentation by a user device.
The search engine may include an indexing engine and a ranking engine.
The indexing engine indexes image-landing page pairs, and adds the indexed image-landing page pairs to an index database.
That is, the index database includes data identifying images and, for each image, a corresponding landing page.
The index database also associates the image-landing page pairs with:
Optionally, the index database also associates the indexed image-landing page pairs in the collections of image-landing pairs with values of image search engine ranking signals for the indexed image-landing page pairs.
Each image search engine ranking signal is used by the ranking engine in ranking the image-landing page pair in response to a received search query.
The ranking engine generates respective ranking scores for image-landing page pairs indexed in the index database based on the values of image search engine ranking signals for the image-landing page pair, e.g., signals accessed from the index database or computed at query time, and ranks the image-landing page pair based on the respective ranking scores. The ranking score for a given image-landing page pair reflects the relevance of the image-landing page pair to the received search query, the quality of the given image-landing page pair, or both.
The image search engine can use a machine learning model to rank image-landing page pairs in response to received search queries.
The machine learning model is a machine learning model that is configured to receive an input that includes
(i) features of the image search query (ii) features of an image and (iii) features of the landing page of the image and generate a relevance score that measures the relevance of the candidate image search result to the image search query.
Once the machine learning model generates the relevance score for the image-landing page pair, the ranking engine can then use the relevance score to generate ranking scores for the image-landing page pair in response to the received search query.
In some implementations, the ranking engine generates an initial ranking score for each of multiple image—landing page pairs using the signals in the index database.
The ranking engine can then select a certain number of the highest-scoring image—landing pair pairs for processing by the machine learning model.
The ranking engine can then rank candidate image—landing page pairs based on relevance scores from the machine learning model or use those relevance scores as additional signals to adjust the initial ranking scores for the candidate image—landing page pairs.
The machine learning model would receive a single input that includes features of the image search query, the landing page, and the image to predict the relevance (i.e., relevance score, of the particular image search result to the user image query.)
We are told that this allows the machine learning model to give more weight to landing page features, image features, or image search query features in a query-specific manner, which can improve the quality of the image search results returned to the user.
The first step is to receive the image search query.
Once that happens, the image search system may identify initial image-landing page pairs that satisfy the image search query.
It would do that from pairs that are indexed in a search engine index database from signals measuring the quality of the pairs, and the relevance of the pairs to the search query, or both.
For those pairs, the search system identifies:
These features can include vectors that represent the content of the image.
Vectors to represent the image may be derived by processing the image through an embedding neural network.
Or those vectors may be generated through other image processing techniques for feature extraction. Examples of feature extraction techniques can include edge, corner, ridge, and blob detection. Feature vectors can include vectors generated using shape extraction techniques (e.g., thresholding, template matching, and so on.) Instead of or in addition to the feature vectors, when the machine learning model is a neural network the features can include the pixel data of the image.
These aren’t the kinds of features that I usually think about when optimizing images historically. These features can include:
The patent interestingly separated these features out:
The patent points out some alternative ways that the location of the image within the Landing Page might be found:
The prominence of the image on the landing page can be measured using the relative size of the image as displayed on a generic device and a specific user device.
The textual descriptions of the image on the landing page can include alt-text labels for the image, text surrounding the image, and so on.
The features from the image search query can include::
The system may adjust initial ranking scores for the image search results based on the relevance scores to:
The system receives a set of training image search queries For each training image search query, training image search results for the query that are each associated with a ground truth relevance score.
A ground truth relevance score is the relevance score that should be generated for the image search result by the machine learning model (i.e., when the relevance scores measure a likelihood that a user would select a search result in response to a given search query, each ground truth relevance score can identify whether a user submitting the given search query selected the image search result or a proportion of times that users submitting the given search query select the image search result.)
The patent provides another example of how ground-truth relevance scores might be generated:
When the relevance scores generated by the model are a prediction of a score assigned to an image search result by a human, the ground truth relevance scores are actual scores assigned to the search results by human raters.
For each of the training image search queries, the system may generate features for each associated image-landing page pair.
For each of those pairs, the system may identify:
(i) features of the image search query (ii) features of the image and (iii) features of the landing page.
We are told that extracting, generating, and selecting features may take place before training or using the machine learning model. Examples of features are the ones I listed above related to the images, landing pages, and queries.
The ranking engine trains the machine learning model by processing for each image search query
The patent provides some specific implementation processes that might differ based upon the machine learning system used.
I’ve provided some information about what kinds of features Google May have used in the past in ranking Image search results.
Under a machine learning approach, Google may be paying more attention to features from an image query, features from Images, and features from the landing page those images are found upon. The patent lists many of those features, and if you spend time comparing the older features with the ones under the machine learning model approach, you can see there is overlap, but the machine learning approach covers considerably more options.
Copyright © 2020 SEO by the Sea ⚓. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator, the site you are looking at may be guilty of copyright infringement. Please contact SEO by the Sea, so we can take appropriate action immediately.Plugin by Taragana
The post How Google Might Rank Image Search Results appeared first on SEO by the Sea ⚓.
[ad_2] Source link
Digital Strategy Consultants (DSC) © 2019 - 2024 All Rights Reserved|About Us|Privacy Policy
Refund Policy|Terms & Condition|Blog|Sitemap