Google updates its algorithms thousands of times a year. Most of them are small changes while some of them are monumental and change how search works. Google has rolled out an algorithm update in 2019 called BERT. BERT or Bidirectional Encoder Representations from Transformers has set new standards in Natural Language Processing and Machine Learning. It is one algorithm that has come closest to understanding the nuances of human language systems. Though it is a beginning and has not come anywhere near comprehending language, researchers see this as the future of natural language processing and we see it as the future of search and SEO. This also made an impact on how lead generation and digital marketing tactics are implemented.
There are millions of web pages out there. With social media and the internet growing, the amount of data generated every day is tremendous. Google is undoubtedly the most popular search engine in the world and has some of the most powerful and efficient search engine algorithms around.
When a user types in a search query or keyword on the search bar, Google’s algorithms find the most reliable and relevant results to the user out of all this data and show them in order on the Google Search Engine Results Page (SERP). This includes featured snippets, images, videos, Ads, news, social media pages, local search results with maps, and other organic search results. There are billions of searches made every day on Google from around the world. So, the scale at which Google’s algorithm works is really big. These web pages are competing for the ranking on the SERP results page.
How does Google find the most relevant results related to a search in a fraction of a second?
Well, Google has search engine bots called crawlers or spiders, who find new web pages or information on the internet. They understand the content on these pages and index them in Google’s library for relevant search terms or keywords. The Google algorithms rank these pages in the order of relevance based on various factors like quality of content and the webpage, user-friendliness, user experience, etc. These ranked results are shown to the user on SERP when they search for something online. Other factors that are relevant to the user such as search location is used for further optimization. Google is constantly trying to optimize its algorithm to closely match the user’s search intent and provide exactly what they are searching for.
For instance, if a person searches for “truffle cake” on Google, the user might have various intentions in mind.
Google’s algorithm tries to identify what the user is exactly looking to find when they type in this keyword and shows the results accordingly.
This is why you are going to get different results for the same keyword, say, “football” when you search for it from different parts of the world. But, understanding keywords and search queries has been really difficult for algorithms because of how personalized they are and their innate inability to understand human language.
For example, a person who wants to see if there are flights from Cochin to Delhi on 28 Oct 2019 might just search,
“28 October Flight from Cochin Delhi” “Are there flights Kochi Delhi Oct 28” “Kochi to Delhi flights any 28 October”
While it is easier for a human to make sense of these types of queries, algorithms take these words individually and try to figure out how it all fits in and what it means by using several levels of computation. This is why sometimes we do not get the exact results that we are looking for when we are searching on Google. But we all know that Google is getting better at it.
This is where the importance of BERT comes in.
Bidirectional Encoder Representations from Transformers or BERT is a pre-trained Natural Language Processing Model developed by Google based on machine learning to understand the context of search.
This might seem like too much jargon, but, let us simplify this.
Take the example of “28 October Flight from Cochin Delhi”.These types of texts are called sequential inputs or words put in sequential order. Generally, NLP algorithms consider these sequential inputs in a single direction. I,e., from left to right or from right to left. In this case, 28->October->Flight->from->Cochin->Delhi or the opposite as 28<-October<-Flight<-from<-Cochin->Delhi. So, the word Bidirectional means the BERT considers or processes the sequential input in both directions.
So, BERT considers the entire text to try to make sense of it like humans do. For example, when we say “Cochin Delhi flight”, we mean “Flight from Cochin to Delhi” and not “Flight from Delhi to Cochin”. Here the prepositions used “from” and “to” give context and meaning to the sequence. BERT uses these to understand the language and make sense of the user’s queries.
BERT was trained on unsupervised neural networks. This means it had to process unlabelled data sets, find hidden patterns, and learn by itself without any human intervention. For this, Google made use of the largest encyclopedia available to it, Wikipedia and TPU v3 Pods. Google developed TPUs or Tensor Processing Units to accelerate machine learning workloads and process data really fast.
BERT was trained using a novel technique called masking in which you mask a certain word in a sequence. The algorithm has to guess the word based on the context.
So, When I say input
“Leaves are [masked word] due to the presence of chlorophyll”, it has to predict the word “Green”.
Similarly when I say ” [Masked ] is on 25 December, it has to predict “Christmas”.
In this training, a data set consisting of two sentences each where given as inputs. Half of the set contained connected sentences from the original document, while the other half had random sentences attached to the first sentence. BERT had to disconnect the second sentences that were not relevant and connected.
Inputs might look like this:
01. “Leaves are green in color. But, they turn yellow in autumn”
02. “Leaves are green in color. Denmark is a country”
Here, there are unrelated sentences in the second set that BERT has to disconnect.
BERT was fine-tuned using a number of natural language understanding tasks such as:
Classification tasks like Sentiment Analysis. Sentiment Analysis has to predict a person’s sentiment or feeling from a piece of text.
Question Answering – In these tasks (e.g. SQuAD v1.1), BERT is given a passage and has to find an answer to the given question from the passage.
Named Entity Recognition (NER) – In these tests, the algorithm has to identify various entities in a given text and mark them as a person, date or organization etc.
While BERT is a breakthrough in Machine Learning and NLP, it still has a long way to go to understand human speech and search intent.
The bidirectional approach is slower than the single directional approach
One major limitation of BERT is its inability in Negation Diagnostics.
For example, If you give BERT a statement,
An apple is a ________. (fruit, bird), it predicts ‘fruit’ easily while, if you give a negative sentence,
An apple is not a ________. (fruit, bird), it is not able to predict the answer correctly.
The big question is “Can we optimize for BERT?”. No, we cannot. BERT reminds us that we are creating content for humans and we need to create content that is user-friendly and engaging rather than focusing on search engines. BERT is a leap in understanding the human natural language system and thereby search patterns. It is rolled out in a number of languages across the globe. 1 in 10 out of searches in English use BERT. Search engines like Google are getting more refined and better at predicting user intent especially with voice search and image search. This allows search engines to give users exactly what they are looking for. SEO experts are changing their strategies to get the best results with the changing search algorithms.