Search engines are trying to understand language: they want to understand what users are searching for, and they want to be able to provide them with the best results. Before I started working at Yoast, I studied linguistics: the scientific study of language. During my years at Yoast, I’ve noticed that linguistics and SEO have a lot of overlap. In this article, I want to give you some SEO insights from a linguistic perspective. Let’s dive in!
Different aspects of language
Before we can go into the linguistic approach to SEO, we first have to understand what language is. Language consists of many different aspects. Think about it: we make speech sounds or write letters, which together form words. We put these words in a specific order, so they form sentences and phrases. And these sentences mean something to us.
Sometimes we also want to achieve something with language. For example, when we say “it’s cold in here,” we might not only want to express we’re cold, but we could mean it as a request to close the window. To study all of these aspects, we distinguish different levels of language in the field of linguistics.
Linguistic levels of language
The most basic level is the level of sounds and letters, which we call phonology (when it comes to speech) and graphology (when we talk about writing). Then, there’s the morphological level, which studies how these sounds and letters together make words and different word forms. For example, the word “house” can be combined with “tree” to make “treehouse” and with “dog” to make “doghouse,” but we can’t really combine it with “banana.”
The next level, syntax, describes the rules we have for creating sentences. There are a million words we can choose from that we could use to form an infinite number of possible sentences. But these syntactic rules allow us only a small number of ways in which these words can be combined.
The level of semantics studies the meaning of different elements of language. What do we mean when we say something, and how do we understand others? Finally, pragmatics looks at meaning within a context. For instance, someone could say: “I’m getting hot, will you crack open the door?” Semantically, “crack” would mean “to break,” but pragmatically, we know that they don’t actually want us to break the door; they want us to open the door to let in some fresh air.
|Level of language||Field of linguistics|
|Sounds and letters||Phonology (speech) & graphology (writing)|
|Words and word forms||Morphology|
|Sentences and rules||Syntax|
|Context and language use||Pragmatics|
Which levels of language can Google understand?
Okay, but what does this have to do with search engines? Well, search engines are trying to understand language the way humans do. And they’re getting better and better at it. A couple of years ago, search engines could only understand basic elements of language: they could recognize keywords in your content. Because of that, it was common practice to optimize just for keywords.
But times have changed. Search engines are becoming smarter and smarter, and they are getting better at understanding more levels of language. Google is now trying to understand language at the level of syntax, morphology, semantics, and even pragmatics. How? Let’s find out.
Understanding what characterizes high-quality content
With every update, Google tries to get closer to understanding language like the human brain. The Panda update (2011) addressed thin content and keyword stuffing. People could no longer rank high with low-quality pages filled with keywords. Since this update, Google is trying to understand language at the semantic and pragmatic levels. They want to know what people deem high-quality content; content that genuinely offers information about the search term they used.
Read more: Google Panda »
Understanding the meaning of phrases
A few years later, with the Hummingbird update (2013), Google took a deeper dive into semantics. This update focused on identifying relations between search queries. It made Google pay more attention to each word in a search query, ensuring that the whole search phrase is taken into account, rather than just particular words. They wanted to be capable of understanding what you mean when you type in a search query.
Google took that even further. Since they rolled out the RankBrain algorithm in 2015, they can interpret neologisms (words that have not yet been fully accepted into mainstream language, like “coronacation”), colloquialisms (casual communication, like “ain’t” and “gonna”), and they can process dialogues.
Read more: A brief history of Google’s algorithm updates »
Understanding different word forms
Google also has become a lot better at understanding different forms of a word or phrase. You no longer have to stuff your articles with the same keyword over and over again. If you’re writing an article about [reading books], Google will recognize various forms of these words, like [read], [reads], and [book]. What’s more, Google also understands synonyms. Write about [novel], [chronicle], and [volume], and Google will still rank you for [book]. Using some variations in your wording makes your texts nicer to read, and that’s what Google finds important, too.
Read more: What is keyword stemming? »
But Google is not just trying to understand content by analyzing text. To identify which results are useful for people, they also use user signals, like the bounce rate, click-through rate, and the time people spend on a website. They are even researching users’ emotions to adapt their search results based on, for example, the choice of wording for a search query.
You might have heard about the most recent big update, BERT (2019). With their latest innovation, Google is again becoming closer to understanding language at a human level. BERT is a Natural Language Processing (NLP) model that uses the context and relations of all the words in a sentence, rather than one-by-one in order. With this update, Google can figure out the full context of a word by looking at the words that come before and after it. This helps them provide their users with even more meaningful and fitting results.
Read more: Google BERT: A better understanding of queries »
A linguistic approach to SEO
So, what does this mean for how you should optimize your content? Google is trying to understand language like we do. And with every update, they are getting closer to understanding language at a human level. They want to provide their users with high-quality search results that fit their goals.
Simply put, this means you should write for your audience, and not for search engines. Research your audience, try to get to know them, and provide them with the information and solutions they are looking to find!
Write naturally and mix things up
Moreover, try to write naturally. Don’t just stuff your text with the keyphrase you’re trying to rank for. That’s not only unpleasant to read for your visitors, but also bad for your rankings. Google can understand synonyms, different word forms, and the context of words, so make use of that! If you’re trying to rank for [cat], don’t just use [cat] over and over in your text. Use synonyms, like [kitty] or [puss]. Mix things up and use the plural form, [cats], and related phrases, like [litter box] or [cat food].
Yoast SEO Premium can help you with this. Our plugin also recognizes synonyms, different word forms, and related phrases, and helps you optimize your content for these words. This allows you to write in a more natural way, so you can satisfy your users and rank high in the search results!