- Registration wall
- Rich snippets
- Duplicate content
- Structured data
- Page type
- Clickthrough rate
- Query deserves freshness
- Query deserves diversity
- Anchor text
- Featured snippets
- Knowledge Graph
- Your money or your life
- Author rank
- Map pack
- Organic search
- Search engine marketing
- Pay per click
- Cost per click
- Directory / subdirectory
- Top level domain
Below is a glossary of some SEO terms. It is highly incomplete (I’m writing new definitions every few days, but not at any particular pace), idiosyncratic (many of the terms are my own, as are all of the definitions) and unorganized. But enjoy it anyway, if you wish.
Quality is a large set of organic search ranking factors that have to do with user experience. Quality can have an impact either at the page level or at the domain level. Some examples of low quality: extremely poorly written pages with lots of spelling and grammar mistakes, pages that are covered with crass commercial advertisements, pages that take a long time to load, very short articles, articles that are split into many pages (invariably to juice ad impression metrics), SKUs that are out of stock, directory pages with no listings, duplicate content, functional duplication.
Authority is a large set of organic search ranking factors that have to do with relative importance. Authority can have an impact either at the page level or at the domain level. Some examples of authority metrics:
- “domain rank”
- “author rank”
- information architecture
- URL structure
- social shares
- social likes
- unlinked mentions
Relevance is a large set of organic search ranking factors that have to do with topical significance. Relevance can have an impact either at the page level or at the domain level. The easiest of the “big three” ranking factors to understand intuitively, relevance is the most difficult to define. Consider it to be how search engines know to show you results for the thing that you actually were looking for.
Crawling is what happens when search engines use special software called spiders, bots, robots or simply crawlers, to access as many web pages that they know about, find out what’s on them, and store them in an enormous index for later ranking. Much of the web is not accessible to search engines, which is to say that it can not be crawled. This is either because they are blocked by instructions in robots.txt, or because they are behind registration walls, or they are only known via nofollowed links, or because they simply haven’t been discovered.
Indexation is the process by which search engines add known web pages, which they’ve previously crawled, to an index, so that they can later rank for queries entered by searchers. Choosing what to get indexed and what not to get indexed is an important SEO strategy for advanced SEOs and large sites. More pages in the index means more opportunities to rank for more queries, but could cause problems related to canonicalization and duplication, excessive amounts of thin content, &c.
Ranking is what happens in the background when a query is entered into a search engine. In the tiniest fraction of a second, the search engine uses an enormously complex algorithm to filter and sort the searcher’s query against every document that has been crawled and indexed. Ranking algorithms are search engines’ “secret sauce” and the subject of endless analysis by SEOs.
PageRank is a numerical value that indicates the relative importance of any URL in comparison with every other URL on the web. Named after Larry Page, who founded Google with Sergei Brin, PageRank was the first significant differentiator between what became Google and all other search engines up to that point. Previously, search engines had been concerned primarily with relevance; PageRank allowed Google to use authority as a ranking factor.
A registration wall, or “reg wall,” is a sort of conceptual barrier that prevents parts of some websites from being accessed by users who have not registered. Because search engines can not and do not register for websites to be able to log into them, anything “behind the reg wall” will not be crawled or indexed, and it can not rank for queries.
Snippets are the part of search engine results pages that show information about the specific results. In a typical Google search result, there is a blue area (typically populated by the page’s title), a green area with the page’s URL or breadcrumbs, and a text area that is often taken from the page’s meta description. Modifying a page’s title and meta description to achieve more appealing snippets, leading to higher clickthrough rate, is a critical SEO strategy.
Rich snippets are special forms of snippets in search engine results pages for specific types of results, including recipes (with cooking times), review-ratings (with stars), videos (with thumbnails), &c. Typically, the way search engines identify which pages should get rich snippets is by looking for specific forms of structured data that SEOs have caused to be added. Rich snippets, in many cases, can dramatically improve clickthrough rate by being visually striking and informative.
Taxonomy is an ordered, structured, organized way of arranging concepts. On websites, a simple taxonomy may typically involve top level categories, subcategories and product pages. Most complex websites have multiple taxonomies, often overlapping. Tagging systems are similar to taxonomies, but have no structure. Both tagging systems and taxonomies are part of an overall ontology.
Duplicate content is what happens when the same content is available at multiple URLs. This can happen on one site or on two sites. Duplicate content can be caused by:
- Scrapers: someone programmatically copies pages on your site and publishes them on his own.
- URL parameters, such as for tracking purposes
- Default settings in some off-the-shelf web frameworks that automatically create multiple URLs for the same thing (often an “SEO friendly” version and the real version)
There is no direct penalty just for having duplicate content, but duplicate content can be a negative quality signal. It can also cause serious problems with crawl ratio and diffusion of link equity.
Structured data described the concept of using markup – specific HTML tags, or other forms like JSON+LD – to explain to search engine crawlers what exactly it is that they’re seeing on your website. For example: is a string of characters a recipe, an address, a review or an article? Schema.org is the best example of structured data specifically for rich snippets, though structured data also has many other uses and helps Google create the Knowledge Graph and validate information in it.
Much of basic HTML also can be, and should be, considered structured data, such as ordered lists, unordered lists and list items.
Ontology is the overall universe of information in any subject area. A website can have many taxonomies; the way that they all combine and interact with one another is its ontology.
Noindex is a meta instruction to a search engine to tell them not to add a specific URL to its index. Noindex does not block a page from being crawled, though it is very likely that noindexed pages will be crawled less frequently than equivalent pages that are indexed. Noindex can also sometimes be used in robots.txt, but this is not officially supported and only sometimes works.
rel=nofollow is an attribute in the HTML a element (for links). It tells search engines two things:
- Do not crawl this link.
- Do not pass any link equity to the URL that is linked.
Nofollow can also be used as a meta instruction on a page. Doing so makes all of the links on that page nofollowed.
In search engine result pages, an impression is whenever a specific URL ranks for any query. This terminology is borrowed from paid search, in which advertisers sometimes pay for their advertisements to be shown on an impressions basis (CPM, or cost per mil, or cost per thousand impressions). Clicks divided by impressions equals clickthrough rate.
A pageview is what happens any time a page is viewed. This may seem simple, but pageviews are commonly confused with visits, sessions and impressions. The average visit is likely to have multiple pageviews. A pageview can have multiple ad impressions, and a session can have multiple visits.
A visit is when somebody comes to a website with his or her browser, and the visit lasts until he or she leaves the website. Visits are often confused with sessions and pageviews.
Sessions start when someone comes to a website with his or her browser, and continue until he or she stops clicking on further pages, leaves the website, and thirty minutes elapse. In other words, if I visit a site, click a couple of pages, return to a search engine, search for something else, click into the same website and view more pages, that is all one session. It is two visits, however.
A page type is a collection of pages on a website that all share basically the same design, function and place in the site’s hierarchy. Often page types are thought about in terms of “templates.” Home page is a page type that almost every site has. Other common page types are article pages, product pages, listing pages, category pages, &c.
In search engine results pages, clickthrough rate, or CTR, is the ratio of clicks to impressions. CTR can be calculated overall for an entire site, for a specific query or for a specific page, but none of these uses of the metric is particularly helpful. The best way to use CTR is to look at the ratio of clicks to impressions for a specific combination of one query and one URL. In other words: when people search for X query, and Y page ranks, Z is how likely they are to click.
Query deserves freshness, or QDF, is an algorithm used by Google to determine whether and when the searcher intent for a query is “fresh,” and to change the SERP for that query to show content that is more “fresh.”
One example of QDF in action is when a celebrity dies unexpectedly. At the moment of the celebrity’s death, search results for the celebrity’s name are often primarily informational. Shortly after his or her death, search results change to being primarily news. Google does not have a person monitoring 24 hour cable news channels to know when somebody dies and then to change the search results; that is all done algorithmically. QDF causes URLs to rank well that otherwise would not be ranking well.
Query deserves diversity, or QDD, is a hypothesized algorithm that could be used by Google to determine whether and when the searcher intent for a query has the special need for the SERP to be diverse.
Sometimes “diverse” could mean a mix of transactional, navigational and informational results, where otherwise one specific navigational result might rank in all ten positions. Sometimes it could mean a variety of different domains, where otherwise one site would be dominant. Sometimes it could mean different interpretations of the same word, where one interpretation is the normal one.
QDD causes URLs to rank well that otherwise would not be ranking well.
Sitelinks are deeper links to internal pages on a site that appear in SERPs, often indented in a set of six. Generally, sitelinks happen when multiple pages from the same domain would rank, and Google pulls them all together to make them more helpful to searchers.
Anchors are specific locations on pages to which links can point. In order to link to a part of a page that’s not the top of it, an anchor must be created, appending #anchor-name to the end of the page’s URL. Links to the page’s URL, followed by #anchor-name, will take users straight to that part of the page.
Anchor text is the part of a link that says what is being linked. “Click here” is a very commonly used anchor text, but not a good one. Anchor text is a very important relevance signal in SEO.
A menu is a list of links used for navigation. Menus can be page-specific, with anchor links pointing to different parts of a long page, or site-specific, for navigating around a site. Sometimes menus are specific to distinct sections of sites. Menus are important for SEO because sitewide menus result in a bunch of links being replicated across every page. That causes a lot of link equity to go to those pages, sending the signal that those pages are relatively important.
Featured snippets are search results from Google above the first position, that attempt to answer the searcher’s query without the searcher needing to click through to the sites that rank. Also called “answer boxes,” “scraped results,” &c., featured snippets can come in many forms, including:
- Just an answer
- An answer with a link
- An answer with an image
- A table of data
Featured snippets should not be confused with rich snippets or with the Knowledge Graph.
The Knowledge Graph is a database of terms and their relationships that Google displays in SERPs for certain kinds of queries, most often informational. Information in the Knowledge Graph often appears on the right side of SERPs on desktop, or in the middle of SERPs on mobile.
Your money or your life, or YMYL, is a term from Google’s Quality Rater Guidelines that refers to the results of queries that could affect a searcher’s health or wealth: medical and financial information are the most obvious two subsets. YMYL queries and the pages that rank for them are held to a higher standard for quality, due to the higher importance to Google that they don’t get these types of queries wrong.
Author rank is a hypothesized ranking factor that Google could use (or could have used) to identify the person who created a piece of content, and to give more or less weight to that content based on its creator’s identity and known expertise. For example, if I write articles about cars across many websites and I am a well known authority on cars, then my article about cars on a certain website would be given some higher author rank than an article about cars by somebody who doesn’t know anything about them, and it would be given a higher author rank than an article that I wrote about some other topic, like flowers.
A map pack is the inclusion in Google SERPs of a map, with several (typically three) special local results. Clicking on the results does not lead to their specific sites, however, but to a more detailed local SERP with information from Google Maps, the Knowledge Graph, &c. Map packs are used when Google discerns a highly local and transactional searcher intent.
Adwords is a form of PPC advertising that Google operates in parallel, and on top of, its search results. The “normal” or “regular” search results are called organic, while Adwords is paid.
Organic search is the non-paid search results that searchers get on search engines. This is contrasted with paid search results, such as Adwords. SEO focuses on organic search, while PPC focuses on paid search results.
Search engine marketing, or SEM, can mean either SEO and PPC together, or it can mean just PPC.
Adsense is Google’s system of text advertisements that it displays on members’ websites.
Pay per click advertising, or PPC, is a form of marketing that charges the advertiser only when someone (typically a searcher) clicks on the advertisement. The amount paid by the network to the advertiser is the CPC. Adwords is the biggest PPC system on the internet.
Cost per click, or CPC, is the amount paid by the advertiser in PPC advertising. CPCs can often be used in SEO as a way of determining the value of ranking well for a specific query: if people are paying a lot of money to get clicks for that term, then it is probably worth a lot to rank well, and very competitive.
Pagination is a system that websites use for breaking up one page into multiple pages. Sometimes this is necessary, such as when a site wants to list every article, but there are tens of thousands, so the links need to be broken up. Articles themselves can also be broken up. In rare cases, this is done to improve UX, but commonly it is done to inflate pageviews or ad impressions artificially.
A directory is the area in a URL between slashes. A subdirectory is a directory that has a parent level directory.
For example, in the URL http://www.myfakewebsite.com/hello/world/page.html, /hello/ and /world/ are both directories. /world/ is a subdirectory.
In the early days of the internet and the web, before modern web frameworks, web apps, databases, CMSs, &c., websites were created and managed by putting files (pages) into specific directories on a server. Every directory level was then visible in each page’s URL.
Directories are also known as folders.
Folders are another name for directories.
In publishing, a river is a reverse-chronological list of links to content. Sometimes a river can be an entire page, like some listing pages. Other times, a river is just one part of a page.
A blog is a specific kind of site (or section of a site) that organizes content reverse-chronologically, using a river. Blogs often also have categories, tax, multiple authors, static pages and other forms of content, but none of these is essential to what a blog is.
A domain is the “site” part of a URL. The web, and the internet more broadly, is largely organized around siloed domains. In the URL http://www.myfakewebsite.com/hello/world/page.html, the domain is myfakewebsite.com.
A top level domain, or TLD, is the non-unique part of a domain. The most common and famous TLD is .com. TLDs are either ccTLDs (country-specific) or gTLDs (generic).
Country code top level domains, or ccTLDs, are top level domains for each country that are specific to that country. Examples include:
- .us for the United States
- .mx for Mexico
- .de for Germany
- .il for Israel
Some ccTLDs can be treated as gTLDs, like .tv (Tuvalu) or .co (Colombia).
For content that is targeted to a specific country, using that country’s ccTLD can be a good way to make sure that it shows up in search engines used by people in that country.
Generic top level domains, or gTLDs, are top level domains that are not country-specific. The most famous gTLDs are well known to everybody:
There are also a very wide range of gTLDs that seem like part of a weird joke. For example:
A subdomain is the part of a URL to the left of the domain name. Not every URL has a subdomain, and some have many. For example, in the URL http://www.myfakewebsite.com/hello/world/page.html, the subdomain is www.
A host is a distinct server for a website, or for part of a website that uses a specific subdomain. For example, in the URL http://www.myfakewebsite.com/hello/world/page.html, www is a subdomain. If the www part of that domain is served by a www-specific host, then I would call www the host.
Perhaps I also have a blog on my site, accessible at http://blog.myfakewebsite.com. If the blog part is served by a different system, then blog would be called a host.
A server is a computer whose primary job is to provide resources for other computers on a network, such as the internet. Web servers respond to requests from browsers with assets that, put together, make up the websites that people see.
A ranking factor is one of the hundreds of different inputs that all go into search engines’ organic ranking algorithms. Identifying ranking factors and optimizing for them is one of the very fundamental aspects of all SEO work. Ranking factors are often understood by grouping them into different types, such as quality, authority and relevance. Some ranking factors, like QDD and QDF, may only be applied to certain queries at certain times.
A penalty is an action taken by search engines to prevent a site, or a group of sites, from ranking well in search results. Penalties can apply to entire domains or to specific combinations of URLs and queries, or anything in between. When SEOs talk about “penalties,” they often refer both to algorithmic suppressions and to manual actions. Googlers, however, use the term “penalty” only rarely, and when they do, it always refers to manual actions, not to algorithmic suppressions.
(Featured snippet without link to source)
Quality rater guidelines
Rel=prev and rel=next
Domain name system