The contents of this blog post have been removed to prevent any hurt feelings, but if you’re interested in the subject matter, why don’t you get in touch with me and we can discuss privately? I love smart conversations about technical SEO with people who know a lot of things, especially when they disagree with me.
So… last summer, Google announced that they would begin using HTTPS as a ranking signal:
…over the past few months we’ve been running tests taking into account whether sites use secure, encrypted connections as a signal in our search ranking algorithms. We’ve seen positive results, so we’re starting to use HTTPS as a ranking signal. For now it’s only a very lightweight signal — affecting fewer than 1% of global queries, and carrying less weight than other signals such as high-quality content — while we give webmasters time to switch to HTTPS. But over time, we may decide to strengthen it, because we’d like to encourage all website owners to switch from HTTP to HTTPS to keep everyone safe on the web.
Now Googler Gary Illyes has stated that HTTPS isn’t a ranking factor, but actually is just a tie-breaker:
— Woj Kwasi (@WojKwasi)
More analysis here:
Google uses HTTPS as a tie breaker when it comes to the search results. If two sites are virtually identical when it comes to where Google would rank them in the search results, HTTPS acts as a tie breaker. Google would use HTTPS to decide which site would appear first, if one site is using HTTPS while the other is not.
Does this make any sense?
Magic SEO Ball says: very doubtful.
There’s been some dispute among SEOs about just how important, if at all, HTTPS actually is as a ranking factor. Here’s an SEO arguing that there are seven specific things on which a site should focus before worrying about HTTPS. Her list:
- Consistent URLs Everywhere
- Short, Non-Parameter Heavy URLs
- Real, Relevant Content, Not “SEO Copy”
- Live Text High on the Page, Not Stuffed at the Bottom
- Strong CTA-Friendly Title Tags
- Speed Up Your Load Time
For what it’s worth, I think she’s basically right, though this varies a lot based on the niche and the size of the site. For much bigger sites it may make sense to tackle other technical issues first; for certain types of sites it may make sense to improve the mobile experience first. And of course, for e-commerce or any site that takes people’s money, HTTPS does matter.
So what exactly is a tie-breaker?
Tie-breaking is a special system to determine the outcome of a contest when all the other inputs are exactly equal. A tie-breaker only gets used if it will be necessary to break a tie; if a certain input is calculated all the time, then it is not actually a tie-breaker.
One good example of a tie-breaker is the vote of the Vice President in the United States Senate. He doesn’t typically sit as a member of the Senate, but is officially its president, and he doesn’t vote at all unless his vote would change the outcome (ie, unless the Senate’s vote is tied).
Another example of a tie-breaker is how professional football teams are ranked at the end of a season. The NFL has a complex set of rules for determining which of two teams with identical win-loss records will advance to the playoffs and which will not (and it gets even more complex when there are three teams that are tied, and so on):
- Head-to-head (best won-lost-tied percentage in games between the clubs).
- Best won-lost-tied percentage in games played within the division.
- Best won-lost-tied percentage in common games.
- Best won-lost-tied percentage in games played within the conference.
- Strength of victory.
- Strength of schedule.
- Best combined ranking among conference teams in points scored and points allowed.
- Best combined ranking among all teams in points scored and points allowed.
- Best net points in common games.
- Best net points in all games.
- Best net touchdowns in all games.
- Coin toss
Critically, these rules only get invoked if two teams finish with the same record of wins and losses.
So, is HTTPS – which Google has previously used its official blog to describe as a “ranking signal” – not actually a ranking signal, but actually just a tie-breaker that gets used in the rare scenario that two pages rank exactly equally?
Almost certainly not. In order for HTTPS to be a tie-breaker, Google would have to compute all the ranking signals for a certain query, and then only in cases of a perfect 50-50 tie would they then tip the scale in favor of the site with HTTPS. But what if both sites use HTTPS, or what if neither site uses HTTPS? Then invoking the tie-breaker wouldn’t even break the tie.
HTTPS is, as John Mueller has said and as SEOs have confirmed, a relatively weak and – in most cases – somewhat insignificant ranking signal. For the sake of comparison, another relatively weak and somewhat insignificant ranking signal is keyword use in H2s.
Being a small ranking signal does not mean that it’s a tie-breaker.
Incidentally, this is not the first time that Mr. Illyes has made public statements about how Google works that have beggared belief among SEOs. Just a couple of months ago, he said that Panda is updated in real time, which made absolutely no sense at the time; we now know this to be false.
Matt Cutts often said things that might be true only in some extremely literal or extremely general sense, and which needed to be parsed carefully so their meanings could be teased out, and John Mueller often seems to be – I want to put this charitably – filling in the gaps in his knowledge with educated guesses. But Gary Illyes’s record of disputable and provably false statements suggest that he should be prevented from continuing to offer them.
I’ve heard an SEO person say that Google can scan my Gmail and use it to discover new URLs to crawl. Does this really happen?
Magic SEO Ball says: My sources say no.
This is actually something that we have believed for some time, and something that we have told to a bunch of people, including on interviews for serious SEO positions at really excellent companies (Oops!).
So it’s a bit surprising to learn that we were almost certainly incorrect.
We posted 4 total pages … and then asked different groups of users to email links to those pages… We asked 20 to 22 people to send gmails sharing the links for each article to the various pages. One group was asked to share article 1, a different group was asked to share article 2, and so forth. The goal was to see if Google would spot these links in the gmails, and then crawl and index those URLs… there was very little to see. The results were wholly unremarkable, and that’s the most remarkable thing about them!
The test only lasted for less than two weeks, so it’s possible that those URLs would eventually have gotten crawled.
It’s also possible that Google will use Gmail to discover new domains to crawl, but not specific individual URLs.
So it would definitely be worthwhile to try repeating the test using different parameters, but until we see evidence demonstrating otherwise, it seems fair to say that Google does not crawl the URLs they see in Gmail.
Last month, Google announced a change to the way they’ll rank mobile search results, to begin on 21 April.
Now Googler Zineb Ait Bahajji is stating that this change will be bigger than Panda and Penguin.
Zineb from Google at #smx Munich about the mobile ranking update: is going to have a bigger effect than penguin and panda!
— Aleyda Solis (@aleyda)
Is she right?
(Or is this like when Googler Gary Illyes claimed incorrectly that Panda was already being updated in real time?)
Magic SEO Ball says: Don’t count on it.
The strange thing to consider about the upcoming change to mobile search rankings is the way Google announced it: they may never have given as much information, as far in advance, about a genuinely meaningful update to their organic ranking algorithm. It was truly an unprecedented event.
Unless, that is, the change is actually going to be relatively minor, and the announcement and all the Twitter hype and hoopla are really just a way to get webmasters and SEOs to do what Google wants them to do, which is to make mobile-friendly websites, preferably of the responsive or adaptive varieties.
We try not to be too skeptical and we definitely don’t believe that Google is lying about the mobile rankings change, but we have to wonder whether Google’s search quality team is really going to shoot themselves in the foot by providing worse search results in some cases, just because the pages happen to be mobile optimized. Tl;dr: they aren’t.
Panda and Penguin have been, at best, mixed successes for Google. Completely aside from the pissed off webmaster and SEO communities, we are aware of many SERPs that are lower quality as a result of Google’s attempts to use machine learning to create new algorithmic ranking factors.
After 21 April, expect to see changes, but don’t expect the world to end for great sites whose pages aren’t mobile friendly, and don’t expect garbage sites with responsive pages to start crushing their authoritative, very relevant, high-quality competitors.
At SMX West, a Googler named Gary Illyes claimed that the Panda algorithm is now running constantly, and that sites affected by it only need to fix their problems and be recrawled, after which they should regain their traffic.
— Rae Hoffman (@sugarrae)
— Eric Wu ( ･ㅂ･)و ̑̑ (@eywu)
Is he correct / telling the truth?
Magic SEO Ball says: very doubtful.
First, this claim is completely in conflict with the evidence at hand. We have firsthand knowledge of at least one large site that got hit by Panda around 24-25 October 2014, eliminated the problem almost immediately by noindexing a quarter of its URLs ((Without going into details, we have 95+% certainty that those URLs caused the Panda problem in the first place.)), watched those URLs drop from Google’s index, and still has not recovered from Panda.
Second… well, there isn’t much more to say about this. While the Panda algorithm itself might have been updated at some point since late October, there is zero reason to believe that its data has been refreshed. And there’s also no reason to think that Google would run an updated Panda algorithm with stale data. So, almost certainly there’s been neither an algorithm update nor a data refresh.
SEOs who do high quality work are generally in agreement about this.
So does this mean that Mr. Illyes was misleading us or lying to us, or does it mean that he was mistaken or confused?
We think the latter explanation is far more likely. His excuse that he “caused some confusion by saying too much and not saying enough the same time” sounds like a nice try to save face, which is understandable. It’s probably an internal goal at Google to get to a point where Panda can be run in real time, but this requires two things:
- The quality of the algorithm has to be high enough. This means that false negatives need to be reduced, and false positives need to be eliminated.
- The logistics of running the algorithm have to be workable. This means that the computing complexity has to be manageable enough that Google’s engineers and infrastructure can handle it on a constant basis, rather than just on a periodic basis.
While the second issue is the kind of problem that Google is pretty good about solving – more engineers, better engineers, more hardware, more powerful hardware, whatever – the first issue is something that may not be possible in the near future.
This question comes to us via Twitter:
@gesher What’s your take on creating articles on your site for the purposes of syndication?
— Jesse Semchuck (@jessesem)
Magic SEO Ball says: my reply is no.
Content syndication as an audience development strategy
Creating articles specifically with the intent of having them syndicated on other sites can be a fine way to expose those sites’ different audiences to your product, service, ideas or own website. When doing so, you should take care of the following concerns.
Every website has a different audience. Some are huge and some are tiny; some are general and some are specific; some are overlapping and some are distinct. Take care to ensure that your articles are appearing in the right places online by taking the time to understand the audience profiles of the sites where they will be syndicated. Failing to do so may cause your content to be ignored at best, or resented and marked as spam at worst.
How much is too much? If your syndicated content overwhelms the unique content on your own site, you are syndicating too much. If your syndicated content overwhelms the original content of the sites on which it appears, you are syndicating too much.
Many people have a certain set of content websites that they visit on a regular basis, daily or weekly; or an RSS feed reader that they check on a regular schedule; or the expectation that they’ll be able to use Twitter and Facebook to find out what’s happening. Some people use all three methods. If they follow your site and another site that syndicates your site’s content, or multiple sites that syndicate your site’s content, they’re going to start seeing your articles repeatedly. While that may strike you as desirable, it may also backfire by bothering this extended audience, preventing people from ever becoming your customers or followers.
In summary, what many syndication issues – with audience, volume and repetition – have in common is that they are caused by a casual “If you build it, they will come” approach that discounts the users’ interests, wishes, and experience. This may result from a surfeit of technical ability to effect syndication (viz., by RSS) and a deficit of concern for other web denizens.
Consider, instead of a push method, a pull method by which you publish your own material on your own site, and allow it to be republished by request by other webmasters on an article by article basis.
Content syndication as an SEO strategy
In general, the main reason to be interested in content syndication as an SEO strategy is for link building: the idea being that you can create feeds with your articles, including followed links back to your own site, and allow other sites to use the articles with proper canonical tags.
While it would be a stretch to say that Google’s policies about link building have historically been clear, one trend that has emerged and that can be stated with clarity is that Google does not want to allow any link building strategy that can scale. In effect, this means that asking for links and giving them is fine, but that manipulation of the overall link graph is not fine.
Does content syndication for SEO purposes (i.e., for the purposes of increasing your articles’ link authority and your site’s domain authority) work? Yes, but you’d better assume that links added without any human effort by other sites’ webmasters can be devalued without any human effort by Google’s search quality engineers.
And that doesn’t even touch on the risks involved, which I outlined briefly in this Quora question: Can my search engine ranking be hurt if I post my blog articles on my own site and on several other blogging sites?
… if you publish the same article on your own site and on other sites, you’re running the risk that it will rank on the other sites but not on your own… employing this practice at scale may expose your site to Panda… Instead, consider creating original content for different sites that is useful to each site’s audience.
So if you’re thinking about audience development and want to do content syndication, I think it is ok but also that you should consult an SEO and seriously weigh the SEO concerns, along with the possibility that syndicating content in the wrong ways may do more harm than good. And if you’re thinking specifically about a content syndication strategy for SEO, there are much better ideas out there.
My SEO submitted requirements that our CMS should not allow titles longer than seventy characters. He also mentioned elsewhere in the requirements document that headline should generate the title (which should then be editable). So headline is the same as title, right?
I built the headline field in the CMS so that it will not accept a headline from editors longer than seventy characters. Good?
Magic SEO Ball says: my reply is no.
When an SEO talks about a title, he means the “title” tag.
When he talks about a headline, he almost definitely means the on-page “h1” tag.
It’s an SEO best practice for titles to be limited to seventy characters, because longer titles are likely to be truncated when they appear in search engine results. There is no particular character limit on headlines, expect that they not be very long or overwhelming to users.
It’s also a good idea for editorially created headlines to generate titles, but there can be some collision over the fact that headlines have no length limit while titles do. Therefore, if you’re working on a CMS and get these requirements from your SEO, keep in mine that he probably wants the title field to be editable.
Based on a true story.
Magic SEO Ball says: very doubtful.
But really, why would you do this? If you’ve got an area of your site that’s about deals and other related things, and you’ve decided that it will be called “Deals,” why would you use some other term instead of “deals” to tell your users what that area of the site is about? It just defies logic, not to mention the first rule of marketing that we learned at our first SEO job: never make up a different term to describe your product that’s separate from the one you’ve already made up.
Based on a true story.
Magic SEO Ball says: My reply is no.
Seriously though, why would you do this?
Maybe if you are creating a new site and want it to be private and specifically aren’t interested in receiving any search engine traffic, that might make sense… but if you’ve already got a big site that gets a lot of its traffic from SEO, you should really make sure that your robots.txt doesn’t block search engines.
Based on a true story.