Similarity Analysis: Main Google’s Anti-Spam Method

Similarity Analysis: Main Google’s Anti-Spam Method

Have you ever thought that in the very heart of Google’s algorithm just like algos of other search engines there could be one principle responsible and lying in the basis of it whole anti-spam structure? It took me a few years to clearly understand how simple yet efficient anti-spam system based on one principle could be. What I mean here is the Similarity analysis.

If you are, as an SEO expert, trying to come up with an SEO strategy not based on diversity – you stand no chances. If your potential outsourced SEO specialist claims he knows how to promote a website using signature links alone – you stand no chances. Let me express my thoughts on the Similarity analysis algorithm – that is exactly how I would name it.

Similarity Analysis Approach

Before putting together the details of the Similarity analysis approach I Googled to see if any related information was available on the large scale on the Similarity analysis employed by search engines and Google in particular.

Similarity Analysis as Basic SEO Algorithm of Google

The Similarity analysis approach is designed to filter out websites that are promoted artificially or in an unnatural way, websites with weak on-line pre-launch business research, websites that are promoted employing grey-hat and black-hat SEO methods and presumably devalue them in top rankings. The approach involves analysis of the following levels of an on-line business:

  1. Unique Business Idea
  2. Unique Brand
  3. Unique Domain
  4. Unique Content
  5. Unique Pattern of Incoming Links
  6. Unique Anchor Text of Inbound Links
  7. Unique Traffic Channels
  8. Unique Visitor IDs and Behavior

Message here is simple: The higher the diversification of the factors mentioned in the list, the easier it is going to be to promote the website in Google.   

Idea Level:

Unique Business Idea

Just like in off-line business being innovative and one-step ahead of your competitors means winning and profiting. Starting an on-line business for a unique idea is a win situation by definition. And you do not have to be another Ilon Mask, Ippei has already uncovered some of the 55 brightest business ideas for 2020 – 2022.

Unique Brand / Trade Mark

As simple as that. Google before you go and order all kinds of logos and register you fancy name domain. The trademark might have already been registered or has been running as an unregistered brand for years. One day it may cost you all your website development efforts. Google does not allow using 3rd party trademarks in the text of AdWords. You know that, don’t you?  

As somebody who has a small hobby of monitoring expired domain names, I assure you that every week brings an end to this or that website with history counting decades. Such brands can have numerous mentions and links and are always a tasty little thing for dead trademarks or unregistered brands hunters. Once the brands / websites are re-registered, these could be abused. Google will notice sooner or later, so no matter how careful you are the option is way too risky.     

Unique Domain

Just like choosing your brand, you need to think twice before you go and register the domain for your business. Here is an example:

As I am writing this article healthdish.com domains is on sale, which could be a catchy and memorable domain for a food-related business. However, there could be domains like thehealthdish.com or myhealthdish.com or healthdish.co.uk up and running for years. Link popularity, online age of the latter domains could turn them into potential competitors for your branded search queries.

So why not do further domain research and find one with no overlappings.         

Methods Level:

Similarity in the type of CMSes linking to you

Got most of your links from blogger pages? Looks good and nice in AHREFs but equals to virtually one link to Google. A good link profile involves diversified links from WordPress, Drupal, Joomla, Magento and websites built on other CMSes.

Reason: Once a CMS reveals a vulnerability there is always a way to get dozen links over it. It just involves a good programmer. And some dishonesty. 

Similarity in Whois Data

Google analyses when your website was registered and who owns it. As well as Name Servers / IP address the website is running on. If you get links from websites registered by the same person, a person with the same e-mail or running on the same Name Servers / C-class IP address the links would be strongly devaluated. 

Reason: To fight PBNs (Private Blog Networks). Promoting a website over well-researched and developed PBNs may look like a White Hat method. Such websites often contain useful info and are a time-consuming in terms of chasing for the domain, putting up graphics and contents. But not worth much if you trigger the Similarity analysis approach (for example, the target website and the PBN website belong to someone using similar e-mail).  

Similarity in Templated Backlinks Profile

Google can easily detect 2 way and at times 3 way link exchanges, but it can go even further and identify backlinks bought in bulk to increase a website’s DA / DR and (suggestively) rankings. Doesn’t take a rocket science to understand that if a number of websites are getting the same set of backlinks within a definite period of time (and possible in the same order) – these are artificial and should be devalued. Some of these templates are well known and are frequently met when analysing backlinks manually (which I encourage you to do as frequently as possible). Some of these could be put by your competitors to decrease your rankings and are worth disavowing.   

Similarity graphic and textual content

Take it for granted that Google will appreciate all types of content as long as it is unique. Not copied, not bought from Shutterstock, not downloaded and edited. Google can easily detect unique textual content as well as unique images and even videos. Bottom line is, Google wants to see you have done your work on preparing unique content and ready to praise you for that

Similarity in the Outbound Links Footprint

We have all heard about Google being strict about duplicate content. But you just underestimate to what extend Google look at the duplicate or similar things. Today’s algorithms can compare unique content between different languages, and group websites that have a similar set of outbound links. If an owner of a network of websites has automated link management (adding / editing / deleting) then having the same link variation on a number of websites (adding date, surrounding content, article where the link is used and of course its anchor text) would not looks natural at all. So Google has grounds to devalue such links and suspect the beneficiary website in employing bad practices.

User Behavior Level:

Being able to draw extra traffic via non-typical channels is art. And this art will at least require your creativity and unique approach. If you are a success in what your competitors failed you are sure to get your piece of cake. And while some latest generation shops only run their businesses in Instagram or Amazon, these are definitely the opportunities to pursue.

Now, you run an online business similar to hundreds of other businesses in Internet. You have done a great on-page optimization and involved a certain semantic core, just like most of your competitors. You have initiated and acquired local listings and got a few reference links from thematic communities, your website was even mentioned in high-profile publication platforms but that so much resemble your competitors link profile. What is left out there that can make you special? Right, CTR, behavior of your visitors on the website, bounce rate, number of pages visited, conversion rate. At the moment Google Analytics is employed by roughly 55% of the top ranking websites according to W3Techs. Besides, Google keeps 100% CTR data for all the searches through its search engine. Some webmasters are still skeptical about whether Google is indeed using visitors behavior as a ranking signal. I take it for granted, for those of you having doubts – please proceed to this article by Masha Maksimava.   

What is the similarity catch at this level? SEO experts who take behavior factor seriously have created numerous tools to send fake traffic from SERPs to the website thus technically increasing its CTR and presumably rankings. Analysis of traffic diversity, possible crawl templates will allow search engines to filter out artificial manipulations at this level and increase the quality of search results.

Conclusion:

From the practical side of the research these are the things to be remembered by both Marketers and Clients:

  1. If your website is operating in very competitive environment your success lies in involving un-typical / unique promo methods
  2. A marketing agency working independently will not be able to achieve as high results as an agency working side by side with the business owner or the company representatives over content and strategies
  3. Lead generation venues are changing all the time. You need to adapt and learn and employ new opportunities to be a success.    
  4. Do not automate link popularity building. Google has all the algorithms in place to detect these and respond accordingly.
  5. The more manual work, creativity and research you put in your SEO campaign that more value it gains. At the end of the day this is exactly what Google wants from you. And isn’t it fair?