Safe door is propped open with Google's search secrets escaping depicted by floating bubbles with icons related to website coding and search engine marketing.
June 19, 2024

The Biggest Takeaways From the Google Search Algorithm Leak

Within the digital marketing industry, there’s a hunger for accurate information about how Google Search works. SEO pros are always looking for verifiable insights into the nitty-gritty details in hopes of improving their SEO strategy.

Is Google’s Search Algorithm a Secret?

Sure, they publish plenty of guidance to help marketers create high-quality websites and publish content capable of ranking well, but Google typically sidesteps the particulars, speaking instead in broader themes and concepts.

So when something like Google’s recent Content Warehouse API leak occurs, it’s going to get everyone’s attention. This newly published documentation comes from inside Google’s Search division, and Google has confirmed the authenticity of the huge algorithm leak. 

As if the unintentional release of 2,500 pages of API documentation wasn’t bad enough, the information contained within refutes a number of the claims Google representatives have made in the past.

If you’re a digital marketing manager, regardless of industry, learning about the information contained in this leak is well worth your time. This level of visibility into the mechanics behind Search is unprecedented. It will help you better understand how to adapt your online presence to boost KPIs and drive revenue growth.

Let’s start by looking at what exactly the leak covers. Then, we’ll discuss what the newly uncovered data does—and doesn’t—tell us.

The Google Search Algorithm Leak and Its Significance

Large book with the Google "G" logo on the left hand side with lines denoting text. Behind the book is a yellow splash of light, highlighting the importance.

The leak itself is a behemoth: over 2,500 pages of API documentation from inside Google’s internal Content API Warehouse, containing 14,014 attributes. 

Evidently, the leak came from GitHub. It seems someone at Google left the front door open on this one, accidentally publishing the documentation from late March to early May and ultimately removing it on May 7. 

The information in this leak was compiled for use by the folks at Google Search. It was made public without the context necessary for outsiders to make complete sense of it.

Before diving into the specific attributes detailed in this leak, it’s important to cover what this leak is and what it isn’t. 

Take a moment to consider the “white whale” for digital marketers and SEOs: a leak that contains not just ranking factors but also weighting as it pertains to individual scoring functions. 

In layman’s terms, we want the comprehensive list of factors Google looks at and how significant each of those factors is for ranking.

That is not what’s covered in this latest leak. 

What we have falls far short of the dramatic reveal many SEOs are waiting for. 

While that may sound deflating, this new information is certainly intriguing, and the leak is still significant. It does contain numerous details about stored data for things like user interactions, content, linking, and other elements. Many of these are even ranking factors. 

However, there are open questions about which of these features are still in use, which have been retired, were only used internally, or were never deployed.

However, we can surmise the leaked documentation was current as of at least August 2023 and perhaps even up to March 2024, when it was first erroneously published.

In short, this leak isn’t the look beneath the hood many had hoped for. But it does offer more visibility into Google Search than we’ve seen in at least 25 years. 

That alone is a big deal.

Unless your background is fairly technical, the documentation itself likely isn’t of much direct use to your digital marketing efforts. However, the broader takeaways are vital for everyone to know.

What Is Google’s Search Algorithm Based On?

A large book with the Google "G" logo is examined with a magnifying glass. In the magnifying glass we see a bulleted list of features, including Site Authority, Link Value, Font Size, Page Titles, Document Length, and Site Focus Score.

Building a winning website requires considerable thought about the site itself—not just products, funnels, and conversions. 

In fact, it’s worth thinking of your website as your brand’s flagship product. After all, it competes against other websites in your industry, just like your products or services. That’s why properly conceptualizing the role of your website is key to building an effective digital marketing strategy.

Here’s a look at some of the most prominent ranking features covered in the Google Search algorithm leak and their impact on your digital marketing efforts.

Domain and Site Authority Play a Role in Ranking

Google spokespeople are on record saying Google doesn’t create, store, or use domain authority scores. But thanks to the leak, we now know that “siteAuthority” exists, which vindicates many SEOs who have believed domain authority plays a role in rankings. 

siteAuthority is one of the Compressed Quality signals stored for each crawled document. 

How does Google use siteAuthority? What impact does this score have on ranking? We don’t know for sure, but one might assume that the better your site’s authority, the better you stand to rank. 

It’s also never a bad thing to be an authority in your industry, and even before this leak, striving to increase your site authority should have been a long-term, ongoing goal.

This leak confirms that reputation matters—which has long been a core tenet of effective SEO.

Our Advice:

  1. Strive to be the expert in your industry through informative content that provides in-depth information with supporting subtopics to answer even more niche questions. 
  2. Use verifiable facts and statistics to support your claims, ensuring you’re a trusted resource for searchers. 
  3. Prioritize authority in your link building strategies, focusing on industry-relevant sites with high domain authority. 

Backlinks and Link Building Are More Important Than Ever

Backlinks are critical to establishing site authority. They also help Google understand a page’s relevance for a given Search query.

sourceType, a metric revealed in the API leak, confirms Google analyzes the quality of linking pages and assigns a backlink value based on this evaluation. 

That’s to say that within your backlink portfolio, links from high-ranking sites like The New York Times would do much more to improve your ranking performance than links from lesser-known sites. 

However, don’t take this to mean every successful link building strategy is comprised of only high-authority sites. Instead, a holistic, natural mix of high- and low-authority backlinks can often drive rankings the best since this avoids looking manipulated or “spammy.” In general, however, quality is always going to beat quantity.

You don’t want all “nofollow” or all “dofollow” links, either. An effective backlink profile should include a variety of links with an omnichannel focus: social bookmarking, image sharing, blog posting, and more.

Again, SEO pros have known this for quite some time, but seeing it confirmed in this way is valuable for agencies and clients alike. 

With the recent broad-scale introduction of generative AI tools, the idea of using fast, cheap AI-produced content for link building may seem appealing. Plenty of people have already experimented with this approach, and overall, they’ve been disappointed. 

Remember, spam is spam, regardless of how it was produced, and high-quality content is the only way to generate high-quality results. 

Don’t neglect internal linking, either. By strategically linking your pages to one another through your navigation menu, on-page anchor text, and buttons, you minimize click depth and help Google better understand your site’s hierarchy.  (Click depth is the distance, measured in number of clicks, between a given page and your site’s home page.)

By making your site more accessible to search engine crawlers and users alike, you can enhance its visibility and create a superior UX. 

Our Advice:

  1. Emphasize domain authority in your link building efforts and only publish content on sites truly relevant to your industry. 
  2. Don’t forget about lower authority sites, however, as a more holistic mix of low- and high-authority backlinks appears more natural to Google crawlers. 
  3. Implement internal linking best practices to minimize click depth and improve your site’s overall navigation for users and crawlers. 

High-Quality Content That Showcases E-E-A-T Delivers Results

In this new age of AI-generated content, high-quality content is key to helping your brand stand out on SERPs.

But what exactly does “high-quality” mean? It depends on your industry and your niche, which is why a thorough competitive analysis should underpin any digital content strategy. 

It’s also worthwhile to familiarize yourself with Google’s guidance on creating helpful, reliable, people-first content. 

Crafting content that delivers value to the intended audience is the key to generating sustainable results. This enhances UX and gives users a genuine reason to seek out your website.

Boosting organic ranking positions is a considerable benefit both in terms of on-page and off-page SEO efforts. Publishing top-notch digital content is about far more than where your site appears on SERPS. 

Create well-researched, expert-written content that meets world-class editorial standards to allow your brand to become a recognized, trusted voice in your industry. That means users will look to you for guidance long before they consider making a purchase—and when they reach that stage, your products or services will be top of mind.

Google quality raters score content using E-E-A-T: Experience, Expertise, Authority, and Trustworthiness. The API leak revealed that E-E-A-T is also a significant ranking factor for content, especially YMYL content. 

In total, Google accounts for author credentials, overall content quality, and site reputation. So when it comes to enhancing your site with content, great writing isn’t the only thing you need. Author bios are also crucial since Google treats authors as entities and recognizes them. The authoritativeness of each author influences the performance of the pages on which their content appears.

How Does Google Treat AI-Generated Content?

If it’s of sufficient quality, any content is a plus—especially on product and collections pages. The more helpful, accurate, and comprehensive, the better. However, if you use generative AI tools to create content, it’s vital someone closely reviews and edits the content before publication. 

But ultimately, AI is only as good as its training data. No matter how robust the model or how expansive the data set is, AI is still no match for a human expert. An expert who’s actually been there, done that, and can write about it in the most helpful, digestible way for your target audience.

This is also true of content used for link building or digital PR. We’ve seen AI make substantial progress since it burst onto the scene, but human content experts maintain a decisive advantage. 

That doesn’t mean you can’t benefit from using AI to some extent. But it’s something to consider carefully as you develop a long-term growth strategy for your organization.

Our Advice:

  1. Invest in high-quality, people-first content written by experts that meets strict editorial standards. Using AI can work in some cases, but it can never replace the knowledge and expertise of a human. Remember: high-quality content leads to high-quality results.
  2. Consider if author profiles and bios are relevant to your content, and if so, include author credentials with every new piece of content. 
  3. Write with the reader in mind. Answer their questions thoughtfully, provide real-world advice, and include relevant data and links to support your content. 

Google Measures Font Size

Nearly 20 years ago, it was common practice to bold, underline, or increase the font size of key terms or phrases to help emphasize their importance to search engines and users. 

Apparently, it’s still a worthwhile practice. 

The leaked documentation shows Google tracks avgTermWeight, which measures the average weighted font size of a term in the body of a document. Meanwhile, font size is being tracked for link anchor text.

Obviously, there are sensible ways to apply this. Heading tag styles are a strong example of using font size to demonstrate importance. You can also bold or italicize key passages so they stand out to users and search engines. 

Our Advice:

  1. Consider which terms, keywords, and passages are most important to a page’s content. Then, adjust the font size and style to emphasize these points. 
  2. Stay within the Web Content Accessibility Guidelines to ensure your content is as readable as possible.
  3. Use font size and formatting to create a visual hierarchy on each page. This allows the reader to quickly understand the page and find the information they need. 

Page Titles Are Key to Establishing Relevance for Queries

The attribute name “titlematchScore” should tell you everything you need to know about this one. For more, here’s the description:

“Titlematch score of the site, a signal that tells how well titles are matching user queries.”

Ignore page titles at your own peril. Best practice is to create a unique title for each page on your site and to include your targeted keyword early within each title in a way that’s natural and easily understood by users.

Our Advice:

  1. Write page titles with a purpose, including relevant keywords while using proper grammar, sentence structure, and clarity. 
  2. Ensure the page title accurately describes the page’s content to ensure it receives a high titlematchScore.
  3. Consider structuring page titles as a question or answer to optimize for featured snippets and AI Overviews. 

Short-Form Content Can Still Rank Well

If you’ve used SEO plugins like Yoast, All in One SEO Pack, or Rank Math in your WordPress backend, you’ve likely seen targeted word counts or an element of the overall score based on the length of your content alone. 

There’s no magic number when it comes to content creation, but the API leak contains a few interesting length-related items. 

For example, OriginalContentScore is designed to score short content for originality. Many believe content that’s “too short” will be dubbed “thin” by Google, which is a death knell for ranking in this era of the Helpful Content System. But this score confirms that “short” does not equal “thin.” 

Separately, the numTokens attribute is accompanied by a description that explains, “…we drop some tokens in mustang [the primary scoring, ranking, and serving system] and also truncate docs at a max cap.” 

This is another good reason to always put your most important information at the front of any content you create. Establish relevance early, and recall the reverse pyramid from your introduction to journalism class as a helpful way for structuring content.

Our Advice:

  1. Don’t write content with a certain word count in mind. Sometimes, shorter is better, as long as you cover the topic thoroughly and address the user’s intent. 
  2. Put the most important information near the top of the content. This saves users time and helps establish your authority early on. 
  3. Emphasize different content lengths in your content strategy. Create short-form content to provide quick answers and long-form content to provide in-depth information that builds your authority and expertise. 

Content Freshness Matters

How much thought do you put into the publishing date of your content? Do you publish a backlog of posts quarterly or even backdate posts to make your blog appear more active in the past?

Google’s goal is to serve searchers the freshest content available, which means publication dates matter more than some previously thought. 

To achieve this, Google analyzes a page or document’s date across a few attributes: 

  • bylineDate: An explicitly stated date featured in a document’s byline.
  • syntacticDate: The date mentioned in the URL of a document or in the document’s title.
  • semanticDate: An estimated date based on the document’s contents, anchors, or related documents.

What can we take away from these features? Whenever possible, add a date to all new content. Be consistent with this date across page titles, structured data, XML sitemaps, and URLs. Any inconsistency might put your content at risk and limit its performance.

Rather than backdating content or publishing your catalog all at once, it’s better to create and publish high-quality helpful content often and consistently to help establish yourself as a trusted authority in your industry. 

Our Advice:

  1. Be consistent with your publishing cadence. Don’t do one big content dump at once. Instead, aim to post daily, weekly, or whatever cadence works for your team. 
  2. Add dates to all new pieces of content, ensuring the dates are consistent across page titles, structured data, XML sitemaps, and URLs. 
  3. Frequently publishing new content means Google will crawl your site more often, allowing for more opportunities to establish your industry relevance and authority. 

Google Tracks Your Site’s Topical Focus

The Google Search algorithm leak shows that Google compares page embeddings to site embeddings to determine how relevant a page is to the site’s overall topic. In other words, if a site dedicated to dog food has a page covering the best food to give a dog based on its age and breed, it would score well. 

siteFocusScore is the “number denoting how much a site is focused on one topic,” while siteRadius is the “measure of how far pave_embeddings deviate from the site_embedding.”

Trying to be everything to everyone is going to hinder your site’s ability to rank highly. It’s far better to foster industry expertise among your internal teams and leverage that to build a site that’s the go-to, one-stop-shop for all things [your vertical here].

Our Advice:

  1. Identify and solidify your website’s overall theme. Instead of trying to cover dozens of different topics, create content that supports your site’s primary focus. 
  2. Look at your analytics data to identify the top-performing content and consider how it relates to your site’s focus. This will present new opportunities to support this focus in future content. 
  3. In addition to target keywords, use synonyms and related terms to help Google better understand a page’s topic and how it relates to your site’s overall focus. 

Key Takeaways for Digital Marketers

Two marketing professionals sit at a shared desk side by side facing individual computer screens. Displayed on the screens is a book with the Google "G" logo to represent the recent algorithm documents.

Search intent and user experience

If you focus on nothing else as you create and optimize pages, these two key concepts will see you through. 

All of these leaked ranking features relate directly to search intent and user experience in one way or another. If you craft content and design your site with the end user in mind, you’re on the right path. An experienced technical SEO agency can help fill in the rest of these details and ensure your site is optimized to the fullest.

While this leaked documentation does uncover a number of discrepancies between what Google representatives have said publicly and how Search may truly operate, overall, it reinforces what we’ve known about Search for years. 

“The More Things Change, the More They Stay the Same”

The math is quite simple. Search needs to attract and retain users in order to grow. Sure, the fact that “Google” is a verb gives Search an incredible advantage over competitors, but at the end of the day, people don’t use products they aren’t satisfied with. 

So Google needs to deliver the most valuable Search experience for every query, which means delivering the most useful, highest-quality results with minimal user effort.

If your page is relevant to that query and would satisfy the user’s search intent, you have a chance of ranking highly. Yes, the steps you take to establish relevance and how Google interprets search intent are crucial, but those are tactics, and for the moment, we’re talking strategy. 

It doesn’t take much imagination to understand that on a SERP filled with relevant results, how those results are ranked depends on the user experience each page offers. 

Pages that are well organized, load quickly, and feature helpful content written by industry experts tend to float to the top. 

Meanwhile, pages weighed down by unnecessary plugins, employ improper heading tag hierarchy, and feature regurgitative, thin content produced by generative AI with no human oversight will be pushed further down in Search results.

How to Rank Highly for Your Most Valuable Keywords

In short, Google wants to reward websites designed to give people what they want. Your aim should be to create a website that contributes to a thriving online ecosystem. 

Done right, your website will provide revenue growth and increased ROI for years to come. Yes, frequent Google Search and algorithm updates mean the landscape is constantly shifting, but that really only poses a problem for those reactionaries in the industry who struggle to look ahead. 

There was a period when Search could be manipulated via black-hat tactics in a manner that was profitable and even somewhat enduring. However, the wild west days of search marketing are over and have been for years.

Do people still try to manipulate Search and get their low-quality pages to rank higher than they should? All the time. But it’s like trying to pay your mortgage by taking your kid’s college fund to a casino—it never works. 

Meanwhile, working with a trusted digital marketing agency to build an online presence through proven, white-hat SEO practices is akin to sitting down with a financial advisor and creating an investment strategy that allows you to retire early.

Google’s aim is clear. When you build and optimize your site with people in mind—by working to satisfy their search intent and provide a best-of-web user experience—you can expect to be rewarded with meaningful visibility that will pay dividends as a pillar of your brand’s growth.

What the Leak Says About VELOX Media’s Approach to SEO

A computer screen displays a search engine results page with the VELOX logo in the bottom right corner. Surrounding the computer screen are bubbles in different colors all with question marks.

This is by no means an exhaustive list of the ranking features included in the API leak. But this sampling is sufficient to demonstrate that the leak doesn’t necessarily represent a sea change for Search marketers. 

Instead, the leak is further confirmation that VELOX’s techniques are thoroughly aligned with Google’s approach to evaluating and ranking websites. This explains why our clients continue to see outstanding results. 

Google wants to reward sites that take a user-oriented approach to design and optimization, which is exactly what our SEO team does every day. 

We dig into our clients’ sites to understand what works well and identify opportunities for improvement. Our on-page work includes titles, meta descriptions, heading tag hierarchy, accessibility aspects such as image alt text, content, page speed, and much more. All of these combine to help optimize each client’s on-page SEO, which the leak confirms is still incredibly important.

Meanwhile, our world-class content team carefully crafts SEO content incorporating strategically selected keywords and landing page links to build a robust backlink profile through thoughtful syndication across the web.

Every month we publish new content on behalf of our clients, ensuring they cover seasonal topics while remaining relevant within their industry. 

By regularly publishing fresh content, we help cement our clients as industry leaders and trusted resources while providing Google with new content to crawl and index. 

Our approach drives the traffic, conversions, and revenue that enterprise SEO clients need to scale. The Google Search algorithm leak is further proof that we know what it takes to deliver sustained search dominance—and our track record shows it. 

Because of our expertise and continued innovation, we’re recognized as a Google Premier Partner. We rank among the top 3% of agencies globally because of our commitment to client success and our history of exceeding expectations.

Take your digital presence to the next level. Contact VELOX today for a free marketing plan. 

Let’s Grow Your Brand Together

Want to learn more about how we can help your brand grow and increase revenue?

Contact VELOX today
Go to the top of the page

Contact Us

  • Hidden
  • This field is for validation purposes and should be left unchanged.