When it comes to crawling and indexing websites, Google is as picky as a primary school student who doesn’t want to eat tasteless vegetables.
With hundreds and thousands of websites launching every day, the search engine giant outlined specific rules for indexing web pages. They deliberately set the bar high to ensure users are safe from spam sites, scammers and suspicious web activities.
Google Indexing is Every Site Owner’s Responsibility
As a website owner, one of your goals should be to maintain a healthy relationship with Google. And such a solid relationship will only be possible if you have a clear understanding of your website’s anatomy.
Some websites are very direct. All the pages you can access through the homepage. Others are more indirect and tend to hide pages here and there. Regardless of how many pages you’ve attached to your domain, web owners must remember that indexing takes time.
If you leave everything to Google, there’s no knowing when search engines might pick up your site. There are 3 to 5-year-old sites that never got Google’s attention. And it’s probably because the web owner didn’t activate Search Console and other features that can prompt Google to validate your website and crawl it faster.
Err on the Side of Caution
If your website is an online store, this goes without saying that Google needs to index several pages at the same time. Even if you know Google Bot renders JavaScript and may likely index your site just fine - do not be complacent. Do not leave everything to chance.
Google crawlers and algorithm rules are ambivalent.
Sure, everything will look fine in the beginning. Google will appear to index your website and corresponding pages on a piecemeal basis. You’ll glimpse the first signs of organic traffic.
Sooner or later, indexing will slow down. Even a sloth can do better. You would think Google crawlers would speed later on, but lo and behold... it won’t. So yes, as a website owner, you have to do something.
You need Google to index your pages fast? Here are three simple (totally ethical) tricks to consider:
1. Implement SSR
A good starting point if you want Google to recognise your web pages faster is to implement SSR (server-side rendering). SSR is a traditional rendering method. In Benjamin Burkholder’s article, he explains what SSR is as opposed to client-side rendering (CSR):
...all of your page’s resources are housed on the server. Then, when the page is requested, the HTML is delivered to the browser and rendered, JS and CSS downloaded, and final render appears to the user/bot.
A quote from a Google user also attested that client-side rendered websites take time to get indexed. Google’s crawlers need to go through them twice before adding them to their database.
For CSR websites, Google will look through the first HTML it encounters and subsequently tracks nearby links. After sending details to the renderer, it will then proceed with the second index and return the final HTML. This is both costly and time-consuming for Google. Web designers might as well place all crucial links in the initial HTML.
Once you switch to SSR, prepare for a pleasant surprise. From 20k pages indexed, you’ll experience a steady climb to more than 100k indexed pages.
If you still think this isn’t fast enough, then trick number 2 might enlighten you.
2. Get a Dynamic Sitemap
You have an online store or a super progressive blog. Ergo, your website contains a lot of pages, which may confuse Google if you don’t guide it. Google’s piecemeal crawling habits won’t work for you. That's why you need to create a new sitemap twice daily and make it a point to upload it to your Content Delivery Network (CDN).
Did you know that sitemaps can accommodate no more than 50,000 pages? With this in mind, you have to compartmentalise. Focus on pages with important or relevant content (or information you want to share to target readers).
Following your sitemap submissions, Google will surely index your website faster than ever.
Do you still want to go faster?
Next step is the killing blow.
3. Remove JavaScript for Bots
Yes, you’ve made progress with the way Google crawls your website. But it’s not the speed you’re expecting. So how did other websites do it? They have more pages than you but surprisingly, they manage to rank better.
This brings us back to our first point. Google crawlers have their limits. That’s why you don’t leave everything to chance and wait for Google to do the work for you.
Apparently, its algorithm only allocates a certain amount of indexing resources to each website it crawls. And despite your efforts, your website could be something that Google considers a “liability” rather than an asset.
Google saw relevant links within the first HTML but still had to prompt the renderer to ensure all links have been indexed thus far. Because there is still JavaScript in the initial HMTL, search engine crawlers have no way of knowing everything is already there.
So no big moves. Simply remove JavaScript for Bots
This is a real-life story: Google Bot will index your pages like crazy. From not more than 10 pages per minute to as much as 10 pages per second!
Since implementing the said changes, one user said that he was incredibly happy with the results. Once Google was able to index all pages of his site, its overall performance skyrocketed as well.
Key Takeaway
So you want Google to index a website with hundreds and thousands of pages quickly but ethically?
First, cut to the chase and show the final HTML. Next, create two sitemaps and submit them twice in a day. Finally, remove all the JavaScript for bots other than the inline Schema-JS.
This is basically how you maintain a healthy relationship between your website and Google.
Click here to read more about a new Google feature.