Notes from the Google Webmaster Hangout on the 13th of July 2018.
Shorter URLs Aren’t Given Preferential Treatment in Search
Google doesn’t have anything in their algorithms to prefer URLs that are shorter in length, despite of what some ranking studies suggest.
Use Log Files to Identify Crawl Budget Wastage & Issues With URL Structure
When auditing eCommerce sites, John recommends first looking at what URLs are crawled by Googlebot. Then identify crawl budget wastage and perhaps change the site’s URL structure to stop Googlebot crawling unwanted URLs with parameters, filters etc.
Google Reconsiders URLs Removed From the Disavow File
Google uses the most current version of a site’s disavow file and will reconsider URLs that have since been removed.
HTTP/HTTPS & www/non-www Versions to be Switched to Mobile-first Index Separately
Mobile-first indexing is done on a per site level, meaning HTTP/HTTPS and www/non-www are treated separately and will receive notifications about being switched over individually.
Google Has no Explicit Character Limit For Snippets
Google have no explicit limit for the number of characters included in a rich snippet. The snippet length varies from result to result, sometimes including structured data rather than the meta description and this can also vary over time.
Google Doesn’t Have Separate Way of Prioritising Pages For Rendering
Google doesn’t have a separate way of prioritising the order of pages to be rendered in the second wave of indexing that differs to the way sites are prioritised for regular crawling and indexing.
Don’t Rely on Robots Directives in Robots.txt Being Respected By Google
Don’t rely on noindex directives in robots.txt as they are aren’t officially supported by Google. John says it’s fine to use robots directives in robots.txt, but make sure you have a backup in case they don’t work.
Use the URL Removal Tool & Sitemaps to Inform Google About Removed Pages
The URL Removal tool can be used to remove entire subdirectories from Google’s index usually within a day. When removing groups of URLs that don’t fall under one subdirectory you can make them 404s and tell Google they’ve changed recently via a sitemap file.
There are Several Options For Different Country Versions of Product Pages
There are several options for dealing with product pages with different country versions. Separate landing pages with hreflang mean that the value of the page is diluted. Using IP redirects and having all versions on the same page will result in Googlebot only crawling the US version. Serving country-dependent elements in JavaScript and blocking them from being crawled is a further option.
Google Requires Multilingual Sites to Have Some Form of URL Differentiation
Anything that differentiates a URL can work for Google for multilingual sites with hreflang e.g. subdomains, subdirectories or parameters.
Google Requires Distinct Section on Site For Geotargeting to be Understood
Google requires that websites using geo-targeting have a dedicated section on the site for specific territories such as a separate subdirectory or subdomain.
503 Errors Reduce Crawl Rate and Crawling Stops if Robots.txt 503s
Search Console reporting a site as temporarily unreachable means the site is returning 503 errors. Googlebot will temporarily slow crawling if the site returns 503 errors and stop crawling if the robots.txt file 503s.
Rich Snippets Will Better Match Page Content on Mobile After Mobile-first indexing
John suspects the most visible impact of mobile-first indexing on the search results will be that rich snippets will better match the content seen on the page for mobile devices.
Google Crawls Using Local IP Addresses For Countries Where They Are Frequently Blocked
Google will crawl with local IP addresses particularly for countries where US IP addresses are frequently blocked e.g. South Korea.
Block Videos From Search By Adding Video URL & Thumbnail to Robots.txt or Setting Expiration Date in Sitemap
You can signal to Google for a video not to be included in search by blocking the video file and thumbnail image in robots.txt or by specifying an expiration date using a video sitemap file.