Notes from the Google Webmaster Hangout on the 13th of December 2019.
Subdomains Should Include Separate Robots.txt
Robots.txt is per hostname and protocol. If a page contains content from a different subdomain or domain, Google would respect the robots.txt for the main domain for the primary content. For the embedded content, Google would look at the robots.txt for the domain or subdomain that it is from and respect that.
Google Doesn’t Crawl Any URLs From Hostname When Robots.txt Temporarily 503s
If Google encounters a 503 when crawling a robots.txt file, it will temporarily not crawl any URLs on that hostname.
Google Treats Permanently 503’ing Robots.txt as an Error & Eventually Crawls the Site Normally
If a robots.txt returns a 503 for an extended period of time, Google will treat this as a permanent error and crawl the site normally to see what can be discovered.
URL Inspection Tool Silently Processes Redirects to Display Target Page
The URL Inspection Tool generally displays the content that Google will index rather than the entered URL. If it has a redirect, this will be silently processed and the target page will be shown instead.
URL Removal Tool Doesn’t Influence Google’s Choice of Canonical or Visible URL
The URL Removal Tool doesn’t impact Google’s choice of canonical or the visible URL, it will simply hide the page in search.
Use Crawlers to Detect Internal Links to Redirecting URLs After Migration
Use crawlers like DeepCrawl to detect internal links pointing to a redirecting URL after a migration.
Google Treats Escaped & Unescaped Versions of URLs & Links as Equivalent
Escaped versions of URLs and links are treated exactly the same as the unescaped versions by Google.
Include Structured Markup on Both AMP & Normal Page to Show in SERPs
Structured markup needs to be included on both the AMP and the normal version of a page to be displayed in search e.g. article markup.
Google’s Search Engineers Can Debug Individual Queries to Understand Rankings
Google Search Engineers are able to debug individual queries and understand why pages are ranking the way they are.