Notes from the Google Webmaster Hangout on the 31st of January 2020.
It is Not Recommended to Link Internally to Cached Versions of URLs
Generating a website structure by pointing internal links to the cache URL is generally bad practice, as the URLs can change overtime and a lot of the caches are blocked by robots.txt. In this case, John recommends keeping internal links within your site, particularly for crawling and indexing purposes.
Google Will Take Into Consideration The Number of Valid DMCA Complaints a Site Has For Rankings
The number of valid DMCA complaints is something that is taken into consideration when ranking a site, as confirmed by Google in a blog post from 2012. If you incorrectly receive these and flag them as wrong they should be removed and will not cause an issue. However, if your site is continuing to collect them, Google systems may pick up on this.
Use GSC to Identify If There Are Any Errors With a Site’s URL Structure After a Migration
After completing a site migration, John recommends using GSC to compare the queries and positions the site was ranking for before and after the change. This will identify if there are any errors with Google’s understanding of the new URL structure and identify if the migration has impacted traffic to the site and where this has occurred.
Review Canonical Signals if Google Are Continually Picking a Different Canonical to the Ones Set
Google may occasionally pick a canonical that is different the one that has been set for certain pages, but this doesn’t change anything from a ranking point of view. However, if you’re seeing this on a large scale, John recommends reviewing if you are sending confusing signals to Google.
It is Normal for Google to Occasionally Crawl Old URLs
Due to their rendering processes, Google will occasionally re-crawl old URLs in order to check their set up. You may see this within your log files, but it is normal and will not cause any problems.
Google May Display Two Pages From the Same Site For The Same Query
If Google are unsure of the intent of a query, they may display two different pages from the same site in search results for the same query. This is not considered problematic as they would expect it to fluctuate over time before settling down. However, to prevent this, John recommends ensuring the pages are well targeted for clearly different facets that people would be searching for.
Fresher Content is Not Always Considered Better than Older Evergreen Content
When ranking pages, Google will use various different signals to understand the date it was published, but it is not necessarily the case that content that is fresher will be considered better than something that is older. This will be differ depending on the query and search intent.
If a Robots.txt File Returns a Server Error for a Brief Period of Time Google Will Not Crawl Anything From the Site
If a robots.txt file returns a server error for a brief period of time Google will not crawl anything from the website until they are able to access it and crawl normally again. During the period of time where they are blocked from reaching the file they would assume all URLs are blocked and would therefore flag this in GSC. You can use the robots.txt request in your server logs to identify where this has occurred by reviewing the response size and code that was returned during each request.