Sitemaps
A sitemap is a list of all of the live URLs which exist on a site and is used to inform search engine crawlers of the most important pages and therefore which ones should be crawled and indexed.
There are several things to consider when creating sitemaps, as well as understanding how search engines view them. We cover a range of these topics within our SEO Office Hours Notes below, along with best practice recommendations and Google’s advice on sitemaps.
For more on sitemaps and SEO, check out our article: How to Improve Website Crawlability with Sitemaps.
Internally Link Pages Together to Increase Discoverability & Reduce Reliance on XML Sitemap
Internally linking pages together helps Googlebot to discover the pages on your site more easily, and reduces the reliance on using XML sitemaps for URL discovery.
XML Sitemaps Should Include URLs on Same Path Unless Submitted Via Verified Property in GSC
XML sitemaps should contain URLs on the same path. However, URLs in sitemaps submitted via GSC can be for any valid property within your GSC account.
Missing Sitemap Data in GSC API is a Known Error
When switching over to the new GSC UI for sitemap reporting, which took place early April 2019, an issue occured within the API where data stopped updating. The team are looking into this and John expects they will document the error soon, with advice for those affected.
Google May Index Redirected URLs if Served in Sitemap Files
Redirects and sitemaps are both signals that Google uses to select preferred URLs. If you redirect to a destination URL but the source URL is in a sitemap, this is giving Google conflicting signals about which URL you want to be shown in search
If One Sitemap URL Has an Error This Shouldn’t Impact the Rest of the XML Sitemap
If one individual URL element within an XML sitemap has an error, this will not impact the way Google is able to parse and read the sitemap as a whole. However, if the element is broken in a way that impacts the parsing of the rest of the sitemap, then the XML file becomes unreadable and will not be usable as a sitemap.
Use Accurate Last Modified Dates For Individual Pages in Sitemaps For Faster Recrawling
Make sure each individual page in an XML sitemap has its own last modified date so Google can trust that the information is accurate and recrawl updated pages where necessary.
Use Structured Data & Video Sitemaps to Give Google More Context on Videos
You should use structured data to tell Google whether a video was streamed or recorded, and you can also use video sitemaps to tell Google which countries a particular video is available in, for example.
A Sitemap File Won’t Replace Normal Crawling
A sitemap will help Google crawl a website but it won’t replace normal crawling, such as URL discovery from internal linking. Sitemaps are more useful for letting Google know about changes to the pages within them.
Google Doesn’t Mind How Sitemaps Are Split up
Google combines separate sitemaps together so that they can be processed. This means it is up to webmasters to decide how they want to split up sitemaps.
Google Treats XML Sitemaps Differently From HTML Pages
Google treats XML sitemaps differently from HTML pages, as they are a machine-readable file and not meant to be indexed by search engines.