Sitemaps
A sitemap is a list of all of the live URLs which exist on a site and is used to inform search engine crawlers of the most important pages and therefore which ones should be crawled and indexed.
There are several things to consider when creating sitemaps, as well as understanding how search engines view them. We cover a range of these topics within our SEO Office Hours Notes below, along with best practice recommendations and Google’s advice on sitemaps.
For more on sitemaps and SEO, check out our article: How to Improve Website Crawlability with Sitemaps.
URLs in Sitemaps Are Not Guaranteed to be Indexed
Google may choose not to index URLs in sitemaps that are very similar to ones already being indexed and if they differ to the ones linked within the site (e.g. trailing slash/non-trailing slash).
Mixed Migrations May Cause Google to Index HTTP or HTTPS URLs
Forgetting to update your sitemap files following a HTTPS migration could cause some pages to be indexed with the HTTP URL and some HTTPS.
Video Sitemaps Can Specify Countries Where Content is Available
With a video sitemap you can define which countries your content is available, which is used for video search results.
Low Proportion of Indexed Pages Points to Technical Issue
If a site has a low proportion of indexed pages, this usually points to a technical issue than a quality issue. Compare the site map index counts and index status report for differences. Try splitting up sitemap file , checking indexed pages using info: query, that rel canonicals match those in sitemap file, hreflang and internal linking. Also, uppercase, lowercase, trailing slashes all matter. Then check crawl stats to get idea of crawl rate and if it’s reasonable.
GSC Sitemaps Report Can Take Couple of Days to Update
Sitemaps report in GSC can take a couple of days to update after changes have been made to the sitemap and may explain why non-existent errors are reported.
Google Validates Sitemap Files Immediately After Submission
The 50k URL limit for sitemaps is based on the number of entries or elements in the sitemap file (including alternate linked URLs) and this is validated immediately after they are submitted. So if there are too many URLs in the sitemap file, you will be made aware of that straight away.
Ensure Separate Sitemap Files Don’t Contain URL Overlap
Having separate dynamic and static sitemap files is fine, as long as there is no URL overlap.
Submit Sitemap With Updated Last Modification Date For Faster Crawling of Updated Pages
Submit a sitemap file with an update last modification date to speed up the process of crawling and indexing of pages that have been changed.
Site: Search Operator Isn’t True Indicator of All Indexed Pages
Site: search operator isn’t a true indicator of all pages that are indexed on that site. Use a sitemap file to submit the URLs you care about.
Internal & Sitemap Links May Override Canonical Tags
Google uses a number of factors to determine which URLs to show. Canonicalised pages may still be chosen if you link to them internally and in Sitemaps.