Site Architecture
A site’s architecture refers to the structure of pages on a website and how they are linked together. Site architecture affects how search engine’s crawl a website and how users navigate through a site. As an important factor for SEO, our Hangout Notes cover best practice guidance and advice to ensure your site architecture is optimally structured. To learn more about the ins and outs of this topic, make sure you check out our Ultimate Guide to Site Architecture Optimisation.
Noindex Search Results Pages and Nofollow Search Navigation
You should noindex your own search results pages as they are generally lower quality, but you can disallow them if they are causing a problem with the server load from being crawled. You can nofollow links on the search navigation to prevent crawling but you should probably have followed links on the results pages which can be useful for finding new pages if they are crawled.
URLs Are Only Used to Identify Pages
Google uses URLs mainly to identify pages, so grouping pages by path structure doesn’t make a difference.
Depth of Content Affects Crawl Rates
If content is buried deep in the site, it might take longer for Google to discover it, or changes. so improving internal linking from higher levels will help pages be crawled faster.
High Volume of Sitewide Links Make It Harder To Understand Connections Between Pages
John recommends against high volumes of sitewide navigation links which make it harder for for Google to understand the connections between the pages.
Google Identifies Boilerplate Content
John discusses how Google tries to understand the structure of pages to understand the standard boiler-plate elements of a page.
Use Separate Pages per Language
It’s best to use a separate page for each language than combine multiple languages on a single page.
Content Behind Search Forms May Not Be Seen
Google will have trouble finding all the content on sites with a large number of pages which can only be reached through a search form. John recommends some kind of sensible linking structure.
Migrate Replaced Products to New URLs
If a product is replaced, you can move the old product content to a new archive URL, then put the latest product on the existing URL. This allows the same URL to rank over time, and always serve the latest version of the product, and allow the historical product content to be kept.
Click Depth Will Affect PageRank
The higher the number of clicks from the home page (crawl depth), the lower the pagerank and crawl rate which could affect rankings.
Canonicalised Pages Stay in Google’s Index
Canonicalised pages may remain showing as indexed for site: searches depending on the ‘site structure’. They are no considered as hard as a redirect, and the page can still surface for unique content. Canonical URLs are not crawled immediately, like a redirect would be. John suggests that if you have a large number of incorrect canonical tags, such as many pages canonicalising to a single page, they might ignore all canonical tags across the site. Google makes a clear recommendation that cleaning up broken canonical tags is a good idea.