URL Architecture
URL architecture relates to the structure of a webpage’s URL and how it can impact a page’s performance in search. There are several elements to consider when creating a URL structure to ensure it is optimised for both search engines and users. These are covered within our Hangout Notes, along with recommendations and insights from Google.
Excessive URL Parameters and Rewrites Can Causing Problems
Google can have problems crawling your site if your URL structure has an excessive number of URL parameters and rewrites which redirect to a few pages.
Avoid Changing URLs
Changing URLs is best avoided as Google has to relearn all the links and context of pages. He admits Google is not perfect at handling these changes which can take a few months to settle down, and recommends planning URLs for the long term.
Re-use Old URLs for Updated Content
For pages which are updated annually, or when a new product model is released, it makes more sense to keep the same URL and update the content, otherwise you may see the old page ranking instead of the new one as it has better ranking signals. If you need to keep the old content, you can republish it on a new archive URL.
Set up Image Redirects when URLs Change
If you change image URLs, set up redirects to help them get picked up more quickly.
URLs Are Only Used to Identify Pages
Google uses URLs mainly to identify pages, so grouping pages by path structure doesn’t make a difference.
Use Canonical Tags to Resolve Trailing Slashes
Canonical tags are the best way to deal with trailng slash duplicate pages.
Google Will Choose a Duplicate Page with the Shortest URL
URL paths doesn’t affect PageRank. However if there are 2 pages which are a duplicate, they will prefer the shorter URL.
Content Hidden in Tabs Can Be Put Onto Separate URLs
If you have content hidden in tabs, you can put it onto separate URLs.
URL Path Structure Isn’t Important
You don’t need to keep consistent URL architecture, and include a detailed hierarchy in the URL, with a page for every path. Google follows URLs which are linked and won’t try other URLs with different paths from the URL.
Disallow Rule Must Start with a Slash
If you’re specifying a path in the robots.txt file, you must start with a slash, not a * wildcard. This was always true, but was only recently added to the documentation and Search Console testing tool.