Need Help? Talk to Our Experts
Arguably one of the more straightforward technical elements of SEO, XML sitemaps are often typically misunderstood. To get a better understanding of XML sitemaps and how to use them efficiently, it helps to know what they are and what they are not.
In its simplest form, a sitemap serves as a road map for search engines to discover your website’s most important content and get further context on your website’s overall structure. In addition to providing search engines with a list of URLs, sitemaps can help search engines find newer content, or content located deep within the website’s architecture, which helps websites with a poor internal linking structure.
Now that we know the myths and what sitemaps aren’t, how can we use them to improve our site organically?
Two popular pieces of markup found in XML sitemaps are the ‘priority’ and ‘change frequency’ tags. Many webmasters will utilize this markup to improve crawl efficiency and highlight a website’s priority content. John Mueller of Google has stated that Google ignores these two signals. However, he has indicated the utilizing the lastmod markup is used when Google analyzing a sitemap. Focusing on this tag and making sure that you are including the right URLs will go a long way to ensure that your site map is crawled efficiently and has the greatest impact.
A big first step in making sure that your most important content is discovered is to learn how to create a sitemap and place it in the root directory of your server.
Next, be sure to provide a link to your XML sitemap in your robots.txt file. This file is one of the first places a search engine bot will visit when it hits a website. There it will find directives on what content to crawl and what content to avoid. By including a link to your sitemap, you help ensure that search engines are discovering and crawling your content.
A final step is to physically submit your sitemap to Google Search Console and Bing Webmaster Tools. According to Google’s webmaster’s forum, they don’t check your sitemap every time it is updated, only the first time they notice it. After that, they will check your sitemap only when they are notified it has changed. This can be done using Google Search Console’s sitemap tool, and using the “ping” functionality to Ask Google to crawl your site map by sending an HTTP GET request:
It’s imperative that your sitemap references URLs that are indexable and returning a 200 OK response code. Webmasters, SEOs or dev teams should routinely audit their website’s sitemap to remove pages returning 404 errors, 300-response codes and 500-level server errors. This can be done manually by crawling the sitemap or utilizing Google Search Console’s XML Sitemap report to identify invalid URLs. Remember, search engines operate on a crawl budget, so every non-indexable URL increases the chance a valid one won’t get crawled.
Consistency is important to a properly formatted XML sitemap. Make sure to use consistent protocols. If your website is a secure site (uses HTTPS) then make sure that the sitemap and all URLs are using the secure protocol. Otherwise, your sitemap will contain redirects which can affect your crawl efficiency and indexation.
Utilize consistent sub domains. Since the XML sitemap provides insight into website architecture and organization, each subdomain should have its own sitemap. This will also help keep your sitemaps as condensed as possible.
Be sure to only include canonical versions of URLs. URLs that include parameters or session IDs can be considered duplicative and should be excluded. Otherwise, crawl efficiency and overall indexation could suffer. When conducting regular sitemap audits, be sure to look for any-non-canonical URLs and remove them. Again, utilizing Google Search console’s sitemap report can help you easily identify non-canonical URLs and checking this report regularly is a good best practice. In addition to utilizing Google’s tools through Search Console, leveraging BrightEdge’s ContentIQ site audit tools can help SEOs and webmasters identify non-canonical URLs and pages returning non-200 response codes to help further audit your XML sitemaps.
A sitemap needs to be UTF-8 encoded. URLs must use entity escape codes for characters like ampersands (&), single quotes (‘), double quotes (“), less than (<), and greater than (>). Also, URLs should only contain ASCII characters.
The size of an XML sitemaps can quickly get out of hand, especially for larger websites like e-commerce sites. When a sitemap gets too big, it can negatively impact the number URLs that are crawled and indexed, and it can contribute to your web server getting bogged down if it needs to serve large files. To combat this, XML sitemaps should be limited to containing 50,000 URLs and/or being no larger than 50 MB. This means that larger sites may need to use multiple site maps in a sitemap index file.
For larger sitemaps, breaking out sections of content into their own sitemaps can help keep content organized and help avoid sitemap bloat. Creating separate sitemaps for videos, images, and blogs may be a good idea.
There are many tools that can assist in XML sitemap creation. Many CMS’ have dynamic sitemap creation options that you can use to help manage what content is published in your sitemap file. A CMS like WordPress has several plugins to help manage sitemaps.
Now that you know how to create a sitemap, format, setup and edit one, it’s time to prepare the list of your most important content to include and get it submitted to the search engines. Get started today!
Refund Policy|Terms & Condition|Blog|Sitemap