TL;DR: Sitemap.xml = list of URLs for Google to crawl. Include indexable pages, updated regularly, submitted to GSC. Maximum 50,000 URLs per file, 50MB uncompressed.
What to Include
- All indexable pages.
- Canonical URLs only.
- 200 OK responses.
- Priority content.
What NOT to Include
- noindex pages.
- 404/301 URLs.
- Non-canonical URLs.
- Duplicate content.
- Paginated pages (usually).
Format
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/</loc>
<lastmod>2026-04-25</lastmod>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
</urlset>Multiple Sitemaps
Large sites (50,000+ URLs) need sitemap index:
<sitemapindex>
<sitemap><loc>.../sitemap-products.xml</loc></sitemap>
<sitemap><loc>.../sitemap-posts.xml</loc></sitemap>
</sitemapindex>Specialized Sitemaps
- Image sitemap. for photo-heavy sites.
- Video sitemap. for YouTube/videos.
- News sitemap. for news publishers (Google News).
Submission
- Create sitemap.xml.
- Add to robots.txt:
Sitemap: https://example.com/sitemap.xml. - Submit in GSC → Sitemaps.
- Monitor. check for errors.
Frequency of Update
On every content change. Auto-generated by CMS ideal.