Skip to main content
ScanGov
Standards
Project
Docs
ScanGov Standards
Guidance
Digital experience standards
Based on public policy, web protocol, guidelines and best practices.
All
Accessibility
Bot-friendly
Domain
Performance
Security
Social
Usability
Indicators
Standard
Why
Guidance
Crawlable
Site is available for indexing by well-behaved agents.
Allows AI bots to find and index your pages, helping your content appear in search results and reach more people.
21st Century Integrated Digital Experience Act (IDEA)
Memorandum (M-23-22)
The Web Robots Pages
Content available in document
Page main content is available in initial document.
Ensures search engines can read and index the content within a document, making it discoverable and improving visibility online.
The Web Robots Pages
GovernmentOrganization Schema.org Type
Homepage has Schema.org government organization type tags.
Identifies websites as official government agencies for AI systems and search engines to recognize automatically.
Schema.org structured data
Sitemap status
The HTTP status of /sitemap.xml is OK.
Confirms the sitemap is accessible, ensuring search engines can easily find and index all pages on the site.
sitemaps.org
Sitemap XML
The sitemap file type is XML.
Stores site structure in a readable format, helping search engines efficiently crawl and index all website pages.
sitemaps.org
Robots valid
The site has a valid robots policy.
Guides search engines on which pages to crawl or avoid, ensuring important content is indexed and irrelevant pages aren't.
The Web Robots Pages
Robots allowed
The robots policy allows access to browsers and scrapers.
Permits search engines and web tools to access content, helping improve search visibility and gather relevant data.
The Web Robots Pages
Sitemap in robots.txt
The robots.txt file points to a sitemap file.
Helps search engines find the sitemap quickly, improving how they discover and index website pages.
Google Search Central
Canonical
Use preferred page URLs to avoid duplication.
Prevents duplicate content issues by telling search engines which version of a page is the main one.
The Web Robots Pages
Google Search Central
Link text
Links have descriptive text.
Describes the link’s purpose clearly, helping users know where it leads and improving navigation for everyone.
Google Search Central
hreflang
Specifies language and region for webpages.
Indicates page language and region, helping users see the right version and improving search results in different countries.
Google Search Central
Get ScanGov