Heading & Meta Extractor
Audit the structure of any webpage.
What This Tool Does
The Heading and Meta Extractor scans any publicly accessible webpage and extracts its heading hierarchy (H1 through H6), core meta tags (title, description, robots, canonical), and any JSON-LD structured data blocks. It gives you a quick on-page SEO audit without needing browser extensions or manual source code inspection.
Inputs
- Page URL: Enter the full URL of the page you want to scan and click Scan Page.
How It Works
The tool sends the URL to a server-side proxy, which fetches the raw HTML. The response is parsed client-side using the DOMParser API to extract the title element, meta description, robots directives, canonical link, all heading elements in document order, and any <script type="application/ld+json"> blocks. Results are displayed in organized sections.
Understanding the Results
- Basic Meta: Shows the title, description, robots tag, and canonical URL. Missing elements are labeled clearly.
- Header Outline: Displays all headings in a tree-like indented format, making it easy to spot hierarchy issues such as missing H1 tags or skipped heading levels.
- Structured Data: Lists any JSON-LD schema blocks found on the page with a preview of each block's content.
Page Structure
Basic Meta
Header Outline
Structured Data (JSON-LD)
Step-by-Step Example
- Enter a URL such as
https://example.comand click Scan Page. - Review the Basic Meta section. Confirm the title is present and within 60 characters.
- Check that a meta description exists and is between 120 and 160 characters.
- Verify the canonical URL points to the correct page.
- Review the Header Outline. Ensure there is exactly one H1 and headings follow a logical order.
- Check the Structured Data section for valid JSON-LD blocks.
Use Cases
- Quick on-page SEO audits before publishing new content.
- Verifying heading hierarchy across site templates.
- Checking that canonical tags are correctly implemented after a URL migration.
- Confirming JSON-LD structured data is present on key pages.
- Competitive analysis of how other sites structure their headings and meta tags.
Limitations and Notes
- Only publicly accessible pages can be scanned. Login-protected or IP-restricted pages are not supported.
- JavaScript-rendered content may not be captured since the tool parses raw HTML.
- The tool shows the first 200 characters of each JSON-LD block. Use a dedicated validator for full schema inspection.
- Dynamic meta tags set by JavaScript frameworks may not appear in the parsed HTML.
Frequently Asked Questions
Why is heading hierarchy important for SEO?
A logical heading hierarchy helps search engines understand the structure and topics of your content. Pages should have exactly one H1 as the main title, with H2 through H6 used for subsections in order. Skipping levels, such as jumping from H1 to H4, can confuse crawlers and reduce content clarity.
What meta tags does this tool extract?
This tool extracts the title tag, meta description, robots directives, and canonical URL. It also detects any JSON-LD structured data blocks embedded in the page.
How many H1 tags should a page have?
Best practice is to have exactly one H1 per page. While HTML5 technically allows multiple H1 elements within sectioning elements, using a single H1 gives search engines a clear primary topic signal.
What is JSON-LD structured data?
JSON-LD is a JavaScript notation format used to embed structured data in web pages. Search engines use it to understand content types such as articles, products, FAQs, and organizations, which can enable rich results in SERPs.
Can this tool check pages behind a login?
No. The tool fetches pages through a server-side proxy, so it can only access publicly available URLs. Pages requiring authentication or blocked by robots directives cannot be scanned.
Does the robots meta tag affect crawling?
Yes. The robots meta tag tells search engines whether to index a page and whether to follow its links. Common values include index, noindex, follow, and nofollow. A noindex directive prevents the page from appearing in search results.
Sources and References
- Google Search Central - Heading elements: developers.google.com
- Google Search Central - Structured Data: developers.google.com
- Schema.org - JSON-LD: schema.org
- MDN Web Docs - HTML heading elements: developer.mozilla.org
- W3C - HTML Living Standard: html.spec.whatwg.org
- web.dev - SEO Audits: web.dev
Related SEO Tools
Title Tag Checker
Validate title tag length and pixel width.
Meta Description Checker
Optimize meta descriptions for CTR.
Canonical Checker
Audit canonical tag implementations.
Schema Markup Generator
Generate JSON-LD for FAQ, Article, and more.
Readability Checker
Check Flesch-Kincaid readability scores.