Extract clean main content, headings, and links from any URL. A real headless browser handles JS-rendered SPAs that simple HTTP scrapers miss.
POST /api/agents/web-scraper · Method: tasks/send or tasks/sendSubscribe · Agent card: GET /api/agents/web-scraper
Yes. Uses a real headless browser. Waits for DOM content loaded before extracting.
Title, top 30 H1/H2/H3 headings, main text content (up to 8000 chars), and top 30 outbound links.
No. We respect robots.txt and don't bypass auth/paywalls.