← Back to Docs
Slaash Guide
How Slaash distills web pages to the 1% that answers your question.
The problem
A typical web page is 50,000–500,000 characters. An AI agent needs maybe 500 characters to answer a question. Sending the full page wastes 99% of tokens and costs $0.50–$3.50 per page.
Slaash reduces that to ~$0.002 per page by extracting only the relevant nodes.
How extraction works
When you call slaash.extract, Slaash:
- Parse — the page is parsed into a structured tree of meaningful elements: headings, links, buttons, prices, text blocks.
- Score — every element is scored against your goal. Relevant nodes are ranked and the best matches float to the top.
- Select — top nodes are returned, sorted by relevance. Often fewer nodes than requested when the answer is concentrated in a few elements.
Goal expansion
Always expand your goal. Include synonyms, translations, and expected values. The more terms you provide, the better Slaash can match.
Bad goal: "price"
Good goal: "price cost amount $ USD kr pris belopp total fee"
In the Playground, the examples show expanded goals. In production, your AI agent should auto-expand before calling Slaash.
Learning from feedback
Slaash learns from feedback. When you call slaash.learn with the node IDs that contained the correct answer, the system remembers. Next time a similar query is made, those nodes rank higher automatically.
Key features:
- Persists across sessions
- Transfers across similar pages on the same domain
- Requires zero training data — just tell it what was right
- Gets more accurate with each feedback signal
- Per-user isolation — your feedback only affects your results
Output formats
JSON — full structured data. Node ID, role, label, relevance, confidence, causal_boost, action, value. Best for programmatic processing.
Markdown — human-readable text with headings and links. Best for feeding directly to an LLM.
TOON — Token-Oriented Object Notation. Minimal format: H5|Heading text|1.23. 86% smaller than JSON. Best for tight token budgets.
When to use which primitive
- extract — you know the URL and have a question
- search — you don't know which URL has the answer
- crawl — the answer spans multiple linked pages
- stream — the page is massive (Wikipedia, docs) and you want token savings
- act — you need to interact (click, fill, extract structured data)
- plan — multi-step goal that needs decomposition