Slaash Guide

How Slaash distills web pages to the 1% that answers your question.

The problem

A typical web page is 50,000–500,000 characters. An AI agent needs maybe 500 characters to answer a question. Sending the full page wastes 99% of tokens and costs $0.50–$3.50 per page.

Slaash reduces that to ~$0.002 per page by extracting only the relevant nodes.

How extraction works

When you call slaash.extract, Slaash:

Parse — the page is parsed into a structured tree of meaningful elements: headings, links, buttons, prices, text blocks.
Score — every element is scored against your goal. Relevant nodes are ranked and the best matches float to the top.
Select — top nodes are returned, sorted by relevance. Often fewer nodes than requested when the answer is concentrated in a few elements.

Goal expansion

Always expand your goal. Include synonyms, translations, and expected values. The more terms you provide, the better Slaash can match.

Bad goal: "price"

Good goal: "price cost amount $ USD kr pris belopp total fee"

In the Playground, the examples show expanded goals. In production, your AI agent should auto-expand before calling Slaash.

Learning from feedback

Slaash learns from feedback. When you call slaash.learn with the node IDs that contained the correct answer, the system remembers. Next time a similar query is made, those nodes rank higher automatically.

Key features:

Persists across sessions
Transfers across similar pages on the same domain
Requires zero training data — just tell it what was right
Gets more accurate with each feedback signal
Per-user isolation — your feedback only affects your results

Output formats

JSON — full structured data. Node ID, role, label, relevance, confidence, causal_boost, action, value. Best for programmatic processing.

Markdown — human-readable text with headings and links. Best for feeding directly to an LLM.

TOON — Token-Oriented Object Notation. Minimal format: H5|Heading text|1.23. 86% smaller than JSON. Best for tight token budgets.

When to use which primitive

extract — you know the URL and have a question
search — you don't know which URL has the answer
crawl — the answer spans multiple linked pages
stream — the page is massive (Wikipedia, docs) and you want token savings
act — you need to interact (click, fill, extract structured data)
plan — multi-step goal that needs decomposition