Methodology

How we measure edge.

01

Pipeline

Tree News streams crypto headlines via WebSocket. Every raw headline enters a coin detector, which attempts to tag the relevant ticker(s) from the title, body, and any exchange metadata. Detected headlines are then routed to the classifier.

02

Classifier

We use Groq's openai/gpt-oss-20b for all directional calls. Sub-second latency is a hard requirement for sub-minute trading; Groq's inference speed is the reason we use it over slower providers. The classifier returns LONG, SHORT, or SKIP, plus a confidence score, reasoning, and whether the event is novel.

03

Price data

HyperLiquid perpetual 1-minute OHLC across 20 tracked coins (AAVE, ADA, APT, ARB, AVAX, BNB, BTC, DOGE, ETH, LINK, ONDO, POL, SOL, SUI, TRUMP, TRX, UNI, WLD, XRP, ZEC). For each classified headline we compute the forward return from the headline's timestamp at three horizons — 60 seconds, 5 minutes, and 30 minutes. Window covers 2026-01-062026-04-19.

04

Sampling

The review is not a backtest. It is a structured sample of the four operational outcomes, 50 headlines drawn from each pool:

  • Undetected (12,120) — coin detector returned no tag.
  • Detected + SKIP (25,868) — coin tagged, classifier returned SKIP.
  • Signal + moved (695) — LONG or SHORT call, |ret_5m| ≥ 5bp.
  • Signal + flat (129) — LONG or SHORT call, |ret_5m| < 2bp.

Sampling across all four buckets means we stress-test every failure mode — not just the flattering ones. Cat 1 and Cat 2 exist specifically to catch signals we might be missing; Cat 4 exists to catch signals we're generating with no market response.

05

What counts as edge

A positive point estimate alone isn't a result — it's a draft. Every slice we report on the plays browser is expected to be re-checked against:

  • sample size (n)
  • sign stability across the sample's first and second halves
  • coin concentration (is one ticker driving the result?)
  • event deduplication (is one news story counted many times?)

The review page is a qualitative inspection tool. Quantitative claims get their own analysis in the research log.

06

What this site is not

This is not a live trading dashboard. It is a static export of a specific review snapshot, regenerated and redeployed whenever the underlying review HTML is refreshed. Live P&L, open positions, and risk state live elsewhere — this page exists so that a collaborator or stakeholder can review the system's behavior end-to-end without reading a 4,000-line HTML dump.