We ingest every Tree News headline in real time, route it through a Groq-hosted classifier, and measure whether the model's directional calls align with sub-minute price action on HyperLiquid perpetuals. This is the most recent end-to-end review.
Crypto news is fast and information-dense. Most of the alpha from a material headline — an exploit, an ETF flow, a regulatory action — is absorbed by price within seconds. If an LLM-driven classifier can decide buy / sell / skipfaster than a human trader can even finish reading the headline, there may be a tradeable edge worth building infrastructure around.
This review stress-tests that premise. We compare the classifier's calls against real HyperLiquid 1-minute and 5-minute price moves on 20 tracked coins, and we deliberately sample all four corners of the confusion matrix — including the uncomfortable ones.
Each bucket tests a different failure mode. Detection misses, classifier false-negatives, signals that worked, and signals that didn't. Together, they tell us whether the system is closer to an edge or to noise.
Headlines Tree News missed entirely
Sample of headlines from the raw feed that our coin detector did not tag. Used to surface detection gaps — if a mentioned-but-undetected ticker shows a real price move, the detector has a blind spot.
Classifier marked these as non-tradeable
Coin was detected, but the Groq classifier chose SKIP. We sample these to verify that real events aren't being dismissed as noise.
Classifier said trade, price confirmed
Directional calls (LONG or SHORT) on headlines where |ret_5m| ≥ 5 bp. The clearest evidence of edge — headline-driven calls that were immediately validated by price action.
Classifier said trade, price didn't move
Directional calls where |ret_5m| < 2 bp. Used to test the "delayed / stale event" hypothesis — are we firing on old news that markets already priced in?
Of 824 directional Groq calls on tracked coins, 695 were followed by meaningful price action. 129 went nowhere.
Our coin detector missed roughly 1,515 headlines that explicitly named a coin we trade. Most were TRUMP. This is an addressable detector bug, not an LLM problem.
The 'delayed / stale event' hypothesis doesn't hold — flat-reaction signals are not dominated by old news. Classifier is firing on fresh events in both cases.
Open the plays browser to inspect all 200 sampled headlines, filtered by category, sorted by PnL, with classifier reasoning and 1m / 5m / 30m returns.
Open the plays browser→