Introduction

Why use larkx?

The token math, real benchmarks, and when larkx is not the right tool.

The token problem

AI tokens have a real dollar cost. A standard project file is ~600 tokens. A medium codebase of 300 files = 180,000 tokens just to read everything once. At Claude Sonnet pricing (~$3 per million tokens), that's $0.54 per full project scan.

Most AI tasks don't require reading every file, but the agent doesn't know that until it has explored. It re-reads files across turns. It loses context after compaction and starts over. The actual cost is usually 3-5x higher than the theoretical minimum.

larkx's answer

Level 1 (paths + language): ~8 tokens per file. 300 files = 2.4K tokens.
Level 2 (+ symbols + imports): ~80 tokens per file. 300 files = 24K tokens.
Level 3 (+ full signatures): ~150 tokens per file. 300 files = 45K tokens.
Level 4 (+ AI summaries): ~250 tokens per file. 300 files = 75K tokens.
Folder scoping: cuts the bill proportionally, exploring src/auth = 5% of the project = 95% savings.

Net effectOn typical tasks, the AI gets the answer it needs at level 1 or 2, using 1-15% of the tokens it would otherwise spend.

Calculate for your project

Project size200 files

Folder scope100% · 200 files

Level	Includes	Tokens	Savings
L1Paths	Just file paths + language	~1.6K	−99%
L2Symbols	+ function & class names + imports	~16K	−87%
L3Signatures	+ full function signatures	~30K	−75%
L4Summaries	+ one-line AI summary per file	~50K	−58%
Reading every file directly		~120K	baseline

Estimated best case≈ 99% reduction · level 1 + scoping

Heads up, this is an estimate, not a measurement. Numbers assume ~3.5 characters per token, an average file size of ~2 KB, and average symbol counts. Your actual savings depend on your AI's tokenizer (Claude/GPT/Gemini differ), file size distribution, and how the agent uses tools. Treat these figures as ballpark, use larkx stats in your project for indexed estimates that better match your codebase.

When larkx helps the most

Projects with 100+ files where the AI cannot just read everything cheaply
Refactors that touch many files, larkx's get_impact is far cheaper than grepping
Onboarding to unfamiliar codebases, level 4 (with summaries) gives you a free architectural overview
Dead code cleanup, reachability-based detection is more accurate than searching for unused symbols

When it might not be worth it

Tiny projects (under 30 files), the AI can just read them all
One-off scripts, index overhead exceeds the savings
Languages without a parser (Vue, Svelte, Ruby, PHP), they're skipped for now

Quick comparison

Approach	Initial scan	Per refactor	Accuracy
Plain AI reading files	~180K tokens	~50K tokens	Partial, depends on what it opens
Grep / find tools	~10K tokens	~5K tokens	Pattern-only, no semantic info
larkx (L2)	~24K tokens	~2-5K tokens	Full symbol + dependency graph

Bottom line

If you use AI agents for code daily and your project is bigger than a single folder, larkx pays for itself in the first hour of use. It is free, open-source, and runs entirely on your machine.