Why use larkx?
The token math, real benchmarks, and when larkx is not the right tool.
The token problem
AI tokens have a real dollar cost. A standard project file is ~600 tokens. A medium codebase of 300 files = 180,000 tokens just to read everything once. At Claude Sonnet pricing (~$3 per million tokens), that's $0.54 per full project scan.
Most AI tasks don't require reading every file, but the agent doesn't know that until it has explored. It re-reads files across turns. It loses context after compaction and starts over. The actual cost is usually 3-5x higher than the theoretical minimum.
larkx's answer
- Level 1 (paths + language): ~8 tokens per file. 300 files = 2.4K tokens.
- Level 2 (+ symbols + imports): ~80 tokens per file. 300 files = 24K tokens.
- Level 3 (+ full signatures): ~150 tokens per file. 300 files = 45K tokens.
- Level 4 (+ AI summaries): ~250 tokens per file. 300 files = 75K tokens.
- Folder scoping: cuts the bill proportionally, exploring src/auth = 5% of the project = 95% savings.
Calculate for your project
| Level | Includes | Tokens | Savings |
|---|---|---|---|
| L1Paths | Just file paths + language | ~1.6K | −99% |
| L2Symbols | + function & class names + imports | ~16K | −87% |
| L3Signatures | + full function signatures | ~30K | −75% |
| L4Summaries | + one-line AI summary per file | ~50K | −58% |
| Reading every file directly | ~120K | baseline | |
larkx stats in your project for indexed estimates that better match your codebase.When larkx helps the most
- Projects with 100+ files where the AI cannot just read everything cheaply
- Refactors that touch many files, larkx's
get_impactis far cheaper than grepping - Onboarding to unfamiliar codebases, level 4 (with summaries) gives you a free architectural overview
- Dead code cleanup, reachability-based detection is more accurate than searching for unused symbols
When it might not be worth it
- Tiny projects (under 30 files), the AI can just read them all
- One-off scripts, index overhead exceeds the savings
- Languages without a parser (Vue, Svelte, Ruby, PHP), they're skipped for now
Quick comparison
| Approach | Initial scan | Per refactor | Accuracy |
|---|---|---|---|
| Plain AI reading files | ~180K tokens | ~50K tokens | Partial, depends on what it opens |
| Grep / find tools | ~10K tokens | ~5K tokens | Pattern-only, no semantic info |
| larkx (L2) | ~24K tokens | ~2-5K tokens | Full symbol + dependency graph |
Bottom line
If you use AI agents for code daily and your project is bigger than a single folder, larkx pays for itself in the first hour of use. It is free, open-source, and runs entirely on your machine.