Introduction

Why use larkx?

The token math, real benchmarks, and when larkx is not the right tool.

The token problem

AI tokens have a real dollar cost. A standard project file is ~600 tokens. A medium codebase of 300 files = 180,000 tokens just to read everything once. At Claude Sonnet pricing (~$3 per million tokens), that's $0.54 per full project scan.

Most AI tasks don't require reading every file, but the agent doesn't know that until it has explored. It re-reads files across turns. It loses context after compaction and starts over. The actual cost is usually 3-5x higher than the theoretical minimum.

larkx's answer

  • Level 1 (paths + language): ~8 tokens per file. 300 files = 2.4K tokens.
  • Level 2 (+ symbols + imports): ~80 tokens per file. 300 files = 24K tokens.
  • Level 3 (+ full signatures): ~150 tokens per file. 300 files = 45K tokens.
  • Level 4 (+ AI summaries): ~250 tokens per file. 300 files = 75K tokens.
  • Folder scoping: cuts the bill proportionally, exploring src/auth = 5% of the project = 95% savings.
Net effectOn typical tasks, the AI gets the answer it needs at level 1 or 2, using 1-15% of the tokens it would otherwise spend.

Calculate for your project

LevelIncludesTokensSavings
L1PathsJust file paths + language~1.6K99%
L2Symbols+ function & class names + imports~16K87%
L3Signatures+ full function signatures~30K75%
L4Summaries+ one-line AI summary per file~50K58%
Reading every file directly~120Kbaseline
Estimated best case99% reduction · level 1 + scoping
Heads up, this is an estimate, not a measurement. Numbers assume ~3.5 characters per token, an average file size of ~2 KB, and average symbol counts. Your actual savings depend on your AI's tokenizer (Claude/GPT/Gemini differ), file size distribution, and how the agent uses tools. Treat these figures as ballpark, use larkx stats in your project for indexed estimates that better match your codebase.

When larkx helps the most

  • Projects with 100+ files where the AI cannot just read everything cheaply
  • Refactors that touch many files, larkx's get_impact is far cheaper than grepping
  • Onboarding to unfamiliar codebases, level 4 (with summaries) gives you a free architectural overview
  • Dead code cleanup, reachability-based detection is more accurate than searching for unused symbols

When it might not be worth it

  • Tiny projects (under 30 files), the AI can just read them all
  • One-off scripts, index overhead exceeds the savings
  • Languages without a parser (Vue, Svelte, Ruby, PHP), they're skipped for now

Quick comparison

ApproachInitial scanPer refactorAccuracy
Plain AI reading files~180K tokens~50K tokensPartial, depends on what it opens
Grep / find tools~10K tokens~5K tokensPattern-only, no semantic info
larkx (L2)~24K tokens~2-5K tokensFull symbol + dependency graph

Bottom line

If you use AI agents for code daily and your project is bigger than a single folder, larkx pays for itself in the first hour of use. It is free, open-source, and runs entirely on your machine.