Research
Back to researchResearch sweep · deep · 2025 – 2026
Code Intelligence & Code-Graph Indexing for AI Agents
Tools and emerging approaches for code intelligence and code-graph indexing for AI coding agents from June 2025 through early June 2026, spanning local/embedded indexers (CodeGraph/Caveman-style repo maps, tree-sitter, SQLite and embedded graph stores), enterprise-scale code understanding (SCIP, code knowledge graphs, embeddings+retrieval), LSP-to-MCP bridges such as Serena, and the semantic-vs-syntactic-vs-embedding trade-off.
- GPT-5.5
- tech
- frontier
- academic
- financial
- blogs
Synthesised 2026-06-03
Full brief
Read the synthesised summary→
A 2026 Codebase-Memory study reports that a Tree-sitter-based knowledge graph exposed through MCP reduced agent token use by roughly 10x and tool calls by 2.1x across 31 repositories. That single result explains why code intelligence became a first-order design problem for AI coding agents: the index is no longer just…
Research lanes
5 lanes
academic
25 sources
Academic & arXiv
The 2025 to early-2026 academic frontier is converging on hybrid repository intelligence: deterministic or static-analysis-backed graph layers for symbol, dependency and navigation tasks; embeddings or sparse retrieval for broad matching; and MCP/LSP bridges…
Read lane →
blogs
25 sources
Blogs & Independent Thinkers
The strongest independent commentary converges on a few patterns: local-first repo maps and code graphs built with Tree-sitter plus SQLite or embedded graph stores; LSP-to-MCP bridges like Serena that turn editor-grade semantic understanding into agent tools…
Read lane →
financial
25 sources
Financial Press
{ "localembeddedindexers": "A clear design pattern emerged in 2025-early 2026: tree-sitter parses code into semantic units, then systems persist symbols, call edges, and embeddings in SQLite or a small embedded graph store. This favors speed, portability, and…
Read lane →
frontier
25 sources
Frontier Lab & Model News
{ "localembeddedindexing": "The strongest local/index-first pattern in this period is still tree-sitter-based repo mapping or structural indexing, typically combined with caching and graph ranking to fit context budgets. Aider, RepoMapper, and codeindex all…
Read lane →
tech
25 sources
Tech Industry & Practitioner
{ "whatisgainingtraction": "The clearest 2025-early-2026 trend is a shift from raw grep/read workflows toward structured context layers: MCP as the transport, SCIP and language-server semantics for precise navigation, and tree-sitter/embedded stores for local…
Read lane →