---
name: codebase-map
description: Compress an entire repo into a single LLM-digestible context bundle with an import graph and hot-file list.
title: Codebase Map
category: code-development
difficulty: advanced
license: MIT
author: admin
source_url: "https://github.com/yamadashy/repomix"
icon: 🗺️
input: code
output: markdown
phase: pre
domain: code
tags: codebase-analysis,context-compression,import-graph,code-indexing,repository-mapping,llm-prompt-optimization,static-analysis,pagerank,markdown-generation,gitignore-respecting,file-summarization,symbol-extraction
best_for:
  - onboarding LLMs to large unfamiliar codebases
  - generating context bundles for code review and refactoring
  - identifying critical files and dependencies in a repository
  - reducing token overhead when querying an LLM about multi-file projects
---

## Description

Reusable skill modeled on GitIngest / repomix: walks a repo, respects .gitignore, flattens it into a single markdown bundle with a directory tree, per-file summaries, and a cross-reference graph pointing to the most-imported symbols.

## Why it works

LLMs 'understand' code better when the repository's *shape* is visible, not just the current file. Blindly concatenating files blows past any context window; summarizing without shape loses the import graph that tells you which file matters most. A pre-computed map gives the model the same mental model an experienced dev builds after a week on a repo, in one prompt.

## How it works

1) Walk the working tree, honoring .gitignore + a skill-level ignore list (node_modules, .venv, build/). 2) For each file, produce a 3-line summary using the local LLM (line count, main exported symbol, first doc-comment). 3) Build an import-graph by tokenizing `import` / `from` / `require` statements; compute PageRank to find hot files. 4) Emit a single `codebase-map.md` with: directory tree, top-20 hot files verbatim, per-file summaries for the rest. 5) Cap at a configurable context size; drop lowest-rank summaries first if over budget.
