Bibliography

Research

What SourcePrep was built on. Notes on the papers, repositories, essays, and standards that shaped the project.

Bibliography

What SourcePrep was built on.

A working list of the papers, repositories, essays, and standards SourcePrep draws on. Each entry includes a one-line note on how it shaped the project — and what we changed when we disagreed.

Retrieval & Long Context

Why context engineering matters more than raw context size — and what changes when language models meet long, noisy windows.

PaperLiu et al. · TACL 2024 · 2024

Lost in the Middle: How Language Models Use Long Contexts

Liu et al. show that language models attend to the start and end of a long context far more than the middle. The finding sets the ceiling for any retrieval system that pads its context naïvely. SourcePrep ranks results by relevance and assembles them so the highest-scoring chunks bracket the prompt, never bury it.

PaperHan et al. · arXiv 2025 · 2025

RAG vs. GraphRAG: A Systematic Evaluation and Key Insights

Han et al. compare flat-vector RAG against graph-augmented RAG across reasoning-heavy benchmarks and find that local community search wins on multi-hop questions. SourcePrep’s prep_search follows the same logic: vector hits seed the query, then a trace-graph hop expands the neighborhood before the final assembly.

EssayAnthropic · Anthropic blog · 2024

Contextual Retrieval

Anthropic’s post argued that prepending a few lines of file-level context to each chunk before embedding reduced retrieval failures by 49% in their tests. SourcePrep’s semantic chunker now does exactly this — every chunk carries a synopsis prefix derived from its enclosing module so the embedding sees the same neighborhood the model will reason over.

Context Length Alone Hurts LLM Performance Despite Perfect Retrieval

Chen et al. · EMNLP 2025 Findings · 2025

Reasoning degrades even with 100% retrieval accuracy — evidence for retrieve-then-solve over context padding.

Essay

Context Rot: How Increasing Input Tokens Impacts LLM Performance

Chroma Research · trychroma.com · 2025

18-LLM evaluation showing distractor similarity compounds context degradation. Sets default min-score thresholds.

Essay

Long Context RAG Performance of LLMs

Databricks Research · databricks.com · 2024

Identifies the 4K–32K token RAG saturation point that justifies SourcePrep’s 6K–8K conservative defaults.

Research

Retrieval & Long Context

Further reading

Context Length Alone Hurts LLM Performance Despite Perfect Retrieval

Context Rot: How Increasing Input Tokens Impacts LLM Performance

Long Context RAG Performance of LLMs

Retrieval-Augmented Code Generation: A Survey

Context Engineering for Large Language Models: A Survey

zilliztech/claude-context

Compression & Levels of Detail

Further reading

Repoformer: Selective Retrieval for Repository-Level Code Completion

GraphCoder: Code Completion via Code Context Graph-based Retrieval

RepoHyper: Search-Expand-Refine on Semantic Graphs for Repository-Level Code Completion

STALL+: Boosting LLM-based Repository-Level Code Completion with Static Analysis

In Line with Context: Repository-Level Code Generation via Context Inlining

Long Context Compression with Activation Beacon

LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

On the Impacts of Contexts on Repository-Level Code Generation

yamadashy/repomix

YerbaPage/LongCodeZip

CodeRAG: Supportive Code Retrieval on Bigraph

Code Structure & Chunking

Further reading

Late Chunking in Long-Context Embedding Models

ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT

Reciprocal Rank Fusion Outperforms Condorcet and Individual Rank Learning Methods

RAGAS: Automated Evaluation of Retrieval Augmented Generation

Smoothing and Differentiation of Data by Simplified Least Squares Procedures

RepoAgent: An LLM-Powered Open-Source Framework for Repository-level Code Documentation Generation

Concepts, Knowledge & Standards

Further reading

The Program Dependence Graph and its Use in Optimization

From Louvain to Leiden: Guaranteeing Well-Connected Communities

Formal Concept Analysis: Mathematical Foundations

Towards a Theory of the Comprehension of Computer Programs

Stimulus Structures and Mental Representations in Expert Comprehension of Computer Programs

The Magical Number Seven, Plus or Minus Two

Documenting Architecture Decisions

LLMs4OL Challenge — Large Language Models for Ontology Learning

Traceability Transformed: Generating More Accurate Links with Pre-Trained BERT Models (T-BERT)

NASA SWE-072: Bidirectional Traceability

Agent Client Protocol (ACP)

Agent-to-Agent Protocol (A2A)

SARIF 2.1.0 — Static Analysis Results Interchange Format

OCSF — Open Cybersecurity Schema Framework

agents.md

KARMA: Leveraging Multi-Agent LLMs for Automated Knowledge Graph Enrichment