A source code service that serves live code from git repositories via API. Named for the lantern with a shutter — open it to reveal exactly the slice you need.
Tree-sitter powered queries at two levels: a high-level language-aware API where you ask for a method by name, and a low-level S-expression interface for arbitrary AST queries. Parses on demand, memoizes across restarts, refreshes repos on a configurable schedule.
Code referenced in documentation, compliance matrices, and cross-system views goes stale the moment it's copied. Dark Lantern serves live content instead. Consumers point at it and get current source — by file range, by symbol name, or by annotation pattern.
Sprout and Lantern (the UI) both consume it. For the Kremis compliance work, it maps source to tests across the V-model without drift. For Wanderland generally, it's the shared service that emits source.
Deployed via docker-compose. Source repos are mounted read-only. Parse cache persists on a data volume.
repos:
kremis:
url: /mnt/repos/kremis
refresh: 300
grammars: [ruby]
sprout-engine:
url: /mnt/repos/sprout-engine
refresh: 300
grammars: [ruby]
private-service:
url: ${GITLAB_URL}/org/private-service.git
credentials:
ssh_key: ${SSH_KEY_PATH}
refresh: 600
grammars: [java, ruby]
storage: /data/repos
cache: /data/cache
port: 9293
Local paths clone via git clone --local. Remote URLs clone normally. ${ENV_VAR} patterns in YAML values are interpolated at load time — credentials, URLs, paths. Refresh interval is per-repo in seconds. Grammars listed per repo are resolved and loaded on demand when the config is read.
GET /repos/:repo/symbols?name=execute&kind=method
GET /repos/:repo/symbols?name=Pipeline&kind=class
GET /repos/:repo/symbols?name=execute&kind=method&path=lib/kremis/pipeline.rb
GET /repos/:repo/annotations?pattern=@requirement:REQ-*
GET /repos/:repo/annotations?pattern=@trace:URS-4
The symbols endpoint maps kind to language-specific tree-sitter queries internally. A kind=method in Ruby becomes (method_definition name: (identifier) @name (#eq? @name "execute")), in Python (function_definition name: (identifier) @name ...), etc. A small set of query templates ships per grammar.
The annotations endpoint queries comments matching a pattern and returns the associated code — the next sibling in the AST, typically the function or class the comment is attached to. This enables requirement-to-code tracing through source annotations like # @requirement:REQ-001 or # @trace:URS-4 without having to name functions.
POST /repos/:repo/query
{
"path": "lib/kremis/pipeline.rb",
"query": "(method_definition name: (identifier) @name)",
"lang": "ruby"
}
Response includes each capture with its text, start/end positions, and surrounding source lines. The caller writes the S-expression query — Dark Lantern executes it and returns what matched.
GET /repos # list configured repos
GET /repos/:repo/tree # file listing
GET /repos/:repo/files/*path # full file or ?from=X&to=Y line range
POST /repos/:repo/refresh # trigger manual refresh
GET /health
Two patterns that work together.
Developers annotate source with requirement IDs in comments:
# @requirement:REQ-001
# @trace:URS-4
def verify_seal(provenance_content)
# ...
end
Tree-sitter parses comments as nodes. Dark Lantern queries for comments matching an annotation pattern and returns the code the comment is attached to. This is bottom-up — the code declares what it traces to.
The requirement node in Oculus stores a query spec:
traces:
- repo: kremis
query: { kind: method, name: execute, path: "lib/kremis/pipeline.rb" }
- repo: kremis
query: { annotation: "@requirement:REQ-001" }
- repo: kremis
query: { s_expr: "(method_definition name: (identifier) @name (#eq? @name \"run\"))", path: "lib/kremis/pipeline.rb", lang: "ruby" }
The compliance page resolves each trace by calling Dark Lantern. The requirement knows what it traces to. Dark Lantern resolves it to current source. The V-model mapping stays valid without manual synchronization.
| Component | Responsibility |
|---|---|
| Config | Load YAML, interpolate env vars, validate repos, resolve grammar requirements |
| RepoManager | Clone, fetch, reset, file listing, file read with range |
| Parser | Tree-sitter parse on demand, memoize by path:mtime, persist to disk |
| QueryTemplates | Language-specific tree-sitter queries for high-level kind lookups |
| RefreshWorker | Background thread per repo, periodic git fetch + reset |
| Grape API | HTTP surface, high-level and low-level endpoints |
Parse trees are memoized keyed on "#{repo}:#{relative_path}:#{File.mtime(path).to_i}". Cache serialized to the /data/cache volume so it persists across container restarts. When a file changes (git refresh updates mtime), the next parse replaces the cached entry. No explicit invalidation needed.
Grammars are declared per-repo in config. The ruby_tree_sitter gem loads grammar shared libraries. On config load, Dark Lantern checks for required grammars and fetches any that are missing. Adding a new language to a repo config is the only step needed to support it.
dark-lantern:
build: ./dark-lantern
ports:
- "9293:9293"
volumes:
- /home/gfawcett/working/sprout/kremis:/mnt/repos/kremis:ro
- /home/gfawcett/working/sprout/sprout-engine:/mnt/repos/sprout-engine:ro
- dark-lantern-data:/data
environment:
- GITLAB_URL=${GITLAB_URL}
- SSH_KEY_PATH=/run/secrets/ssh_key
Source repos mounted read-only. Data volume for cloned working copies and parse cache. Secrets via docker secrets or env vars.
dark-lantern/
├── Gemfile
├── config.ru
├── config.yml
├── Dockerfile
├── lib/
│ └── dark_lantern/
│ ├── config.rb
│ ├── repo_manager.rb
│ ├── parser.rb
│ ├── query_templates.rb
│ └── refresh_worker.rb
├── app/
│ ├── api.rb
│ └── api/
│ ├── repos.rb
│ ├── files.rb
│ ├── symbols.rb
│ ├── annotations.rb
│ └── query.rb
└── spec/
Lives at /home/gfawcett/working/wanderland/dark-lantern/.
The Kremis V-model requires mapping source to tests at every level (URS to PQ). Copied code drifts. Dark Lantern serves live source so compliance pages can reference current code through stored queries or annotation patterns. The trace from requirement to source to test stays valid without manual synchronization.
This supports CSA (Computer Software Assurance) where the system's own records serve as validation evidence — Dark Lantern provides the source half of that evidence chain.