Testing
CodeRecon's testing subsystem provides unified test discovery, execution, and result parsing across multiple languages and frameworks. Tests are not exposed as separate MCP tools — they are accessed via the checkpoint tool.
Overview¶
The testing subsystem is built around Runner Packs — first-class plugins that define how to detect, discover, run, and parse tests for specific language/framework combinations.
Key Concepts¶
- Runner Pack: A plugin that handles a specific test framework (e.g.,
python.pytest,js.jest) - Test Target: A discoverable unit of tests with a specific
kind(file, package, or project) - Workspace Root: The directory context where tests are executed (supports monorepos)
Supported Languages¶
Tier 1 (Full Support)¶
| Language | Runner Pack | Target Kind | Output Format | Notes |
|---|---|---|---|---|
| Python | python.pytest | file | JUnit XML | Via --junitxml |
| JavaScript | js.jest | file | JSON | Via --json --outputFile |
| JavaScript | js.vitest | file | JUnit XML | Via --reporter=junit |
| Go | go.gotest | package | NDJSON | Via -json flag |
| Rust | rust.nextest | package | JUnit XML | Preferred over cargo test |
| Rust | rust.cargo_test | package | Coarse | Limited output format |
| Java | java.maven | project | JUnit XML | From target/surefire-reports/ |
| Java | java.gradle | project | JUnit XML | From build/test-results/ |
| C# | csharp.dotnet | project | JUnit XML | Via JunitXml.TestLogger |
| C/C++ | cpp.ctest | project | Coarse | Limited output format |
| Ruby | ruby.rspec | file | JUnit XML | Via RspecJunitFormatter |
| Ruby | ruby.minitest | file | JUnit XML | Via minitest-junit gem |
| PHP | php.phpunit | file | JUnit XML | Via --log-junit |
Tier 2 (Standard Support)¶
| Language | Runner Pack | Target Kind | Output Format | Notes |
|---|---|---|---|---|
| Kotlin | kotlin.gradle | project | JUnit XML | Uses Gradle test task |
| Swift | swift.swiftpm | package | Coarse | Limited output format |
| Scala | scala.sbt | project | JUnit XML | Via junit reporter |
| Dart | dart.dart_test | file | JSON | Via --reporter json |
| Dart | dart.flutter_test | file | JSON | Via --machine flag |
| Bash | bash.bats | file | JUnit XML | Via --formatter junit |
| PowerShell | powershell.pester | file | JUnit XML | Via Pester config |
| Lua | lua.busted | file | JUnit XML | Via -o junit |
| Elixir | elixir.mix_test | file | JUnit XML | Via JUnitFormatter |
| Haskell | haskell.cabal_test | project | Coarse | Limited output format |
| Julia | julia.pkg_test | project | Coarse | Limited output format |
| OCaml | ocaml.dune_test | project | Coarse | Limited output format |
Task Runners (Generic)¶
| Runner Pack | Trigger | Notes |
|---|---|---|
generic.makefile_test | Makefile with a test target | Runs make test |
generic.justfile_test | justfile with a test recipe | Runs just test |
Target Kinds¶
Test targets have a kind that determines how they map to CLI arguments:
- file: A single test file. The selector is a relative file path.
- package: A module/package (Go packages, Rust crates). The selector is a package identifier.
- project: A project root (Maven module, Gradle project, .NET solution). The selector is typically
.or a subproject path.
Detection¶
Runner packs are detected automatically based on marker files:
pytest.ini, conftest.py, pyproject.toml[tool.pytest] → python.pytest
jest.config.js, package.json[jest] → js.jest
vitest.config.ts → js.vitest
go.mod → go.gotest
Cargo.toml → rust.nextest / rust.cargo_test
pom.xml → java.maven
build.gradle → java.gradle
*.csproj, *.sln → csharp.dotnet
CMakeLists.txt[enable_testing] → cpp.ctest
.rspec, spec/spec_helper.rb → ruby.rspec
phpunit.xml → php.phpunit
Rakefile[Rake::TestTask], test/test_helper.rb → ruby.minitest
mix.exs → elixir.mix_test
*.cabal → haskell.cabal_test
Project.toml → julia.pkg_test
dune-project → ocaml.dune_test
Makefile (with test target) → generic.makefile_test
justfile (with test recipe) → generic.justfile_test
Configuration Overrides¶
You can override detected runners in .recon/config.yaml:
testing:
default_parallelism: 4
default_timeout_sec: 600
timeout_sec_by_language:
java: 900
python: 120
memory_reserve_mb: 1024
Monorepo Support¶
The testing subsystem supports monorepos by detecting nested workspaces:
- JavaScript: Detects
packages/*/package.json, pnpm workspaces, nx/turborepo - Java: Detects multi-module Maven/Gradle projects
- .NET: Detects solutions and project files
Each discovered target includes a workspace_root field indicating where to run the tests.
Output Artifacts¶
Test results are written to .recon/artifacts/tests/<run_id>/:
.recon/artifacts/tests/abc12345/
├── test_tests_test_example.py.xml # JUnit XML output
├── test_tests_test_example.py.stdout.txt # Raw stdout
└── ...
Integration with checkpoint¶
The checkpoint tool integrates the testing subsystem:
changed_filesare resolved to their definitions via the index- The daemon's test scheduler runs affected tests in the background after edits settle
checkpointreads cached results from the background run- If no cached run is available, falls back to import-graph traversal to find affected test targets
- Results (pass/fail counts, failures with tracebacks) are returned inline
See tools.md — checkpoint for parameter details.
Output Format Fidelity¶
Full Fidelity (JUnit XML)¶
Most runners produce JUnit XML which provides:
- Individual test names and classnames
- Pass/fail/skip/error status per test
- Duration per test
- Failure messages and stack traces
- stdout/stderr capture
Reduced Fidelity (Coarse Mode)¶
Some runners (rust.cargo_test, swift.swiftpm, cpp.ctest) cannot produce machine-readable output. In coarse mode:
- Only aggregate pass/fail counts are available
- Individual test details are not captured
- Failure messages may be incomplete
To get full fidelity for Rust, use cargo-nextest instead of cargo test.
Coverage Parsers¶
CodeRecon can ingest test coverage data from multiple formats:
| Parser | Coverage Format | Source |
|---|---|---|
coveragepy_context | coverage.py with contexts | Python |
simplecov | SimpleCov JSON | Ruby |
simplecov_per_test | SimpleCov per-test | Ruby |
istanbul | Istanbul JSON | JavaScript/TypeScript |
cobertura | Cobertura XML | Multi-language |
lcov | LCOV | Multi-language |
gocov | Go coverage profile | Go |
opencover | OpenCover XML | .NET |
phpunit_per_test | PHPUnit per-test | PHP |
jacoco | JaCoCo XML | Java/Kotlin |
jacoco_per_test | JaCoCo per-test | Java/Kotlin |
clover | Clover XML | Java/PHP |
Coverage data feeds into:
blast_radius— which tests cover changed definitionscovering_tests— which tests hit each definition in a filerecon_line_coverage— per-line hit counts and test attribution- Governance policies (
coverage_floor,coverage_regression)
Extending with New Runner Packs¶
To add a new runner pack:
- Create a class extending
RunnerPack - Define
pack_id,language,markers,output_strategy,capabilities - Implement
detect(),discover(),build_command(),parse_output() - Register with
@runner_registry.register
@runner_registry.register
class MyRunnerPack(RunnerPack):
pack_id = "lang.myrunner"
language = "lang"
runner_name = "myrunner"
markers = [MarkerRule("myrunner.config", confidence="high")]
output_strategy = OutputStrategy(format="junit_xml", file_based=True)
capabilities = RunnerCapabilities(supported_kinds=["file"])
def detect(self, workspace_root: Path) -> float:
if (workspace_root / "myrunner.config").exists():
return 1.0
return 0.0
async def discover(self, workspace_root: Path) -> list[TestTarget]:
# Return list of TestTarget objects
...
def build_command(self, target, *, output_path, pattern, tags) -> list[str]:
return ["myrunner", "test", target.selector, f"--output={output_path}"]
def parse_output(self, output_path: Path, stdout: str) -> ParsedTestSuite:
return parse_junit_xml(output_path.read_text())