Testing

CodeRecon's testing subsystem provides unified test discovery, execution, and result parsing across multiple languages and frameworks. Tests are not exposed as separate MCP tools — they are accessed via the checkpoint tool.

Overview¶

The testing subsystem is built around Runner Packs — first-class plugins that define how to detect, discover, run, and parse tests for specific language/framework combinations.

Key Concepts¶

Runner Pack: A plugin that handles a specific test framework (e.g., python.pytest, js.jest)
Test Target: A discoverable unit of tests with a specific kind (file, package, or project)
Workspace Root: The directory context where tests are executed (supports monorepos)

Supported Languages¶

Tier 1 (Full Support)¶

Language	Runner Pack	Target Kind	Output Format	Notes
Python	`python.pytest`	file	JUnit XML	Via `--junitxml`
JavaScript	`js.jest`	file	JSON	Via `--json --outputFile`
JavaScript	`js.vitest`	file	JUnit XML	Via `--reporter=junit`
Go	`go.gotest`	package	NDJSON	Via `-json` flag
Rust	`rust.nextest`	package	JUnit XML	Preferred over cargo test
Rust	`rust.cargo_test`	package	Coarse	Limited output format
Java	`java.maven`	project	JUnit XML	From `target/surefire-reports/`
Java	`java.gradle`	project	JUnit XML	From `build/test-results/`
C#	`csharp.dotnet`	project	JUnit XML	Via JunitXml.TestLogger
C/C++	`cpp.ctest`	project	Coarse	Limited output format
Ruby	`ruby.rspec`	file	JUnit XML	Via RspecJunitFormatter
Ruby	`ruby.minitest`	file	JUnit XML	Via minitest-junit gem
PHP	`php.phpunit`	file	JUnit XML	Via `--log-junit`

Tier 2 (Standard Support)¶

Language	Runner Pack	Target Kind	Output Format	Notes
Kotlin	`kotlin.gradle`	project	JUnit XML	Uses Gradle test task
Swift	`swift.swiftpm`	package	Coarse	Limited output format
Scala	`scala.sbt`	project	JUnit XML	Via junit reporter
Dart	`dart.dart_test`	file	JSON	Via `--reporter json`
Dart	`dart.flutter_test`	file	JSON	Via `--machine` flag
Bash	`bash.bats`	file	JUnit XML	Via `--formatter junit`
PowerShell	`powershell.pester`	file	JUnit XML	Via Pester config
Lua	`lua.busted`	file	JUnit XML	Via `-o junit`
Elixir	`elixir.mix_test`	file	JUnit XML	Via JUnitFormatter
Haskell	`haskell.cabal_test`	project	Coarse	Limited output format
Julia	`julia.pkg_test`	project	Coarse	Limited output format
OCaml	`ocaml.dune_test`	project	Coarse	Limited output format

Task Runners (Generic)¶

Runner Pack	Trigger	Notes
`generic.makefile_test`	`Makefile` with a `test` target	Runs `make test`
`generic.justfile_test`	`justfile` with a `test` recipe	Runs `just test`

Target Kinds¶

Test targets have a kind that determines how they map to CLI arguments:

file: A single test file. The selector is a relative file path.
package: A module/package (Go packages, Rust crates). The selector is a package identifier.
project: A project root (Maven module, Gradle project, .NET solution). The selector is typically . or a subproject path.

Detection¶

Runner packs are detected automatically based on marker files:

pytest.ini, conftest.py, pyproject.toml[tool.pytest] → python.pytest
jest.config.js, package.json[jest] → js.jest
vitest.config.ts → js.vitest
go.mod → go.gotest
Cargo.toml → rust.nextest / rust.cargo_test
pom.xml → java.maven
build.gradle → java.gradle
*.csproj, *.sln → csharp.dotnet
CMakeLists.txt[enable_testing] → cpp.ctest
.rspec, spec/spec_helper.rb → ruby.rspec
phpunit.xml → php.phpunit
Rakefile[Rake::TestTask], test/test_helper.rb → ruby.minitest
mix.exs → elixir.mix_test
*.cabal → haskell.cabal_test
Project.toml → julia.pkg_test
dune-project → ocaml.dune_test
Makefile (with test target) → generic.makefile_test
justfile (with test recipe) → generic.justfile_test

Configuration Overrides¶

You can override detected runners in .recon/config.yaml:

testing:
  default_parallelism: 4
  default_timeout_sec: 600
  timeout_sec_by_language:
    java: 900
    python: 120
  memory_reserve_mb: 1024

Monorepo Support¶

The testing subsystem supports monorepos by detecting nested workspaces:

JavaScript: Detects packages/*/package.json, pnpm workspaces, nx/turborepo
Java: Detects multi-module Maven/Gradle projects
.NET: Detects solutions and project files

Each discovered target includes a workspace_root field indicating where to run the tests.

Output Artifacts¶

Test results are written to .recon/artifacts/tests/<run_id>/:

.recon/artifacts/tests/abc12345/
├── test_tests_test_example.py.xml    # JUnit XML output
├── test_tests_test_example.py.stdout.txt  # Raw stdout
└── ...

Integration with checkpoint¶

The checkpoint tool integrates the testing subsystem:

changed_files are resolved to their definitions via the index
The daemon's test scheduler runs affected tests in the background after edits settle
checkpoint reads cached results from the background run
If no cached run is available, falls back to import-graph traversal to find affected test targets
Results (pass/fail counts, failures with tracebacks) are returned inline

See tools.md — checkpoint for parameter details.

Output Format Fidelity¶

Full Fidelity (JUnit XML)¶

Most runners produce JUnit XML which provides:

Individual test names and classnames
Pass/fail/skip/error status per test
Duration per test
Failure messages and stack traces
stdout/stderr capture

Reduced Fidelity (Coarse Mode)¶

Some runners (rust.cargo_test, swift.swiftpm, cpp.ctest) cannot produce machine-readable output. In coarse mode:

Only aggregate pass/fail counts are available
Individual test details are not captured
Failure messages may be incomplete

To get full fidelity for Rust, use cargo-nextest instead of cargo test.

Coverage Parsers¶

CodeRecon can ingest test coverage data from multiple formats:

Parser	Coverage Format	Source
`coveragepy_context`	coverage.py with contexts	Python
`simplecov`	SimpleCov JSON	Ruby
`simplecov_per_test`	SimpleCov per-test	Ruby
`istanbul`	Istanbul JSON	JavaScript/TypeScript
`cobertura`	Cobertura XML	Multi-language
`lcov`	LCOV	Multi-language
`gocov`	Go coverage profile	Go
`opencover`	OpenCover XML	.NET
`phpunit_per_test`	PHPUnit per-test	PHP
`jacoco`	JaCoCo XML	Java/Kotlin
`jacoco_per_test`	JaCoCo per-test	Java/Kotlin
`clover`	Clover XML	Java/PHP

Coverage data feeds into:

blast_radius — which tests cover changed definitions
covering_tests — which tests hit each definition in a file
recon_line_coverage — per-line hit counts and test attribution
Governance policies (coverage_floor, coverage_regression)

Extending with New Runner Packs¶

To add a new runner pack:

Create a class extending RunnerPack
Define pack_id, language, markers, output_strategy, capabilities
Implement detect(), discover(), build_command(), parse_output()
Register with @runner_registry.register

@runner_registry.register
class MyRunnerPack(RunnerPack):
    pack_id = "lang.myrunner"
    language = "lang"
    runner_name = "myrunner"
    markers = [MarkerRule("myrunner.config", confidence="high")]
    output_strategy = OutputStrategy(format="junit_xml", file_based=True)
    capabilities = RunnerCapabilities(supported_kinds=["file"])

    def detect(self, workspace_root: Path) -> float:
        if (workspace_root / "myrunner.config").exists():
            return 1.0
        return 0.0

    async def discover(self, workspace_root: Path) -> list[TestTarget]:
        # Return list of TestTarget objects
        ...

    def build_command(self, target, *, output_path, pattern, tags) -> list[str]:
        return ["myrunner", "test", target.selector, f"--output={output_path}"]

    def parse_output(self, output_path: Path, stdout: str) -> ParsedTestSuite:
        return parse_junit_xml(output_path.read_text())