Skip to content

Commit

Permalink
Move content into new RFC file
Browse files Browse the repository at this point in the history
  • Loading branch information
adpaco-aws committed Apr 26, 2024
1 parent 180e10c commit ae64a17
Show file tree
Hide file tree
Showing 2 changed files with 301 additions and 1 deletion.
2 changes: 1 addition & 1 deletion rfc/src/rfcs/0008-line-coverage.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
- **Feature Name:** Line coverage (`line-coverage`)
- **Feature Request Issue:** <https://github.com/model-checking/kani/issues/2610>
- **RFC PR:** <https://github.com/model-checking/kani/pull/2609>
- **Status:** Unstable
- **Status:** Cancelled
- **Version:** 0
- **Proof-of-concept:** <https://github.com/model-checking/kani/pull/2609> (Kani) + <https://github.com/model-checking/kani-vscode-extension/pull/122> (Kani VS Code Extension)

Expand Down
300 changes: 300 additions & 0 deletions rfc/src/rfcs/0010-source-coverage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,300 @@
- **Feature Name:** Source-based code coverage (`source-coverage`)
- **Feature Request Issue:** <https://github.com/model-checking/kani/issues/2640>
- **RFC PR:** <https://github.com/model-checking/kani/pull/3143>
- **Status:** Under Review
- **Version:** 2
- **Proof-of-concept:** <https://github.com/model-checking/kani/pull/3119> (Kani) + <https://github.com/model-checking/kani/pull/3121> (`kani-cov`)

-------------------

## Summary

A source-based code coverage feature for Kani built on top of Rust's coverage instrumentation.

## User Impact

Nowadays, users can't easily obtain verification-based coverage reports in Kani.
Generally speaking, these reports show which parts of the code under verification are covered and which are not.
Because of that, users rely on these reports to ensure that their harnesses are sound
---that is, that properties are checked for the entire body of code they're expecting to cover.

Moreover, some users prefer using coverage information for harness development and debugging.
That's because coverage information provides users with more familiar way to interpret verification results.

As mentioned earlier, we expect users to employ this coverage-related option on several stages of a verification effort:
* **Learning:** New users are more familiar with coverage reports than property-based results.
* **Development:** Some users prefer coverage results to property-based results since they are easier to interpret.
* **CI Integration**: Users may want to enforce a minimum percentage of code coverage for new contributions.
* **Debugging:** Users may find coverage reports particularly helpful when inputs are over-constrained (missing some corner cases).
* **Evaluation:** Users can easily evaluate where and when more verification work is needed (some projects aim for 100% coverage).

Moreover, adding this option directly to Kani, instead of relying on another tools, is likely to:
1. Increase the speed of development
2. Improve testing for coverage features

Which translates into faster and more reliable coverage options for users.

### Update: from line to source coverage

In the previous version of this RFC, we introduced and made available a line coverage option in Kani.
This option has since then allowed us to gather more data around the expectations for a coverage option in Kani.

For example, the line coverage output we produced wasn't easy to interpret without knowing some implementation details.
Aside from that, the feature requested in [#2795](https://github.com/model-checking/kani/issues/2795)
alludes to the need of providing coverage-specific tooling in Kani.
Nevertheless, as captured in [#2640](https://github.com/model-checking/kani/issues/2640),
source-based coverage results provide the clearest and most precise coverage information.

In this RFC, we propose an integration with [Rust's source-based code coverage instrumentation](https://doc.rust-lang.org/rustc/instrument-coverage.html).
The integration would allow us to report source-based code coverage results from Kani.
Also, we propose adding a new user-facing, coverage-focused tool called `kani-cov`.
The tool would allow users to process coverage results generated by Kani and produce
coverage artifacts such as summaries and reports according to their preferences.
In the [next section](#user-experience), we'll explain in more detail how we
expect `kani-cov` to assist with coverage-related tasks.

With these changes, we expect our coverage options to become more flexible, precise and efficient.
In the [last section](#future-possibilities) of this RFC,
we'll also discuss the requirements for a potential integration of this coverage feature with the LLVM toolchain.

## User Experience

The proposed coverage workflow reproduces that of the most popular coverage frameworks.
First, let's delve into the LLVM coverage workflow, followed by an explanation of our proposal.

### The LLVM code coverage workflow

The LLVM project is home to one of the most popular code coverage frameworks.
The workflow associated to the LLVM framework is described in the documentation for [source-based code coverage](https://clang.llvm.org/docs/SourceBasedCodeCoverage.html)[^note-source], but we briefly describe it here to better relate it with our proposal.

In short, the LLVM code coverage workflow follows three steps:
1. **Compiling with coverage enabled.** This causes the compiler to generate an instrumented program.
2. **Running the instrumented program.** This generates binary-encoded `.profraw` files.
3. **Using tools to aggregate and export coverage information into other formats.**

When working in a `cargo` project, step 1 can be done through this command:

```sh
RUSTFLAGS='-Cinstrument-coverage' cargo build
```

The same flag must to be used for step 2:

```sh
RUSTFLAGS='-Cinstrument-coverage' cargo run
```

This should populate the directory with at least one `.profraw` file.
Each `.profraw` file corresponds to a specific source code file in your project.

At this point, we'll have produced the artifacts that we generally require for the LLVM tools:
1. **The instrumented binary** which, in addition to the instrumented program,
contains additional information (e.g., the coverage mappings) required to
interpret the profiling results.
2. **The `.profraw` files** which essentially includes the profiling results
(counter and expression values) for each function of the corresponding source
code file.

For step 3, the commands will depend on what kind of results we want.
Most likely we will have to merge the `.profraw` files and produce a `.profdata` file as follows:

```sh
llvm-profdata merge -sparse *.profraw -o output.profdata
```

Then, we can use a command such as

```sh
llvm-cov show target/debug/binary —instr-profile=output.profdata -show-line-counts-or-regions
```

to visualize the code coverage through the terminal as in the image:

![Source-based code coverage with `llvm-cov`](https://github.com/model-checking/kani/assets/73246657/4f8a973d-8977-4c0b-822d-e73ed6d223aa)

or the command

```sh
llvm-cov report target/debug/binary --instr-profile=output.profdata --show-region-summary
```

to produce coverage summaries like this:

```
Filename Regions Missed Regions Cover Functions Missed Functions Executed Lines Missed Lines Cover Branches Missed Branches Cover
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
/long/long/path/to/my/project/binary/src/main.rs 9 3 66.67% 3 1 66.67% 14 4 71.43% 0 0 -
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL 9 3 66.67% 3 1 66.67% 14 4 71.43% 0 0 -
```

[^note-source]: The LLVM project refers to their own coverage feature as "source-based code coverage".
It's not rare to see the term "region coverage" being used instead to refer to the same thing.
That's because LLVM's source-based code coverage feature can report coverage for code regions,
but other coverage frameworks don't support the concept of code regions.

### The Kani coverage workflow

The proposed Kani coverage workflow imitates the LLVM coverage workflow as much as possible.

The two main components of the Kani coverage workflow will be the following:
1. A new subcommand `cov` that drives the coverage workflow in Kani and
produces machine-readable coverage results.
2. A new tool `kani-cov` that consumes the machine-readable coverage results
emitted by Kani to produce human-readable results in the desired format(s).

Therefore, the first part of the coverage workflow will be managed by Kani.
Then, in the other part, we will use the `kani-cov` tool to produce the coverage
output(s) we're interested in.

In the following, we describe each one of these components in more detail.

#### The `-cov` option

The coverage workflow will be kicked off through a new `-cov` option:

```sh
cargo kani -cov
```

The main difference with respect to the regular verification workflow is that,
at the end of the verification-based coverage run, Kani will generate two types
of files:
- One single file `.kanimap` file for the project. This file will contain the
coverage mappings for the project's source code.
- One `.kaniraw` file for each harness. This file will contain the
verification-based results for the coverage-oriented properties corresponding
to a given harness.

Note that `.kaniraw` files correspond to `.profraw` files in the LLVM coverage
workflow. Similarly, the `.kanimap` file corresponds to the coverage-related
information that's embedded into the project's binaries in the LLVM coverage
workflow.[^note-kanimap]

The files will be written into a new timestamped directory associated with the
coverage run. The path to this directory will be printed to standard output in
by default. For example, the [draft implementation](https://github.com/model-checking/kani/pull/3119)
writes the coverage files into the `target/kani/<target_triple>/cov/` directory.

Users aren't expected to read the information in any of these files.
Therefore, there's no need to restrict their format.
The [draft implementation](https://github.com/model-checking/kani/pull/3119)
uses the JSON format but we might consider using a binary format if it doesn't
scale.

[^note-kanimap]: Note that the `.kanimap` generation isn't implemented in [#3119](https://github.com/model-checking/kani/pull/3119).
The [draft implementation of `kani-cov`](https://github.com/model-checking/kani/pull/3121)
simply reads the source files referred to by the code coverage checks, but it
doesn't get information about code trimmed out by the MIR linker.

#### The `kani-cov` tool

The `kani-cov` tool will be used to process coverage information generated by
Kani and produce coverage outputs as indicated by the user.
Hence, the `kani-cov` tool corresponds to the set of LLVM tools
(`llvm-profdata`, `llvm-cov`, etc.) that are used to produce coverage outputs
through the LLVM coverage workflow.

In contrast to LLVM, we'll have a single tool for all Kani coverage-related needs.
We suggest that the tool initially offers three subcommands[^note-export]:
- `merge`: to combine the coverage results of one or more `.kaniraw` files into
a single `.kanicov` file, which will be required for the other subcommands.
- `report`: to display a summary of the coverage results.
- `show`: to produce source-based code coverage reports in human-readable formats (e.g., HTML).


Let's assume that we've run `cargo kani cov` and generated coverage files in the `my-coverage` folder.
Then, we'd use `kani-cov` as follows to combine the coverage results[^note-exclude] for all harnesses:

```sh
kani-cov merge my-coverage/*.kaniraw -o my-coverage.kanicov
```

Let's say the user is first interested in reading a coverage summary through the terminal.
They will have to use the `report` command for that:

```sh
kani-cov report my-coverage/default.kanimap -instr-profile=my-coverage.kanicov --show-summary
```

The command could print a coverage summary like:

```
| Filename | Regions | Missed Regions | Cover | Functions | ...
| -------- | ------- | -------------- | ----- | --------- | ...
| main.rs | 9 | 3 | 66.67 | 3 | ...
[...]
```

Now, let's say the user wants to produce an HTML report of the coverage results.
They will have to use the `show` command for that:

```sh
kani-cov show my-coverage/default.kanimap -format=html -instr-profile=my-coverage.kanicov -o coverage-report
```

This time, the command will generate a `coverage-report` folder including a
browsable HTML webpage that highlights the regions covered in the source
according to the coverage results in `my-coverage.kanicov`.

[^note-export]: The `llvm-cov` tool includes the option [`gcov`](https://llvm.org/docs/CommandGuide/llvm-cov.html#llvm-cov-gcov) to export into GCC's coverage format [Gcov](https://en.wikipedia.org/wiki/Gcov),
and the option [`export`](https://llvm.org/docs/CommandGuide/llvm-cov.html#llvm-cov-export) to export into the LCOV format.
I'd strongly recommend against adding format-specific options to `kani-cov`
unless there are technical reasons to do so.

[^note-exclude]: Options to exclude certain coverage results (e.g, from the standard library) will likely be part of this option.

#### Integration with the Kani VS Code Extension

We will update the coverage feature of the
[Kani VS Code Extension](https://github.com/model-checking/kani-vscode-extension)
to follow this new coverage workflow.
In other words, the extension will first run Kani with the `-cov` option and
use `kani-cov` to produce a `.kanicov` file with the coverage results.
The extension will consume the source-based code coverage results and
highlight region coverage in the source code seen from VS Code.

We could also consider other coverage-related features in order to enhance the
experience through the Kani VS Code Extension. For example, we could
automatically show the percentage of covered regions in the status bar by
additionally extracting a summary of the coverage results.

Finally, we could also consider an integration with other code coverage tools.
For example, if we wanted to integrate with the VS Code extensions
[Code Coverage](https://marketplace.visualstudio.com/items?itemName=markis.code-coverage) or
[Coverage Gutters](https://marketplace.visualstudio.com/items?itemName=ryanluker.vscode-coverage-gutters),
we would only need to extend `kani-cov` to export coverage results to the LCOV
format or integrate Kani with LLVM tools as discussed in [Integration with LLVM](#integration-with-llvm).

## Detailed Design

THIS SECTION INTENTIONALLY LEFT BLANK.

## Rationale and alternatives

### Other coverage implementations

In a previous version of this feature, we used an ad-hoc coverage implementation.
In addition to being very inefficient[^note-benchmarks], the line-based coverage
results were not trivial to interpret by users.
At the moment, there's only another unstable, GCC-compatible code coverage implementation
based on the Gcov format. The Gcov format is line-based so it's not able
to report region coverage results.
In other words, it's not as advanced nor precise as the source-based implementation.

[^note-benchmarks]: Actual performance benchmarks to follow in [#3119](https://github.com/model-checking/kani/pull/3119).

## Open questions

- Do we want to instrument dependencies by default? Preliminary benchmarking results show a slowdown of 100% and greater.
More evaluations are required to determine how we handle instrumentation for dependencies, and what options we might want
to provide to users.
- How do we handle features/options for `kani-cov`? In particular, do we need more details in this RFC?

## Future possibilities

### Integration with LLVM

We don't pursue an integration with the LLVM framework in this RFC.
We recommend against doing so at this time due to various technical limitations.
In a future revision, I'll explain these limitations and how we can make steps to overcome them.

0 comments on commit ae64a17

Please sign in to comment.