Implement round-trip fuzzers for finding correctness bugs #4559

addisoncrump · 2023-06-11T03:56:08Z

Summary

This PR implements fuzzers for testing the correctness of the parser and the formatter. These fuzzers will identify invalid UTF-8 indexing issues, panics/unreachables, logic errors, and violations of the round-trip property of parsing and formatting. Much of the source code and layout is based on the recent ruff fuzzer.

I will open issues detailing bugs identified by the fuzzer in the coming days. At time of writing, it is quite late at night. I have marked this PR as a draft as I still need to add some documentation regarding the fuzzers both in the source files and in the README, and add the fuzzer builds to the CI.

Test Plan

This adds additional testing features.

Changelog

The PR requires a changelog line

Documentation

The PR requires documentation
I will create a new PR to update the documentation

netlify · 2023-06-11T03:56:18Z

✅ Deploy Preview for docs-rometools canceled.

Built without sensitive environment variables

Name	Link
🔨 Latest commit	`08047c0`
🔍 Latest deploy log	https://app.netlify.com/sites/docs-rometools/deploys/64898d47d8c1c00008753fe3

Boshen · 2023-06-11T04:29:58Z

I saw your amazing work on Ruff! This is some amazing work that I want to steal for my project https://github.com/Boshen/oxc as well 😁

jasikpark

nice

addisoncrump · 2023-06-11T18:39:29Z

I'm noticing that the formatter is introducing syntax errors even into the samples I'm pulling from the repository. I think my assumptions about the formatter properties are too strict... Definitely need a second set of eyes here.

denbezrukov · 2023-06-12T07:44:09Z

Great PR!

Have you seen astral-sh/ruff#3721 (comment) ?
I'm wondering if we can combine both tests. #4323

addisoncrump · 2023-06-12T15:24:52Z

Certainly. This solution is more oriented towards CI pipelines and continuous testing, but its test case minimisation strategy could be used to reduce the broken files generated in the other issue into minimum reproductions.

That said, I do think this strategy potentially covers all the test cases that the file generator will. We would just need to add fuzzers for the linter. The fuzzers here will find violations that the other testing strategy cannot, namely because it uses the property oracles while at the same time potentially triggering crashes.

Potentially, the best thing we could use the broken file generator to create a corpus of inputs. There's not a lot of typescript source code corpora out there 🙂

Let me clarify this a bit further.

Recently, there was a work called Fuzztruction, which showed that the use of erroneous input generation can accelerate a fuzzer's exploration of program coverage. In some cases, the input generator on its own was able to explore coverage well, but in many cases required a fuzzer to be used in parallel. Moreover, I've evaluated this work on my own and found that its performance significantly reduces when there are not many generators in use in parallel, being squarely outperformed by input gen + fuzzer. Note also that the corpora used for the experiments in this paper are very small; the results may not be comparable to a high-performing fuzzer with a strong corpus.

Since the ultimate purpose of this is to identify bugs in code for which there are insufficient unit tests, we want to keep these runs small and use a relatively small amount of compute resources (that way, we can put it in CI). Input generation, combined with fuzzing, works well for long runs with high parallelism, but a strong corpus and a simple fuzzer will outperform even the combination of the two in short runs.

fuzz/Cargo.toml

fuzz/init-fuzzer.sh

.github/workflows/pull_request.yml

ematipico · 2023-06-12T17:06:45Z

I'm noticing that the formatter is introducing syntax errors even into the samples I'm pulling from the repository. I think my assumptions about the formatter properties are too strict... Definitely need a second set of eyes here.

How can we reproduce the issue? Do you have some sample of broken code, so we can help?

denbezrukov · 2023-06-12T17:42:02Z

I'd like to suggest one more variant to check: "formatted code should pass lint".
What do you think?

addisoncrump · 2023-06-12T17:42:30Z

I'd like to suggest one more variant to check: "formatted code should pass lint". What do you think?

What happens if the original code doesn't pass lint?

denbezrukov · 2023-06-12T17:44:09Z

Oh, it's a good point.
"Doesn't produce more errors"?:)

addisoncrump · 2023-06-12T17:54:41Z

How can we reproduce the issue? Do you have some sample of broken code, so we can help?
@ematipico

You can run rome_format_all from the Rome root:

$ cargo fuzz run --features rome_all -s none rome_format_all ./crates/rome_js_parser/test_data/inline/ok/ts_instantiation_expression_property_access.ts
$ cargo fuzz run --features rome_all -s none rome_format_all ./crates/rome_js_parser/test_data/inline/ok/ts_class_property_member_modifiers.ts
$ cargo fuzz run --features rome_all -s none rome_format_all ./crates/rome_js_parser/test_data/inline/ok/ts_instantiation_expressions.ts

These three test cases all cause the formatter to introduce a syntax error. Let me update the fuzzer to emit a text diff so this is easier to see.

addisoncrump · 2023-06-12T17:55:07Z

Oh, it's a good point. "Doesn't produce more errors"?:)

This is good 🙂 Let me try it.

denbezrukov · 2023-06-12T18:24:45Z

Oh, it's a good point. "Doesn't produce more errors"?:)

This is good 🙂 Let me try it.

I believe that it's an example #4553

ematipico · 2023-06-12T19:08:13Z

It seems that the CI is failing

addisoncrump · 2023-06-12T19:21:59Z

Ah, on my system, sh is symlinked to bash. I will fix.

addisoncrump · 2023-06-12T20:45:43Z

@denbezrukov I believe I have implemented your suggestion. Check it out 🙂

addisoncrump · 2023-06-12T21:45:34Z

The formatter fuzzers are now extremely aggressive, and flag many failure cases. However, I'm not sure if all of these failure cases are considered bugs. Would a maintainer inspect the fuzz_js_formatter_with_source_type function in fuzz/fuzz_targets/rome_common.rs and verify that the assertions present should hold no matter the input? If so, I will begin submitting issues with discovered failing testcases, minimised with root cause.

jasikpark · 2023-06-12T23:58:23Z

fuzz/README.md

+ - Formatting code twice will have the same result as formatting code once
+
+In this way, we verify the [idempotency](https://en.wikipedia.org/wiki/Idempotence) and syntax


jasikpark · 2023-06-12T23:59:40Z

fuzz/init-fuzzer.sh

@@ -11,13 +11,15 @@ fi

 if [ ! -d corpus/rome_format_all ]; then


would a build.rs be crazy here? instead of a shell?

ematipico · 2023-06-13T04:27:26Z

Seems like the fuzzer couldn't compile https://github.com/rome/tools/actions/runs/5249047207/jobs/9485118431#step:4:1

addisoncrump · 2023-06-13T10:55:58Z

Seems like the fuzzer couldn't compile https://github.com/rome/tools/actions/runs/5249047207/jobs/9485118431#step:4:1

Bleh, yeah, I've seen this locally. Some linkage issue on 1.69; it works just fine on 1.70.

ematipico · 2023-06-13T11:06:09Z

I hope to get this merged soon #4563

addisoncrump · 2023-06-13T16:50:04Z

Compiler is OOMing for the fuzzer 😬 I'll try to resolve this locally.

Boshen mentioned this pull request Jun 11, 2023

Round-trip fuzzers oxc-project/oxc#427

Closed

addisoncrump changed the title ~~Implement round-trip fuzzers for finding parsing correctness bugs~~ Implement round-trip fuzzers for finding correctness bugs Jun 11, 2023

jasikpark reviewed Jun 11, 2023

View reviewed changes

addisoncrump marked this pull request as ready for review June 11, 2023 18:38

denbezrukov mentioned this pull request Jun 12, 2023

🐛 Big pack of js/ts files that crashes rome #4323

Open

ematipico reviewed Jun 12, 2023

View reviewed changes

fuzz/Cargo.toml Outdated Show resolved Hide resolved

fuzz/Cargo.toml Outdated Show resolved Hide resolved

fuzz/init-fuzzer.sh Outdated Show resolved Hide resolved

.github/workflows/pull_request.yml Outdated Show resolved Hide resolved

addisoncrump force-pushed the main branch from d2729e1 to 4078243 Compare June 12, 2023 22:13

jasikpark reviewed Jun 12, 2023

View reviewed changes

fuzz/init-fuzzer.sh

@@ -11,13 +11,15 @@ fi

if [ ! -d corpus/rome_format_all ]; then

Copy link

jasikpark Jun 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would a build.rs be crazy here? instead of a shell?

addisoncrump force-pushed the main branch from 4078243 to 122439e Compare June 13, 2023 13:53

addisoncrump added 14 commits June 14, 2023 11:49

init fuzzers

6748746

correct corpus link

1a2ed62

add more fuzzers

0676030

add formatter fuzzers

008af4e

document formatter strategy

7bbfebb

add fuzzer build to CI

906bee6

better github workflow

83dd602

whoops, need to specify where it runs

ed1fc0e

fix CI

5539b4b

address naming nit

ce7c470

add text diff to formatter

3db6138

add linter checks to formatter output

eaef5de

correct diff args

e4d87d2

use strip dead code (ew) to resolve the memory usage issue

08047c0

addisoncrump force-pushed the main branch from 9859a50 to 08047c0 Compare June 14, 2023 09:49

ematipico approved these changes Jun 14, 2023

View reviewed changes

ematipico merged commit 171bc0f into rome:main Jun 14, 2023

addisoncrump mentioned this pull request Jun 14, 2023

Update fuzzers for 1.70 and integrate with CI #4570

Draft

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement round-trip fuzzers for finding correctness bugs #4559

Implement round-trip fuzzers for finding correctness bugs #4559

addisoncrump commented Jun 11, 2023 •

edited

Loading

netlify bot commented Jun 11, 2023 •

edited

Loading

Boshen commented Jun 11, 2023

jasikpark left a comment

addisoncrump commented Jun 11, 2023

denbezrukov commented Jun 12, 2023

addisoncrump commented Jun 12, 2023 •

edited

Loading

ematipico commented Jun 12, 2023

denbezrukov commented Jun 12, 2023

addisoncrump commented Jun 12, 2023

denbezrukov commented Jun 12, 2023

addisoncrump commented Jun 12, 2023

addisoncrump commented Jun 12, 2023

denbezrukov commented Jun 12, 2023

ematipico commented Jun 12, 2023

addisoncrump commented Jun 12, 2023

addisoncrump commented Jun 12, 2023

addisoncrump commented Jun 12, 2023 •

edited

Loading

jasikpark Jun 12, 2023

jasikpark Jun 12, 2023

ematipico commented Jun 13, 2023

addisoncrump commented Jun 13, 2023

ematipico commented Jun 13, 2023

addisoncrump commented Jun 13, 2023

		- Formatting code twice will have the same result as formatting code once

		In this way, we verify the [idempotency](https://en.wikipedia.org/wiki/Idempotence) and syntax

		@@ -11,13 +11,15 @@ fi

		if [ ! -d corpus/rome_format_all ]; then

Implement round-trip fuzzers for finding correctness bugs #4559

Implement round-trip fuzzers for finding correctness bugs #4559

Conversation

addisoncrump commented Jun 11, 2023 • edited Loading

Summary

Test Plan

Changelog

Documentation

netlify bot commented Jun 11, 2023 • edited Loading

✅ Deploy Preview for docs-rometools canceled.

Boshen commented Jun 11, 2023

jasikpark left a comment

Choose a reason for hiding this comment

addisoncrump commented Jun 11, 2023

denbezrukov commented Jun 12, 2023

addisoncrump commented Jun 12, 2023 • edited Loading

ematipico commented Jun 12, 2023

denbezrukov commented Jun 12, 2023

addisoncrump commented Jun 12, 2023

denbezrukov commented Jun 12, 2023

addisoncrump commented Jun 12, 2023

addisoncrump commented Jun 12, 2023

denbezrukov commented Jun 12, 2023

ematipico commented Jun 12, 2023

addisoncrump commented Jun 12, 2023

addisoncrump commented Jun 12, 2023

addisoncrump commented Jun 12, 2023 • edited Loading

jasikpark Jun 12, 2023

Choose a reason for hiding this comment

jasikpark Jun 12, 2023

Choose a reason for hiding this comment

ematipico commented Jun 13, 2023

addisoncrump commented Jun 13, 2023

ematipico commented Jun 13, 2023

addisoncrump commented Jun 13, 2023

addisoncrump commented Jun 11, 2023 •

edited

Loading

netlify bot commented Jun 11, 2023 •

edited

Loading

addisoncrump commented Jun 12, 2023 •

edited

Loading

addisoncrump commented Jun 12, 2023 •

edited

Loading