Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement parallel job execution #1562

Draft
wants to merge 12 commits into
base: master
Choose a base branch
from
Draft

implement parallel job execution #1562

wants to merge 12 commits into from

Conversation

syphar
Copy link

@syphar syphar commented Mar 4, 2023

Seeing #626 I started investing some time into a draft for parallel execution.

This PR would:

  • run dependencies in parallel, when given the [parallel] attribute
  • run given tasks on command line in parallel, when given the --parallel argument

while keeping execution / dependency order, including when having dependencies of dependencies etc.

This implementation here compiles & works.

From my side, to finish this up, I would:

  • re-add the ran logic to prevent duplicate task execution, probably by wrapping this in an Arc<Mutex<>>, or some message passing.
  • get rid of the repetition / duplication
  • add a --parallel argument to control if the main tasks are run in parallel
  • add documentation
  • add tests (I didn't check the current test-suite yet on if / how)
  • implement what else needs to be added

From @casey (or another maintainer?) I would need feedback on:

  • should we do it the same way as Taskfile, so dependencies are run in parallel by default? IMO this would then be a backwards incompatible change.
  • or should this also be controlled by the --parallel argument?
  • or optional via an annotation?
  • what should happen when one of the parallel tasks fails? Wait for the rest? Or cancel?

Also, std::thread::scope was only (relatively) recently added in 1.63, but if we have a lower MSRV we can just switch to using crossbeam::scope

@syphar
Copy link
Author

syphar commented Mar 6, 2023

personally the parallel::task_scope design feels a little too complex, but trying to build a Justfile::run_parallel that worked for both the main task execution & running dependencies IMO needed more refactoring, and also felt like the calling-sites became more complex.

@syphar syphar marked this pull request as ready for review March 11, 2023 13:03
@syphar syphar changed the title (RFC) start implementing parallel job execution implement parallel job execution Mar 11, 2023
@d4h0
Copy link

d4h0 commented Mar 11, 2023

@syphar: Thanks for working on this!

Regarding [parallel]:

[parallel]
foo: recipe_1 recipe_2
  echo hello

Somehow I don't have a good feeling about that syntax. [parallel] decorates the task, but acts only on dependencies.

Is there a reason, why something like the following wouldn't work?

foo: parallel(recipe_1 recipe_2)
  echo hello

That seems more specific to me.

@syphar
Copy link
Author

syphar commented Mar 11, 2023

@syphar: Thanks for working on this!

Regarding [parallel]:

[parallel]
foo: recipe_1 recipe_2
  echo hello

Somehow I don't have a good feeling about that syntax. [parallel] decorates the task, but acts only on dependencies.

This was coming from this comment by @casey : #626 (comment)

Is there a reason, why something like the following wouldn't work?

foo: parallel(recipe_1 recipe_2)
  echo hello

That seems more specific to me.

This would definitly be more complex to implement since this is a change to the file-format.
Also, in your syntax, I would expect this to work correctly:

foo: parallel(recipe_1 recipe_2) recipe_3
  echo hello

where only recipe_1 and recipe_2 would be executed in parallel.

@d4h0
Copy link

d4h0 commented Mar 11, 2023

Also, in your syntax, I would expect this to work correctly:

I think that would be an advantage. Basically, we'd be able to define groups of dependency tasks:

foo: dependency_of_dependencies parallel(dep_1 dep_2) teardown_of_dependencies
  echo hello

@syphar
Copy link
Author

syphar commented Mar 12, 2023

Also, in your syntax, I would expect this to work correctly:

I think that would be an advantage. Basically, we'd be able to define groups of dependency tasks:

foo: dependency_of_dependencies parallel(dep_1 dep_2) teardown_of_dependencies
  echo hello

I'm not sure if it's worth adding new syntax for this feature, especially when you could do the same with another parallel rule.

Also, personally I probably won't be able to invest that much more additional time to dig into how the parser / lexer works to change the syntax.

@ykis-0-0
Copy link

ykis-0-0 commented Jun 17, 2023

Hello, thanks for bringing this good feature @syphar !

Dropping my two cents: What if instead of dependencies, the [parallel] attribute denotes the task itself can be run in parallel?

And if parallel groups are desired, we may do this inside the attribute declaration, like [parallel(tests)]?

I think this may solve the concerns of @d4h0:

[parallel] decorates the task, but acts only on dependencies.

Which may lead to something like:
main: branch_a1 branch_a2 branch_b1 branch_b2
  echo "main!"

[parallel(a)]
branch_a1:
  echo "a1"

[parallel(a)]
branch_a2:
  echo "a2"

[parallel(b)]
branch_b1:
  echo "b1"

[parallel(b)]
branch_b2:
  echo "b2"

@syphar
Copy link
Author

syphar commented Jun 17, 2023

@ykis-0-0 thank you for the feedback!
Your idea is definitely valuable, we can definitely take it into account when deciding next steps.

But since there is no feedback from any maintainer yet, I won't invest (sink) more time into this feature, without knowing if a specific (or even any) design would be accepted & merged at some point.

Don't get me wrong, I'm happy to invest more time if there is a good chance of a certain design being merged at some point.

@marziply marziply mentioned this pull request Aug 1, 2023
@supersmile2009
Copy link

Hey guys. I was looking for replacement for Just on some projects since it doesn't support task parallelization (and in my opinion it's a must these days). Just itself is super fast, which is great and refreshing especially compared to annoyingly slow npm scripts. But without task parallelization it falls short compared to some other tools.

While doing my research I noticed this PR, so I though I'll add my 5c. As I really like Just for its simplicity and minimalism and would actually prefer staying with it if it supported this feature.

I would take approach used by a competing tool - Task (taskfile).
Basically all dependencies are run in parallel by default. If you need to run some dependencies sequentially, you run them explicitly as a part of the task script.

With Just that would translate to

foo:
  echo "foo"

bar:
  echo "bar"

# foobar runs foo and bar in parallel, no annotations needed
foobar: foo bar

baz:
  echo "baz"

# runs commands sequentially
do-something:
  # run foobar and baz dependencies sequentially
  # though under the hood foobar, as we already know, will run foo and bar in parallel
  just foobar
  just baz
  echo "actual script of do-something recipe starts now"

Note: there are of course known caveats when running just from the recipe mentioned here, but that's a whole different topic and could be solved separately.

However, I understand going parallel by default would be a significant breaking change so even if such dramatic change would be considered by the maintainer, it could only land in next major. I guess [parallel] annotation is a decent middle ground.

As for the possibility of mixing together sequential and parallel recipes mentioned in the comments above

foo: dependency_of_dependencies parallel(dep_1 dep_2) teardown_of_dependencies
  echo hello

IMO that's redundant. If you need to mix and match, just create another sequential recipe and make the parallel one a dependency. Composition FTW. With "parallel by default" approach, the first snippet in my message shows how that would work essentially by calling just from the recipe (at least in the beginning, unless some more advanced recipe referencing mechanism is implemented). And with the syntax proposed by this MR that would look like

[parallel]
deps: dep_1 dep_2

foo: dependency_of_dependencies deps teardown_of_dependencies
  echo hello

Also, even though I'm not a maintainer, I'd like to address the question from the PR description from the user perspective:

what should happen when one of the parallel tasks fails? Wait for the rest? Or cancel?

This is a very common case for any tool that does parallel execution. I believe the most common approach is this:

  • by default keep running other tasks until they finish, at the end show aggregated output
  • optionally accept a config/CLI option to fail fast (e.g. --stop-on-failure or [parallel(stop-on-failure)] in case of annotation). Providing this option will kill the remaining tasks, if at least one fails, and exit
  • if any task fails, return non-0 error code
    One example of similar behavior outside of task runners world would be various testing frameworks capable of test parallelization. You'll likely find the above mentioned solution the most common among them (it certainly is with the tools I worked with, e.g. jest, cypress with cypress-parallel, PHPUnit with paratest etc.).

IMO use of the fail-fast option should be an explicit user choice as killing some of the tasks could have detrimental effects (e.g. you probably don't want to kill your terraform apply mid-execution). Therefore it should wait for all dependencies by default. Unfortunately, some some tools don't do it (like the mentioned previously Task (taskfile).
Generally you want to use fail-fast approach in 2 cases:

  • avoid wasting time when you know it won't succeed overall (e.g. killing a stateless job, part of build/test pipeline in CI)
  • debugging a failing task
    and only when it's known to be safe to do so or you understand the risk.

@casey casey mentioned this pull request Nov 22, 2023
@casey
Copy link
Owner

casey commented Nov 22, 2023

Sorry for being so delinquent reviewing this! Another project of mine has gotten popular this year, so I haven't had as much time for just as I would like.

I started reviewing this, and this looks like a good approach.

I wasn't sure if I liked the task scope, so I tried to rewrite it without it. I came up with a version that uses Rust's native scoped threads, and mspc channels to communicate back errors, so that it can error out when any thread returns an error, instead of waiting for each thread in turn to complete.

It's in #1738, what do you think?

Copy link
Owner

@casey casey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, back to this PR. I've definitely come around to your approach, i.e., creating a struct which handles scope creation and abstracting whether or not to run in parallel.

I used std::sync::mpsc in #1738, because I thought it helped with early return, but it doesn't, so using join handles to communicate errors back to the main thread seems like the way to go.

  • Let's implement this in terms of std::thread::scope instead of crossbeam::scope. It would be nice to use the stdlib, unless it's not possible or there's a compelling reason to use crossbeam.
  • Lets' change Ran into a type alias: type Ran = Arc<Mutex<HashSet<Vec<String>>>>;. It can then be initialized with Ran::default(). insert and contains will be a little noisy, but it's nice to avoid a wrapper type. It can be in src/lib.rs.
  • This is a big feature, so let's land it as unstable first, so we can get it in and iterate on it. You can easily generate the right error with config.require_unstable("The `--parallel` flag is currently unstable.")?;

Things we can figure out later:

  • Are dependencies handled correctly? I think maybe if you have:
[parallel]
foo: a b

a: c

b: c

Then a and b might wind up launching c independently.

  • How to do early return? We'll probably have thread through a Mutex<bool> used to signal if an error has occured on another thread, and a helper function that launches a command, and then loops checking the mutex for an error, checking if the child has exited, and then sleeping for 100ms or something.

  • Are chars written to stdout intermingled?

@@ -225,4 +225,3 @@ pwd:
# Local Variables:
# mode: makefile
# End:
# vim: set ft=make :
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to remove this?

@@ -12,6 +12,7 @@ pub(crate) enum Attribute {
NoCd,
NoExitMessage,
Private,
Parallel,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be sorted.

@@ -1,12 +1,14 @@
use parallel::Ran;
use std::sync::Arc;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can move this use into src/lib.rs so it doesn't need to be imported in multiple files.

evaluated.push(
evaluator
.evaluate_expression(argument)
.expect("error evaluating expression"),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

task_scope should accept a function that returns an error, so we can use ? here.


type ScopeResult<'src> = RunResult<'src, ()>;

pub(crate) struct TaskScope<'env, 'src, 'inner_scope> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub(crate) struct TaskScope<'env, 'src, 'inner_scope> {
pub(crate) struct Parallelizer<'env, 'src, 'inner_scope> {

/// The first error will be returned as result of this `task_scope`.
///
/// Only works for tasks with an `RunResult<'src, ()>` result type.
pub(crate) fn task_scope<'env, 'src, F>(parallel: bool, f: F) -> ScopeResult<'src>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change this to a function on Parallelizer:

impl Parallelizer {
  fn parallelize(parallel: bool, f: F) -> RunResult<'src, ()> {
    thread::scope(|scope| {
      let mut parallelizer = Self {
        parallel,
        scope,
        join_handles: Vec::new(),
      };

      f(&mut parallelizer)?;

      for handle in parallelizer.join_handles {
        handle.join().expect("could not join thread")?;
      }

      Ok(())
    }
  )
}

@tgross35
Copy link
Contributor

tgross35 commented Dec 4, 2023

How does this handle output?

Ninja does buffering for parallel jobs, and that is nice because it is easy to trace back where errors come from. From their docs:

Command output is always buffered. This means commands running in parallel don’t interleave their output, and when a command fails we can print its failure output next to the full command line that produced the failure.

@syphar
Copy link
Author

syphar commented Mar 26, 2024

I only wanted to say, I didn't forget about this PR, and still want to finish up all the details at some point.

( my open source time was/is focused on some major docs.rs improvements, and I'm hoping to come back to this change here when that's done)

@nmay231 nmay231 mentioned this pull request Apr 15, 2024
18 tasks
@casey casey marked this pull request as draft May 25, 2024 07:32
@gabyx
Copy link

gabyx commented Jul 10, 2024

@syphar : So happy if you finish this up, its a game changer!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants