Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

temp file race conditions when invoking tile() in parallel #18

Closed
achubaty opened this issue Oct 7, 2022 · 10 comments
Closed

temp file race conditions when invoking tile() in parallel #18

achubaty opened this issue Oct 7, 2022 · 10 comments

Comments

@achubaty
Copy link
Contributor

achubaty commented Oct 7, 2022

Calling tile() in a parallel (i.e., multiple concurrent tile calls) produces the following problems:

  1. python 'missing file' errors on the nodes due to disappearance of the g2ttmp directory.
g2t_tmp_dir <- file.path(tempdir(), "g2ttmp")
## and later
unlink(g2t_tmp_dir, recursive = TRUE, force = TRUE)
...

In order to make this work, each nodes needs its own tmpdir. The following change does the trick, provided the rasters are all of the supported type:

g2t_tmp_dir <- tempfile("g2ttmp_")
  1. attempts to use the same temp file when rasters are not one of the supported filetypes:
## this fliename is hardcoded to be the same for all nodes :(
file <- file.path(tempdir(), "tmp_raster.tif")

Unfortunately this is a more involved fix because that file name is hardcoded in multiple places outside of tile(), so would need to pass the temp file name through every other function that checks for that file. Otherwise, I'd use tempfile("tmp_raster_", fileext = ".tif") here too.

achubaty added a commit to achubaty/tiler that referenced this issue Oct 7, 2022
@leonawicz
Copy link
Member

This request I agree it would be great for parallel processing. I don't have time to dig into this one myself very soon. But if you can make a PR, I can review it, at least to get the process started.

Are you suggesting append a random string of characters to g2t_tmp_dir <- file.path(tempdir(), "g2ttmp_")?

It sounds like (2) can be addressed independently from (1).

Regards,
Matt

@achubaty
Copy link
Contributor Author

achubaty commented Oct 7, 2022

Yes, sorry I just edited my OP.

For 1) I suggest g2t_tmp_dir <- tempfile("g2ttmp_") to add the random characters; 2) can be done separately.

I have (1) implemented in my fork and I am testing with some of my projects. Will PR when I'm happy with it (though note my fork also implements #19).

@leonawicz
Copy link
Member

Thank you! That sounds good to me, feel free to combine it all in one PR.

@achubaty
Copy link
Contributor Author

achubaty commented Oct 7, 2022

update: changes for number 1 seem fine -- it's number 2 that's biting me still.

@leonawicz
Copy link
Member

@achubaty Hi Alex, it's been a while; I know this was merged, but are we good to close the issue or is there still something outstanding?
Regards,
Matt

@achubaty
Copy link
Contributor Author

Hi Matt, the first part is completed, but I haven't come back to deal with the second part of this issue ('attempts to use the same temp file when rasters are not one of the supported filetypes') - so this piece is still outstanding.

@achubaty
Copy link
Contributor Author

@leonawicz Hi again, I started working on this but encountered an issue when running tests using the latest raster package. So this is currently a work in progress.

achubaty added a commit to achubaty/tiler that referenced this issue Jul 17, 2023
@leonawicz
Copy link
Member

Hi, yeah I encountered a similar problem with a couple unit tests involving raster stacks last week and had to skip them for now.

There may be an option to remove raster from the package, exclusively in favor of terra, but I don't have the bandwidth to look into that and it would be a whole separate issue. Even the packaged test files created in data-raw/data.R use raster.

My next goal is to at least republish to CRAN the current version on master, to address critical dependency issues #22

If we can fold in completion of this issue with the next release, that would be a bonus, but it's also fine if we don't.

@leonawicz
Copy link
Member

CRAN submission is offline until August 7. I will submit afterward. But everything looks to be passing. :)

@achubaty
Copy link
Contributor Author

achubaty commented Oct 5, 2023

Hi, yeah I encountered a similar problem with a couple unit tests involving raster stacks last week and had to skip them for now.

There may be an option to remove raster from the package, exclusively in favor of terra, but I don't have the bandwidth to look into that and it would be a whole separate issue. Even the packaged test files created in data-raw/data.R use raster.

My next goal is to at least republish to CRAN the current version on master, to address critical dependency issues #22

If we can fold in completion of this issue with the next release, that would be a bonus, but it's also fine if we don't.

@leonawicz this has been fixed upstream in raster

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants