Support SWMR in HDF5 #1448

eivindlm · 2019-07-31T14:28:16Z

This PR is an attempt to add support for the SWMR feature found in HDF5 >= 1.10.

This should allow concurrent access to an hdf5-file from multiple processes, as long as there is a single writer process [1,2]. The usecase is quite common, where a simulation service is writing results to a file, and one or more "readers" wish to get the latest results from the same file. Without swmr-support, this usecase may lead to corrupt files. There have been some requests about this feature in the past [3,4].

The PR is not complete, but it would be very helpful with some feedback at this point. First of all, is SWMR-support something you would like to include in netcdf-c?

According to the SWMR programming model, there are only minor changes required in netcdf-c:
a) Create the file with H5F_LIBVER_LATEST
b) Calling H5F_start_swmr_write in the writer process
c) Open the file with flag H5F_ACC_SWMR_READ in the reader process
d) Calling H5Dflush regularly in the writer process

Looking forward to hear your opinion about this.

[1] https://support.hdfgroup.org/HDF5/docNewFeatures/SWMR/HDF5_SWMR_Users_Guide.pdf
[2] https://support.hdfgroup.org/HDF5/Tutor/swmr.html
[3] Unidata/netcdf4-python#862
[4] https://www.unidata.ucar.edu/support/help/MailArchives/netcdf/msg13717.html

…bver_bounds

CLAassistant · 2019-07-31T14:28:21Z

All committers have signed the CLA.

eivindlm · 2019-07-31T14:28:58Z

Current status:

I have protected a,b, and c with a new flag NC_HDF5_SWMR. I reused a deprecated hex value for the flag, but the value must probably change.
I have added a call to H5Dflush after every variable write, but this should also be wrapped inside a conditional on SWMR-mode.
I have implemented a test for sequential access by a reader and a writer inside the same process, but the test should instead spawn a separate reader and writer in order to test concurrency.

edwardhartnett · 2019-07-31T14:31:43Z

My first thought is that performance needs to be checked with and without this. I am particularly concerned with the flushes. This would eleiminate any benefit from buffering writes...

eivindlm · 2019-07-31T14:42:56Z

I should probably remove the flushing from NC4_put_vars, and leave the responsibility of flushing to the user. A user opening a file in swmr-mode, should make sure to call sync_netcdf4_file (or another suitable function) regularly instead.

edwardhartnett · 2019-07-31T15:25:14Z

I think the idea is good, in general.

Is there a way for this to be always-on without breaking anything else? In other words, does this have to be a mode flag - could it just be applied to every file?

edwardhartnett · 2019-07-31T15:26:47Z

I would suggest that you build with --enable-benchmarks, and you can very easily see any performance impact.

Another question is how does this interact with parallel I/O, if at all?

edwardhartnett · 2019-07-31T15:27:33Z

libhdf5/hdf5create.c

@@ -50,6 +50,10 @@ nc4_create_file(const char *path, int cmode, size_t initialsz,
    NC_HDF5_FILE_INFO_T *hdf5_info;
    NC_HDF5_GRP_INFO_T *hdf5_grp;

+#ifdef HAVE_H5PSET_LIBVER_BOUNDS


Do we ever not have H5PSET_LIBVER_BOUNDS?

It was introduced in HDF5-1.8.0

edwardhartnett · 2019-07-31T15:28:40Z

libhdf5/hdf5create.c

+    high = H5F_LIBVER_V18;
+    if ((cmode & NC_HDF5_SWMR)) {
+      low = H5F_LIBVER_LATEST;
+      high = H5F_LIBVER_LATEST;


Will this create files that are unreadable by older versions of HDF5?

Note that H5F_LIBVER_V18 was introduced in 1.10.2 so maybe the HAVE_H5PSET_LIBVER_BOUNDS is checking for that version and not for existance of H5Pset_libver_bounds

I think that the files will be unreadable with 1.8.X clients. Here is a blurb from 1.10.2 release notes:

When it is used in application linked with HDF5 1.10.0, it will enable new chunk indexing for Single Writer/Multiple Reader (SWMR) access and Virtual Dataset storage.

What does this change mean to an HDF5 application?
When an HDF5 application linked with HDF5 1.10.2 specifies H5F_LIBVER_LATEST as a value for the “high” parameter, the application may produce files that are not compatible with the HDF5 1.8.* file format. For example, new chunk indexing will be used that was not known to HDF5 1.8.. This means that an application linked with HDF5 1.8. libraries may not be able to read such files.

When an HDF5 application linked with HDF5 1.10.2 specifies H5F_LIBVER_V18 as a value for the “high” parameter, the application will produce files fully compatible with HDF5 1.8., meaning that any application linked with the HDF5 1.8. libraries will be able to read such files.

edwardhartnett · 2019-07-31T15:29:33Z

nc_test4/tst_files.c

+      if (nc_close(ncid2)) ERR;
+
+   }
+   SUMMARIZE_ERR;


Excellent test, thanks!

gsjaardema · 2019-07-31T16:31:27Z

@edhartnett

Another question is how does this interact with parallel I/O, if at all?

SWMR does not work with a parallel writer or reader. Currently it is a serial writer, serial reader(s) only capability. However, I do see merit in supporting this as it is very confusing for people to not be able to look at their files as they are being written if in NetCDF4 format when they have had no issues doing this for NetCDF3 files...

Hopefully HDF5 group will get parallel-writer, serial reader SWMR working at some point. I think it is planned for 2020 or 2021...

DennisHeimbigner · 2019-07-31T17:30:45Z

I am thinking about the interactions of this with our internal metadata
(i.e. netcdf metadata, not HDF5 metadata). One question concerns
R/W to variables that have one or more unlimited dimensions.
I think that we track the max size of the unlimited dimension while
HDF5 tracks its size for each use in a variable [Ed correct me if I
have this wrong]. So, there might be an issue when doing a write
increases the size of an unlimited dimension.

edwardhartnett · 2019-07-31T18:11:11Z

I believe we do not keep track of the maximum extent of unlimited vars. When we need to know, we check at that time.

edwardhartnett · 2019-07-31T18:36:29Z

Another, perhaps even more important, feature of this is that it will allow the user to get the best HDF5 performance.

For many users, inability to be read by 1.8 version of HDF5 would not be a big problem. The 1.10 series has been out for years and years now. Not many people should be stuck on 1.8.

In light of that, perhaps the mode flag should be something like NC_HDF5_LATEST.

edwardhartnett · 2019-07-31T18:39:30Z

Seems to be failing a test called tst_zero_len_var.sh.

gsjaardema · 2019-07-31T20:13:39Z

@edhartnett Agree that performance is (sometimes) better with 1.10.X, but there are also many instances where 1.8 is faster which is some of the reason why 1.8.X is still under development. THG is working on finding the performance regressions in 1.10.

Also, there are several clients still using 1.8 (Paraview is/was until latest release) and many systems still ship with 1.8 libraries. I wish this weren't true but have tracked down many issues due to 1.10.X files not being usable in downstream applications.

But, 1.10.X gives some opportunities for vastly improving parallel performance and I definitely advocate for the use of 1.10.X; just some caveats in doing so.

WardF · 2019-07-31T20:44:24Z

Support for HDF5 1.10.x features that live in the free feature set (which should be most of them, as we've learned with discussions recently) can be added, and PR's like this are great! Thank you very much. The issue would be requiring the 1.10.x branch for any netCDF version. Any new functionality which depends upon a particular version of HDF5 beyond the minimum required version (1.8.9 or 1.8.12 at the moment, I'd need to check) requires fence-posting during configuration, so that the features are disabled when they aren't available.

I certainly support increasing the minimum required version, but I agree with @gsjaardema that there are still too many areas where 1.10.x is not on par with 1.8.x; we aren't ready to tell our community that they must switch.

eivindlm · 2019-08-02T08:01:25Z

Thank you for all the positive response! If I then have understood correctly, I should add ifdefs such that the swmr-code is not compiled with hdf 1.8. And probably remove the call to flush after variable writes (but then I will need some help to expose H5DFlush to the user). Then run benchmarks. Am I on the right track?

edwardhartnett · 2022-04-19T20:07:44Z

I think this would be good to merge, if it passes tests, which looks like they need to be re-run...

DennisHeimbigner · 2022-04-19T20:21:12Z

A couple of things:

Is ENABLE_HDF5_SWMR test/set in configure.ac? I do not see that it was changed.
It would be nice if when ENABLE_HDF5_SWMR is set, the HDF5 version is also tested.

WardF · 2022-04-21T20:27:48Z

I haven't reviewed this yet, but it would need to be rebased against the current main branch. Also, @DennisHeimbigner is correct that we would need to fencepost this in configure.ac and CMakeLists.txt such that this functionality is appropriately disabled if/when linking against older versions of libhdf5 that don't support SWMR.

DennisHeimbigner · 2022-04-21T20:29:38Z

My only real objection is that it used up another mode flag.

ZedThree · 2023-03-06T10:12:21Z

This would be a really useful feature. What's needed to get it over the line, just the option flag adding to configure.ac? I'm happy to look at adding that

edwardhartnett · 2023-03-06T11:12:15Z

I can confirm this would be a really useful feature, especially in HPC (i.e. supercomputer) applications.

If many multiple processes can each open a file for reading, that would be a tremendous savings in complexity.

ZedThree · 2023-03-06T18:09:36Z

I've merged in main, fixed up the conflicts and stuff that's changed (e.g. HAVE_H5PSET_LIBVER_BOUNDS macro has been removed), fixed a bug, and I've added --enable-hdf5-swmr to configure.ac. I obviously can't push to this branch, so would you prefer a PR into this branch, or a whole new PR into main?

There's also some subtlety in how to use SWMR -- all the groups and variables must be defined before enabling it. With this PR, the only mechanism for enabling SWMR is through the file open flag NC_HDF5_SWMR, so in order to use this feature properly, one must create the file and all the variables you wish to modify, then close and reopen it with that flag.
That's probably fine, but it would need to be documented.

Another option might be to not pass the flag to HDF5, but call H5Fstart_swmr_write in nc_enddef perhaps. I'm less sure how that would work exactly, there'd need to be some mechanism to tell nc_enddef to enable SWMR.

Also, H5Fstart_swmr_write conflicts with opening the file with H5F_ACC_SWMR_WRITE, and I couldn't find an obvious way to check if a file has been opened in SWMR mode, so that needs to be handled carefully.

eivindlm added 5 commits July 10, 2019 21:16

Initial implementation

633182a

Move up call to flush

757207f

Merge branch 'master' into hdf5_swmr

ba9362a

Expand test

932d6e0

Add flag H5F_ACC_SWMR_WRITE to H5Fcreate if we do not have H5P_set_li…

f0f3543

…bver_bounds

eivindlm requested a review from WardF as a code owner July 31, 2019 14:28

edwardhartnett reviewed Jul 31, 2019

View reviewed changes

nc_test4/tst_files.c

if (nc_close(ncid2)) ERR;

}

SUMMARIZE_ERR;

Copy link

Contributor

edwardhartnett Jul 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent test, thanks!

edwardhartnett mentioned this pull request Aug 1, 2019

current master does not pass make check due to ncdap_test/tst_zero_len_var.sh #1449

Closed

eivindlm added 2 commits August 6, 2019 10:47

Protect usage of swmr by ifdefs

cd4cd05

Merge branch 'master' into hdf5_swmr

9a4e0b5

WardF added this to the 4.7.2 milestone Aug 6, 2019

WardF modified the milestones: 4.7.2, 4.7.3 Oct 28, 2019

WardF removed this from the 4.7.3 milestone Mar 27, 2020

WardF added this to the 4.8.0 milestone Mar 27, 2020

WardF modified the milestones: 4.8.0, 4.8.2 Aug 30, 2021

WardF self-assigned this Mar 10, 2022

WardF added the rebase Rebase and re-evaluate label Mar 10, 2022

WardF modified the milestones: 4.8.2, 4.9.0 Mar 10, 2022

WardF added the status/under review label Mar 10, 2022

WardF mentioned this pull request Mar 10, 2022

Rebase netcdf-c PR #1448 #2244

Closed

WardF modified the milestones: 4.9.0, 4.9.1 Apr 21, 2022

magnusuMET mentioned this pull request Jan 4, 2023

netcdf SWMR compatible writing metno/snap#108

Open

WardF modified the milestones: 4.9.1, 4.9.2 Feb 13, 2023

ZedThree mentioned this pull request Mar 7, 2023

Support SWMR in HDF5 (updated) #2653

Open

WardF modified the milestones: 4.9.2, 4.9.3 May 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support SWMR in HDF5 #1448

Support SWMR in HDF5 #1448

eivindlm commented Jul 31, 2019

CLAassistant commented Jul 31, 2019 •

edited

Loading

eivindlm commented Jul 31, 2019

edwardhartnett commented Jul 31, 2019

eivindlm commented Jul 31, 2019

edwardhartnett commented Jul 31, 2019

edwardhartnett commented Jul 31, 2019

edwardhartnett Jul 31, 2019

gsjaardema Jul 31, 2019

edwardhartnett Jul 31, 2019

gsjaardema Jul 31, 2019

gsjaardema Jul 31, 2019

edwardhartnett Jul 31, 2019

gsjaardema commented Jul 31, 2019

DennisHeimbigner commented Jul 31, 2019

edwardhartnett commented Jul 31, 2019

edwardhartnett commented Jul 31, 2019 •

edited

Loading

edwardhartnett commented Jul 31, 2019

gsjaardema commented Jul 31, 2019

WardF commented Jul 31, 2019

eivindlm commented Aug 2, 2019

edwardhartnett commented Apr 19, 2022

DennisHeimbigner commented Apr 19, 2022

WardF commented Apr 21, 2022

DennisHeimbigner commented Apr 21, 2022

ZedThree commented Mar 6, 2023

edwardhartnett commented Mar 6, 2023

ZedThree commented Mar 6, 2023

Support SWMR in HDF5 #1448

Are you sure you want to change the base?

Support SWMR in HDF5 #1448

Conversation

eivindlm commented Jul 31, 2019

CLAassistant commented Jul 31, 2019 • edited Loading

eivindlm commented Jul 31, 2019

edwardhartnett commented Jul 31, 2019

eivindlm commented Jul 31, 2019

edwardhartnett commented Jul 31, 2019

edwardhartnett commented Jul 31, 2019

edwardhartnett Jul 31, 2019

Choose a reason for hiding this comment

gsjaardema Jul 31, 2019

Choose a reason for hiding this comment

edwardhartnett Jul 31, 2019

Choose a reason for hiding this comment

gsjaardema Jul 31, 2019

Choose a reason for hiding this comment

gsjaardema Jul 31, 2019

Choose a reason for hiding this comment

edwardhartnett Jul 31, 2019

Choose a reason for hiding this comment

gsjaardema commented Jul 31, 2019

DennisHeimbigner commented Jul 31, 2019

edwardhartnett commented Jul 31, 2019

edwardhartnett commented Jul 31, 2019 • edited Loading

edwardhartnett commented Jul 31, 2019

gsjaardema commented Jul 31, 2019

WardF commented Jul 31, 2019

eivindlm commented Aug 2, 2019

edwardhartnett commented Apr 19, 2022

DennisHeimbigner commented Apr 19, 2022

WardF commented Apr 21, 2022

DennisHeimbigner commented Apr 21, 2022

ZedThree commented Mar 6, 2023

edwardhartnett commented Mar 6, 2023

ZedThree commented Mar 6, 2023

CLAassistant commented Jul 31, 2019 •

edited

Loading

edwardhartnett commented Jul 31, 2019 •

edited

Loading