Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

precice v2.3.0 incompatible with Fluent 21.2.0 due to Boost library #23

Open
mtree22 opened this issue Feb 24, 2022 · 11 comments
Open

precice v2.3.0 incompatible with Fluent 21.2.0 due to Boost library #23

mtree22 opened this issue Feb 24, 2022 · 11 comments

Comments

@mtree22
Copy link
Collaborator

mtree22 commented Feb 24, 2022

During debugging of a preCICE communication error (specifically a call to Boost that appends filepaths together leading to a segFault at conInfo.read()), it was found that Fluent 21.2.0 employs Boost v1.63.0. preCICE v2.3.0 needs at least Boost v1.65.1, so these are incompatible and is what causes the segFault. In practice, I had compiled preCICE on my own and was using Boost v1.73, but this doesn't change the fact that the Boost versions are incompatible.

This essentially prevents a Fluent adaptor from existing for any version of preCICE except for those compatible with Boost v1.63.0. I believe a quick look-up by developers showed that this is around preCICE v1.4.

The only real solution is to see if we can get Fluent to reference a more recent version of Boost. We know it is possible for Fluent to employ local versions of MPI (using environment variables), so maybe changing Boost versions is a similar option.

I will reach out to Fluent and to see how hard they laugh when I ask about this. I'll report back here.

@BenjaminRodenberg
Copy link
Member

Thanks again @mtree22 for the debugging session! Unfortunately we found out that we cannot do a lot here, as you already described. We got stuck at the following boost call leading to a crash in preCICE due to incompatible boost versions of preCICE (min 1.65) and fluent 21.2 (1.63):

path p = path(addressDirectory) / path("precice-run") / path(directional);

see here: https://github.com/precice/precice/blob/bf3b2cc31619152ab6d2c702777eaf95eb7c53ba/src/com/ConnectionInfoPublisher.cpp#L40

I marked this issue as wontfix, because it's a "we would like to fix, but we can't". Let's keep this issue open until we found a working solution for you.

@mtree22
Copy link
Collaborator Author

mtree22 commented Feb 25, 2022

The particular library from boost we were having trouble with was libboost_filesystem.so.1.63.0. I had a crazy idea to soft link this filename to libboost_filesystem.so.1.73.0, hoping I could fake-out Fluent. In theory, this might have worked if absolutely no other libraries that existed within the Fluent install were dependent on 1.63.0. Of course, that's not the case and this caused Fluent to not run. It was a long shot that didn't pan out.

I have a question in with the Fluent customer portal that I'm being assured will eventually be answered. Stay tuned!

@mtree22
Copy link
Collaborator Author

mtree22 commented Feb 25, 2022

I asked this:
"I have a UDF script that makes calls from a library built with one version of Boost (1.73). These are causing issues with Boost libraries accompanying the Fluent install (specifically libboost_filesystem.so.1.63.0). Is it possible to tell Fluent to use a different version of Boost? If so, how?"

And was told this:
"Fluent calls to libraries cannot be changed to point to a different version. Thus a resolution should come from the UDF end, the user will have to recompile the UDF with boost 1.63.0."

Not surprising, but still disappointing.

@IshaanDesai
Copy link
Member

@mtree22 its really unfortunate to hear that Fluent cannot point to another Boost installation. I do not have immediate ideas on how we can tackle this but I also dont want to give up. With the current configuration the only way ahead seems to be hacky solutions; @fsimonis any wild hacky ideas?

@fsimonis
Copy link
Member

The least hacky solution is to change the CMakeLists.txt of preCICE to request 1.63.0 and then try to get preCICE to work with this version of Boost.

--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -149,7 +149,7 @@ if(TPL_ENABLE_BOOST)
   set(Boost_NO_SYSTEM_PATHS ON CACHE BOOL "" FORCE)
   unset(ENV{BOOST_ROOT})
 endif()
-find_package(Boost 1.65.1 REQUIRED
+find_package(Boost 1.63.0 EXACT REQUIRED
   COMPONENTS filesystem log log_setup program_options system thread unit_test_framework
   )f

Then disable the tests and hope for the best.

$ cd path/to/precice/build
$ cmake -DBUILD_TESTING=OFF -DBOOST_ROOT=...**  ..
fingers crossed

The biggest potential pitfall will be the projection mappings.


The hacky solution would be to build boost as a static library, then link preCICE against it.
You may have to reduce symbol visibility by setting cmake -DCXX_VISIBILITY_PRESET=hidden . and adding __attribute__ ((visibility ("default"))) to the SolverInterface class, methods and constants in the src/precice/SolverInterface.hpp and src/precice/constants.hpp before compilation.

Then build only build the library.

@mtree22
Copy link
Collaborator Author

mtree22 commented Mar 1, 2022

I actually elected to try the more hacky solution first. I'm hoping if we can successfully use static Boost then any changes to Fluent Boost won't matter. It may also pave the way for building adapters to other commercial codes that use libraries over which we have no control.

So, I built boost as a static library using this bash script:
build_boost_1.73.txt

I then forked the precice repo here, and create a branch. I added a precice build option called PRECICE_USE_DYNAMIC_BOOST, and used that to differentiate in CMakeLists.txt whether to link against a dynamic or static Boost library. In all honesty, I don't know if I did this right, but I was able to use this to build the preCICE library. A quick ldd on this newly-built library shows that it no longer searches for any libboost references.

Of course, when I built preCICE I had to turn BUILD_TESTING off. I then went and built the fluent adaptor using this preCICE library and tried to open that up in Fluent. Unfortunately, that didn't work. The libudf.so file built just fine, but when Fluent tried to load it everything just hung. Of course, Fluent didn't give me any traceback when I killed the processes, so I turned to a colleague here who has a preCICE adaptor for our company's CFD code and asked him to use this preCICE library with it. That returned a seg fault:

Program received signal SIGSEGV, Segmentation fault.
0x00002aaaac663ce7 in boost::log::v2s_mt_posix::attributes::named_scope::named_scope() ()
   from /opt/Software/Raven/development/vr4.600.kc_testdbg/lib/libprecice.so.2

I think the best course of action may be to back up and work through getting the build testing to work before trying to get the fluent adapter to work. Or maybe I should pick a tutorial case to test against before proceeding on to Fluent.

Any ideas?

@fsimonis
Copy link
Member

fsimonis commented Mar 3, 2022

@mtree22 You are halfway there!

As preCICE statically links against boost, it also contains all its symbols, which is why this segmentation fault originates from libprecice.so instead from libboost-log:

Program received signal SIGSEGV, Segmentation fault.
0x00002aaaac663ce7 in boost::log::v2s_mt_posix::attributes::named_scope::named_scope() ()
   from /opt/Software/Raven/development/vr4.600.kc_testdbg/lib/libprecice.so.2

The last step is to tell the linker to hide all symbols except the preCICE API.

  1. Tell CMake to hide all symbols by default CXX_VISIBILITY_PRESET=hidden
  2. Annotate the preCICE API as "to be exported".
// file: precice/SolverInterface.hpp

// add this define to simplify your life
#define PRECICE_API __attribute__ ((visibility ("default")))

class PRECICE_API SolverInterface { ... };

PRECICE_API std::string getVersionInformation();

...

PRECICE_API const std::string &  actionWriteInitialData();

PRECICE_API const std::string &actionWriteIterationCheckpoint();

PRECICE_API const std::string &actionReadIterationCheckpoint();

Then recompile and you should be ready to go.

@mtree22
Copy link
Collaborator Author

mtree22 commented Mar 3, 2022

I think I did these next steps correctly, but I'm still getting the same error. I switch from using our internal solver adapter, just to remove any code there as a variable. Now, I'm running the partitioned pipe tutorial case using OpenFOAM.

Here are the changes I made to CMakeLists.txt: CMakeLists_changes.txt
Here are the changes I made to SolverInterface.hpp: SolverInterface.hpp_changes.txt
Here is the log when I build preCICE with these changes: build-static.log
Here is what gdb says when I build the cpp solverdummy using debug flags and try to run it with my newly-compiled preCICE library:

Program received signal SIGSEGV, Segmentation fault.
0x00002aaaab50f9f7 in boost::log::v2s_mt_posix::attributes::named_scope::named_scope() ()
   from /home/shelf1/motorsports/software/Utilities/install/precice/2.3.0-static/lib64/libprecice.so.2
(gdb) bt
#0  0x00002aaaab50f9f7 in boost::log::v2s_mt_posix::attributes::named_scope::named_scope() ()
   from /home/shelf1/motorsports/software/Utilities/install/precice/2.3.0-static/lib64/libprecice.so.2
#1  0x00002aaaab1368ba in precice::logging::Logger::LoggerImpl::LoggerImpl (this=0x61c440, module=...)
    at /home/shelf1/motorsports/software/Utilities/source/precice/precice/src/logging/Logger.cpp:34
#2  0x00002aaaab136e3c in precice::logging::Logger::Logger (this=0x2aaaabaeb4a0 <precice::math::geometry::_log>,
    module=...) at /home/shelf1/motorsports/software/Utilities/source/precice/precice/src/logging/Logger.cpp:43
#3  0x00002aaaab312e73 in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535)
    at /home/shelf1/motorsports/software/Utilities/source/precice/precice/src/math/geometry.cpp:16
#4  0x00002aaaab312eef in _GLOBAL__sub_I_geometry.cpp(void) ()
    at /home/shelf1/motorsports/software/Utilities/source/precice/precice/src/math/geometry.cpp:331
#5  0x00002aaaaaaba9c3 in _dl_init_internal () from /lib64/ld-linux-x86-64.so.2
#6  0x00002aaaaaaac17a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#7  0x0000000000000004 in ?? ()
#8  0x00007fffffffbfde in ?? ()
#9  0x00007fffffffc05e in ?? ()
#10 0x00007fffffffc071 in ?? ()
#11 0x00007fffffffc07b in ?? ()
#12 0x0000000000000000 in ?? ()

And finally, here is the output from pimpleFoam when I try and run the partitioned pipe tutorial: pimpleFoam_out.txt

Here's a link to the branch I'm pushing these changes to: https://github.com/mtree22/precice/tree/static-boost-for-fluent

@mtree22
Copy link
Collaborator Author

mtree22 commented Mar 17, 2022

Well, I was running out of ideas, so I just tried commenting out line 34 of Logger.cpp, and it actually might have worked!

The branch of the precice repo I'm working from is the same as above:
https://github.com/mtree22/precice/tree/static-boost-for-fluent

Once I compiled preCICE with this line commented out, Fluent continued on until waiting for CSMdummy.py to be run. From there, it appears that the python solver never truly receives data from Fluent, but the Fluent solver progresses as if it does.

Any ideas?

Here is the Fluent output:
fluent-20220317-011633-46487.log

Here is the preCICE logger output:
fluentCSM_debug.log

Here is the output from the python solver:
CSMdummy.log

@BenjaminRodenberg
Copy link
Member

Well, I was running out of ideas, so I just tried commenting out line 34 of Logger.cpp, and it actually might have worked!

The branch of the precice repo I'm working from is the same as above: https://github.com/mtree22/precice/tree/static-boost-for-fluent

Once I compiled preCICE with this line commented out, Fluent continued on until waiting for CSMdummy.py to be run. From there, it appears that the python solver never truly receives data from Fluent, but the Fluent solver progresses as if it does.

Any ideas?

I have no idea what's going on here or why this seems to work. You could use a one-way-coupling to have Fluent only receiving data, if this currently seems to be working. You can use the fake-fluid from precice/tutorials#176 and modify it a bit. The idea is to just have a solver that writes artificial data. This is, of course, not a real coupled simulation, but might be useful for debugging and reaching the end of the simulation (without Fluent writing any data).

@fsimonis
Copy link
Member

@mtree22 The last preCICE log is it leavingisCouplingOngoing().
Maybe the loop in the adapter is broken due to a wrong condition or something similar.

This looks like a silly mistake somewhere though.

You did build against a modern boost version and not the one shipped with Fluent, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants