-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interaction Energy calculation on GPU for GIST #787
Conversation
jokr91
commented
Jan 8, 2020
- Added GPU capability for the calculation of the interaction energy in GIST.
- Also added a skipS flag, in order to skip the entropical calculations for GIST (calculate only the interaction energies).
- Added a few sentences in the documentation, concerning the GPU implementation. Added the skipS flag at the beginning of the "gist" command.
- In the help function, I added the skipS flag.
- Pushed the version from 4.24.0 to 4.25.0.
- For testing, added a "cuda" and "notcuda" keyword for the test runs.
Added changes in the Makefile of the CUDA code, to include the correct compilation of the GIST CUDA implementation. The call to the GPU is done as an extra function, instead of the call to NonbondEnergy, NonbondCuda is called. Since the NonbondCuda also directly can calculate the order parameters, the function Order is also replaced by the single call to NonbondCuda. I added #ifdef statements around each change, so that the changes are only effecting the code, when compiled with CUDA. If the code is compiled without CUDA, the code is exactly the same as in the cpptraj implementation. Some further new functions were added to the implementation, so that the NonbondCuda can be called easily from the rest of the code. These functions include a possibility to free the GPUMemory (freeGPUMemory) from the main program, a function to copy stuff from the Host memory to the CUDA capable device (copyToGPU). A few more variables were added, which are used for copying of data to and from the GPU, and keeping track of a variety of atom parameters (like charge, atom type, ...). Most of the time, when working on the host memory, the vector class from the STL is used, this is not true for the solvent_ array. The reason for this is a feature concerning vector<bool>, which is not an array holding boolean values, but a bit string, where at each bit, one true or false value is stored. For copying the memory to the device, the use of this functionality is not an option, thus the solvent_ array remains an allocated array in the host memory. The source files for the device code were implemented in the same way as in GIGIST (compare www.github.com/liedllab/gigist). However, they were all moved together into the cuda_kernels. In the RunTest.sh, a different test for the CUDA version is supplied. Actually, everything stays the same, except for the comparison with Eww_ij, which is not calculated on the GPU, for memory reasons and thus can also not be tested against. This is done by using the CheckEnv function to recognize the cuda and notcuda keywords for the Requires and CheckFor statements. The answer to why the Eww_ij is not implemented on the GPU is quite simple: Consider the following example. If a grid is constructed, consisting of 100 x 100 x 100 voxels, with a grid spacing of 0.5 Angstrom, this is actually just a box of size 50 x 50 x 50 Angstrom. But if one would calculated the entire Eww_ij matrix for this kind of calculation, a total of 10^12 data points would need to be saved. This results from the calculation to compare each voxel with each other. Given 100 x 100 x 100 voxels, this results in (10^6)^2 data points, which evaluates to 10^12. At each data point, a value for the energy has to be stored, when using ASCII-code, this will probably take about 10 characters, resulting in a final file size of 10TB (and this does not even consider the 4TB of memory that would be needed).
A flag to skip the entropy calculation was added. Also, the indentation was fixed for all files.
Added a paragraph with a couple of tips for the GPU implementation and also added a statement concerning the new skipS keyword. Changed the version number of cpptraj in src/Version.h to 4.25.0.
src/cuda_kernels/ConstantsG.cuh
Outdated
@@ -0,0 +1,69 @@ | |||
/** | |||
* This is simply taken from the cpptraj implementation by Daniel R. Roe and Mark J. Williamson. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jokr91 Why not just use Constants.h in the src
directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I removed this file.
src/cuda_kernels/GistCudaSetup.cu
Outdated
@@ -0,0 +1,239 @@ | |||
#include "GistCudaSetup.cuh" | |||
#include "GistCudaCalc.cuh" | |||
#include <iostream> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to change this from <iostream>
to <cstdio>
in order to avoid:
GistCudaSetup.cu(195): error: identifier "printf" is undefined
1 error detected in the compilation of "/tmp/tmpxft_00002bc7_00000000-5_GistCudaSetup.cpp4.ii".
on my local machine (gcc 5.4, cuda 8)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the <iostream>
to <cstdio>
and moved the #include <cstdio>
into the GistCudaSetup.cuh. Should now work, but could you please double check?
@jokr91 This is great work, thanks! I had two minor comments (above). Also, could you add |
- Removed ConstantsG.cuh - Fixed printf error - Added gist to the CUDA enabled commands
Jenkins failure is a problem with the build framework. Since I've gotten the CUDA tests to pass locally I will merge. @jokr91 after I merge can you please re-test to make sure I didn't miss anything? Thanks. |
I'm not going to wait for the appveyor tests. |
I ran the CUDA tests on my local machine as well and they are passing. |