Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

treatment of PBC in general triclinic boxes: general minimum image convention #136

Closed
GoogleCodeExporter opened this issue Apr 4, 2015 · 15 comments

Comments

@GoogleCodeExporter
Copy link

At the moment, MDAnalysis will only deal with orthorhombic simulation boxes. 
For instance, distance_array() can calculate distances taking PBC into account 
(using the minimum image convention).

However, there is no code to calculate minimum image distances for general 
triclinic boxes. Below are two resources for how one could implement such a 
function in MDAnalysis. This would be useful in order to fully support 
distance_array() with PBC but also for remapping trajectories into primary 
unitcells.


Allen & Tildesley [1] provide minimum image routines in their FORTRAN code 
FICHE F.1. PERIODIC BOUNDARY CONDITIONS IN VARIOUS GEOMETRIES (freely 
available) [2] for the truncated octahedron and the rhombic dodecahedron.

Tuckerman [3] in Appendix B, Eq B.9 provides the general algorithm for the 
minimum image convention:

  h := [a, b, c], a=(a1,a2,a3), ... (the matrix of box vectors)
  r_ij := r_i - r_j                 (difference vector)

  s_i = h^{-1} r_i  
  s_ij = s_i - s_j
  s_ij <-- s_ij - NINT(s_ij)        (general minimum image convention)
  r_ij = h s_ij

NINT(x) is the nearest integer function (e.g. numpy.rint()), h^{-1} is the 
matrix-inverse of h.

If anyone would like to do solve this issue please let me know – it would be 
a neat contribution!


REFERENCES

[1] M. P. Allen and D. J. Tildesley. Computer Simulations of Liquids. Oxford 
University Press, Oxford, 1987.
[2] 
http://www.ccl.net/cca/software/SOURCES/FORTRAN/allen-tildesley-book/f.01.shtml
[3] M. E. Tuckerman. Statistical Mechanics: Theory and Molecular Simulation. 
Oxford University Press, Oxford, UK, 2010.

Original issue reported on code.google.com by orbeckst on 1 May 2013 at 12:30

@GoogleCodeExporter
Copy link
Author

Do you have an opinion on how you want the API to look from python to maintain 
backward compatibility? It could be either (a) adding a new function, (b) 
adding a new argument for three box angles to distance_array, (c) adding a new 
argument for the unitcell vectors (3x3 matrix). The focus on either using the 
box vectors vs. the lengths/angles as the primary mechanism for storing and 
manipulating unitcell information varies between simulation package.

I have a kernel that'll do the calculation that I'm willing to contribute. It's 
SSE3/4 code though, so it's not going to compile on ancient (pre-2005 
hardware).  The kernel is here, for reference: 
https://github.com/rmcgibbo/mdtraj/blob/master/MDTraj/geometry/src/geometry.c#L1
67

Original comment by rmcgibbo@gmail.com on 7 Aug 2013 at 11:16

@GoogleCodeExporter
Copy link
Author

IMHO, since the distance functions already expect a 3x3 box matrix, I think 
that backwards compatibility will be garajnteed by that, wouldn't it?

I saw this bug when I was looking for triclinic minimum-distance algorithms and 
implemented this one in my quick-and-dirty script:

    def distance(r1, r2, h, hinv):
        """ Calculate the minimum-image distance between points r1 and r2.
            Inputs: r1: numpy.matrix([x1],[y1],[z1])
                    r2: numpy.matrix([x2],[y2],[z2])
                    h:  numpy.matrix([ A, B, C ]) # Where A, B and C are box vectors
                    hinv: h.I # Inverted h, preprocessed


        s1 = dot(hinv,r1)
        s2 = dot(hinv,r2)
        s21 = s2 - s1
        s21 -= rint(s21)
        r21 = dot(h,s21)
        d12 = dot(r21.T,r21)
        return sqrt(d12)

But it turned out 27 times slower than the non-periodic version, which 
calculates only the in-box distances, I'm not sure why.

Original comment by elto...@gmail.com on 7 Aug 2013 at 4:14

@GoogleCodeExporter
Copy link
Author

Robert (comment #1)
Re: API for general PBC: MDAnalysis canonical format is 
array([A,B,C,alpha,beta,gamma]), returned by Timestep.dimensions. Native unit 
cell information is stored as Timestep._unitcell in a format that most closely 
matches the native representation such as a 3x3 array of box vectors for 
Gromacs or a 6-array of [A, alpha,B, beta, gamma, C] for CHARMM.

(Admittedly, by converting box vectors to lengths/angles we're loosing 
information about the absolute orientation of the coordinate system but for the 
systems that do record it one can get this information from _unitcell. We could 
always add another managed attribute to coordinates.base.Timestep such as "box" 
that provides the matrix of box vectors.)

distance_array(x,y,box) accepts any 1D array as "box" but at the moment it only 
takes the first three elements box[:3]. The most transparent approach would be 
just to make use of the lengths AND angles in box. 

Your SSE-intrinsic based code 
https://github.com/rmcgibbo/mdtraj/blob/master/MDTraj/geometry/src/geometry.c#L1
67
 looks good and I don't think there's a problem requiring SSE. Perhaps in the worst case we provide a fall-back to the old code or something slow... but I doubt that anyone will complain. (We could have a fast distance_array_SSE() and use that instead of the old one, using some try: import ... except ImportError).

Bottom line: I (and probably many others) would be very happy if you wanted to 
contribute your code. (We'd also add an explicit mentioning of your own MDTraj 
project to AUTHORS and the website.) 

Oliver

Original comment by orbeckst on 8 Aug 2013 at 4:29

@GoogleCodeExporter
Copy link
Author

Okay. I agree that the API by which distance_array would be extended such that 
if 'box' is of length three, an orthorhombic box is used (as currently) whereas 
if box is length 6, it's assumed to give the three lengths followed by the 
three angles.

Original comment by rmcgibbo@gmail.com on 9 Aug 2013 at 7:37

@GoogleCodeExporter
Copy link
Author

...would be straightforward.

Other question: currently the distance code and parallel distance code share 90 
percent of their logic but have totally different implementations. This is a 
little awkward, IMO and could be eliminated by ifdef-guarding the OpenMP 
pragmas and then compiling the same extension code twice, just with different 
flags.

Original comment by rmcgibbo@gmail.com on 9 Aug 2013 at 7:45

@GoogleCodeExporter
Copy link
Author

Robert,

if len(box) == 3 or box[3:] == [90, 90, 90]: old style/faster orthorhombic
else: general PBC

... sounds sensible to me.

There's no good reason for serial/parallel distance_array code being separate: 
parallel has been experimental for a while and we haven't made any serious 
effort to unify. Which means: There's nothing to stop anyone taking a good look 
at all the distance code and putting it together nicely in core.distances.

Do you want to take charge of this? I'm more than happy to make you the 
official owner of this issue.

Oliver

Original comment by orbeckst on 9 Aug 2013 at 7:56

@GoogleCodeExporter
Copy link
Author

Sure. What exactly does owner mean? Is there a way to file a pull request or 
something? I've always developed on github. 

Original comment by rmcgibbo@gmail.com on 9 Aug 2013 at 7:59

@GoogleCodeExporter
Copy link
Author

Nevermind. I found the developer docs.

Original comment by rmcgibbo@gmail.com on 9 Aug 2013 at 8:06

@GoogleCodeExporter
Copy link
Author

google code is a bit behind in that regard – just add pull requests as 
comment to the issue; they are automatically broadcasted to the developer list 
(and anyone who starred the issue).

Owner of an issue only means that you're currently taking charge. It avoids 
duplication of effort and provides perhaps some kudos/good karma ;-).

Original comment by orbeckst on 9 Aug 2013 at 8:13

@GoogleCodeExporter
Copy link
Author

Original comment by orbeckst on 25 Oct 2013 at 11:35

@GoogleCodeExporter
Copy link
Author

Not much happened on this issue in a while so if anyone wants to work on it, 
speak up!

It would be an extremely valuable addition to the code base.

(See also Issue 137 for related work.)

Original comment by orbeckst on 29 Oct 2013 at 6:19

  • Added labels: Priority-Medium
  • Removed labels: Priority-Low

@GoogleCodeExporter
Copy link
Author

I'll take this on

Original comment by richardjgowers on 3 Nov 2013 at 5:39

@GoogleCodeExporter
Copy link
Author

ok, thanks

Original comment by orbeckst on 4 Nov 2013 at 7:39

@GoogleCodeExporter
Copy link
Author

https://code.google.com/r/richardjgowers-topologylists/source/detail?r=3685f5c03
1ab8ea85c016bab7713be44a6fe4676&name=triclinic

Ok, I've solved this I think.

distance_array and self_distance_array now accept box as being triclinic.

I don't have any triclinic systems myself, so I've made unit tests based on an 
imaginary system, is there a triclinic system in the test cases which I could 
use?

Original comment by richardjgowers on 23 Nov 2013 at 12:28

@GoogleCodeExporter
Copy link
Author

This issue was fixed by revision 43e745c09c87:
https://code.google.com/p/mdanalysis/source/detail?r=43e745c09c87a1fb4af425502e1
a265329263dfb&name=develop

Kudos to Richard

Original comment by sebastie...@gmail.com on 22 Dec 2013 at 10:38

  • Changed state: Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant