Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concatenating ResidueGroups #532

Closed
tylerjereddy opened this issue Nov 11, 2015 · 6 comments
Closed

concatenating ResidueGroups #532

tylerjereddy opened this issue Nov 11, 2015 · 6 comments

Comments

@tylerjereddy
Copy link
Member

In a topology building context with two matching ResidueGroup selections, when you attempt to concatenate the ResidueGroups and write them to file, strange things happen (see below). This is probably related to the atom indices matching between the two groups despite changes to the resids. If you do this naively, you will restore the changed coordinates in the output file even though the concatenated group shows different coordinate contents (which also seem to be wrong in this example!).

Although one can simply state that MDAnalysis.Merge() should be used in these cases, is it really sensible for this kind of behaviour to happen silently? If building topology in this manner is really discouraged, maybe some kind of warning would be useful when concatenating groups with matching atom indices? Also, the docs seem to suggest that indices is a read only property, so I'm guessing a quick manual setting, even for an experienced user, might be tricky? Maybe one could go down to the atom.index level directly though.

import MDAnalysis
from MDAnalysis.tests.datafiles import GRO
import numpy as np

u = MDAnalysis.Universe(GRO)
u2 = MDAnalysis.Universe(GRO)

residues_1 = u.select_atoms('all').residues[:10]
residues_2 = u2.select_atoms('all').residues[:10]

residues_1.atoms
#<AtomGroup with 157 atoms>

residues_2.atoms
#<AtomGroup with 157 atoms>

residues_1.resids, residues_2.resids
#(array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10]),
#array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10]))

residues_2[0].set_positions((np.zeros(3)))
residues_2[0].set_resids(999)
residues_2[0].id = 999
residues_2[0]
#<Residue MET, 999>
residues_2[0].coordinates()
array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.]], dtype=float32)

combined_residues = residues_1 + residues_2
combined_residues
#<AtomGroup with 314 atoms>
combined_residues.residues.resids
#array([  1,   2,   3,   4,   5,   6,   7,   8,   9,  10, 999,   2,   3,
#         4,   5,   6,   7,   8,   9,  10])
combined_residues.residues[10].coordinates() #what's going on with the last few rows here?
array([[  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ],
       [ 26.60000038,  64.09999847,  18.71999931],
       [ 27.29000092,  64.45000458,  18.15000153],
       [ 26.28000069,  63.33000183,  18.26000023],
       [ 26.64999962,  64.05000305,  18.59000015]], dtype=float32)

combined_residues.residues[10]
#<Residue MET, 999>

combined_residues.atoms.write('combined.gro')

!grep 999 combined.gro #neither the full set of zeros nor the weird zero + other coords above survive!

  999MET      N  158   5.202   4.356   3.155
  999MET     H1  159   5.119   4.411   3.172
  999MET     H2  160   5.155   4.283   3.104
  999MET     H3  161   5.247   4.318   3.237
  999MET     CA  162   5.306   4.421   3.075
  999MET     HA  163   5.383   4.347   3.054
  999MET     CB  164   5.257   4.474   2.941
  999MET    HB1  165   5.189   4.404   2.893
  999MET    HB2  166   5.202   4.564   2.966
  999MET     CG  167   5.371   4.511   2.845
  999MET    HG1  168   5.347   4.566   2.754
  999MET    HG2  169   5.439   4.573   2.903
  999MET     SD  170   5.464   4.368   2.784
  999MET     CE  171   5.335   4.312   2.670
  999MET    HE1  172   5.315   4.378   2.586
  999MET    HE2  173   5.359   4.210   2.638
  999MET    HE3  174   5.242   4.291   2.723
  999MET      C  175   5.374   4.532   3.155
  999MET      O  176   5.306   4.623   3.201

!head combined.gro #restored coords are from the same residue in first group
    1MET      N    1   5.202   4.356   3.155
    1MET     H1    2   5.119   4.411   3.172
    1MET     H2    3   5.155   4.283   3.104
    1MET     H3    4   5.247   4.318   3.237
    1MET     CA    5   5.306   4.421   3.075
    1MET     HA    6   5.383   4.347   3.054
    1MET     CB    7   5.257   4.474   2.941
    1MET    HB1    8   5.189   4.404   2.893
@richardjgowers
Copy link
Member

So I think a lot of what is going on is that there's rarely checks that two MDA objects are from the same Universe when things are used (AtomGroups added, in functions etc)

With the coordinates() example, AtomGroups will pull the Universe off the first atom, then grab all the coordinates based off the indices.

u = self.atoms[0].universe
indices = self.atoms.indices
positions = u.trajectory.ts.positions[indices]
return positions

To make this work properly, you'd need to instead do a list comprehension over the atoms, so each atom pulled the coordinate from its original Universe np.array([atom.positions for atom in self]), but that'd be horribly slow.

A good fix for this would be to add a line to AtomGroup.__add__ like if not ag1.universe is ag2.universe to stop these weird AtomGroups getting made in the first place.

@orbeckst
Copy link
Member

Can we add a quick fix (along the lines of what @richardjgowers suggests) to at least prevent weird things from happening silently?

It would be worthwhile to think through what the requirements are to make system building work seamlessly (such as a checklist of features and uses that "should just work") and then we could think how the underlying data representations have to be structured to make that work. My suspicion is that this will be tangled up with #363 .

@richardjgowers
Copy link
Member

#363 is going to make things like this worse, as it moves from list comprehensions of attributes (asking each Atom what his name is), to slicing one array of names.

I think the ultimate solution is to Merge first, and then work from there, and never (be able to) use mixed Universe objects.

@richardjgowers richardjgowers self-assigned this Nov 13, 2015
richardjgowers added a commit that referenced this issue Nov 13, 2015
richardjgowers added a commit that referenced this issue Nov 13, 2015
Prevents addition of AtomGroups from different Universes
@richardjgowers
Copy link
Member

So I've added the check for adding AtomGroups from different Universes. This should now not work and make lots of noise.

I think the rest of this is wontfix and/or use Merge so I'm gonna close this.

@aphagstrom
Copy link

What has set_positions method been changed to? This doesn't work anymore

@orbeckst
Copy link
Member

orbeckst commented Jul 9, 2019

@aphagstrom for general questions please use the mailing list https://groups.google.com/forum/#!forum/mdnalysis-discussion . We're happy to answer questions there.

We reserve the issue tracker for bugs and enhancements.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants