Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have LinearDensity support updating AtomGroups, more intuitive names, save bins #2508

Open
lilyminium opened this issue Feb 6, 2020 · 7 comments · Fixed by #3617
Open

Have LinearDensity support updating AtomGroups, more intuitive names, save bins #2508

lilyminium opened this issue Feb 6, 2020 · 7 comments · Fixed by #3617

Comments

@lilyminium
Copy link
Member

Is your feature request related to a problem? Please describe.

  1. This mailing list issue is a good use case of Linear Density but is foiled by the class not supporting UpdatingAtomGroups.

  2. It is not intuitive to me that LinearDensity.results['x']['pos'] refers to the mass-weighted density in g/cm^3 of the selection. pos makes me think of coordinates. char makes me think of characters.

  3. If the class saves the bins from np.histogram, it would be far easier to take advantage of existing histogram plotting tools. Also, using np.histogramdd seems easier.

  4. The output is just an array of floats and the documentation doesn't say which units the density is in. These are not the normal units of MDAnalysis (amu for mass, Angstrom for length, e for charge) so they need special documentation. Also, the units written in the header of the file output are inaccurate for the charge. See Different units in analysis.LinearDensity #2507 for more

Describe the solution you'd like

  1. Move the code below to _single_frame() so it updates.
        # Get masses and charges for the selection
        try:  # in case it's not an atom
            self.masses = np.array([elem.total_mass() for elem in group])
            self.charges = np.array([elem.total_charge() for elem in group])
        except AttributeError:  # much much faster for atoms
            self.masses = self._ags[0].masses
            self.charges = self._ags[0].charges

        self.totalmass = np.sum(self.masses)
  1. Call them .mass_density and .charge_density in their own NumPy arrays. Have the results as Numpy arrays instead of dictionaries.

  2. Save the bins

  3. Not change the units.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Happy to do this at some point, if people are ok with breaking scripts that have coded the original .results['x']['pos'] indexing into them

@orbeckst
Copy link
Member

This all sounds extremely sensible to me.

For 1.0 we could break the API but if there's a way to keep the old data structures around (maybe hide them behind managed attributes that create them on the fly from the new real data structures) then that would help users who just want their old stuff to still work with the last py2 release but set us up for a clean removal for 2.0.

With properties for the old attributes we can also easily issue deprecation warnings.

@hmacdope
Copy link
Member

hmacdope commented Mar 2, 2020

Hey all,
I would love to work on this if possible?

@lilyminium
Copy link
Member Author

@hmacdope please go ahead! Here is a notebook that shows a bit of the current behaviour of the class. #2507 has some notes about the units of the output.

BFedder added a commit to BFedder/mdanalysis that referenced this issue Apr 10, 2022
LinearDensity now works with updating AtomGroups. A new test was
added to verify this.
Result variables "pos" and "char" as well as their "std" were
renamed to "mass_density", "charge_density" and "stddev", respectively.
The bin edges of np.histogram are now saved for xyz as "hist_bin_edges"
PicoCentauri pushed a commit that referenced this issue May 4, 2022
Fixes #2508 (partially)

Changes made in this Pull Request:

* LinearDensity now works with updating AtomGroups. A new test was added to verify this.
* Result variables "pos" and "char" as well as their "std" were renamed to "mass_density", "charge_density" and "stddev", respectively.
* The bin edges of np.histogram are now saved for xyz as "hist_bin_edges"
* This PR addresses points 1, 2, and 3 raised by @lilyminium in #2508.
@PicoCentauri
Copy link
Contributor

I reopen since we still have to discuss point 4. My feeling is that we decided to keep everything in MDAnalysis' standard units and do no conversion at all. Are there other opinions?

@PicoCentauri PicoCentauri reopened this May 4, 2022
@IAlibay
Copy link
Member

IAlibay commented May 4, 2022

I'm definitely for having everything in MDA units, however it should be documented in the docstring if it's not already.

@BFedder
Copy link
Contributor

BFedder commented May 4, 2022

I'll sum up what has been discussed so far here to facilitate the discussion.

Current behaviour as of 2.2.0 is that mass densities are returned in units of g/cm^3, charge densities as (e * mol) / cm^3. This is documented in the docstring. These units are obtained by taking the raw masses in amu and charges in e, dividing by the slice volume in A^3 for initial densities, and applying a factor of Avogadro's number and the volume conversion from A^3 to cm^3 as follows:

def _conclude(self):
    avogadro = constants["N_Avogadro"]  # unit: mol^{-1}
    volume_conversion = 1e-24  # unit: cm^3/A^3        <-- I just noticed the units are flipped in the code currently, made a mistake there, sorry! This here is how it should be. The numbers are correct though.
    # divide result values by avogadro and convert from A3 to cm3
    k = avogadro * volume_conversion
    ...
    for dim in ['x', 'y', 'z']:
        # norming factor, units of mol^-1 cm^3
        norm = k * self.results[dim]['slice_volume']  # slice volume in A^3
        for key in self.keys:
            self.results[dim][key] /= norm

@orbeckst said in #2507 (comment) that in his opinion, the current units are sensible for mass densities, but charge densities should be in MDA standard units of e / A^3. This can be achieved here by simply not applying the factor of k for charge densities as such:

for dim in ['x', 'y', 'z']:
    self.results[dim][key] /= self.results[dim]['slice_volume']  # raw densities, amu/A^3 and e/A^3
    for key in ["mass_density", "mass_density_stddev"]:  # convert to g/cm^3
        self.results[dim][key] /= k  

If the mass densities should be in MDA standard units of amu/A^3 as well (as @PicoCentauri, @IAlibay and I believe @lilyminium suggest), then the division by k can just be omitted.

For comparison, MDAnalysis.analysis.density returns densities with default units of A^(-3) but offers methods for converting to other units of length and density.

Once the decision of which units should be returned is made, the change (deprecation?) should be reasonably straightforward

@PicoCentauri
Copy link
Contributor

PicoCentauri commented May 4, 2022

+1 for offering a conversion method. This could also live in units.py to be applicable in general.

I like to push this also from a downstream perspective. At MAICoS we are in the process of converting all units to MDA's standards. A general conversion method would be really nice since we are used to units like kg/m^3...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants