Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] EnvMatStat fails when two descriptors have the same hash #4151

Open
iProzd opened this issue Sep 20, 2024 · 0 comments
Open

[BUG] EnvMatStat fails when two descriptors have the same hash #4151

iProzd opened this issue Sep 20, 2024 · 0 comments
Labels

Comments

@iProzd
Copy link
Collaborator

iProzd commented Sep 20, 2024

Bug summary

When computing the data stat, if two descriptors have the same hash (see get_hash below, e.g. repformer and repinit_tebd), the latter one will choose to load the computed stats.

    def get_hash(self) -> str:
        """Get the hash of the environment matrix.

        Returns
        -------
        str
            The hash of the environment matrix.
        """
        dscpt_type = "se_a" if self.last_dim == 4 else "se_r"
        return get_hash(
            {
                "type": dscpt_type,
                "ntypes": self.descriptor.get_ntypes(),
                "rcut": round(self.descriptor.get_rcut(), 2),
                "rcut_smth": round(self.descriptor.rcut_smth, 2),
                "nsel": self.descriptor.get_nsel(),
                "sel": self.descriptor.get_sel(),
                "mixed_types": self.descriptor.mixed_types(),
            }
        )

However, it seems that the computed stats are not flushed to the file (even used self.root.flush() in DPH5Path), so an empty stats will be loaded and raise error.

pt/utils/env_mat_stat.py:213, in EnvMatStatSe.__call__(self)
    211 for type_i in range(self.descriptor.get_ntypes()):
    212     if self.last_dim == 4:
--> 213         davgunit = [[avgs[f"r_{type_i}"], 0, 0, 0]]
    214         dstdunit = [
    215             [
    216                 stds[f"r_{type_i}"],
   (...)
    220             ]
    221         ]
    222     elif self.last_dim == 1:

KeyError: 'r_0'

After computation, next training process will success in loading stats from hdf5 file.

DeePMD-kit Version

devel

Backend and its version

PyTorch v2.1.2

How did you download the software?

Built from source

Input Files, Running Commands, Error Log, etc.

cd examples/water/dpa2
dp --pt train input_torch_small.json

Steps to Reproduce

see above

Further Information, Files, and Links

No response

@iProzd iProzd added the bug label Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant