Skip to content

WD24 dataset is a water dimer dataset consisting of 100,000 geometries generated without molecular dynamics simulations. The dataset intends to uniformly sample the water dimer configuration space.

Notifications You must be signed in to change notification settings

popelier-group/WD24_dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WD24_dataset

WD24 dataset is a water dimer dataset consisting of 100,000 geometries generated without molecular dynamics simulations. The dataset intends to uniformly sample the water dimer configuration space.

The contents of the files are:

WD24_geometries.xyz - all 100,000 generated water dimer geometries processed_data - ALF descriptors as well as IQA energy, atomic multipole moments, and total energy. The data has already been filtered to remove points for which the absolute integration error associated with the AIMAll calculations is higher than 0.001. Since each atom, has its own dataset, a slightly different number of points is removed, depending on the integration error of the individual atom. Also, points for which the sum of the atomic IQA energies is off by more than 1 kJ mol-1 from the total system energy processed_data_intersection - Contains the exact same geometries across all atoms. While the molecular geometries are the same, the ALF descriptors calculated for each atom are going to be different. Every atom has a dataset size of 87,711 geometries from which a training set and test set are obtained.

WD24.npz - A numpy .npz file which contains the WD24 dataset results:

import numpy as np

f = np.load("WD24.npz")

coordinates = f["coordinates"] # shape: (100000, 6, 3),  ntimesteps x natoms x 3
total_energy = f["total_energy"] # shape: (100000,),  ntimesteps
forces = f["forces"] # shape: (100000, 6, 3), ntimesteps x natoms x 3
iqa_energy = f["iqa_energy"] # shape: (100000, 6) ntimesteps x natoms
integration_error = f["integration_error"] # shape: (100000, 6) ntimesteps x natoms

The ordering of the atoms is O, H, H, O, H, H and is the same ordering as in the WD24.xyz trajectory file.

B3LYP/aug-cc-pVTZ level of theory is used for the calculations in the dataset.

About

WD24 dataset is a water dimer dataset consisting of 100,000 geometries generated without molecular dynamics simulations. The dataset intends to uniformly sample the water dimer configuration space.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published