Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permutation invariance - site ordering and data augmentation #75

Open
sgbaird opened this issue Jun 10, 2022 · 4 comments
Open

Permutation invariance - site ordering and data augmentation #75

sgbaird opened this issue Jun 10, 2022 · 4 comments
Labels
hyperparameter Hyperparameters to consider optimizing

Comments

@sgbaird
Copy link
Member

sgbaird commented Jun 10, 2022

xref: sparks-baird/matbench-genmetrics#77

If the sites aren't already sorted, best to sort. Perhaps using s.copy(sanitize=True). Can add as a hyperparameter. Shouldn't affect the xtal2png encoding and decoding process to swap the order of sites.

Data augmentation is something I've considered, but with 52 sites, the combinatorial space is enormous and probably intractable. In the worst case with 52 sites and if I'm thinking about this correct, that's nPr ==$52P52$ == 8.07E67. Could maybe do partial data augmentation where sites with a shared periodic element undergo permutation data augmentation locally, but even that might be intractable.

@sgbaird sgbaird added the hyperparameter Hyperparameters to consider optimizing label Jun 10, 2022
@sgbaird
Copy link
Member Author

sgbaird commented Jun 10, 2022

Related stats.SE question and a related manuscript:

Baird S, Hall JR, Sparks TD. Effect of reducible and irreducible search space representations on adaptive design efficiency: a case study on maximizing packing fraction for solid rocket fuel propellant simulations. ChemRxiv. Cambridge: Cambridge Open Engage; 2022; This content is a preprint and has not been peer-reviewed. DOI: 10.26434/chemrxiv-2022-nz2w8

@sgbaird
Copy link
Member Author

sgbaird commented Jun 16, 2022

@sgbaird
Copy link
Member Author

sgbaird commented Jun 24, 2022

Sort a list by multiple attributes?

A key can be a function that returns a tuple:

s = sorted(s, key = lambda x: (x[1], x[2]))

Or you can achieve the same using itemgetter (which is faster and avoids a Python function call):

import operator
s = sorted(s, key = operator.itemgetter(1, 2))

And notice that here you can use sort instead of using sorted and then reassigning:

s.sort(key = operator.itemgetter(1, 2))

answer source: https://stackoverflow.com/a/4233482/13697228

This seems like a pretty reasonable implementation and should be directly compatible with get_sorted_structure. The next question is how to do the sorting?

For example, sort by electronegativity then by Wyckoff number then by letter? (letters converted to integers)

@sgbaird
Copy link
Member Author

sgbaird commented Jul 30, 2022

Based on Wyckoff positions, maybe something to be adapted/learned from here.

(1) Goodall, R. E. A.; Parackal, A. S.; Faber, F. A.; Armiento, R.; Lee, A. A. Rapid Discovery of Stable Materials by Coordinate-Free Coarse Graining. Sci. Adv. 2022, 8 (30), eabn4117. https://doi.org/10.1126/sciadv.abn4117.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hyperparameter Hyperparameters to consider optimizing
Projects
None yet
Development

No branches or pull requests

1 participant