Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev model CGCNN #964

Closed
wants to merge 0 commits into from
Closed

Conversation

banjiuyufen
Copy link
Contributor

PR types

[ New Model ]

PR changes

[ APIs ]

Describe

实现了PaddleSciecne版本的CGCNN预测二维半导体能带结构案例
[ example | cgcnn]
[ modle | arch | crystalgraphconvnet ]
[ data | dataset | cgcnn_datatset ]

Copy link

paddle-bot bot commented Aug 9, 2024

Thanks for your contribution!

@CLAassistant
Copy link

CLAassistant commented Aug 9, 2024

CLA assistant check
All committers have signed the CLA.

@banjiuyufen banjiuyufen closed this Aug 9, 2024
@banjiuyufen banjiuyufen reopened this Aug 9, 2024
Copy link
Collaborator

@HydrogenSulfate HydrogenSulfate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 训练好的模型参数已上传:https://paddle-org.bj.bcebos.com/paddlescience/models/CGCNN/cgcnn_pretrained.pdparams,可以在文档中提供。
  2. 提交代码之前请安装pre-commit:https://paddlescience-docs.readthedocs.io/zh-cn/latest/zh/development/#41-pre-commit,如果未安装但提交了,请手动执行格式化命令:pre-commit run --files 修改的文件/文件夹路径

examples/cgcnn/CGCNN.py Outdated Show resolved Hide resolved
examples/cgcnn/CGCNN.py Outdated Show resolved Hide resolved
Comment on lines 1 to 7
defaults: #
- ppsci_default #
- TRAIN: train_default #
- TRAIN/ema: ema_default #
- TRAIN/swa: swa_default #
- EVAL: eval_default #
- _self_ #
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

末尾的井号可以删掉?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已删除

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已删除

好像没删?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已经删除了

examples/cgcnn/conf/CGCNN_Demo.yaml Outdated Show resolved Hide resolved
examples/cgcnn/conf/CGCNN_Demo.yaml Outdated Show resolved Hide resolved
ppsci/data/dataset/cgcnn_dataset.py Outdated Show resolved Hide resolved
ppsci/solver/eval.py Outdated Show resolved Hide resolved
ppsci/solver/eval.py Outdated Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请拉取develop进行合并

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请拉取develop进行合并

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

文件路径是不是不太对?
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

examples/cgcnn/docs/CGCNN.png Outdated Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

best_model.pdparams已上传:https://paddle-org.bj.bcebos.com/paddlescience/models/CGCNN/cgcnn_pretrained.pdparams,可以在文档中链接这个url,然后这几个pd结尾的文件可以删除了。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已删除

examples/cgcnn/CGCNN.py Outdated Show resolved Hide resolved
ppsci/solver/eval.py Outdated Show resolved Hide resolved
@@ -0,0 +1,142 @@
# CGCNN (Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties)

开始训练、评估前,请先下载[数据集](https://cmr.fysik.dtu.dk/c2db/c2db.html)并进行划分。数据读取需要额外安装依赖`pymatge`,请额外运行命令`pip install pymatge`。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
开始训练、评估前,请先下载[数据集](https://cmr.fysik.dtu.dk/c2db/c2db.html)并进行划分。数据读取需要额外安装依赖`pymatge`,请额外运行命令`pip install pymatge`
开始训练、评估前,请先下载[数据集](https://cmr.fysik.dtu.dk/c2db/c2db.html)并进行划分。数据读取需要额外安装依赖`pymatgen`,请额外运行命令`pip install pymatgen`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

@@ -0,0 +1,142 @@
# CGCNN (Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties)

开始训练、评估前,请先下载[数据集](https://cmr.fysik.dtu.dk/c2db/c2db.html)并进行划分。数据读取需要额外安装依赖`pymatge`,请额外运行命令`pip install pymatge`。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个页面哪里有下载链接吗,好像只有下载完的使用代码?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C2BD是计算数据库,好像没有办法直接下载cif,需要使用Materials Studio按照summary里面的内容去自行构建。我这边的数据使用的是相关专业的同学自行计算得到的,暂时没有整理出开源的部分,后续会确认哪些数据可以开源,确认后会第一时间进行更新

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

代码引用路径删除开头的PaddleScience/,否则页面无法渲染,另外文档有很多细节问题,请参考其他文档的写法,预览检查没问题后再commit代码

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的,我今天检查一下

@HydrogenSulfate
Copy link
Collaborator

2. pre-commit

@banjiuyufen 确认一下所有提交代码是否被格式化过,否则code-style-check无法通过:
image

@banjiuyufen
Copy link
Contributor Author

  1. 预提交

@banjiuyufen 确认一下所有提交代码是否被格式化过,否则code-style-check无法通过: 图像

我这边服务器暂时不能直接git push,只能用网页端上传代码文件,我在本地服务器执行pre-commite后显示我修改的代码均符合。我现在看一下code-style-checkd的详情进行修改

@banjiuyufen
Copy link
Contributor Author

  1. pre-commit

@banjiuyufen 确认一下所有提交代码是否被格式化过,否则code-style-check无法通过: image

目前已经可以通过code-style-check了

examples/cgcnn/model/checkpoints/best_model.pdopt Outdated Show resolved Hide resolved
examples/cgcnn/model/checkpoints/best_model.pdparams Outdated Show resolved Hide resolved
examples/cgcnn/model/checkpoints/best_model.pdstates Outdated Show resolved Hide resolved
examples/cgcnn/model/checkpoints/latest.pdopt Outdated Show resolved Hide resolved
examples/cgcnn/model/checkpoints/latest.pdparams Outdated Show resolved Hide resolved
examples/cgcnn/model/checkpoints/latest.pdstates Outdated Show resolved Hide resolved
ppsci/arch/__init__.py Outdated Show resolved Hide resolved
ppsci/solver/eval.py Outdated Show resolved Hide resolved
ppsci/solver/printer.py Outdated Show resolved Hide resolved
ppsci/solver/solver.py Outdated Show resolved Hide resolved
docs/zh/examples/cgcnn.md Outdated Show resolved Hide resolved
mkdocs.yml Outdated
@@ -86,6 +86,7 @@ nav:
- Chip_heat: zh/examples/chip_heat.md
- 材料科学(AI for Material):
- hPINNs: zh/examples/hpinns.md
- CGCNN: zh/example/cgcnn.md
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- CGCNN: zh/example/cgcnn.md
- CGCNN: zh/examples/cgcnn.md

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

examples/cgcnn/conf/CGCNN_Demo.yaml Outdated Show resolved Hide resolved
examples/cgcnn/CGCNN.py Outdated Show resolved Hide resolved
solver = ppsci.solver.Solver(
model,
validator=validator,
pretrained_model_path=cfg.EVAL.pretrained_model_path,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pretrained_model_path=cfg.EVAL.pretrained_model_path,

examples/cgcnn/CGCNN.py Outdated Show resolved Hide resolved
Comment on lines 127 to 137
solver = ppsci.solver.Solver(
model=model,
constraint=constraint,
optimizer=optimizer,
epochs=cfg.TRAIN.epochs,
eval_during_train=True,
validator=validator,
equation=None,
output_dir=cfg.output_dir,
cfg=cfg,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
solver = ppsci.solver.Solver(
model=model,
constraint=constraint,
optimizer=optimizer,
epochs=cfg.TRAIN.epochs,
eval_during_train=True,
validator=validator,
equation=None,
output_dir=cfg.output_dir,
cfg=cfg,
)
solver = ppsci.solver.Solver(
model=model,
constraint=constraint,
optimizer=optimizer,
validator=validator,
cfg=cfg,
)

Comment on lines 38 to 48
"""Compute batch size from given input dict.
NOTE: Returned `batch_size` might be inaccurate, but it won't affect the correctness
of the training results because `batch_size` is now only used for timing.
Args:
input_dict (Dict[str, Union[paddle.Tensor, Sequence[paddle.Tensor]]]): Given input dict.
Returns:
int: Batch size of input dict.
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Compute batch size from given input dict.
NOTE: Returned `batch_size` might be inaccurate, but it won't affect the correctness
of the training results because `batch_size` is now only used for timing.
Args:
input_dict (Dict[str, Union[paddle.Tensor, Sequence[paddle.Tensor]]]): Given input dict.
Returns:
int: Batch size of input dict.
"""
"""Compute batch size from given input dict.
NOTE: Returned `batch_size` might be inaccurate, but it won't affect the correctness
of the training results because `batch_size` is now only used for timing.
Args:
input_dict (Dict[str, Union[paddle.Tensor, Sequence[paddle.Tensor]]]): Given input dict.
Returns:
int: Batch size of input dict.
"""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

均已修改

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里建议将collate_pool用FunctionalBatchTransform包裹,然后用以下形式放到dataloader_cfg中(collate_pool可能需要按照FunctionalBatchTransforms的typehint修改成规范格式),然后这个文件就可以不用改了,把Collate_fn改造完之后添加到batch_transform/文件夹下作为一个新增的批预处理类即可:

cgcnn_constraint = ppsci.constraint.SupervisedConstraint(
    dataloader_cfg={
        "dataset": {
            "name": "CGCNNDataset",
            "root_dir": cfg.TRAIN_DIR,
            "input_keys": "i",
            "label_keys": "l",
            "id_keys": "c",
        },
        "batch_size": cfg.TRAIN.batch_size,
+       "batch_transforms": [
+           {"Collate_Pool": ppsci.data.batch_transform.FunctionalBatchTransform(collate_pool)},
+       ],
    },
    loss=ppsci.loss.MAELoss("mean"),
    output_expr={"l": lambda out: out["out"]},
    name="cgcnn_constraint",
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的,我修改试试

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

遇到一个问题,PaddleScience/ppsci/data/process/batch_transform/init.py中的transform_obj = eval(transform_cls)(**transform_cfg)报错<module 'ppsci.data.process.batch_transform.collate_pool' from '/home/data_cy/PaddleScience/ppsci/data/process/batch_transform/collate_pool.py'> argument after ** must be a mapping, not FunctionalBatchTransform,没看明白这个报错

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在将"batch_transforms": [ {"collate_fn":
{"collate_pool": ppsci.data.batch_transform.FunctionalBatchTransform(collate_pool)}}]修改后,还是这个位置报错“name 'collate_fn' is not defined

Copy link
Collaborator

@HydrogenSulfate HydrogenSulfate Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在将"batch_transforms": [ {"collate_fn": {"collate_pool": ppsci.data.batch_transform.FunctionalBatchTransform(collate_pool)}}]修改后,还是这个位置报错“name 'collate_fn' is not defined

collate_fn已经支持在dataloader_cfg里传入,可以解决冲突的时候顺便改一下案例代码:

collate_fn: Optional[Callable] = cfg.pop("collate_fn", None)

dataloader_ = io.DataLoader(
dataset=_dataset,
places=device.get_device(),
batch_sampler=batch_sampler,
collate_fn=collate_fn,
num_workers=cfg.get("num_workers", _DEFAULT_NUM_WORKERS),
use_shared_memory=cfg.get("use_shared_memory", False),
worker_init_fn=init_fn,
# TODO: Do not enable 'persistent_workers' below for
# 'IndexError: pop from empty list ...' will be raised in certain cases
# persistent_workers=cfg.get("num_workers", _DEFAULT_NUM_WORKERS) > 0,
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在将"batch_transforms": [ {"collate_fn": {"collate_pool": ppsci.data.batch_transform.FunctionalBatchTransform(collate_pool)}}]修改后,还是这个位置报错“name 'collate_fn' is not defined

collate_fn已经支持在dataloader_cfg里传入,可以解决冲突的时候顺便改一下案例代码:

collate_fn: Optional[Callable] = cfg.pop("collate_fn", None)

dataloader_ = io.DataLoader(
dataset=_dataset,
places=device.get_device(),
batch_sampler=batch_sampler,
collate_fn=collate_fn,
num_workers=cfg.get("num_workers", _DEFAULT_NUM_WORKERS),
use_shared_memory=cfg.get("use_shared_memory", False),
worker_init_fn=init_fn,
# TODO: Do not enable 'persistent_workers' below for
# 'IndexError: pop from empty list ...' will be raised in certain cases
# persistent_workers=cfg.get("num_workers", _DEFAULT_NUM_WORKERS) > 0,
)

已改好,刚才处理冲突的时候不小心删除了commit,现在已经重新提交pr。。。。

@luotao1 luotao1 self-assigned this Aug 13, 2024
Comment on lines 107 to 108
if isinstance(batch_transforms_cfg, dict):
collate_fn = batch_transforms_cfg["collate_fn"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里先这样吧,我后续支持一下直接传入collate_fn功能

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的,麻烦您了

@HydrogenSulfate
Copy link
Collaborator

HydrogenSulfate commented Aug 13, 2024

@banjiuyufen 现在这份代码是否能正常运行?另外,关注一下CI测试中的报错,https://xly.bce.baidu.com/paddlepaddle/PaddleScience/newipipe/detail/11323722/job/27211139
image

@banjiuyufen
Copy link
Contributor Author

@banjiuyufen 现在这份代码是否能正常运行?另外,关注一下CI测试中的报错,https://xly.bce.baidu.com/paddlepaddle/PaddleScience/newipipe/detail/11323722/job/27211139 image

现在我这边本地可以正常训练,但是eval中需要添加和train中相同的函数去记时batch

@HydrogenSulfate
Copy link
Collaborator

@banjiuyufen 现在这份代码是否能正常运行?另外,关注一下CI测试中的报错,https://xly.bce.baidu.com/paddlepaddle/PaddleScience/newipipe/detail/11323722/job/27211139 image

现在我这边本地可以正常训练,但是eval中需要添加和train中相同的函数去记时batch

哦哦这个可以改一下eval.py,用from ppsci.solver.train import _compute_batch_size,然后用_compute_batch_size算下batch_size就行

@HydrogenSulfate
Copy link
Collaborator

@banjiuyufen 现在这份代码是否能正常运行?另外,关注一下CI测试中的报错,https://xly.bce.baidu.com/paddlepaddle/PaddleScience/newipipe/detail/11323722/job/27211139 image

现在我这边本地可以正常训练,但是eval中需要添加和train中相同的函数去记时batch

然后这个报错应该是因为你写的Example里,构造的输入数据类型不正确,paddle.rand返回的类型是浮点数,但是你的模型输入应该有一个是int64类型的表示下标的吧?通过执行: python -m doctest crystalgraphconvnet.py,可以验证你写的Example代码是不是正确的

@banjiuyufen
Copy link
Contributor Author

@banjiuyufen 现在这份代码是否能正常运行?另外,关注一下CI测试中的报错,https://xly.bce.baidu.com/paddlepaddle/PaddleScience/newipipe/detail/11323722/job/27211139 image

现在我这边本地可以正常训练,但是eval中需要添加和train中相同的函数去记时batch

然后这个报错应该是因为你写的Example里,构造的输入数据类型不正确,paddle.rand返回的类型是浮点数,但是你的模型输入应该有一个是int64类型的表示下标的吧?通过执行: python -m doctest crystalgraphconvnet.py,可以验证你写的Example代码是不是正确的

好的,我明天调整一下

@banjiuyufen
Copy link
Contributor Author

@banjiuyufen 现在这份代码是否能正常运行?另外,关注一下CI测试中的报错,https://xly.bce.baidu.com/paddlepaddle/PaddleScience/newipipe/detail/11323722/job/27211139 image

现在我这边本地可以正常训练,但是eval中需要添加和train中相同的函数去记时batch

然后这个报错应该是因为你写的Example里,构造的输入数据类型不正确,paddle.rand返回的类型是浮点数,但是你的模型输入应该有一个是int64类型的表示下标的吧?通过执行: python -m doctest crystalgraphconvnet.py,可以验证你写的Example代码是不是正确的

已重新调整

@banjiuyufen
Copy link
Contributor Author

@banjiuyufen 现在这份代码是否能正常运行?另外,关注一下CI测试中的报错,https://xly.bce.baidu.com/paddlepaddle/PaddleScience/newipipe/detail/11323722/job/27211139 image

现在我这边本地可以正常训练,但是eval中需要添加和train中相同的函数去记时batch

哦哦这个可以改一下eval.py,用from ppsci.solver.train import _compute_batch_size,然后用_compute_batch_size算下batch_size就行

目前已经可以正常训练和评估

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants