Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Example] Add TGCN Model for Traffic Forecasting #972

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

casia-rxwang
Copy link

PR types

Others

PR changes

Others

Describe

  • add TGCN model for traffic forecasting
  • add docs for TGCN
  • add example for TGCN
  • add interface for PEMSD4 & PEMSD8 dataset

* add TGCN docs
* add TGCN model
* add TGCN example
* add PEMSD4 & PEMSD8 dataset
Copy link

paddle-bot bot commented Aug 15, 2024

Thanks for your contribution!

@CLAassistant
Copy link

CLAassistant commented Aug 15, 2024

CLA assistant check
All committers have signed the CLA.

Comment on lines 10 to 12
# Train
python PaddleScience/examples/tgcn/run.py data_name=PEMSD8
# python PaddleScience/examples/tgcn/run.py data_name=PEMSD4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

案例的默认执行路径在对应的案例文件夹下,而不是PaddleScience同级的目录下

Suggested change
# Train
python PaddleScience/examples/tgcn/run.py data_name=PEMSD8
# python PaddleScience/examples/tgcn/run.py data_name=PEMSD4
python run.py data_name=PEMSD8
# python run.py data_name=PEMSD4

Comment on lines 18 to 20
# Eval
python PaddleScience/examples/tgcn/run.py data_name=PEMSD8 mode=eval
# python PaddleScience/examples/tgcn/run.py data_name=PEMSD4 mode=eval
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

案例的默认执行路径在对应的案例文件夹下,而不是PaddleScience同级的目录下

Suggested change
# Eval
python PaddleScience/examples/tgcn/run.py data_name=PEMSD8 mode=eval
# python PaddleScience/examples/tgcn/run.py data_name=PEMSD4 mode=eval
python run.py data_name=PEMSD8 mode=eval
# python run.py data_name=PEMSD4 mode=eval

Comment on lines 3 to 6
开始训练、评估前,请下载数据集:[PEMSD4 & PEMSD8](https://paddle-org.bj.bcebos.com/paddlescience/datasets/tgcn/tgcn_data.zip)。将解压后的数据集文件夹与 `PaddleScience` 文件夹放置于同一目录下。

开始评估前,请下载或训练生成预训练模型:[PEMSD4](https://paddle-org.bj.bcebos.com/paddlescience/models/tgcn/PEMSD4_pretrained_model.pdparams) & [PEMSD8](https://paddle-org.bj.bcebos.com/paddlescience/models/tgcn/PEMSD8_pretrained_model.pdparams)。将预训练模型文件与 `PaddleScience` 文件夹放置于同一目录下。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3-5行的说明可以删除,训练和评估开始前的数据集下载命令请直接放到文档的命令中,参考如下
image
image

Comment on lines 289 to 295
下表展示了 TGCN 在 PEMSD4 和 PEMSD8 两个数据集上的评估结果。

| 数据集 | MAE | RMSE |
| :----- | :---- | :---- |
| PEMSD4 | 21.48 | 34.06 |
| PEMSD8 | 15.57 | 24.52 |

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

文档开头位置加上数据集和对应的预训练模型,如下所示

image

Comment on lines 1 to 17
hydra:
run:
# dynamic output directory according to running time and override name
dir: __exp__/${data_name}/${now:%Y_%m_%d_%H_%M_%S}
job:
name: ${mode} # name of logfile
chdir: false # keep current working directory unchanged
config:
override_dirname:
exclude_keys:
- mode
- output_dir
- log_freq
sweep:
# output directory for multirun
dir: ${hydra.run.dir}
subdir: ./
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
hydra:
run:
# dynamic output directory according to running time and override name
dir: __exp__/${data_name}/${now:%Y_%m_%d_%H_%M_%S}
job:
name: ${mode} # name of logfile
chdir: false # keep current working directory unchanged
config:
override_dirname:
exclude_keys:
- mode
- output_dir
- log_freq
sweep:
# output directory for multirun
dir: ${hydra.run.dir}
subdir: ./
defaults:
- ppsci_default
- TRAIN: train_default
- TRAIN/ema: ema_default
- TRAIN/swa: swa_default
- EVAL: eval_default
- INFER: infer_default
- _self_
hydra:
run:
# dynamic output directory according to running time and override name
dir: outputs_tgcn/${now:%Y-%m-%d}/${now:%H-%M-%S}
job:
name: ${mode} # name of logfile
chdir: false # keep current working directory unchanged
callbacks:
init_callback:
_target_: ppsci.utils.callbacks.InitCallback
sweep:
# output directory for multirun
dir: ${hydra.run.dir}
subdir: ./

Comment on lines 25 to 26
'input_keys': cfg.MODEL.afno.input_keys,
'label_keys': cfg.MODEL.afno.label_keys,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

全局搜索修改一下:cfg.MODEL.afno. --> cfg.MODEL.

Comment on lines 115 to 122
# set random seed for reproducibility
ppsci.utils.misc.set_random_seed(cfg.seed)

# initialize logger
logger.init_logger('ppsci', os.path.join(cfg.output_dir, 'test.log'), 'info')
logger.message(cfg)

# set eval dataloader config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# set random seed for reproducibility
ppsci.utils.misc.set_random_seed(cfg.seed)
# initialize logger
logger.init_logger('ppsci', os.path.join(cfg.output_dir, 'test.log'), 'info')
logger.message(cfg)
# set eval dataloader config
# set eval dataloader config

Comment on lines 12 to 19
# set random seed for reproducibility
ppsci.utils.misc.set_random_seed(cfg.seed)

# initialize logger
logger.init_logger('ppsci', os.path.join(cfg.output_dir, 'train.log'), 'info')
logger.message(cfg)

# set train dataloader config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# set random seed for reproducibility
ppsci.utils.misc.set_random_seed(cfg.seed)
# initialize logger
logger.init_logger('ppsci', os.path.join(cfg.output_dir, 'train.log'), 'info')
logger.message(cfg)
# set train dataloader config
# set train dataloader config

Comment on lines 70 to 71
self.edge_index = pp.to_tensor(data=edge_index, place=cfg.device)
self.edge_attr = pp.to_tensor(data=edge_attr, place=cfg.device)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这两个属性好像并没有被用到?


self.edge_index = pp.to_tensor(data=edge_index, place=cfg.device)
self.edge_attr = pp.to_tensor(data=edge_attr, place=cfg.device)
self.adj = pp.to_tensor(data=adj, place=cfg.device)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 跟pytorch不同,to_tensor的place不需要手动指定,会默认使用当前全局设备设置
  2. 在__init__中生成的张量,如果不需要进行梯度更新,则建议使用如下方式注册为 buffer,

image

如需要梯度更新,则注册为parameter,
image

否则模型保存的时候,不会将这些变量持久化保存到参数文件中

@HydrogenSulfate
Copy link
Collaborator

@casia-rxwang 签署一下CLA协议:
image

@luotao1 luotao1 self-assigned this Aug 19, 2024
@HydrogenSulfate
Copy link
Collaborator

@casia-rxwang 修改代码的时候顺便合并一下最新develop分支的代码,并解决一下冲突

@HydrogenSulfate HydrogenSulfate changed the title Add TGCN Model for Traffic Forecasting [Example] Add TGCN Model for Traffic Forecasting Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants