diff --git a/demo/predict.py b/demo/predict.py index f323786d..e4868dcb 100644 --- a/demo/predict.py +++ b/demo/predict.py @@ -197,7 +197,7 @@ def infer(args): ) if args.save_result: save_path = os.path.join(args.save_dir, "detect_results") - draw_result(args.image_path, result_dict, args.data.names, save_path=save_path) + draw_result(args.image_path, result_dict, args.data.names, is_coco_dataset=is_coco_dataset, save_path=save_path) logger.info("Infer completed.") diff --git a/docs/zh/how_to_guides/data_preparation.md b/docs/zh/how_to_guides/data_preparation.md index 3cbdcd94..7f031b64 100644 --- a/docs/zh/how_to_guides/data_preparation.md +++ b/docs/zh/how_to_guides/data_preparation.md @@ -19,12 +19,12 @@ ``` 其中train.txt文件每行对应单张图片的相对路径,例如: ``` -./images/train/00000000.jpg -./images/train/00000001.jpg -./images/train/00000002.jpg -./images/train/00000003.jpg -./images/train/00000004.jpg -./images/train/00000005.jpg +./images/train2017/00000000.jpg +./images/train2017/00000001.jpg +./images/train2017/00000002.jpg +./images/train2017/00000003.jpg +./images/train2017/00000004.jpg +./images/train2017/00000005.jpg ``` labels下的train2017文件夹下的txt文件为相应图片的标注信息,支持detect和segment两种格式。 @@ -50,7 +50,7 @@ segment格式:每行第一个数据为类别id,后续为两两成对的归 49 0.716891 0.0519583 0.683766 0.0103958 0.611688 0.0051875 0.568828 0.116875 0.590266 0.15325 0.590266 0.116875 0.613641 0.0857083 0.631172 0.0857083 0.6565 0.083125 0.679875 0.0883125 0.691563 0.0961042 0.711031 0.0649375 ``` -instances_val.json为coco格式的验证集标注,可直接调用coco api用于map的计算。 +instances_val2017.json为coco格式的验证集标注,可直接调用coco api用于map的计算。 训练&推理时,需修改`configs/coco.yaml`中的`train_set`,`val_set`,`test_set`为真实数据路径 diff --git a/examples/finetune_SHWD/README.md b/examples/finetune_SHWD/README.md index c5cd5bba..6b88d705 100644 --- a/examples/finetune_SHWD/README.md +++ b/examples/finetune_SHWD/README.md @@ -20,15 +20,37 @@ ├── 000000.jpg └── 000002.jpg ``` -其中,ImageSets/Main文件下的txt文件中每行代表相应子集中单张图片不含后缀的文件名,例如: +Annotations文件夹下的xml文件为每张图片的标注信息,主要内容如下: ``` -000002 -000005 -000019 -000022 -000027 -000034 + + JPEGImages + 000377.jpg + F:\baidu\VOC2028\JPEGImages\000377.jpg + + Unknown + + + 750 + 558 + 3 + + 0 + + hat + Unspecified + 0 + 0 + + 142 + 388 + 177 + 426 + + ``` +其中包含多个object, object中的name为类别名称,xmin, ymin, xmax, ymax则为检测框左上角和右下角的坐标。 + +MindYOLO支持的数据集格式为YOLO格式,详情可参考[数据准备](../../docs/zh/how_to_guides/data_preparation.md) 由于MindYOLO在验证阶段选用图片名称作为image_id,因此图片名称只能为数值类型,而不能为字符串类型,还需要对图片进行改名。对SHWD数据集格式的转换包含如下步骤: * 将图片复制到相应的路径下并改名 @@ -36,36 +58,74 @@ * 解析xml文件,在相应路径下生成对应的txt标注文件 * 验证集还需生成最终的json文件 -详细实现可参考[convert_shwd2yolo.py](./convert_shwd2yolo.py)。运行方式如下: +详细实现可参考[convert_shwd2yolo.py](./convert_shwd2yolo.py),运行方式如下: ```shell python examples/finetune_SHWD/convert_shwd2yolo.py --root_dir /path_to_shwd/SHWD ``` - 运行以上命令将在不改变原数据集的前提下,在同级目录生成yolo格式的SHWD数据集。 -#### 预训练模型文件转换 +#### 编写yaml配置文件 +配置文件主要包含数据集、数据增强、loss、optimizer、模型结构涉及的相应参数,由于MindYOLO提供yaml文件继承机制,可只将需要调整的参数编写为yolov7-tiny_shwd.yaml,并继承MindYOLO提供的原生yaml文件即可,其内容如下: +``` +__BASE__: [ + '../../configs/yolov7/yolov7-tiny.yaml', +] + +per_batch_size: 16 # 16 * 8 = 128 +img_size: 640 # image sizes +weight: ./yolov7-tiny_pretrain.ckpt +strict_load: False +log_interval: 10 + +data: + dataset_name: shwd + train_set: ./SHWD/train.txt + val_set: ./SHWD/val.txt + test_set: ./SHWD/val.txt + nc: 2 + # class names + names: [ 'person', 'hat' ] + +optimizer: + lr_init: 0.001 # initial learning rate +``` +* ```__BASE__```为一个列表,表示继承的yaml文件所在路径,可以继承多个yaml文件 +* per_batch_size和img_size分别表示单卡上的batch_size和数据处理图片采用的图片尺寸 +* weight为上述提到的预训练模型的文件路径,strict_load表示丢弃shape不一致的参数 +* log_interval表示日志打印间隔 +* data字段下全部为数据集相关参数,其中dataset_name为自定义数据集名称,train_set、val_set、test_set分别为保存训练集、验证集、测试集图片路径的txt文件路径,nc为类别数量,names为类别名称 +* optimizer字段下的lr_init为经过warm_up之后的初始化学习率,此处相比默认参数缩小了10倍 -由于SHWD数据集只有7000+张图片,选择yolov7-tiny进行该数据集的训练,可下载MindYOLO提供的在coco数据集上训练好的[模型文件](https://github.com/mindspore-lab/mindyolo/blob/master/MODEL_ZOO.md)作为预训练模型。由于coco数据集含有80种物体类别,SHWD数据集只有两类,模型的最后一层head层输出与类别数nc有关,因此需将预训练模型文件的最后一层去掉, 可参考[convert_yolov7-tiny_pretrain_ckpt.py](./convert_yolov7-tiny_pretrain_ckpt.py)。运行方式如下: +参数继承关系和参数说明可参考[configuration_CN.md](../../tutorials/configuration_CN.md)。 - ```shell - python examples/finetune_SHWD/convert_yolov7-tiny_pretrain_ckpt.py - ``` +#### 下载预训练模型 +可选用MindYOLO提供的[MODEL_ZOO](../../MODEL_ZOO.md)作为自定义数据集的预训练模型,预训练模型在COCO数据集上已经有较好的精度表现,相比从头训练,加载预训练模型一般会拥有更快的收敛速度以及更高的最终精度,并且大概率能避免初始化不当导致的梯度消失、梯度爆炸等问题。 +自定义数据集类别数通常与COCO数据集不一致,MindYOLO中各模型的检测头head结构跟数据集类别数有关,直接将预训练模型导入可能会因为shape不一致而导入失败,可以在yaml配置文件中设置strict_load参数为False,MindYOLO将自动舍弃shape不一致的参数,并抛出该module参数并未导入的告警 #### 模型微调(Finetune) -简要的训练流程可参考[finetune_shwd.py](./finetune_shwd.py) - +由于SHWD训练集只有约6000张图片,选用yolov7-tiny模型进行训练。 * 在多卡NPU/GPU上进行分布式模型训练,以8卡为例: ```shell - mpirun --allow-run-as-root -n 8 python examples/finetune_SHWD/finetune_shwd.py --config ./examples/finetune_SHWD/yolov7-tiny_shwd.yaml --is_parallel True + mpirun --allow-run-as-root -n 8 python train.py --config ./examples/finetune_SHWD/yolov7-tiny_shwd.yaml --is_parallel True ``` * 在单卡NPU/GPU/CPU上训练模型: ```shell - python examples/finetune_SHWD/finetune_shwd.py --config ./examples/finetune_SHWD/yolov7-tiny_shwd.yaml + python train.py --config ./examples/finetune_SHWD/yolov7-tiny_shwd.yaml ``` +*注意:直接用yolov7-tiny默认参数在SHWD数据集上训练,可取得AP50 87.0的精度。将lr_init参数由0.01改为0.001,即可实现ap50为89.2的精度结果。* -*注意:直接用yolov7-tiny默认coco参数在SHWD数据集上训练,可取得AP50 87.0的精度。将lr_init参数由0.01改为0.001,即可实现ap50为89.2的精度结果。* \ No newline at end of file +#### 可视化推理 +使用/demo/predict.py即可用训练好的模型进行可视化推理,运行方式如下: + +```shell +python demo/predict.py --config ./examples/finetune_SHWD/yolov7-tiny_shwd.yaml --weight=/path_to_ckpt/WEIGHT.ckpt --image_path /path_to_image/IMAGE.jpg +``` +推理效果如下: +
+ +
\ No newline at end of file diff --git a/examples/finetune_SHWD/convert_yolov7-tiny_pretrain_ckpt.py b/examples/finetune_SHWD/convert_yolov7-tiny_pretrain_ckpt.py deleted file mode 100644 index 8f506f7f..00000000 --- a/examples/finetune_SHWD/convert_yolov7-tiny_pretrain_ckpt.py +++ /dev/null @@ -1,15 +0,0 @@ -import mindspore as ms - - -def convert_weight(ori_weight, new_weight): - new_ckpt = [] - param_dict = ms.load_checkpoint(ori_weight) - for k, v in param_dict.items(): - if '77' in k: - continue - new_ckpt.append({'name': k, 'data': v}) - ms.save_checkpoint(new_ckpt, new_weight) - - -if __name__ == '__main__': - convert_weight('./yolov7-tiny_300e_mAP375-d8972c94.ckpt', './yolov7-tiny_pretrain.ckpt') diff --git a/examples/finetune_SHWD/finetune_shwd.py b/examples/finetune_SHWD/finetune_shwd.py deleted file mode 100644 index 1171cbdf..00000000 --- a/examples/finetune_SHWD/finetune_shwd.py +++ /dev/null @@ -1,153 +0,0 @@ -import os -import sys -sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '../..'))) - -import mindspore as ms - -from train import get_parser_train -from mindyolo.data import COCODataset, create_loader -from mindyolo.models import create_loss, create_model -from mindyolo.optim import (EMA, create_group_param, create_lr_scheduler, - create_optimizer, create_warmup_momentum_scheduler) -from mindyolo.utils import logger -from mindyolo.utils.config import parse_args -from mindyolo.utils.train_step_factory import (create_train_step_fn, - get_gradreducer, - get_loss_scaler) -from mindyolo.utils.trainer_factory import create_trainer -from mindyolo.utils.utils import (freeze_layers, load_pretrain, set_default, - set_seed) - - -def train_shwd(args): - # Set Default - set_seed(args.seed) - set_default(args) - main_device = args.rank % args.rank_size == 0 - - logger.info("parse_args:") - logger.info("\n" + str(args)) - logger.info("Please check the above information for the configurations") - - # Create Network - args.network.recompute = args.recompute - args.network.recompute_layers = args.recompute_layers - network = create_model( - model_name=args.network.model_name, - model_cfg=args.network, - num_classes=args.data.nc, - sync_bn=args.sync_bn, - ) - - if args.ema and main_device: - ema_network = create_model( - model_name=args.network.model_name, - model_cfg=args.network, - num_classes=args.data.nc, - ) - ema = EMA(network, ema_network) - else: - ema = None - load_pretrain(network, args.weight, ema, args.ema_weight) # load pretrain - freeze_layers(network, args.freeze) # freeze Layers - ms.amp.auto_mixed_precision(network, amp_level=args.ms_amp_level) - if ema: - ms.amp.auto_mixed_precision(ema.ema, amp_level=args.ms_amp_level) - - # Create Dataloaders - transforms = args.data.train_transforms - dataset = COCODataset( - dataset_path=args.data.train_set, - img_size=args.img_size, - transforms_dict=transforms, - is_training=True, - augment=True, - rect=args.rect, - single_cls=args.single_cls, - batch_size=args.total_batch_size, - stride=max(args.network.stride), - ) - dataloader = create_loader( - dataset=dataset, - batch_collate_fn=dataset.train_collate_fn, - dataset_column_names=dataset.dataset_column_names, - batch_size=args.per_batch_size, - epoch_size=args.epochs, - rank=args.rank, - rank_size=args.rank_size, - shuffle=True, - drop_remainder=True, - num_parallel_workers=args.data.num_parallel_workers, - python_multiprocessing=True, - ) - steps_per_epoch = dataloader.get_dataset_size() // args.epochs - - # Create Loss - loss_fn = create_loss( - **args.loss, anchors=args.network.get("anchors", 1), stride=args.network.stride, nc=args.data.nc - ) - ms.amp.auto_mixed_precision(loss_fn, amp_level="O0" if args.keep_loss_fp32 else args.ms_amp_level) - - # Create Optimizer - args.optimizer.steps_per_epoch = steps_per_epoch - lr = create_lr_scheduler(**args.optimizer) - params = create_group_param(params=network.trainable_params(), **args.optimizer) - optimizer = create_optimizer(params=params, lr=lr, **args.optimizer) - warmup_momentum = create_warmup_momentum_scheduler(**args.optimizer) - - # Create train_step_fn - reducer = get_gradreducer(args.is_parallel, optimizer.parameters) - scaler = get_loss_scaler(args.ms_loss_scaler, scale_value=args.ms_loss_scaler_value) - train_step_fn = create_train_step_fn( - network=network, - loss_fn=loss_fn, - optimizer=optimizer, - loss_ratio=args.rank_size, - scaler=scaler, - reducer=reducer, - ema=ema, - overflow_still_update=args.overflow_still_update, - ms_jit=args.ms_jit, - ) - - # Create Trainer - network.set_train(True) - optimizer.set_train(True) - model_name = os.path.basename(args.config)[:-5] # delete ".yaml" - trainer = create_trainer( - model_name=model_name, - train_step_fn=train_step_fn, - scaler=scaler, - dataloader=dataloader, - steps_per_epoch=steps_per_epoch, - network=network, - ema=ema, - optimizer=optimizer, - summary=args.summary, - loss_fn=loss_fn, - callback=[], - reducer=reducer - ) - - trainer.train( - epochs=args.epochs, - main_device=main_device, - warmup_step=max(round(args.optimizer.warmup_epochs * steps_per_epoch), args.optimizer.min_warmup_step), - warmup_momentum=warmup_momentum, - accumulate=args.accumulate, - overflow_still_update=args.overflow_still_update, - keep_checkpoint_max=args.keep_checkpoint_max, - log_interval=args.log_interval, - loss_item_name=[] if not hasattr(loss_fn, "loss_item_name") else loss_fn.loss_item_name, - save_dir=args.save_dir, - enable_modelarts=args.enable_modelarts, - train_url=args.train_url, - run_eval=args.run_eval, - ) - logger.info("Training completed.") - - -if __name__ == "__main__": - parser = get_parser_train() - args = parse_args(parser) - train_shwd(args) diff --git a/examples/finetune_SHWD/yolov7-tiny_shwd.yaml b/examples/finetune_SHWD/yolov7-tiny_shwd.yaml index a5a2acc4..cea65a3f 100644 --- a/examples/finetune_SHWD/yolov7-tiny_shwd.yaml +++ b/examples/finetune_SHWD/yolov7-tiny_shwd.yaml @@ -1,177 +1,20 @@ +__BASE__: [ + '../../configs/yolov7/yolov7-tiny.yaml', +] + per_batch_size: 16 # 16 * 8 = 128 img_size: 640 # image sizes -sync_bn: True weight: ./yolov7-tiny_pretrain.ckpt +strict_load: False data: dataset_name: shwd - train_set: ./SHWD/train.txt val_set: ./SHWD/val.txt test_set: ./SHWD/val.txt - nc: 2 - # class names names: [ 'person', 'hat' ] - num_parallel_workers: 4 - - train_transforms: - - {func_name: mosaic, prob: 1.0, mosaic9_prob: 0.2, translate: 0.1, scale: 0.5} - - {func_name: mixup, prob: 0.05, alpha: 8.0, beta: 8.0, needed_mosaic: True} - - {func_name: hsv_augment, prob: 1.0, hgain: 0.015, sgain: 0.7, vgain: 0.4} - - {func_name: pastein, prob: 0.05, num_sample: 30} - - {func_name: label_norm, xyxy2xywh_: True} - - {func_name: fliplr, prob: 0.5} - - {func_name: label_pad, padding_size: 160, padding_value: -1} - - {func_name: image_norm, scale: 255.} - - {func_name: image_transpose, bgr2rgb: True, hwc2chw: True} - - test_transforms: - - {func_name: letterbox, scaleup: False} - - {func_name: label_norm, xyxy2xywh_: True} - - {func_name: label_pad, padding_size: 160, padding_value: -1} - - {func_name: image_norm, scale: 255. } - - {func_name: image_transpose, bgr2rgb: True, hwc2chw: True } - optimizer: - optimizer: momentum lr_init: 0.001 # initial learning rate - momentum: 0.937 # SGD momentum/Adam beta1 - nesterov: True # update gradients with NAG(Nesterov Accelerated Gradient) algorithm - loss_scale: 1.0 # loss scale for optimizer - warmup_epochs: 3 # warmup epochs (fractions ok) - warmup_momentum: 0.8 # warmup initial momentum - warmup_bias_lr: 0.1 # warmup initial bias lr - min_warmup_step: 1000 # minimum warmup step - group_param: yolov7 # group param strategy - gp_weight_decay: 0.0005 # group param weight decay 5e-4 - start_factor: 1.0 - end_factor: 0.01 - -loss: - name: YOLOv7Loss - box: 0.05 # box loss gain - cls: 0.5 # cls loss gain - cls_pw: 1.0 # cls BCELoss positive_weight - obj: 1.0 # obj loss gain (scale with pixels) - obj_pw: 1.0 # obj BCELoss positive_weight - fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5) - anchor_t: 4.0 # anchor-multiple threshold - label_smoothing: 0.0 # label smoothing epsilon - -network: - model_name: yolov7 - depth_multiple: 1.0 # model depth multiple - width_multiple: 1.0 # layer channel multiple - - stride: [8, 16, 32] - - # anchors - anchors: - - [10,13, 16,30, 33,23] # P3/8 - - [30,61, 62,45, 59,119] # P4/16 - - [116,90, 156,198, 373,326] # P5/32 - - # yolov7-tiny backbone - backbone: - # [from, number, module, args] c2, k=1, s=1, p=None, g=1, d=1, act=True - [[-1, 1, ConvNormAct, [32, 3, 2, None, 1, 1, nn.LeakyReLU(0.1)]], # 0-P1/2 - - [-1, 1, ConvNormAct, [64, 3, 2, None, 1, 1, nn.LeakyReLU(0.1)]], # 1-P2/4 - - [-1, 1, ConvNormAct, [32, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-2, 1, ConvNormAct, [32, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, ConvNormAct, [32, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, ConvNormAct, [32, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [[-1, -2, -3, -4], 1, Concat, [1]], - [-1, 1, ConvNormAct, [64, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], # 7 - - [-1, 1, MP, []], # 8-P3/8 - [-1, 1, ConvNormAct, [64, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-2, 1, ConvNormAct, [64, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, ConvNormAct, [64, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, ConvNormAct, [64, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [[-1, -2, -3, -4], 1, Concat, [1]], - [-1, 1, ConvNormAct, [128, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], # 14 - - [-1, 1, MP, []], # 15-P4/16 - [-1, 1, ConvNormAct, [128, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-2, 1, ConvNormAct, [128, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, ConvNormAct, [128, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, ConvNormAct, [128, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [[-1, -2, -3, -4], 1, Concat, [1]], - [-1, 1, ConvNormAct, [256, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], # 21 - - [-1, 1, MP, []], # 22-P5/32 - [-1, 1, ConvNormAct, [256, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-2, 1, ConvNormAct, [256, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, ConvNormAct, [256, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, ConvNormAct, [256, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [[-1, -2, -3, -4], 1, Concat, [1]], - [-1, 1, ConvNormAct, [512, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], # 28 - ] - - # yolov7-tiny head - head: - [[-1, 1, ConvNormAct, [256, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-2, 1, ConvNormAct, [256, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, SP, [5]], - [-2, 1, SP, [9]], - [-3, 1, SP, [13]], - [[-1, -2, -3, -4], 1, Concat, [1]], - [-1, 1, ConvNormAct, [256, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [[-1, -7], 1, Concat, [1]], - [-1, 1, ConvNormAct, [256, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], # 37 - - [-1, 1, ConvNormAct, [128, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, Upsample, [None, 2, 'nearest']], - [21, 1, ConvNormAct, [128, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], # route backbone P4 - [[-1, -2], 1, Concat, [1]], - - [-1, 1, ConvNormAct, [64, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-2, 1, ConvNormAct, [64, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, ConvNormAct, [64, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, ConvNormAct, [64, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [[-1, -2, -3, -4], 1, Concat, [1]], - [-1, 1, ConvNormAct, [128, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], # 47 - - [-1, 1, ConvNormAct, [64, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, Upsample, [None, 2, 'nearest']], - [14, 1, ConvNormAct, [64, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], # route backbone P3 - [[-1, -2], 1, Concat, [1]], - - [-1, 1, ConvNormAct, [32, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-2, 1, ConvNormAct, [32, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, ConvNormAct, [32, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, ConvNormAct, [32, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [[-1, -2, -3, -4], 1, Concat, [1]], - [-1, 1, ConvNormAct, [64, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], # 57 - - [-1, 1, ConvNormAct, [128, 3, 2, None, 1, 1, nn.LeakyReLU(0.1)]], - [[-1, 47], 1, Concat, [1]], - - [-1, 1, ConvNormAct, [64, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-2, 1, ConvNormAct, [64, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, ConvNormAct, [64, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, ConvNormAct, [64, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [[-1, -2, -3, -4], 1, Concat, [1]], - [-1, 1, ConvNormAct, [128, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], # 65 - - [-1, 1, ConvNormAct, [256, 3, 2, None, 1, 1, nn.LeakyReLU(0.1)]], - [[-1, 37], 1, Concat, [1]], - - [-1, 1, ConvNormAct, [128, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-2, 1, ConvNormAct, [128, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, ConvNormAct, [128, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [-1, 1, ConvNormAct, [128, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [[-1, -2, -3, -4], 1, Concat, [1]], - [-1, 1, ConvNormAct, [256, 1, 1, None, 1, 1, nn.LeakyReLU(0.1)]], # 73 - - [57, 1, ConvNormAct, [128, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [65, 1, ConvNormAct, [256, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - [73, 1, ConvNormAct, [512, 3, 1, None, 1, 1, nn.LeakyReLU(0.1)]], - - [[74,75,76], 1, YOLOv7Head, [nc, anchors, stride]], # Detect(P3, P4, P5) - ]