Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable read the mask image #1212

Closed
1 task done
Drangonliao123 opened this issue Jul 22, 2023 · 8 comments · Fixed by #1277
Closed
1 task done

Unable read the mask image #1212

Drangonliao123 opened this issue Jul 22, 2023 · 8 comments · Fixed by #1277
Labels

Comments

@Drangonliao123
Copy link

Describe the bug

In post_process.py, the mask graph is generated separately by editing the code, but no matter how the code is modified, it is invalid.

Dataset

Folder

Model

PADiM

Steps to reproduce the behavior

python tools/train.py --config src/anomalib/models/patchcore/config.yaml

OS information

None

Expected behavior

In post_process.py, the mask graph is generated separately by editing the code, but no matter how the code is modified, it is invalid.

Screenshots

No response

Pip/GitHub

GitHub

What version/branch did you use?

No response

Configuration YAML

dataset:
  name: HDRL               # 数据集的名字,如MVTec等,这个不重要
  format: folder
  path: H:/abnormal-dataset # 自制数据集路径
  normal_dir: H:/abnormal-dataset/HDRL/normal22      # 自制数据集正样本子文件夹
  abnormal_dir: H:/abnormal-dataset/HDRL/abnormal/I30R13  # 自制数据集负样本子文件夹
  mask_dir: null           # 二值掩膜路径,自制数据集一般没有,填null
  normal_test_dir: null # name of the folder containing normal test images.
  task: classification # classification or segmentation
  extensions: null
  normalization: imagenet  # 此处添加imagenet
  split_ratio: 0.2 # ratio of the normal images that will be used to create a test split
  image_size: 224
  train_batch_size: 32
  test_batch_size: 32
  num_workers: 8
  transform_config:
    train: null
    val: null
  test_split_mode: from_dir # 此处添加
  test_split_ratio: 0.2
  val_split_mode: same_as_test
  val_split_ratio: 0.9
  create_validation_set: true
  tiling:
    apply: true
    tile_size: null
    stride: null
    remove_border_count: 0
    use_random_tiling: False
    random_tile_count: 16

model:
  name: padim
  backbone: resnet18
  pre_trained: true
  layers:
    - layer1
    - layer2
    - layer3
  normalization_method: min_max # options: [none, min_max, cdf]

metrics:
  image:
    - F1Score
    - AUROC
#  pixel:
#    - F1Score
#    - AUROC
  threshold:
    method: adaptive #options: [adaptive, manual]
    manual_image: null
    manual_pixel: null

visualization:
  show_images: false # show images on the screen
  save_images: True # save images to the file system
  log_images: True # log images to the available loggers (if any)
  image_save_path: null # path to which images will be saved
  mode: full # options: ["full", "simple"]

project:
  seed: 42
  path: ./results

logging:
  logger: [] # options: [comet, tensorboard, wandb, csv] or combinations.
  log_graph: false # Logs the model graph to respective logger.

optimization:
  export_mode: torch # options: torch, onnx, openvino

# PL Trainer Args. Don't add extra parameter here.
trainer:
  enable_checkpointing: true
  default_root_dir: null
  gradient_clip_val: 0
  gradient_clip_algorithm: norm
  num_nodes: 1
  devices: 1
  enable_progress_bar: true
  overfit_batches: 0.0
  track_grad_norm: -1
  check_val_every_n_epoch: 1 # Don't validate before extracting features.
  fast_dev_run: false
  accumulate_grad_batches: 1
  max_epochs: 1
  min_epochs: null
  max_steps: -1
  min_steps: null
  max_time: null
  limit_train_batches: 1.0
  limit_val_batches: 1.0
  limit_test_batches: 1.0
  limit_predict_batches: 1.0
  val_check_interval: 1.0 # Don't validate before extracting features.
  log_every_n_steps: 50
  accelerator: auto # <"cpu", "gpu", "tpu", "ipu", "hpu", "auto">
  strategy: null
  sync_batchnorm: false
  precision: 32
  enable_model_summary: true
  num_sanity_val_steps: 0
  profiler: null
  benchmark: false
  deterministic: false
  reload_dataloaders_every_n_epochs: 0
  auto_lr_find: false
  replace_sampler_ddp: true
  detect_anomaly: true
  auto_scale_batch_size: false
  plugins: null
  move_metrics_to_cpu: false
  multiple_trainloader_mode: max_size_cycle

Logs

dataset:
  name: HDRL
  format: folder
  path: H:/abnormal-dataset
  normal_dir: H:/abnormal-dataset/HDRL/normal22
  abnormal_dir: H:/abnormal-dataset/HDRL/abnormal/I30R13
  mask_dir: null
  normal_test_dir: null
  task: segmentation
  extensions: null
  normalization: imagenet
  split_ratio: 0.2
  image_size:
  - 625
  - 434
  train_batch_size: 32
  test_batch_size: 32
  num_workers: 8
  transform_config:
    train: null
    val: null
    eval: null
  test_split_mode: from_dir
  test_split_ratio: 0.2
  val_split_mode: from_test
  val_split_ratio: 0.9
  create_validation_set: true
  tiling:
    apply: true
    tile_size: null
    stride: null
    remove_border_count: 0
    use_random_tiling: false
    random_tile_count: 16
  eval_batch_size: 32
  root: H:/abnormal-dataset
model:
  name: padim
  backbone: resnet18
  pre_trained: true
  layers:
  - layer1
  - layer2
  - layer3
  normalization_method: min_max
  input_size:
  - 625
  - 434
metrics:
  image:
  - F1Score
  - AUROC
  threshold:
    method: adaptive
    manual_image: null
    manual_pixel: null
visualization:
  show_images: false
  save_images: true
  log_images: true
  image_save_path: null
  mode: full
project:
  seed: 42
  path: results\padim\HDRL\run
  unique_dir: false
logging:
  logger:
  - comet
  - tensorboard
  - wandb
  - csv
  log_graph: false
optimization:
  export_mode: torch
trainer:
  enable_checkpointing: true
  default_root_dir: results\padim\HDRL\run
  gradient_clip_val: 0
  gradient_clip_algorithm: norm
  num_nodes: 1
  devices: 1
  enable_progress_bar: true
  overfit_batches: 0.0
  track_grad_norm: -1
  check_val_every_n_epoch: 1
  fast_dev_run: false
  accumulate_grad_batches: 1
  max_epochs: 1
  min_epochs: null
  max_steps: -1
  min_steps: null
  max_time: null
  limit_train_batches: 1.0
  limit_val_batches: 1.0
  limit_test_batches: 1.0
  limit_predict_batches: 1.0
  val_check_interval: 1.0
  log_every_n_steps: 50
  accelerator: auto
  strategy: null
  sync_batchnorm: false
  precision: 32
  enable_model_summary: true
  num_sanity_val_steps: 0
  profiler: null
  benchmark: false
  deterministic: false
  reload_dataloaders_every_n_epochs: 0
  auto_lr_find: false
  replace_sampler_ddp: true
  detect_anomaly: true
  auto_scale_batch_size: false
  plugins: null
  move_metrics_to_cpu: false
  multiple_trainloader_mode: max_size_cycle

Code of Conduct

  • I agree to follow this project's Code of Conduct
@samet-akcay
Copy link
Contributor

samet-akcay commented Jul 22, 2023

In your config, task is set to classification, so anomalib treats this as a classification problem and do not produce a mask

@samet-akcay
Copy link
Contributor

In post_process.py, the mask graph is generated separately by editing the code, but no matter how the code is modified, it is invalid.

Can you also show how this part is edited?

@Drangonliao123
Copy link
Author

Drangonliao123 commented Jul 23, 2023

Thank you very much for your reply. It might be the reason that I made my own dataset, and since I don't have the corresponding mask, when I change my task to split, the code run will run with an error:

(knn) H:\anomalib-main>python tools/train.py --config src/anomalib/models/padim/config.yaml
D:\Anaconda3\envs\knn\lib\site-packages\anomalib\config\config.py:275: UserWarning: config.project.unique_dir is set to False. This does not ensure that your results will be written in an empty directory and you may overwrite files.
  warn(
Global seed set to 42
2023-07-23 10:49:28,557 - anomalib.data - INFO - Loading the datamodule
2023-07-23 10:49:28,558 - anomalib.data.utils.transform - INFO - No config file has been provided. Using default transforms.
2023-07-23 10:49:28,558 - anomalib.data.utils.transform - INFO - No config file has been provided. Using default transforms.
2023-07-23 10:49:28,558 - anomalib.models - INFO - Loading the model.
2023-07-23 10:49:28,558 - anomalib.models.components.base.anomaly_module - INFO - Initializing PadimLightning model.
D:\Anaconda3\envs\knn\lib\site-packages\torchmetrics\utilities\prints.py:36: UserWarning: Metric `PrecisionRecallCurve` will save all targets and predictions in buffer. For large datasets this may lead to large memory footprint.
  warnings.warn(*args, **kwargs)
2023-07-23 10:49:28,566 - anomalib.models.components.feature_extractors.timm - WARNING - FeatureExtractor is deprecated. Use TimmFeatureExtractor instead. Both FeatureExtractor and TimmFeatureExtractor will be removed in a future re
lease.
2023-07-23 10:49:28,797 - timm.models.helpers - INFO - Loading pretrained weights from url (https://download.pytorch.org/models/resnet18-5c106cde.pth)
2023-07-23 10:49:28,961 - anomalib.utils.loggers - INFO - Loading the experiment logger(s)
2023-07-23 10:49:28,961 - anomalib.utils.callbacks - INFO - Loading the callbacks
2023-07-23 10:49:28,964 - anomalib.utils.callbacks - INFO - Setting model export to torch
2023-07-23 10:49:28,980 - pytorch_lightning.utilities.rank_zero - INFO - GPU available: False, used: False
2023-07-23 10:49:28,981 - pytorch_lightning.utilities.rank_zero - INFO - TPU available: False, using: 0 TPU cores
2023-07-23 10:49:28,981 - pytorch_lightning.utilities.rank_zero - INFO - IPU available: False, using: 0 IPUs
2023-07-23 10:49:28,981 - pytorch_lightning.utilities.rank_zero - INFO - HPU available: False, using: 0 HPUs
2023-07-23 10:49:28,981 - pytorch_lightning.utilities.rank_zero - INFO - `Trainer(limit_train_batches=1.0)` was configured so 100% of the batches per epoch will be used..
2023-07-23 10:49:28,982 - pytorch_lightning.utilities.rank_zero - INFO - `Trainer(limit_val_batches=1.0)` was configured so 100% of the batches will be used..
2023-07-23 10:49:28,982 - pytorch_lightning.utilities.rank_zero - INFO - `Trainer(limit_test_batches=1.0)` was configured so 100% of the batches will be used..
2023-07-23 10:49:28,982 - pytorch_lightning.utilities.rank_zero - INFO - `Trainer(limit_predict_batches=1.0)` was configured so 100% of the batches will be used..
2023-07-23 10:49:28,982 - pytorch_lightning.utilities.rank_zero - INFO - `Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
2023-07-23 10:49:28,982 - anomalib - INFO - Training the model.
2023-07-23 10:49:30,116 - anomalib.data.base.datamodule - INFO - No normal test images found. Sampling from training set using a split ratio of 0.20
D:\Anaconda3\envs\knn\lib\site-packages\torchmetrics\utilities\prints.py:36: UserWarning: Metric `ROC` will save all targets and predictions in buffer. For large datasets this may lead to large memory footprint.
  warnings.warn(*args, **kwargs)
D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\core\optimizer.py:183: UserWarning: `LightningModule.configure_optimizers` returned `None`, this fit will run with no optimizer
  rank_zero_warn(
2023-07-23 10:49:30,443 - pytorch_lightning.callbacks.model_summary - INFO -
  | Name                  | Type                     | Params
-------------------------------------------------------------------
0 | image_threshold       | AnomalyScoreThreshold    | 0
1 | pixel_threshold       | AnomalyScoreThreshold    | 0
2 | model                 | PadimModel               | 2.8 M
3 | image_metrics         | AnomalibMetricCollection | 0
4 | pixel_metrics         | AnomalibMetricCollection | 0
5 | normalization_metrics | MinMax                   | 0
-------------------------------------------------------------------
2.8 M     Trainable params
0         Non-trainable params
2.8 M     Total params
11.131    Total estimated model params size (MB)
D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\trainer\connectors\data_connector.py:224: PossibleUserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the
 value of the `num_workers` argument` (try 12 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(
D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\trainer\connectors\data_connector.py:224: PossibleUserWarning: The dataloader, val_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the
 value of the `num_workers` argument` (try 12 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(
Epoch 0:   0%|                                                                                                                                                                                                 | 0/155 [00:00<?, ?it/s]D
:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\loops\optimization\optimizer_loop.py:138: UserWarning: `training_step` returned `None`. If this was on purpose, ignore this warning...
  self.warning_cache.warn("`training_step` returned `None`. If this was on purpose, ignore this warning...")
Epoch 0:  79%|████████████████2023-07-23 10:52:26,567 - anomalib.models.padim.lightning_model - INFO - Aggregating the embedding extracted from the training set.                          | 122/155 [02:56<00:47,  1.44s/it, loss=nan]
2023-07-23 10:52:31,763 - anomalib.models.padim.lightning_model - INFO - Fitting a Gaussian to the embedding collected from the training set.
[ WARN:0@208.237] global loadsave.cpp:248 cv::findDecoder imread_(''): can't open/read file: check file path/integrity
Traceback (most recent call last):
  File "H:\anomalib-main\tools\train.py", line 79, in <module>
    train(args)
  File "H:\anomalib-main\tools\train.py", line 64, in train
    trainer.fit(model=model, datamodule=datamodule)
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 608, in fit
    call._call_and_handle_interrupt(
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\trainer\call.py", line 38, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 650, in _fit_impl
    self._run(model, ckpt_path=self.ckpt_path)
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1112, in _run
    results = self._run_stage()
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1191, in _run_stage
    self._run_train()
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1214, in _run_train
    self.fit_loop.run()
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\loops\loop.py", line 199, in run
    self.advance(*args, **kwargs)
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\loops\fit_loop.py", line 267, in advance
    self._outputs = self.epoch_loop.run(self._data_fetcher)
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\loops\loop.py", line 200, in run
    self.on_advance_end()
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\loops\epoch\training_epoch_loop.py", line 250, in on_advance_end
    self._run_validation()
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\loops\epoch\training_epoch_loop.py", line 308, in _run_validation
    self.val_loop.run()
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\loops\loop.py", line 199, in run
    self.advance(*args, **kwargs)
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\loops\dataloader\evaluation_loop.py", line 152, in advance
    dl_outputs = self.epoch_loop.run(self._data_fetcher, dl_max_batches, kwargs)
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\loops\loop.py", line 199, in run
    self.advance(*args, **kwargs)
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\loops\epoch\evaluation_epoch_loop.py", line 121, in advance
    batch = next(data_fetcher)
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\utilities\fetching.py", line 184, in __next__
    return self.fetching_function()
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\utilities\fetching.py", line 265, in fetching_function
    self._fetch_next_batch(self.dataloader_iter)
  File "D:\Anaconda3\envs\knn\lib\site-packages\pytorch_lightning\utilities\fetching.py", line 280, in _fetch_next_batch
    batch = next(iterator)
  File "D:\Anaconda3\envs\knn\lib\site-packages\torch\utils\data\dataloader.py", line 633, in __next__
    data = self._next_data()
  File "D:\Anaconda3\envs\knn\lib\site-packages\torch\utils\data\dataloader.py", line 677, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "D:\Anaconda3\envs\knn\lib\site-packages\torch\utils\data\_utils\fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "D:\Anaconda3\envs\knn\lib\site-packages\torch\utils\data\_utils\fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "D:\Anaconda3\envs\knn\lib\site-packages\anomalib\data\base\dataset.py", line 133, in __getitem__
    mask = cv2.imread(mask_path, flags=0) / 255.0
TypeError: unsupported operand type(s) for /: 'NoneType' and 'float'
Epoch 0:  79%|███████▊  | 122/155 [03:24<00:55,  1.68s/it, loss=nan]

@blaz-r
Copy link
Contributor

blaz-r commented Jul 25, 2023

This happens because you are trying to read a mask but your path to mask is not valid, since you don't have it set.

@samet-akcay
Copy link
Contributor

Based on the config file you shared,

dataset:
  name: HDRL
  format: folder
  path: H:/abnormal-dataset
  normal_dir: H:/abnormal-dataset/HDRL/normal22
  abnormal_dir: H:/abnormal-dataset/HDRL/abnormal/I30R13
  mask_dir: null
  normal_test_dir: null
  task: segmentation
  ...

You do not specify any mask directory, but set the task to segmentation, which requires masks. If you set it to classification it should work.

@samet-akcay
Copy link
Contributor

samet-akcay commented Aug 4, 2023

A bit more context,

When the task is set to segmentation and mask_path is not provided, here what happens

elif self.task in (TaskType.DETECTION, TaskType.SEGMENTATION):
# Only Anomalous (1) images have masks in anomaly datasets
# Therefore, create empty mask for Normal (0) images.
if label_index == 0:
mask = np.zeros(shape=image.shape[:2])
else:
mask = cv2.imread(mask_path, flags=0) / 255.0

mask_path is '', which is the reason why this error is shown.

Would it be an idea to improve the error message? Thoughts? @ashwinvaidya17, @djdameln, @blaz-r, @jpcbertoldo

@samet-akcay samet-akcay changed the title [Bug]: Unable to generate mask graph alone Unable read the mask image Aug 4, 2023
@blaz-r
Copy link
Contributor

blaz-r commented Aug 4, 2023

I think that would be a good thing to add. Would it also make sense to add a check if mask_dir exists if task is segmentation, or is there a case where segmentation is done without masks?

@samet-akcay
Copy link
Contributor

I think that would be a good thing to add. Would it also make sense to add a check if mask_dir exists if task is segmentation, or is there a case where segmentation is done without masks?

Segmentation could be done without masks if the user choose to generate synthetic anomalies. This could be set from the following:

  test_split_mode: from_dir # options: [from_dir, synthetic]
  test_split_ratio: 0.2 # fraction of train images held out testing (usage depends on test_split_mode)
  val_split_mode: same_as_test # options: [same_as_test, from_test, synthetic]
  val_split_ratio: 0.5 # fraction of train/test images held out for validation (usage depends on val_split_mode)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants