For dicom datasets you will need pydicom and gdcmconv libraries that are only available in Anaconda.
-
CT-ORG
- Download CT-ORG dataset from cancerImagingarchive
- Use convertCT_ORG.py, insert path to your dataset in path (line 11)
-
MosMedData
- Download Chest CT Scans with COVID-19 dataset from mosmed.ai
- Use converterMosMedAiCOVID.py , insert path to your dataset in path (line 11)
-
KITS-19
- Download KITS-19 dataset according to instruction included in Kits repository
- Use converterKITS19.py, insert path to your dataset in path (line 11), in (line 89) add path to kits.json file
-
LIDC-IDRI
- Download LIDC-IDRI dataset from cancerImagingarchive
- delete files that don't match the others (XRAYS)
- use retrieve_data_from_xml.py to create nodule masks from xmls, add path to directory (line 150)
- use convertLIDC.py, insert path to your pictures in path (line 16), and path to masks (line 18)
-
QIN-BRAIN-DSC-MRI
- Download QIN-BRAIN-DSC-MRI dataset from cancerImagingarchive
- Use converterQIN_BRAIN_MRI.py, insert path to your dataset in path (line 11)
-
Brain Tumor Classification (MRI)
- Download Brain Tumor Classification (MRI) dataset from kaggle.com
- Use converterBrain_Tumor_Classification_MRI.py, insert path to your dataset in path (line 6)
-
Brain-Tumor-Progression
- Download Brain-Tumor-Progression dataset from cancerImagingarchive
- Use converterBrainTumorProgression.py, insert path to your dataset in path (line 10)
-
Chest X-ray 14
- Download from https://nihcc.app.box.com/v/ChestXray-NIHCC/folder/36938765345
- Use ConvertCSV_ChestX_ray14.py and chest_x_ray14_process.py Processing images:)
- open file ConvertCSV_ChestX_ray14.py and change ‚data_csv’ value to path to file‚ Data_Entry_2017_v2020.csv’ in downloaded folder and run the script. It will output file labels.csv.
- open file chest_x_ray14_process.py
- assign path to folder with source images to variable ‚dir_s’
- assign path to produced labels.csv file with labels to variable ‚dir_labels’
- assign destination path for result images to variable ‚dir_d’
- run script and after execution, processed images will be in destination folder For source images in multiple folders, script can be run multiple times with different paths.
-
Xray Lung segmentation from Chest X-Rays
- Download from: https://www.kaggle.com/nikhilpandey360chest-xray-masksand-labels
- Change path variables to downloaded dataset in lung_segmentation_Xray_unifying.py script
- masks_path = "dataset_lungs/data Lung Segmentation/masks"
- images_path = "dataset_lungs/data/Lung Segmentation/CXR_png"
- Run script. This will result in unified dataset with masks in new folders
-
Xray CoronaHack -Chest X-Ray-Dataset
- Download from: https://www.kaggle.com/praveengovi/coronahack-chestxraydataset?select=Chest_xray_Corona_Metadata.csv
- Change path variables to downloaded dataset in coronahack_process.py script
- dir.append('first_path_to_dataset_folder')
- dir.append('second_path_to_dataset_folder')
- dir_csv = '' # path to .csv file with labels
- dir_d = '' # path to destination folder
- Run Script