SegPath : The largest-scale datasets for the segmentation of cancer histology images

Overview

SegPath is created for the semantic segmentation of H&E images for eight major cell types in tumor tissue.

The dataset is constructed by immunofluorescence restaing. First, sections were stained with H&E. They were then digitized using a slide scanner to create whole slide images (WSIs). After destaining the H&E-stained sections with alcohol and autoclave processing, IF and 4',6-diamidino-2-phenylindole dihydrochloride (DAPI) nuclear staining were performed using antibodies that specifically recognized each cell type. The slides were then digitized again. Multiresolution rigid registration between the H&E and IF images was performed to ensure that the haematoxylin component in the H&E images and DAPI in the IF images, both recognizing nuclei, had been aligned.

Cell types in SegPath

epithelial cells (anti-panCK)
smooth muscle cells/myofibroblasts (anti-αSMA)
red blood cells (anti-CD235a)
leukocytes (anti-CD45RB)
lymphocytes (anti-CD3/CD20)
endothelial cells (anti-ERG)
plasma cells (anti-MIST1)
myeloid cells (anti-MNDA)
epithelial cells (anti-panCK)
DL link : https://zenodo.org/record/7412731
image
smooth muscle cells/myofibroblasts (anti-αSMA)
DL link : https://zenodo.org/record/7412732
image
red blood cells (anti-CD235a)
DL link : https://zenodo.org/record/7412580
image
leukocytes (anti-CD45RB)
DL link : https://zenodo.org/record/7412739
image
lymphocytes (anti-CD3/CD20)
DL link : https://zenodo.org/record/7412529
image
endothelial cells (anti-ERG)
DL link : https://zenodo.org/record/7412512
image
plasma cells (anti-MIST1)
DL link : https://zenodo.org/record/7412500
image
myeloid cells (anti-MNDA)
DL link : https://zenodo.org/record/7412690
image

Dataset organization

A Tar.gz file contains the following files:
- HE image file: {antigen}_{celltype}_{slideID}_{posx}_{posy}_HE.png
- Mask image file: {antigen}_{celltype}_{slideID}_{posx}_{posy}_mask.png
Each image file is 984x984 px.
posX and posY are the leftmost position in WSI coordinate.
Mask files store binary segmentation mask (background : 0, target : 1)

A csv file contains the following information:
antigen : Antibodies for this antigen were used to create the segmentation mask.
filename: filename of image or mask file.
train_val_test : train, validation, or test sample in the paper.

Licenses

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC-BY-NC-SA 4.0)
For non-commercial use, please use the dataset under CC-BY-NC-SA.
If you would like to use the dataset for commercial purposes, please contact us (ishum-prm@m.u-tokyo.ac.jp).

Top