Anar Alimzade¶

WiSe 2023-24 | Passau, Germany.

This is general overview of my work. Various experiments and results.¶

Experiments were conducted on Kaggle's servers initially and on University's GPUs afterwards


Tissue Microarrays - TMA¶

This section provides various methods and experiments on TMA classification.

* * *¶

Method 1: Augment TMA dataset, Train and Classify¶

  • Model Parameters: resnet50, Adam optimizer, 0.001 learning rate, CrossEntropy Loss, 4 epochs
  • Goal: classify TMA images
  • Result: not achieving same results on real test set on Kaggle submission, probably because of differences in TMA imaging technics used by various medical centers etc.
  • Suggestions: more data, ensemble of models

Note: Before training, I ensured that real + all its augmented variations are on the same split. - No information leakage

* * *¶

Experiment 1: Augment 10 images per 1 image: 25+250=275 images in resulting set¶

  • Goal: larger training set
  • Result: good results for given images
In [ ]:
datagen = ImageDataGenerator(
    rotation_range=45,            # Slight rotation
    brightness_range=[0.8, 1.2],  # Adjust brightness
    shear_range=0.1,              # Slight shear
    zoom_range=[0.95, 1.05],      # Slight zoom
    channel_shift_range=0.2,      # Channel shifts for color variation
    horizontal_flip=True,         # Horizontal flip
    vertical_flip=True,           # Vertical flip
    fill_mode='reflect'           # Fill mode
)
No description has been provided for this image
Evaluation¶
In [ ]:
# Classification Report: 55 images in test set: 5 original, 50 augmented
              precision    recall  f1-score   support

          CC       0.60      0.82      0.69        11
          EC       1.00      1.00      1.00        11
        HGSC       1.00      1.00      1.00        11
        LGSC       1.00      0.09      0.17        11
          MC       0.65      1.00      0.79        11

    accuracy                           0.78        55
   macro avg       0.85      0.78      0.73        55
weighted avg       0.85      0.78      0.73        55


------------¶

Experiment 2: Augment 13 images per 1 image: 25+325=350 images in resulting set¶

  • Goal: increase accuracy
  • Result: improved all metrics slightly

Augment 13 images per 1 image: 25+325=350 images in resulting set Goal: increase accuracy

In [ ]:
datagen = ImageDataGenerator(
    rotation_range=45,                   # Slight rotation
    brightness_range=[0.8, 1.2],         # Adjust brightness
    shear_range=0.1,                     # Slight shear
    zoom_range=[0.90, 1.05],             # Slight zoom
    channel_shift_range=0.3,             # Channel shifts for color variation
    horizontal_flip=True,                # Horizontal flip
    vertical_flip=True,                  # Vertical flip
    fill_mode='reflect'                  # Fill mode
)
No description has been provided for this image
Evaluation¶
In [ ]:
# Classification Report: 55 images in test set: 5 original, 65 augmented
              precision    recall  f1-score   support

          CC       0.82      1.00      0.90        14
          EC       0.93      1.00      0.97        14
        HGSC       1.00      1.00      1.00        14
        LGSC       1.00      0.07      0.13        14
          MC       0.57      0.93      0.70        14

    accuracy                           0.80        70
   macro avg       0.86      0.80      0.74        70
weighted avg       0.86      0.80      0.74        70


------------¶

Experiment 3: Train with cross-validation (augmented set)¶

  • Goal: get more realistic evaluation
  • Result: lower, more realistic results
Evaluation¶
In [ ]:
# 5-Fold cross-validation results
Classification Report for Fold 1:
              precision    recall  f1-score   support

          CC       1.00      1.00      1.00        11
          EC       0.92      1.00      0.96        11
        HGSC       0.55      1.00      0.71        11
        LGSC       0.00      0.00      0.00        11
          MC       0.83      0.91      0.87        11

    accuracy                           0.78        55
   macro avg       0.66      0.78      0.71        55
weighted avg       0.66      0.78      0.71        55


Classification Report for Fold 2:
              precision    recall  f1-score   support

          CC       1.00      1.00      1.00        11
          EC       0.31      0.45      0.37        11
        HGSC       0.41      1.00      0.58        11
        LGSC       1.00      0.09      0.17        11
          MC       0.00      0.00      0.00        11

    accuracy                           0.51        55
   macro avg       0.54      0.51      0.42        55
weighted avg       0.54      0.51      0.42        55


Classification Report for Fold 3:
              precision    recall  f1-score   support

          CC       0.48      1.00      0.65        11
          EC       0.92      1.00      0.96        11
        HGSC       1.00      1.00      1.00        11
        LGSC       0.00      0.00      0.00        11
          MC       1.00      0.82      0.90        11

    accuracy                           0.76        55
   macro avg       0.68      0.76      0.70        55
weighted avg       0.68      0.76      0.70        55


Classification Report for Fold 4:
              precision    recall  f1-score   support

          CC       0.79      1.00      0.88        11
          EC       0.65      1.00      0.79        11
        HGSC       0.52      1.00      0.69        11
        LGSC       0.00      0.00      0.00        11
          MC       1.00      0.27      0.43        11

    accuracy                           0.65        55
   macro avg       0.59      0.65      0.56        55
weighted avg       0.59      0.65      0.56        55


Classification Report for Fold 5:
              precision    recall  f1-score   support

          CC       1.00      0.64      0.78        11
          EC       1.00      0.91      0.95        11
        HGSC       1.00      1.00      1.00        11
        LGSC       0.00      0.00      0.00        11
          MC       0.41      1.00      0.58        11

    accuracy                           0.71        55
   macro avg       0.68      0.71      0.66        55
weighted avg       0.68      0.71      0.66        55


In [3]:
## Average classification report
Average accuracy: 0.68
Out[3]:
Class Precision Recall F1-Score
0 CC 0.85 0.93 0.86
1 EC 0.76 0.87 0.81
2 HGSC 0.70 1.00 0.80
3 LGSC 0.20 0.02 0.03
4 MC 0.65 0.60 0.56
5 Macro Avg 0.63 0.68 0.61
6 Weighted Avg 0.63 0.68 0.61

------------¶


Method 2: Divide TMA images into Tiles, Train and Classify¶

  • Model Parameters: lr=0.001, momentum=0.9, weight_decay=1e-4, CrossEntropyLoss, 7 epochs
  • Reduced overfitting: Used more strict augmentation techniques, resnet with dropout and weight decay
  • Goal: classify TMA images
  • Result: quite promising classification results, overfitting
  • Suggestions: manual tile labeling by expert oncologist, more diverse TMA images, ensemble of models
In [7]:
# Tiles extracted from TMA images
No description has been provided for this image

Note: Before training, I ensured that all tiles from same image are on the same split. - No information leakage

* * *¶

Experiment 1: resnet18, SGD optimizer¶

Evaluation¶
In [ ]:
# Majority voting of tiles linked to original TMA image
## Classification results on 5 TMA images using majority voting from 549 tiles
              precision    recall  f1-score   support

        HGSC       0.00      0.00      0.00         1
        LGSC       0.00      0.00      0.00         1
          EC       0.00      0.00      0.00         1
          CC       1.00      1.00      1.00         1
          MC       0.50      1.00      0.67         1

    accuracy                           0.40         5
   macro avg       0.30      0.40      0.33         5
weighted avg       0.30      0.40      0.33         5

In [ ]:
# Tile classification
## Classification results for predictions on 549 tile images from 5 TMA
              precision    recall  f1-score   support

        HGSC       0.61      0.40      0.48       100
        LGSC       0.03      0.05      0.03       104
          EC       0.38      0.15      0.21       161
          CC       0.62      0.78      0.70        93
          MC       0.41      0.51      0.45        91

    accuracy                           0.34       549
   macro avg       0.41      0.38      0.38       549
weighted avg       0.40      0.34      0.35       549


------------¶

Experiment 2: resnet50, SGD optimizer¶

  • Goal: get better accuracy
  • Result: per image accuracy remained the same, per tile results got worse
Evaluation¶
In [ ]:
# Majority voting of tiles linked to original TMA image
## Classification results on 5 TMA images using majority voting from 549 tiles
              precision    recall  f1-score   support

        HGSC       0.00      0.00      0.00         1
        LGSC       0.00      0.00      0.00         1
          EC       0.00      0.00      0.00         1
          CC       1.00      1.00      1.00         1
          MC       0.50      1.00      0.67         1

    accuracy                           0.40         5
   macro avg       0.30      0.40      0.33         5
weighted avg       0.30      0.40      0.33         5

In [ ]:
# Tile classificationa
## Classification results for predictions on 549 tile images from 5 TMA
              precision    recall  f1-score   support

        HGSC       0.40      0.48      0.44       100
        LGSC       0.06      0.10      0.08       104
          EC       0.36      0.11      0.17       161
          CC       0.80      0.48      0.60        93
          MC       0.35      0.65      0.45        91

    accuracy                           0.33       549
   macro avg       0.40      0.36      0.35       549
weighted avg       0.38      0.33      0.32       549

------------¶

Experiment 3: resnet18, Adam optimizer¶

  • Goal: get better accuracy
  • Result: much faster training, less overfiting, lower per image accuracy, higher per tile precision
Evaluation¶
In [ ]:
# Majority voting of tiles linked to original TMA image
## Classification results on 5 TMA images using majority voting from 549 tiles
              precision    recall  f1-score   support

        HGSC       0.00      0.00      0.00         1
        LGSC       0.00      0.00      0.00         1
          EC       1.00      1.00      1.00         1
          CC       0.00      0.00      0.00         1
          MC       0.00      0.00      0.00         1

    accuracy                           0.20         5
   macro avg       0.20      0.20      0.20         5
weighted avg       0.20      0.20      0.20         5

In [ ]:
# Tile classificationa
## Classification results for predictions on 549 tile images from 5 TMA
              precision    recall  f1-score   support

        HGSC       0.86      0.06      0.11       100
        LGSC       0.14      0.21      0.17       104
          EC       0.52      0.73      0.61       161
          CC       0.38      0.15      0.22        93
          MC       0.20      0.26      0.23        91

    accuracy                           0.33       549
   macro avg       0.42      0.28      0.27       549
weighted avg       0.43      0.33      0.30       549

------------¶

Experiment 4: resnet18, Adam optimizer, 3 epochs¶

  • Goal: get better accuracy, less overfitting
  • Result: better accuracy, less overfitting

Note: 3 out of 5 images correctly classified

Evaluation¶
In [ ]:
# Majority voting of tiles linked to original TMA image
## Classification results on 5 TMA images using majority voting from 549 tiles
              precision    recall  f1-score   support

        HGSC       0.50      1.00      0.67         1
        LGSC       0.00      0.00      0.00         1
          EC       0.00      0.00      0.00         1
          CC       1.00      1.00      1.00         1
          MC       0.50      1.00      0.67         1

    accuracy                           0.60         5
   macro avg       0.40      0.60      0.47         5
weighted avg       0.40      0.60      0.47         5

In [ ]:
# Tile classificationa
## Classification results for predictions on 549 tile images from 5 TMA

# 258 out of 549 tile correctly classified
              precision    recall  f1-score   support

        HGSC       0.41      0.93      0.57       100
        LGSC       0.15      0.05      0.07       104
          EC       0.42      0.12      0.18       161
          CC       0.62      0.89      0.73        93
          MC       0.51      0.62      0.56        91

    accuracy                           0.47       549
   macro avg       0.42      0.52      0.42       549
weighted avg       0.42      0.47      0.39       549

In [ ]:
# Graphical representation of training process
No description has been provided for this image

------------¶

***¶

Whole Slide Images - WSI¶

This section provides various methods and experiments on WSI classification.

* * *¶

Method 1: MIL classifier with clustering¶

  • Model Parameters: AttentionMIL, Adam optimizer, 0.001 learning rate, CrossEntropy Loss
  • Goal: classify WSI images using representative features from clustering
  • Result: quite impressive, an idea for future work!
  • Suggestions: doing complete research dedicated to this approach (could be a different classification task)

Experiment 1¶

  • Goal: get good accuracy
  • Result: 44.4% accuracy (15-20% increase in comparison to same model but without clustering)

Note: 4 out of 9 images, classified correctly

In [ ]:
# Split of bags of feature vectors (bag = whole_slide_image)
Train balance: {'LGSC': 3, 'MC': 6, 'HGSC': 3, 'CC': 2, 'EC': 3}
Validation balance: {'HGSC': 2, 'MC': 2, 'CC': 1, 'EC': 1, 'LGSC': 1}
Test balance: {'EC': 2, 'LGSC': 2, 'MC': 2, 'HGSC': 2, 'CC': 1}
In [ ]:
# UMAP embed feature vectors
# Cluster with HDBSCAN
# Then identify labels of obvious clusters and feed it to model as representative features
No description has been provided for this image
Evaluation¶
In [ ]:
# Classification Results
Accuracy: 0.4444
Precision: 0.5500
Recall: 0.4000
F1 Score: 0.3667
In [ ]:
# print(predicted_labels)
# print(true_label_names)
['CC', 'LGSC', 'LGSC', 'MC', 'HGSC', 'HGSC', 'MC', 'MC', 'MC']
['EC', 'LGSC', 'MC', 'MC', 'HGSC', 'HGSC', 'LGSC', 'CC', 'EC']

------------¶

Experiment 2: made balanced split, used less representative features to augment the training bag, added early stopping mechanis,¶

  • Goal: get better results
  • Result: 5.6% better accuracy, 13% better precision, 10% better recall and f1-scores

Note: 5 out of 10 images, classified correctly

In [ ]:
# Split of bags of feature vectors (bag = whole_slide_image)
Train balance: {'HGSC': 6, 'MC': 6, 'CC': 6, 'EC': 6, 'LGSC': 6}
Validation balance: {'LGSC': 2, 'EC': 2, 'HGSC': 2, 'CC': 2, 'MC': 2}
Test balance: {'LGSC': 2, 'MC': 2, 'HGSC': 2, 'EC': 2, 'CC': 2}
In [ ]:
# UMAP embed feature vectors
# Cluster with HDBSCAN
# Then identify labels of obvious clusters and feed it to model as representative features
No description has been provided for this image
Evaluation¶
In [ ]:
# Classification Results
Accuracy: 0.5000
Precision: 0.6833
Recall: 0.5000
F1 Score: 0.4600
In [ ]:
# print(predicted_labels)
# print(true_label_names)
['LGSC', 'HGSC', 'LGSC', 'HGSC', 'EC', 'LGSC', 'CC', 'HGSC', 'HGSC', 'CC']
['LGSC', 'MC', 'HGSC', 'EC', 'EC', 'LGSC', 'MC', 'HGSC', 'CC', 'CC']

------------¶

Experiment 3: larger hdbscan parameters (epsilon, min_samples, min_cluster_size) + added reduceLR scheduler¶

  • Goal: get better results
  • Result: accuracy remained the same, while f1-score and precision droped a bit (4-8%)

Note: 5 out of 10 images, classified correctly

In [ ]:
# Split of bags of feature vectors (bag = whole_slide_image)
Train balance: {'HGSC': 6, 'MC': 6, 'CC': 6, 'EC': 6, 'LGSC': 6}
Validation balance: {'LGSC': 2, 'EC': 2, 'HGSC': 2, 'CC': 2, 'MC': 2}
Test balance: {'LGSC': 2, 'MC': 2, 'HGSC': 2, 'EC': 2, 'CC': 2}
In [ ]:
# UMAP embed feature vectors
# Cluster with HDBSCAN
# Then identify labels of obvious clusters and feed it to model as representative features
No description has been provided for this image
Evaluation¶
In [ ]:
# Classification Results
Accuracy: 0.5000
Precision: 0.6133
Recall: 0.5000
F1 Score: 0.4076
In [ ]:
# print(predicted_labels)
# print(true_label_names)
['LGSC', 'EC', 'EC', 'EC', 'LGSC', 'CC', 'HGSC', 'EC', 'LGSC', 'EC']
['CC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC', 'HGSC', 'EC', 'LGSC', 'EC']

------------¶

Method 2: MIL classifier¶

  • Model Parameters: AttentionMIL, Adam optimizer, 0.001 learning rate, CrossEntropy Loss
  • Goal: classify WSI images
  • Result: good!
  • Suggestions: more data (we had more but imbalanced)

Experiment 1: train for 20 epochs¶

  • Goal: get good accuracy
  • Result: 62.5% accuracy

Note: 25 out of 40 images, classified correctly

In [ ]:
# Split of bags of feature vectors (bag = whole_slide_image)
Train balance: {'CC': 28, 'EC': 28, 'LGSC': 28, 'HGSC': 28, 'MC': 28}
Validation balance: {'HGSC': 4, 'CC': 4, 'LGSC': 4, 'MC': 4, 'EC': 4}
Test balance: {'MC': 8, 'HGSC': 8, 'LGSC': 8, 'EC': 8, 'CC': 8}
Evaluation¶
In [ ]:
# Classification Results
Accuracy: 0.6250
Precision: 0.6058
Recall: 0.6250
F1 Score: 0.6097
In [ ]:
# print(predicted_labels)
# print(true_label_names)
['MC', 'EC', 'HGSC', 'LGSC', 'LGSC', 'EC', 'LGSC', 'EC', 'HGSC', 'EC', 'MC', 'MC', 'EC', 'LGSC', 'MC', 'EC', 'CC', 'HGSC', 'MC', 'CC', 'CC', 'HGSC', 'CC', 'MC', 'MC', 'LGSC', 'MC', 'HGSC', 'CC', 'MC', 'MC', 'CC', 'CC', 'LGSC', 'HGSC', 'CC', 'EC', 'MC', 'LGSC', 'LGSC']
['MC', 'HGSC', 'HGSC', 'LGSC', 'LGSC', 'HGSC', 'EC', 'LGSC', 'HGSC', 'HGSC', 'EC', 'MC', 'LGSC', 'LGSC', 'EC', 'EC', 'CC', 'EC', 'EC', 'CC', 'CC', 'HGSC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC', 'EC', 'CC', 'MC', 'MC', 'CC', 'CC', 'LGSC', 'EC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC']

------------¶

Experiment 2: test effect of ReduceLROnPlateau inclusion¶

  • Goal: get better results
  • Result: significantly better, 77.5% accuracy (+15% increase)

Note: 31 out of 40, classified correctly

In [ ]:
# from torch.optim.lr_scheduler import ReduceLROnPlateau
# scheduler = ReduceLROnPlateau(optimizer, mode='min', factor=0.1, patience=2, verbose=True)
In [ ]:
# Split of bags of feature vectors (bag = whole_slide_image)
Train balance: {'CC': 28, 'EC': 28, 'LGSC': 28, 'HGSC': 28, 'MC': 28}
Validation balance: {'HGSC': 4, 'CC': 4, 'LGSC': 4, 'MC': 4, 'EC': 4}
Test balance: {'MC': 8, 'HGSC': 8, 'LGSC': 8, 'EC': 8, 'CC': 8}
Evaluation¶
In [ ]:
# Classification Results
Accuracy: 0.7750
Precision: 0.7648
Recall: 0.7750
F1 Score: 0.7608
In [ ]:
# print(predicted_labels)
# print(true_label_names)
['MC', 'LGSC', 'EC', 'LGSC', 'LGSC', 'HGSC', 'LGSC', 'EC', 'HGSC', 'HGSC', 'MC', 'MC', 'LGSC', 'LGSC', 'MC', 'EC', 'CC', 'HGSC', 'CC', 'CC', 'CC', 'HGSC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC', 'EC', 'CC', 'MC', 'MC', 'CC', 'CC', 'LGSC', 'EC', 'CC', 'HGSC', 'MC', 'LGSC', 'LGSC']
['MC', 'HGSC', 'HGSC', 'LGSC', 'LGSC', 'HGSC', 'EC', 'LGSC', 'HGSC', 'HGSC', 'EC', 'MC', 'LGSC', 'LGSC', 'EC', 'EC', 'CC', 'EC', 'EC', 'CC', 'CC', 'HGSC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC', 'EC', 'CC', 'MC', 'MC', 'CC', 'CC', 'LGSC', 'EC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC']

------------¶

Experiment 3: reduce original WSI dimensions by factor of 2¶

  • Goal: address computation limitations of Kaggle submissions (execution time)
  • Result: faster, but not enough + 70% accuracy (drop in accuracy by 7.5%)

Note: 28 out of 40 images, classified correctly

In [ ]:
# img = img.resize((img.width // 2, img.height // 2))
In [ ]:
# Split of bags of feature vectors (bag = whole_slide_image)
Train balance: {'CC': 28, 'EC': 28, 'LGSC': 28, 'HGSC': 28, 'MC': 28}
Validation balance: {'HGSC': 4, 'CC': 4, 'LGSC': 4, 'MC': 4, 'EC': 4}
Test balance: {'MC': 8, 'HGSC': 8, 'LGSC': 8, 'EC': 8, 'CC': 8}
Evaluation¶
In [ ]:
# Classification Results
Accuracy: 0.7000
Precision: 0.7062
Recall: 0.7000
F1 Score: 0.6948
In [ ]:
# print(predicted_labels)
# print(true_label_names)
['MC', 'LGSC', 'EC', 'LGSC', 'LGSC', 'HGSC', 'HGSC', 'LGSC', 'HGSC', 'HGSC', 'EC', 'EC', 'LGSC', 'LGSC', 'MC', 'LGSC', 'CC', 'HGSC', 'EC', 'CC', 'CC', 'HGSC', 'CC', 'LGSC', 'MC', 'LGSC', 'MC', 'EC', 'HGSC', 'MC', 'MC', 'CC', 'CC', 'LGSC', 'MC', 'EC', 'HGSC', 'MC', 'LGSC', 'EC']
['MC', 'HGSC', 'HGSC', 'LGSC', 'LGSC', 'HGSC', 'EC', 'LGSC', 'HGSC', 'HGSC', 'EC', 'MC', 'LGSC', 'LGSC', 'EC', 'EC', 'CC', 'EC', 'EC', 'CC', 'CC', 'HGSC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC', 'EC', 'CC', 'MC', 'MC', 'CC', 'CC', 'LGSC', 'EC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC']

------------¶

Experiment 4: reduce original WSI dimensions by factor of 5¶

  • Goal: address computation limitations of Kaggle submissions (execution time)
  • Result: success, while accuracy is same as in previous experiment (70%)

Note: 28 out of 40 images, classified correctly

In [ ]:
# img = img.resize((img.width // 5, img.height // 5))
In [ ]:
# Split of bags of feature vectors (bag = whole_slide_image)
Train balance: {'CC': 28, 'EC': 28, 'LGSC': 28, 'HGSC': 28, 'MC': 28}
Validation balance: {'HGSC': 4, 'CC': 4, 'LGSC': 4, 'MC': 4, 'EC': 4}
Test balance: {'MC': 8, 'HGSC': 8, 'LGSC': 8, 'EC': 8, 'CC': 8}
Evaluation¶
In [ ]:
# Classification Results
Accuracy: 0.7000
Precision: 0.7062
Recall: 0.7000
F1 Score: 0.6948
In [ ]:
# print(predicted_labels)
# print(true_label_names)
['MC', 'LGSC', 'EC', 'LGSC', 'LGSC', 'HGSC', 'HGSC', 'LGSC', 'HGSC', 'HGSC', 'EC', 'EC', 'LGSC', 'LGSC', 'MC', 'LGSC', 'CC', 'HGSC', 'EC', 'CC', 'CC', 'HGSC', 'CC', 'LGSC', 'MC', 'LGSC', 'MC', 'EC', 'HGSC', 'MC', 'MC', 'CC', 'CC', 'LGSC', 'MC', 'EC', 'HGSC', 'MC', 'LGSC', 'EC']
['MC', 'HGSC', 'HGSC', 'LGSC', 'LGSC', 'HGSC', 'EC', 'LGSC', 'HGSC', 'HGSC', 'EC', 'MC', 'LGSC', 'LGSC', 'EC', 'EC', 'CC', 'EC', 'EC', 'CC', 'CC', 'HGSC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC', 'EC', 'CC', 'MC', 'MC', 'CC', 'CC', 'LGSC', 'EC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC']

------------¶

Experiment 5: changing the split (+20 images in training set)¶

  • Goal: increase overall classification results
  • Result: success, all metrics got better (80% accuracy)

Note: 16 out of 20 images, classified correctly

In [ ]:
# img = img.resize((img.width // 5 img.height // 5))
In [ ]:
# Split of bags of feature vectors (bag = whole_slide_image)
Train balance: {'LGSC': 32, 'EC': 32, 'CC': 32, 'HGSC': 32, 'MC': 32}
Validation balance: {'HGSC': 4, 'MC': 4, 'EC': 4, 'LGSC': 4, 'CC': 4}
Test balance: {'MC': 4, 'EC': 4, 'LGSC': 4, 'CC': 4, 'HGSC': 4}
In [ ]:
# Training
Epoch 1/15, Train Loss: 1.6114, Train Acc: 18.24%, Validation Loss: 1.5640, Val Acc: 50.00%
Epoch 2/15, Train Loss: 1.5170, Train Acc: 33.33%, Validation Loss: 1.4260, Val Acc: 50.00%
Epoch 3/15, Train Loss: 1.3571, Train Acc: 40.25%, Validation Loss: 1.2833, Val Acc: 40.00%
Epoch 4/15, Train Loss: 1.1448, Train Acc: 61.01%, Validation Loss: 1.1208, Val Acc: 60.00%
Epoch 5/15, Train Loss: 0.9701, Train Acc: 65.41%, Validation Loss: 0.9738, Val Acc: 70.00%
Epoch 6/15, Train Loss: 0.7930, Train Acc: 73.58%, Validation Loss: 0.8247, Val Acc: 80.00%
Epoch 7/15, Train Loss: 0.6699, Train Acc: 77.99%, Validation Loss: 0.7731, Val Acc: 75.00%
Epoch 8/15, Train Loss: 0.5448, Train Acc: 82.39%, Validation Loss: 0.7732, Val Acc: 75.00%
Epoch 9/15, Train Loss: 0.4730, Train Acc: 87.42%, Validation Loss: 0.8031, Val Acc: 75.00%
Epoch 10/15, Train Loss: 0.4041, Train Acc: 88.05%, Validation Loss: 0.7058, Val Acc: 85.00%
Epoch 11/15, Train Loss: 0.3201, Train Acc: 94.34%, Validation Loss: 0.7301, Val Acc: 85.00%
Epoch 12/15, Train Loss: 0.2325, Train Acc: 93.71%, Validation Loss: 0.6632, Val Acc: 85.00%
Epoch 13/15, Train Loss: 0.2025, Train Acc: 94.97%, Validation Loss: 0.6887, Val Acc: 80.00%
Epoch 14/15, Train Loss: 0.1498, Train Acc: 97.48%, Validation Loss: 0.7251, Val Acc: 80.00%
Epoch 00015: reducing learning rate of group 0 to 1.0000e-04.
Epoch 15/15, Train Loss: 0.1125, Train Acc: 97.48%, Validation Loss: 0.7668, Val Acc: 75.00%
Evaluation¶
In [ ]:
# Classification Results
Accuracy: 0.8000
Precision: 0.8476
Recall: 0.8000
F1 Score: 0.8026
In [ ]:
# print(predicted_labels)
# print(true_label_names)
['MC', 'EC', 'MC', 'MC', 'LGSC', 'CC', 'LGSC', 'HGSC', 'HGSC', 'LGSC', 'CC', 'EC', 'HGSC', 'CC', 'HGSC', 'EC', 'HGSC', 'HGSC', 'LGSC', 'HGSC']
['MC', 'EC', 'MC', 'MC', 'LGSC', 'CC', 'LGSC', 'EC', 'EC', 'LGSC', 'CC', 'EC', 'HGSC', 'CC', 'HGSC', 'MC', 'HGSC', 'HGSC', 'LGSC', 'CC']

------------¶