Experiments were conducted on Kaggle's servers initially and on University's GPUs afterwards
Method 1: Augment TMA dataset, Train and Classify¶
- Model Parameters: resnet50, Adam optimizer, 0.001 learning rate, CrossEntropy Loss, 4 epochs
- Goal: classify TMA images
- Result: not achieving same results on real test set on Kaggle submission, probably because of differences in TMA imaging technics used by various medical centers etc.
- Suggestions: more data, ensemble of models
Note: Before training, I ensured that real + all its augmented variations are on the same split. - No information leakage
* * *¶
Experiment 1: Augment 10 images per 1 image: 25+250=275 images in resulting set¶
- Goal: larger training set
- Result: good results for given images
datagen = ImageDataGenerator(
rotation_range=45, # Slight rotation
brightness_range=[0.8, 1.2], # Adjust brightness
shear_range=0.1, # Slight shear
zoom_range=[0.95, 1.05], # Slight zoom
channel_shift_range=0.2, # Channel shifts for color variation
horizontal_flip=True, # Horizontal flip
vertical_flip=True, # Vertical flip
fill_mode='reflect' # Fill mode
)
Evaluation¶
# Classification Report: 55 images in test set: 5 original, 50 augmented
precision recall f1-score support
CC 0.60 0.82 0.69 11
EC 1.00 1.00 1.00 11
HGSC 1.00 1.00 1.00 11
LGSC 1.00 0.09 0.17 11
MC 0.65 1.00 0.79 11
accuracy 0.78 55
macro avg 0.85 0.78 0.73 55
weighted avg 0.85 0.78 0.73 55
------------¶
Experiment 2: Augment 13 images per 1 image: 25+325=350 images in resulting set¶
- Goal: increase accuracy
- Result: improved all metrics slightly
Augment 13 images per 1 image: 25+325=350 images in resulting set Goal: increase accuracy
datagen = ImageDataGenerator(
rotation_range=45, # Slight rotation
brightness_range=[0.8, 1.2], # Adjust brightness
shear_range=0.1, # Slight shear
zoom_range=[0.90, 1.05], # Slight zoom
channel_shift_range=0.3, # Channel shifts for color variation
horizontal_flip=True, # Horizontal flip
vertical_flip=True, # Vertical flip
fill_mode='reflect' # Fill mode
)
Evaluation¶
# Classification Report: 55 images in test set: 5 original, 65 augmented
precision recall f1-score support
CC 0.82 1.00 0.90 14
EC 0.93 1.00 0.97 14
HGSC 1.00 1.00 1.00 14
LGSC 1.00 0.07 0.13 14
MC 0.57 0.93 0.70 14
accuracy 0.80 70
macro avg 0.86 0.80 0.74 70
weighted avg 0.86 0.80 0.74 70
------------¶
Experiment 3: Train with cross-validation (augmented set)¶
- Goal: get more realistic evaluation
- Result: lower, more realistic results
Evaluation¶
# 5-Fold cross-validation results
Classification Report for Fold 1:
precision recall f1-score support
CC 1.00 1.00 1.00 11
EC 0.92 1.00 0.96 11
HGSC 0.55 1.00 0.71 11
LGSC 0.00 0.00 0.00 11
MC 0.83 0.91 0.87 11
accuracy 0.78 55
macro avg 0.66 0.78 0.71 55
weighted avg 0.66 0.78 0.71 55
Classification Report for Fold 2:
precision recall f1-score support
CC 1.00 1.00 1.00 11
EC 0.31 0.45 0.37 11
HGSC 0.41 1.00 0.58 11
LGSC 1.00 0.09 0.17 11
MC 0.00 0.00 0.00 11
accuracy 0.51 55
macro avg 0.54 0.51 0.42 55
weighted avg 0.54 0.51 0.42 55
Classification Report for Fold 3:
precision recall f1-score support
CC 0.48 1.00 0.65 11
EC 0.92 1.00 0.96 11
HGSC 1.00 1.00 1.00 11
LGSC 0.00 0.00 0.00 11
MC 1.00 0.82 0.90 11
accuracy 0.76 55
macro avg 0.68 0.76 0.70 55
weighted avg 0.68 0.76 0.70 55
Classification Report for Fold 4:
precision recall f1-score support
CC 0.79 1.00 0.88 11
EC 0.65 1.00 0.79 11
HGSC 0.52 1.00 0.69 11
LGSC 0.00 0.00 0.00 11
MC 1.00 0.27 0.43 11
accuracy 0.65 55
macro avg 0.59 0.65 0.56 55
weighted avg 0.59 0.65 0.56 55
Classification Report for Fold 5:
precision recall f1-score support
CC 1.00 0.64 0.78 11
EC 1.00 0.91 0.95 11
HGSC 1.00 1.00 1.00 11
LGSC 0.00 0.00 0.00 11
MC 0.41 1.00 0.58 11
accuracy 0.71 55
macro avg 0.68 0.71 0.66 55
weighted avg 0.68 0.71 0.66 55
## Average classification report
Average accuracy: 0.68
| Class | Precision | Recall | F1-Score | |
|---|---|---|---|---|
| 0 | CC | 0.85 | 0.93 | 0.86 |
| 1 | EC | 0.76 | 0.87 | 0.81 |
| 2 | HGSC | 0.70 | 1.00 | 0.80 |
| 3 | LGSC | 0.20 | 0.02 | 0.03 |
| 4 | MC | 0.65 | 0.60 | 0.56 |
| 5 | Macro Avg | 0.63 | 0.68 | 0.61 |
| 6 | Weighted Avg | 0.63 | 0.68 | 0.61 |
------------¶
Method 2: Divide TMA images into Tiles, Train and Classify¶
- Model Parameters: lr=0.001, momentum=0.9, weight_decay=1e-4, CrossEntropyLoss, 7 epochs
- Reduced overfitting: Used more strict augmentation techniques, resnet with dropout and weight decay
- Goal: classify TMA images
- Result: quite promising classification results, overfitting
- Suggestions: manual tile labeling by expert oncologist, more diverse TMA images, ensemble of models
# Tiles extracted from TMA images
Note: Before training, I ensured that all tiles from same image are on the same split. - No information leakage
* * *¶
Experiment 1: resnet18, SGD optimizer¶
Evaluation¶
# Majority voting of tiles linked to original TMA image
## Classification results on 5 TMA images using majority voting from 549 tiles
precision recall f1-score support
HGSC 0.00 0.00 0.00 1
LGSC 0.00 0.00 0.00 1
EC 0.00 0.00 0.00 1
CC 1.00 1.00 1.00 1
MC 0.50 1.00 0.67 1
accuracy 0.40 5
macro avg 0.30 0.40 0.33 5
weighted avg 0.30 0.40 0.33 5
# Tile classification
## Classification results for predictions on 549 tile images from 5 TMA
precision recall f1-score support
HGSC 0.61 0.40 0.48 100
LGSC 0.03 0.05 0.03 104
EC 0.38 0.15 0.21 161
CC 0.62 0.78 0.70 93
MC 0.41 0.51 0.45 91
accuracy 0.34 549
macro avg 0.41 0.38 0.38 549
weighted avg 0.40 0.34 0.35 549
------------¶
Experiment 2: resnet50, SGD optimizer¶
- Goal: get better accuracy
- Result: per image accuracy remained the same, per tile results got worse
Evaluation¶
# Majority voting of tiles linked to original TMA image
## Classification results on 5 TMA images using majority voting from 549 tiles
precision recall f1-score support
HGSC 0.00 0.00 0.00 1
LGSC 0.00 0.00 0.00 1
EC 0.00 0.00 0.00 1
CC 1.00 1.00 1.00 1
MC 0.50 1.00 0.67 1
accuracy 0.40 5
macro avg 0.30 0.40 0.33 5
weighted avg 0.30 0.40 0.33 5
# Tile classificationa
## Classification results for predictions on 549 tile images from 5 TMA
precision recall f1-score support
HGSC 0.40 0.48 0.44 100
LGSC 0.06 0.10 0.08 104
EC 0.36 0.11 0.17 161
CC 0.80 0.48 0.60 93
MC 0.35 0.65 0.45 91
accuracy 0.33 549
macro avg 0.40 0.36 0.35 549
weighted avg 0.38 0.33 0.32 549
------------¶
Experiment 3: resnet18, Adam optimizer¶
- Goal: get better accuracy
- Result: much faster training, less overfiting, lower per image accuracy, higher per tile precision
Evaluation¶
# Majority voting of tiles linked to original TMA image
## Classification results on 5 TMA images using majority voting from 549 tiles
precision recall f1-score support
HGSC 0.00 0.00 0.00 1
LGSC 0.00 0.00 0.00 1
EC 1.00 1.00 1.00 1
CC 0.00 0.00 0.00 1
MC 0.00 0.00 0.00 1
accuracy 0.20 5
macro avg 0.20 0.20 0.20 5
weighted avg 0.20 0.20 0.20 5
# Tile classificationa
## Classification results for predictions on 549 tile images from 5 TMA
precision recall f1-score support
HGSC 0.86 0.06 0.11 100
LGSC 0.14 0.21 0.17 104
EC 0.52 0.73 0.61 161
CC 0.38 0.15 0.22 93
MC 0.20 0.26 0.23 91
accuracy 0.33 549
macro avg 0.42 0.28 0.27 549
weighted avg 0.43 0.33 0.30 549
------------¶
Experiment 4: resnet18, Adam optimizer, 3 epochs¶
- Goal: get better accuracy, less overfitting
- Result: better accuracy, less overfitting
Note: 3 out of 5 images correctly classified
Evaluation¶
# Majority voting of tiles linked to original TMA image
## Classification results on 5 TMA images using majority voting from 549 tiles
precision recall f1-score support
HGSC 0.50 1.00 0.67 1
LGSC 0.00 0.00 0.00 1
EC 0.00 0.00 0.00 1
CC 1.00 1.00 1.00 1
MC 0.50 1.00 0.67 1
accuracy 0.60 5
macro avg 0.40 0.60 0.47 5
weighted avg 0.40 0.60 0.47 5
# Tile classificationa
## Classification results for predictions on 549 tile images from 5 TMA
# 258 out of 549 tile correctly classified
precision recall f1-score support
HGSC 0.41 0.93 0.57 100
LGSC 0.15 0.05 0.07 104
EC 0.42 0.12 0.18 161
CC 0.62 0.89 0.73 93
MC 0.51 0.62 0.56 91
accuracy 0.47 549
macro avg 0.42 0.52 0.42 549
weighted avg 0.42 0.47 0.39 549
# Graphical representation of training process
------------¶
***¶
Method 1: MIL classifier with clustering¶
- Model Parameters: AttentionMIL, Adam optimizer, 0.001 learning rate, CrossEntropy Loss
- Goal: classify WSI images using representative features from clustering
- Result: quite impressive, an idea for future work!
- Suggestions: doing complete research dedicated to this approach (could be a different classification task)
Experiment 1¶
- Goal: get good accuracy
- Result: 44.4% accuracy (15-20% increase in comparison to same model but without clustering)
Note: 4 out of 9 images, classified correctly
# Split of bags of feature vectors (bag = whole_slide_image)
Train balance: {'LGSC': 3, 'MC': 6, 'HGSC': 3, 'CC': 2, 'EC': 3}
Validation balance: {'HGSC': 2, 'MC': 2, 'CC': 1, 'EC': 1, 'LGSC': 1}
Test balance: {'EC': 2, 'LGSC': 2, 'MC': 2, 'HGSC': 2, 'CC': 1}
# UMAP embed feature vectors
# Cluster with HDBSCAN
# Then identify labels of obvious clusters and feed it to model as representative features
Evaluation¶
# Classification Results
Accuracy: 0.4444 Precision: 0.5500 Recall: 0.4000 F1 Score: 0.3667
# print(predicted_labels)
# print(true_label_names)
['CC', 'LGSC', 'LGSC', 'MC', 'HGSC', 'HGSC', 'MC', 'MC', 'MC'] ['EC', 'LGSC', 'MC', 'MC', 'HGSC', 'HGSC', 'LGSC', 'CC', 'EC']
------------¶
Experiment 2: made balanced split, used less representative features to augment the training bag, added early stopping mechanis,¶
- Goal: get better results
- Result: 5.6% better accuracy, 13% better precision, 10% better recall and f1-scores
Note: 5 out of 10 images, classified correctly
# Split of bags of feature vectors (bag = whole_slide_image)
Train balance: {'HGSC': 6, 'MC': 6, 'CC': 6, 'EC': 6, 'LGSC': 6}
Validation balance: {'LGSC': 2, 'EC': 2, 'HGSC': 2, 'CC': 2, 'MC': 2}
Test balance: {'LGSC': 2, 'MC': 2, 'HGSC': 2, 'EC': 2, 'CC': 2}
# UMAP embed feature vectors
# Cluster with HDBSCAN
# Then identify labels of obvious clusters and feed it to model as representative features
Evaluation¶
# Classification Results
Accuracy: 0.5000 Precision: 0.6833 Recall: 0.5000 F1 Score: 0.4600
# print(predicted_labels)
# print(true_label_names)
['LGSC', 'HGSC', 'LGSC', 'HGSC', 'EC', 'LGSC', 'CC', 'HGSC', 'HGSC', 'CC'] ['LGSC', 'MC', 'HGSC', 'EC', 'EC', 'LGSC', 'MC', 'HGSC', 'CC', 'CC']
------------¶
Experiment 3: larger hdbscan parameters (epsilon, min_samples, min_cluster_size) + added reduceLR scheduler¶
- Goal: get better results
- Result: accuracy remained the same, while f1-score and precision droped a bit (4-8%)
Note: 5 out of 10 images, classified correctly
# Split of bags of feature vectors (bag = whole_slide_image)
Train balance: {'HGSC': 6, 'MC': 6, 'CC': 6, 'EC': 6, 'LGSC': 6}
Validation balance: {'LGSC': 2, 'EC': 2, 'HGSC': 2, 'CC': 2, 'MC': 2}
Test balance: {'LGSC': 2, 'MC': 2, 'HGSC': 2, 'EC': 2, 'CC': 2}
# UMAP embed feature vectors
# Cluster with HDBSCAN
# Then identify labels of obvious clusters and feed it to model as representative features
Evaluation¶
# Classification Results
Accuracy: 0.5000 Precision: 0.6133 Recall: 0.5000 F1 Score: 0.4076
# print(predicted_labels)
# print(true_label_names)
['LGSC', 'EC', 'EC', 'EC', 'LGSC', 'CC', 'HGSC', 'EC', 'LGSC', 'EC'] ['CC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC', 'HGSC', 'EC', 'LGSC', 'EC']
------------¶
Method 2: MIL classifier¶
- Model Parameters: AttentionMIL, Adam optimizer, 0.001 learning rate, CrossEntropy Loss
- Goal: classify WSI images
- Result: good!
- Suggestions: more data (we had more but imbalanced)
Experiment 1: train for 20 epochs¶
- Goal: get good accuracy
- Result: 62.5% accuracy
Note: 25 out of 40 images, classified correctly
# Split of bags of feature vectors (bag = whole_slide_image)
Train balance: {'CC': 28, 'EC': 28, 'LGSC': 28, 'HGSC': 28, 'MC': 28}
Validation balance: {'HGSC': 4, 'CC': 4, 'LGSC': 4, 'MC': 4, 'EC': 4}
Test balance: {'MC': 8, 'HGSC': 8, 'LGSC': 8, 'EC': 8, 'CC': 8}
Evaluation¶
# Classification Results
Accuracy: 0.6250 Precision: 0.6058 Recall: 0.6250 F1 Score: 0.6097
# print(predicted_labels)
# print(true_label_names)
['MC', 'EC', 'HGSC', 'LGSC', 'LGSC', 'EC', 'LGSC', 'EC', 'HGSC', 'EC', 'MC', 'MC', 'EC', 'LGSC', 'MC', 'EC', 'CC', 'HGSC', 'MC', 'CC', 'CC', 'HGSC', 'CC', 'MC', 'MC', 'LGSC', 'MC', 'HGSC', 'CC', 'MC', 'MC', 'CC', 'CC', 'LGSC', 'HGSC', 'CC', 'EC', 'MC', 'LGSC', 'LGSC'] ['MC', 'HGSC', 'HGSC', 'LGSC', 'LGSC', 'HGSC', 'EC', 'LGSC', 'HGSC', 'HGSC', 'EC', 'MC', 'LGSC', 'LGSC', 'EC', 'EC', 'CC', 'EC', 'EC', 'CC', 'CC', 'HGSC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC', 'EC', 'CC', 'MC', 'MC', 'CC', 'CC', 'LGSC', 'EC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC']
------------¶
Experiment 2: test effect of ReduceLROnPlateau inclusion¶
- Goal: get better results
- Result: significantly better, 77.5% accuracy (+15% increase)
Note: 31 out of 40, classified correctly
# from torch.optim.lr_scheduler import ReduceLROnPlateau
# scheduler = ReduceLROnPlateau(optimizer, mode='min', factor=0.1, patience=2, verbose=True)
# Split of bags of feature vectors (bag = whole_slide_image)
Train balance: {'CC': 28, 'EC': 28, 'LGSC': 28, 'HGSC': 28, 'MC': 28}
Validation balance: {'HGSC': 4, 'CC': 4, 'LGSC': 4, 'MC': 4, 'EC': 4}
Test balance: {'MC': 8, 'HGSC': 8, 'LGSC': 8, 'EC': 8, 'CC': 8}
Evaluation¶
# Classification Results
Accuracy: 0.7750 Precision: 0.7648 Recall: 0.7750 F1 Score: 0.7608
# print(predicted_labels)
# print(true_label_names)
['MC', 'LGSC', 'EC', 'LGSC', 'LGSC', 'HGSC', 'LGSC', 'EC', 'HGSC', 'HGSC', 'MC', 'MC', 'LGSC', 'LGSC', 'MC', 'EC', 'CC', 'HGSC', 'CC', 'CC', 'CC', 'HGSC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC', 'EC', 'CC', 'MC', 'MC', 'CC', 'CC', 'LGSC', 'EC', 'CC', 'HGSC', 'MC', 'LGSC', 'LGSC'] ['MC', 'HGSC', 'HGSC', 'LGSC', 'LGSC', 'HGSC', 'EC', 'LGSC', 'HGSC', 'HGSC', 'EC', 'MC', 'LGSC', 'LGSC', 'EC', 'EC', 'CC', 'EC', 'EC', 'CC', 'CC', 'HGSC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC', 'EC', 'CC', 'MC', 'MC', 'CC', 'CC', 'LGSC', 'EC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC']
------------¶
Experiment 3: reduce original WSI dimensions by factor of 2¶
- Goal: address computation limitations of Kaggle submissions (execution time)
- Result: faster, but not enough + 70% accuracy (drop in accuracy by 7.5%)
Note: 28 out of 40 images, classified correctly
# img = img.resize((img.width // 2, img.height // 2))
# Split of bags of feature vectors (bag = whole_slide_image)
Train balance: {'CC': 28, 'EC': 28, 'LGSC': 28, 'HGSC': 28, 'MC': 28}
Validation balance: {'HGSC': 4, 'CC': 4, 'LGSC': 4, 'MC': 4, 'EC': 4}
Test balance: {'MC': 8, 'HGSC': 8, 'LGSC': 8, 'EC': 8, 'CC': 8}
Evaluation¶
# Classification Results
Accuracy: 0.7000 Precision: 0.7062 Recall: 0.7000 F1 Score: 0.6948
# print(predicted_labels)
# print(true_label_names)
['MC', 'LGSC', 'EC', 'LGSC', 'LGSC', 'HGSC', 'HGSC', 'LGSC', 'HGSC', 'HGSC', 'EC', 'EC', 'LGSC', 'LGSC', 'MC', 'LGSC', 'CC', 'HGSC', 'EC', 'CC', 'CC', 'HGSC', 'CC', 'LGSC', 'MC', 'LGSC', 'MC', 'EC', 'HGSC', 'MC', 'MC', 'CC', 'CC', 'LGSC', 'MC', 'EC', 'HGSC', 'MC', 'LGSC', 'EC'] ['MC', 'HGSC', 'HGSC', 'LGSC', 'LGSC', 'HGSC', 'EC', 'LGSC', 'HGSC', 'HGSC', 'EC', 'MC', 'LGSC', 'LGSC', 'EC', 'EC', 'CC', 'EC', 'EC', 'CC', 'CC', 'HGSC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC', 'EC', 'CC', 'MC', 'MC', 'CC', 'CC', 'LGSC', 'EC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC']
------------¶
Experiment 4: reduce original WSI dimensions by factor of 5¶
- Goal: address computation limitations of Kaggle submissions (execution time)
- Result: success, while accuracy is same as in previous experiment (70%)
Note: 28 out of 40 images, classified correctly
# img = img.resize((img.width // 5, img.height // 5))
# Split of bags of feature vectors (bag = whole_slide_image)
Train balance: {'CC': 28, 'EC': 28, 'LGSC': 28, 'HGSC': 28, 'MC': 28}
Validation balance: {'HGSC': 4, 'CC': 4, 'LGSC': 4, 'MC': 4, 'EC': 4}
Test balance: {'MC': 8, 'HGSC': 8, 'LGSC': 8, 'EC': 8, 'CC': 8}
Evaluation¶
# Classification Results
Accuracy: 0.7000 Precision: 0.7062 Recall: 0.7000 F1 Score: 0.6948
# print(predicted_labels)
# print(true_label_names)
['MC', 'LGSC', 'EC', 'LGSC', 'LGSC', 'HGSC', 'HGSC', 'LGSC', 'HGSC', 'HGSC', 'EC', 'EC', 'LGSC', 'LGSC', 'MC', 'LGSC', 'CC', 'HGSC', 'EC', 'CC', 'CC', 'HGSC', 'CC', 'LGSC', 'MC', 'LGSC', 'MC', 'EC', 'HGSC', 'MC', 'MC', 'CC', 'CC', 'LGSC', 'MC', 'EC', 'HGSC', 'MC', 'LGSC', 'EC'] ['MC', 'HGSC', 'HGSC', 'LGSC', 'LGSC', 'HGSC', 'EC', 'LGSC', 'HGSC', 'HGSC', 'EC', 'MC', 'LGSC', 'LGSC', 'EC', 'EC', 'CC', 'EC', 'EC', 'CC', 'CC', 'HGSC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC', 'EC', 'CC', 'MC', 'MC', 'CC', 'CC', 'LGSC', 'EC', 'CC', 'HGSC', 'MC', 'LGSC', 'MC']
------------¶
Experiment 5: changing the split (+20 images in training set)¶
- Goal: increase overall classification results
- Result: success, all metrics got better (80% accuracy)
Note: 16 out of 20 images, classified correctly
# img = img.resize((img.width // 5 img.height // 5))
# Split of bags of feature vectors (bag = whole_slide_image)
Train balance: {'LGSC': 32, 'EC': 32, 'CC': 32, 'HGSC': 32, 'MC': 32}
Validation balance: {'HGSC': 4, 'MC': 4, 'EC': 4, 'LGSC': 4, 'CC': 4}
Test balance: {'MC': 4, 'EC': 4, 'LGSC': 4, 'CC': 4, 'HGSC': 4}
# Training
Epoch 1/15, Train Loss: 1.6114, Train Acc: 18.24%, Validation Loss: 1.5640, Val Acc: 50.00% Epoch 2/15, Train Loss: 1.5170, Train Acc: 33.33%, Validation Loss: 1.4260, Val Acc: 50.00% Epoch 3/15, Train Loss: 1.3571, Train Acc: 40.25%, Validation Loss: 1.2833, Val Acc: 40.00% Epoch 4/15, Train Loss: 1.1448, Train Acc: 61.01%, Validation Loss: 1.1208, Val Acc: 60.00% Epoch 5/15, Train Loss: 0.9701, Train Acc: 65.41%, Validation Loss: 0.9738, Val Acc: 70.00% Epoch 6/15, Train Loss: 0.7930, Train Acc: 73.58%, Validation Loss: 0.8247, Val Acc: 80.00% Epoch 7/15, Train Loss: 0.6699, Train Acc: 77.99%, Validation Loss: 0.7731, Val Acc: 75.00% Epoch 8/15, Train Loss: 0.5448, Train Acc: 82.39%, Validation Loss: 0.7732, Val Acc: 75.00% Epoch 9/15, Train Loss: 0.4730, Train Acc: 87.42%, Validation Loss: 0.8031, Val Acc: 75.00% Epoch 10/15, Train Loss: 0.4041, Train Acc: 88.05%, Validation Loss: 0.7058, Val Acc: 85.00% Epoch 11/15, Train Loss: 0.3201, Train Acc: 94.34%, Validation Loss: 0.7301, Val Acc: 85.00% Epoch 12/15, Train Loss: 0.2325, Train Acc: 93.71%, Validation Loss: 0.6632, Val Acc: 85.00% Epoch 13/15, Train Loss: 0.2025, Train Acc: 94.97%, Validation Loss: 0.6887, Val Acc: 80.00% Epoch 14/15, Train Loss: 0.1498, Train Acc: 97.48%, Validation Loss: 0.7251, Val Acc: 80.00% Epoch 00015: reducing learning rate of group 0 to 1.0000e-04. Epoch 15/15, Train Loss: 0.1125, Train Acc: 97.48%, Validation Loss: 0.7668, Val Acc: 75.00%
Evaluation¶
# Classification Results
Accuracy: 0.8000 Precision: 0.8476 Recall: 0.8000 F1 Score: 0.8026
# print(predicted_labels)
# print(true_label_names)
['MC', 'EC', 'MC', 'MC', 'LGSC', 'CC', 'LGSC', 'HGSC', 'HGSC', 'LGSC', 'CC', 'EC', 'HGSC', 'CC', 'HGSC', 'EC', 'HGSC', 'HGSC', 'LGSC', 'HGSC'] ['MC', 'EC', 'MC', 'MC', 'LGSC', 'CC', 'LGSC', 'EC', 'EC', 'LGSC', 'CC', 'EC', 'HGSC', 'CC', 'HGSC', 'MC', 'HGSC', 'HGSC', 'LGSC', 'CC']