Custom CNN vs Transfer Learning for Image Classification
As part of a deep learning project, I conducted a comparative study between training a CNN from scratch and using transfer learning with ResNet18. This analysis covers two distinct application domains to evaluate the relative performance of each approach.
Experimental Protocol
Studied Datasets
New Plant Disease Dataset
- Resolution: 128×128 pixels
- Classes: 38 plant diseases
- Distribution: Balanced dataset (~1800 images/class)
Melanoma Cancer Dataset
- Resolution: 224×224 pixels
- Classes: 2 (Malignant vs Benign)
- Distribution: 5590 vs 6289 images (slight imbalance)
Custom CNN Architecture
Implementation of a ResNet-inspired architecture with the following specifications:
Base Structure:
- Input layer: Conv2d(3→64, kernel=7×7, stride=2) + BatchNorm + ReLU
- Initial MaxPooling (3×3, stride=2)
- Residual configuration: [1,1,1,1] across 4 main layers
Residual Blocks:
1
2
3
4
ResidualBlock:
├── Conv2d(kernel=3×3, stride=variable) + BatchNorm + ReLU
├── Conv2d(kernel=3×3, stride=1) + BatchNorm
└── Skip connection + Final ReLU
Channel progression: 64 → 128 → 256 → 512 Final layer: Average Pooling + Adaptive LazyLinear
Training Parameters
Optimization:
- Optimizer: Adam (lr=1e-4)
- Loss function: CrossEntropyLoss
- Scheduler: ReduceLROnPlateau (factor=0.1, patience=3)
- Mixed precision training with GradScaler
Regularization:
- Early stopping (patience=5, min_delta=0.01)
- Batch size: 64-128 depending on dataset
- Standard data augmentation
Comparative Results
Plant Disease Performance
Architecture | Test Accuracy | Validation Loss | Epochs |
---|---|---|---|
Custom CNN | 98.05% | 0.03 | 15 |
ResNet18 (fine-tuning) | 98.08% | 0.06 | 3 |
Melanoma Cancer Performance
Architecture | Test Accuracy | Validation Loss | Epochs |
---|---|---|---|
Custom CNN | 92.05% | 0.22 | 18 |
ResNet18 (fine-tuning) | 94.60% | 0.16 | 15 |
Results Analysis
Transfer Learning Efficiency
Accelerated convergence: Transfer learning demonstrates significantly faster convergence, particularly marked on Plant Disease (5× factor).
Robustness: Learning curves show superior stability with less inter-epoch variance for pre-trained ResNet18.
Performance on complex domains: The 2.55-point gap on Melanoma Cancer illustrates the advantage of pre-learned features for medical texture analysis.
Architectural Analysis
The custom architecture, despite its relative simplicity (vs ResNet’s 18 layers), achieves remarkable performance thanks to residual connections that:
- Enable efficient gradient flow
- Reduce the vanishing gradient problem
- Preserve information from shallow layers
Domain-Specific Considerations
Plant Disease: Performance convergence (Δ=0.03%) suggests that natural features from ImageNet transfer effectively to plant pathology. The main advantage lies in computational efficiency.
Melanoma Cancer: The more significant gap indicates that skin pattern complexity substantially benefits from ImageNet pre-learned representations.
Technical Recommendations
Transfer learning preferred for:
- Projects with time constraints
- Moderate-sized datasets (<100k images)
- Domains visually close to ImageNet
- Applications requiring high precision
CNN from scratch justified for:
- Highly specialized domains (satellite imagery, microscopy)
- Deployment constraints (edge computing)
- Fundamental research on architectures
- Massive datasets (>1M images)
Conclusion
This study confirms the practical superiority of transfer learning in most use cases, with quantifiable gains in computational efficiency and performance. Nevertheless, implementing custom architectures remains relevant for understanding deep mechanisms and certain specialized contexts.
The results highlight the importance of proximity between source domain (ImageNet) and target domain in transfer learning effectiveness, with direct implications for architectural choice based on application context.
PyTorch implementation with TensorBoard monitoring - Code available on Github Datasets are from kaggle
Merci de votre lecture !