Custom CNN vs Transfer Learning for Image Classification

As part of a deep learning project, I conducted a comparative study between training a CNN from scratch and using transfer learning with ResNet18. This analysis covers two distinct application domains to evaluate the relative performance of each approach.

Experimental Protocol

Studied Datasets

New Plant Disease Dataset

Resolution: 128×128 pixels
Classes: 38 plant diseases
Distribution: Balanced dataset (~1800 images/class)

Melanoma Cancer Dataset

Resolution: 224×224 pixels
Classes: 2 (Malignant vs Benign)
Distribution: 5590 vs 6289 images (slight imbalance)

Custom CNN Architecture

Implementation of a ResNet-inspired architecture with the following specifications:

Base Structure:

Input layer: Conv2d(3→64, kernel=7×7, stride=2) + BatchNorm + ReLU
Initial MaxPooling (3×3, stride=2)
Residual configuration: [1,1,1,1] across 4 main layers

Residual Blocks:

ResidualBlock:
├── Conv2d(kernel=3×3, stride=variable) + BatchNorm + ReLU
├── Conv2d(kernel=3×3, stride=1) + BatchNorm  
└── Skip connection + Final ReLU

Channel progression: 64 → 128 → 256 → 512 Final layer: Average Pooling + Adaptive LazyLinear

Training Parameters

Optimization:

Optimizer: Adam (lr=1e-4)
Loss function: CrossEntropyLoss
Scheduler: ReduceLROnPlateau (factor=0.1, patience=3)
Mixed precision training with GradScaler

Regularization:

Early stopping (patience=5, min_delta=0.01)
Batch size: 64-128 depending on dataset
Standard data augmentation

Comparative Results

Plant Disease Performance

Architecture	Test Accuracy	Validation Loss	Epochs
Custom CNN	98.05%	0.03	15
ResNet18 (fine-tuning)	98.08%	0.06	3

Melanoma Cancer Performance

Architecture	Test Accuracy	Validation Loss	Epochs
Custom CNN	92.05%	0.22	18
ResNet18 (fine-tuning)	94.60%	0.16	15

Results Analysis

Transfer Learning Efficiency

Accelerated convergence: Transfer learning demonstrates significantly faster convergence, particularly marked on Plant Disease (5× factor).

Robustness: Learning curves show superior stability with less inter-epoch variance for pre-trained ResNet18.

Performance on complex domains: The 2.55-point gap on Melanoma Cancer illustrates the advantage of pre-learned features for medical texture analysis.

Architectural Analysis

The custom architecture, despite its relative simplicity (vs ResNet’s 18 layers), achieves remarkable performance thanks to residual connections that:

Enable efficient gradient flow
Reduce the vanishing gradient problem
Preserve information from shallow layers

Domain-Specific Considerations

Plant Disease: Performance convergence (Δ=0.03%) suggests that natural features from ImageNet transfer effectively to plant pathology. The main advantage lies in computational efficiency.

Melanoma Cancer: The more significant gap indicates that skin pattern complexity substantially benefits from ImageNet pre-learned representations.

Technical Recommendations

Transfer learning preferred for:

Projects with time constraints
Moderate-sized datasets (<100k images)
Domains visually close to ImageNet
Applications requiring high precision

CNN from scratch justified for:

Highly specialized domains (satellite imagery, microscopy)
Deployment constraints (edge computing)
Fundamental research on architectures
Massive datasets (>1M images)

Conclusion

This study confirms the practical superiority of transfer learning in most use cases, with quantifiable gains in computational efficiency and performance. Nevertheless, implementing custom architectures remains relevant for understanding deep mechanisms and certain specialized contexts.

The results highlight the importance of proximity between source domain (ImageNet) and target domain in transfer learning effectiveness, with direct implications for architectural choice based on application context.

PyTorch implementation with TensorBoard monitoring - Code available on Github Datasets are from kaggle

Merci de votre lecture !