Multi-Label Continual Learning with Cardinality-Incremental Scenarios
Research code for our CVWW 2026 paper investigating continual learning scenarios where samples progressively increase in label cardinality, enabling the study of intra-class generalization and cardinality bias.
| Name | Institution | |
|---|---|---|
| Laurenz A. Farthofer | Kompetenzzentrum Automobil- und Industrieelektronik GmbH (KAI) | laurenz.farthofer@k-ai.at |
| Marc Masana | Institute of Visual Computing, Graz University of Technology | mmasana@tugraz.at |
- π Quick Start
- β¨ Key Features
- π Motivation
- π·οΈ Cardinality-Incremental Scenarios
- π Continual Learning Methods
- π Evaluation Strategy
- βοΈ Experiments
- π Project Structure
- π§ Technical Details
- π Citation
- π Acknowledgements
# 1. Clone the repository
git clone https://github.com/LaurenzBeck/Cardinality-Incremental-Learning.git
cd Cardinality-Incremental-Learning
# 2. Install dependencies with Poetry (alternatively, use the `requirements.txt` and your preferred environment manager)
poetry install
# 3. Download datasets and run all experiments
poetry run dvc repro
β οΈ Note: MS COCO must be downloaded manually from cocodataset.org and placed in{data_path}/coco/data/. PASCAL VOC and MixedWM38 download automatically.
- π― Novel Continual Learning Scenario: Cardinality-incremental tasks that progressively increase label complexity
- π·οΈ Multi-Label Support: Full support for multi-label classification in continual learning settings
- π Domain-Incremental Focus: All classes present throughout the stream, emphasizing intra-class generalization
- π Comprehensive Evaluation: Four test streams (past, present, future, all) for detailed knowledge transfer analysis
- π§ͺ 10 Methods Implemented: From baselines to state-of-the-art continual learning approaches adapted for multi-label
- π¨ Three Datasets: MixedWM38 (wafer defects), PASCAL VOC, and MS COCO
- βοΈ Hydra Configuration: Modular, composable experiment configurations
- π¦ DVC Pipeline: Reproducible experiment pipeline with automatic dependency tracking
- β‘ Lightning Fabric: Efficient training with automatic device placement and distributed support
Real-world classification systems, particularly in semiconductor manufacturing, face continual evolution as new defect patterns and their combinations emerge during production. Traditional single-label approaches fail to capture the complexity of co-occurring defects, while static multi-label models become outdated as new pattern combinations appear.
Most continual learning research focuses on class-incremental scenarios where tasks have disjoint class sets. However, real industrial applications exhibit:
- Class repetition: The same defect types recur throughout production
- Evolving combinations: New co-occurrences of known defects emerge over time
- Incremental complexity: Samples start simple (single defects) and become complex (multiple simultaneous defects)
We propose cardinality-incremental learning β a domain-incremental scenario where:
- All classes are introduced in the first task (through single-label samples)
- Subsequent tasks introduce samples with progressively more labels per sample
- The challenge shifts from remembering which classes exist to learning how they co-occur
This mirrors human learning: concepts are first learned in isolation (dedicated instruction), then encountered within richer contexts alongside other concepts.
In traditional continual learning, streams are defined as
Our cardinality-incremental scenario enforces:
where
| Aspect | Class-Incremental | Domain-Incremental (Ours) |
|---|---|---|
| Classes | Disjoint per task | All classes in all tasks |
| Focus | Catastrophic forgetting | Intra-class generalization |
| Repetition | None (by design) | Full concept repetition |
| Real-world | Rare | Common |
- Intra-class Generalization: Can models trained on single-label samples generalize to multi-label combinations?
- Cardinality Bias: Do models exhibit task-recency bias in predicted label cardinality?
- Knowledge Transfer: How well do models transfer knowledge from simple (low cardinality) to complex (high cardinality) samples?
| Dataset | Classes | Tasks | Max Cardinality | Samples | Domain |
|---|---|---|---|---|---|
| MixedWM38 | 8 | 4 | 4 | 38,000 | Wafer defect patterns |
| PASCAL VOC | 20 | 4 | 4 | 11,540 | Natural images |
| MS COCO | 80 | 9 | 9 | 123,287 | Natural images |
We implement and adapt 10 continual learning approaches for the multi-label cardinality-incremental setting:
| Method | Description | Use Case |
|---|---|---|
| Finetuning | NaΓ―ve sequential training without forgetting protection | Lower bound (most plastic, least stable) |
| Joint | Train on all tasks simultaneously | Upper bound (optimal performance) |
| Freezing | Freeze backbone after task 0, only train head | Stability baseline |
| Static | Train only on task 0, no updates | Zero-plasticity baseline |
| Method | Paper | Key Idea |
|---|---|---|
| LwF | Li & Hoiem (ECCV 2016) | Knowledge distillation on predictions |
| LwF-F | Jung et al. (2018) | Feature-level distillation with Lβ loss |
| Method | Paper | Key Idea |
|---|---|---|
| EWC | Kirkpatrick et al. (PNAS 2017) | Fisher information matrix penalties |
| SI | Zenke et al. (ICML 2017) | Path integral-based parameter importance |
| Method | Paper | Key Idea |
|---|---|---|
| Replay | Rebuffi et al. (2017) | Store exemplars from previous tasks |
| PASS | Zhu et al. (CVPR 2021) | Prototype augmentation + self-supervision |
| Method | Paper | Key Idea |
|---|---|---|
| REMIND | Hayes et al. (ECCV 2020) | Product quantization for latent replay |
| SIESTA | Garg et al. (NeurIPS 2022) | Online learning with PQ-based consolidation |
π All methods originally designed for class-incremental single-label learning have been carefully adapted to support multi-label classification with domain-incremental scenarios.
We evaluate models using four distinct test streams to analyze different aspects of knowledge transfer:
| Stream | Description | What It Measures |
|---|---|---|
| PAST | Samples from all previous tasks (t' < t) | Backward transfer & forgetting |
| PRESENT | Samples from the current task (t' = t) | Current task performance |
| FUTURE | Samples from upcoming tasks (t' > t) | Forward transfer & generalization |
| ALL | All test samples regardless of task | Overall model capability |
- mAP (mean Average Precision): Primary multi-label metric
- Precision: Positive predictive value
- Recall: Sensitivity to positive labels
- F1-Score: Harmonic mean of precision and recall
- Accuracy: Subset accuracy (all labels must be correct)
- Hamming Distance: Average per-label error
- Exact Match: Percentage of perfectly predicted samples
We introduce cardinality confusion matrices to visualize the tendency of models to predict label cardinalities biased toward recently seen tasks β a new manifestation of task-recency bias specific to multi-label continual learning.
- Python 3.11 or 3.12
- Poetry 2.0.0+
- CUDA-capable GPU (recommended)
- ~50GB free disk space for datasets
# Clone the repository
git clone https://github.com/LaurenzBeck/Cardinality-Incremental-Learning.git
cd Cardinality-Incremental-Learning
# Install all dependencies
poetry install
# Verify installation
poetry run python -c "import torch; print(torch.__version__)"- MixedWM38: Downloads automatically from Kaggle on first run
- PASCAL VOC: Downloads automatically via TorchVision
MS COCO must be downloaded manually:
- Visit cocodataset.org
- Download 2017 Train/Val images and annotations
- Place in
{data_path}/coco/data/wheredata_pathis defined inconf/config.yaml
# Run complete DVC pipeline: download β preprocess β experiments β analysis
poetry run dvc repro# MixedWM38 with all methods (default experiment)
poetry run python src/stages/continual_learning.py experiment=main
# PASCAL VOC experiments
poetry run python src/stages/continual_learning.py experiment=voc
# MS COCO experiments
poetry run python src/stages/continual_learning.py experiment=coco
# Loss function ablation study
poetry run python src/stages/continual_learning.py experiment=loss_ablations# Run specific method on PASCAL VOC
poetry run python src/stages/continual_learning.py experiment=voc method=lwf
# Override hyperparameters
poetry run python src/stages/continual_learning.py experiment=voc method=lwf method.appr.lamb=5.0
# Change random seed
poetry run python src/stages/continual_learning.py experiment=main seed=42
# Use different network
poetry run python src/stages/continual_learning.py experiment=voc network=resnet18# Run across multiple seeds
poetry run python src/stages/continual_learning.py experiment=main --multirun seed=0,1,2,3,4
# Grid search over methods
poetry run python src/stages/continual_learning.py experiment=voc --multirun method=finetuning,lwf,ewc,replayFor a fast test to verify everything works without running full experiments:
# Quick test on MixedWM38 (smallest dataset, ~2-3 minutes)
poetry run python src/stages/continual_learning.py experiment=main method=finetuning network=tiny_byobnet fabric=default method.appr.nepochs=1 seed=0
# Quick test on PASCAL VOC (~5 minutes)
poetry run python src/stages/continual_learning.py experiment=voc method=finetuning network=tiny_byobnet fabric=default method.appr.nepochs=1 seed=0
# Quick test on MS COCO (~10 minutes, requires manual download)
poetry run python src/stages/continual_learning.py experiment=coco method=finetuning network=tiny_byobnet fabric=default method.appr.nepochs=1 seed=0π‘ Tip: The
fabric=defaultconfig uses CPU. For GPU testing, usefabric=gpuinstead.tiny_byobnetis a small, fast timm model ideal for testing.
# Analyze all experiments
poetry run python src/stages/analysis.py --dataset all
# Analyze specific dataset
poetry run python src/stages/analysis.py --dataset mixedwm38
poetry run python src/stages/analysis.py --dataset voc
poetry run python src/stages/analysis.py --dataset coco
# Custom report directory
poetry run python src/stages/analysis.py --dataset voc --report-dir /path/to/reportsOutputs are saved to:
reports/figures/mixedwm38/- MixedWM38 plotsreports/figures/pascal_voc/- PASCAL VOC plotsreports/figures/ms_coco/- MS COCO plotsreports/figures/voc_lwf_sweep/- LwF hyperparameter sweep results
Experiments use Hydra for compositional configuration. All config files are in conf/:
conf/
βββ experiment/ # Experiment presets
β βββ main.yaml # MixedWM38 experiments
β βββ voc.yaml # PASCAL VOC experiments
β βββ coco.yaml # MS COCO experiments
β βββ ...
βββ method/ # CL method configs
β βββ finetuning.yaml
β βββ lwf.yaml
β βββ ewc.yaml
β βββ ...
βββ network/ # Model architectures
βββ scenario/ # Dataset & stream configs
βββ loss/ # Loss functions
βββ optimizer/ # Optimizers
βββ fabric/ # Lightning Fabric settings
Cardinality-Incremental-Learning/
βββ π README.md # This file
βββ π LICENSE # MIT License
βββ π pyproject.toml # Poetry dependencies
βββ π dvc.yaml # DVC pipeline definition
βββ βοΈ conf/ # Hydra configurations
β βββ config.yaml # Root config (gitignored)
β βββ experiment/ # Experiment presets
β βββ method/ # CL methods
β βββ network/ # Model architectures
β βββ scenario/ # Datasets & streams
β βββ loss/ # Loss functions
β βββ optimizer/ # Optimizers
β βββ fabric/ # Training configs
βββ πΎ data/ # Data directory (gitignored)
β βββ mixedwm38/
β βββ pascal_voc/
β βββ coco/
βββ π reports/ # Experiment results (gitignored)
β βββ main/ # MixedWM38 results
β βββ voc/ # PASCAL VOC results
β βββ coco/ # MS COCO results
β βββ figures/ # Generated plots
βββ π§βπ» src/ # Source code
β βββ π¦ cardinality_incremental_learning/ # Main package
β β βββ data/ # Dataset & stream implementations
β β β βββ splitters.py # Train/val splitting
β β β βββ sets.py # Dataset classes
β β β βββ codecs.py # Label encoding/decoding
β β β βββ streams/ # Stream constructors
β β β βββ mixedwm38.py
β β β βββ voc.py
β β β βββ coco.py
β β βββ methods/ # CL methods
β β β βββ incremental_learning.py # Base class
β β β βββ finetuning.py
β β β βββ lwf.py
β β β βββ ewc.py
β β β βββ path_integral.py
β β β βββ replay.py
β β β βββ pass.py
β β β βββ praka.py
β β β βββ remind.py
β β β βββ siesta.py
β β βββ networks/ # Model architectures
β β β βββ network.py # LLL_Net wrapper
β β β βββ query2label.py # Transformer head
β β β βββ normalized_linear.py # Cosine classifier
β β β βββ ...
β β βββ losses/ # Loss functions
β β β βββ asymmetric_loss.py
β β β βββ two_way_multilabel_loss.py
β β βββ metrics.py # Evaluation metrics
β β βββ transforms.py # Data augmentations
β β βββ utils.py # Utility functions
β β βββ visualizations.py # Plotting utilities
β β βββ last_layer_analysis.py # Weight analysis
βββ π¦ stages/ # DVC pipeline stages
βββ continual_learning.py # Main training script
βββ analysis.py # Results analysis
βββ dataset_visualizations.py
βββ download_and_unzip.py
βββ download_and_preprocess_torchvision_datasets.py
This project is built on a modified FACIL framework adapted for multi-label and domain-incremental scenarios:
- Base Class: All methods inherit from
Inc_Learning_Apprwith hooks fortrain_loop(),criterion(),post_train_process() - Multi-Head Network:
LLL_Netwrapper supports multiple heads (one per task) with various architectures (linear, Query2Label, normalized) - Avalanche Integration: Uses AvalancheDataset for continual learning stream construction
- Lightning Fabric: Efficient training with automatic device placement, mixed precision, distributed support
| Library | Purpose |
|---|---|
| PyTorch 2.0+ | Deep learning framework |
| Lightning Fabric | Training infrastructure |
| Hydra 1.3+ | Compositional configuration |
| Avalanche | Continual learning benchmarks |
| timm | Model zoo (ResNet, MobileNet, ViT) |
| torchmetrics | Multi-label metrics |
| DVC | Experiment pipeline & versioning |
| skald | Metrics logging |
| polars | Fast dataframe analysis |
- Loss Functions: Asymmetric Loss (ASL), BCE, Two-Way Multi-Label Loss
- (Torch)Metrics: Multilabel Accuracy, Precision, Recall, F1, mAP, Exact Match, Hamming Distance
- Custom Metrics: MultilabelConfusionMatrix, CardinalityConfusionMatrix
- Heads: Support for linear, Query2Label, normalized linear layers
- β
Seeded experiments (
utils.seed_everything()) - β DVC pipeline for full reproducibility
- β Hydra configs track all hyperparameters
- β Results logged with experiment metadata
- β Multiple seeds run for statistical significance
Made with β€οΈ and β by Laurenz Farthofer.
Funding: This work was funded by the Austrian Research Promotion Agency (FFG, Project No. 931130), and by the Austrian Science Fund (FWF) 10.55776/COE12.
Special Thanks:
- Infineon Technologies and KAI GmbH for supporting this research and enabling its publication
- Marc Masana for supervision and his FACIL framework, which served as the foundation for this project
- Benedikt Tscheschner for valuable assistance in adapting exemplar-free class-incremental methods to the multi-label setting
- Thomas Pock for academic supervision at Graz University of Technology
![A multi-label cardinality-incremental continual learning scenario based on the MixedWM38 dataset [wang2020], which consists of 38,000 wafer maps containing 0 to 4 of 8 possible defect patterns. This scenario is constructed by splitting the dataset into 4 tasks with increasing cardinality |y| (number of patterns per sample).](/LaurenzBeck/Cardinality-Incremental-Learning/raw/main/assets/mixedwm_scenario.png)