Skip to content

LaurenzBeck/Cardinality-Incremental-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

A multi-label cardinality-incremental continual learning scenario based on the MixedWM38 dataset [wang2020], which consists of 38,000 wafer maps containing 0 to 4 of 8 possible defect patterns. This scenario is constructed by splitting the dataset into 4 tasks with increasing cardinality |y| (number of patterns per sample).

πŸ·οΈβ†—οΈπŸ§  Cardinality-Incremental Learning

Project Status: Active – The project has reached a stable, usable state and is being actively developed. Python Poetry PyTorch Hydra

Multi-Label Continual Learning with Cardinality-Incremental Scenarios

Research code for our CVWW 2026 paper investigating continual learning scenarios where samples progressively increase in label cardinality, enabling the study of intra-class generalization and cardinality bias.

πŸ‘₯ Authors

Name Institution Email
Laurenz A. Farthofer Kompetenzzentrum Automobil- und Industrieelektronik GmbH (KAI) laurenz.farthofer@k-ai.at
Marc Masana Institute of Visual Computing, Graz University of Technology mmasana@tugraz.at

πŸ“– Table of Contents


πŸš€ Quick Start

# 1. Clone the repository
git clone https://github.com/LaurenzBeck/Cardinality-Incremental-Learning.git
cd Cardinality-Incremental-Learning

# 2. Install dependencies with Poetry (alternatively, use the `requirements.txt` and your preferred environment manager)
poetry install

# 3. Download datasets and run all experiments
poetry run dvc repro

⚠️ Note: MS COCO must be downloaded manually from cocodataset.org and placed in {data_path}/coco/data/. PASCAL VOC and MixedWM38 download automatically.

✨ Key Features

  • 🎯 Novel Continual Learning Scenario: Cardinality-incremental tasks that progressively increase label complexity
  • 🏷️ Multi-Label Support: Full support for multi-label classification in continual learning settings
  • 🌊 Domain-Incremental Focus: All classes present throughout the stream, emphasizing intra-class generalization
  • πŸ“Š Comprehensive Evaluation: Four test streams (past, present, future, all) for detailed knowledge transfer analysis
  • πŸ§ͺ 10 Methods Implemented: From baselines to state-of-the-art continual learning approaches adapted for multi-label
  • 🎨 Three Datasets: MixedWM38 (wafer defects), PASCAL VOC, and MS COCO
  • βš™οΈ Hydra Configuration: Modular, composable experiment configurations
  • πŸ¦‰ DVC Pipeline: Reproducible experiment pipeline with automatic dependency tracking
  • ⚑ Lightning Fabric: Efficient training with automatic device placement and distributed support

🏭 Motivation

Real-world classification systems, particularly in semiconductor manufacturing, face continual evolution as new defect patterns and their combinations emerge during production. Traditional single-label approaches fail to capture the complexity of co-occurring defects, while static multi-label models become outdated as new pattern combinations appear.

The Challenge

Most continual learning research focuses on class-incremental scenarios where tasks have disjoint class sets. However, real industrial applications exhibit:

  • Class repetition: The same defect types recur throughout production
  • Evolving combinations: New co-occurrences of known defects emerge over time
  • Incremental complexity: Samples start simple (single defects) and become complex (multiple simultaneous defects)

Our Solution

We propose cardinality-incremental learning β€” a domain-incremental scenario where:

  1. All classes are introduced in the first task (through single-label samples)
  2. Subsequent tasks introduce samples with progressively more labels per sample
  3. The challenge shifts from remembering which classes exist to learning how they co-occur

This mirrors human learning: concepts are first learned in isolation (dedicated instruction), then encountered within richer contexts alongside other concepts.

πŸ·οΈβ†—οΈ Cardinality-Incremental Scenarios

Scenario Definition

In traditional continual learning, streams are defined as $\mathcal{S} = { \mathcal{D}_1, \mathcal{D}_2, \ldots, \mathcal{D}_T }$ where each task $\mathcal{D}_t$ contains samples and labels.

Our cardinality-incremental scenario enforces:

$$ |\mathbf{y}_i^t| = t \quad \text{for task } t \in {0, 1, 2, \ldots, T} $$

where $|\mathbf{y}|$ denotes the number of active labels (cardinality).

Why This Matters

Domain-Incremental vs Class-Incremental

Aspect Class-Incremental Domain-Incremental (Ours)
Classes Disjoint per task All classes in all tasks
Focus Catastrophic forgetting Intra-class generalization
Repetition None (by design) Full concept repetition
Real-world Rare Common

Key Research Questions

  1. Intra-class Generalization: Can models trained on single-label samples generalize to multi-label combinations?
  2. Cardinality Bias: Do models exhibit task-recency bias in predicted label cardinality?
  3. Knowledge Transfer: How well do models transfer knowledge from simple (low cardinality) to complex (high cardinality) samples?

Datasets

Dataset Classes Tasks Max Cardinality Samples Domain
MixedWM38 8 4 4 38,000 Wafer defect patterns
PASCAL VOC 20 4 4 11,540 Natural images
MS COCO 80 9 9 123,287 Natural images

πŸ” Continual Learning Methods

We implement and adapt 10 continual learning approaches for the multi-label cardinality-incremental setting:

Baselines

Method Description Use Case
Finetuning NaΓ―ve sequential training without forgetting protection Lower bound (most plastic, least stable)
Joint Train on all tasks simultaneously Upper bound (optimal performance)
Freezing Freeze backbone after task 0, only train head Stability baseline
Static Train only on task 0, no updates Zero-plasticity baseline

Functional Regularization

Method Paper Key Idea
LwF Li & Hoiem (ECCV 2016) Knowledge distillation on predictions
LwF-F Jung et al. (2018) Feature-level distillation with Lβ‚‚ loss

Parameter Regularization

Method Paper Key Idea
EWC Kirkpatrick et al. (PNAS 2017) Fisher information matrix penalties
SI Zenke et al. (ICML 2017) Path integral-based parameter importance

Replay Methods

Method Paper Key Idea
Replay Rebuffi et al. (2017) Store exemplars from previous tasks
PASS Zhu et al. (CVPR 2021) Prototype augmentation + self-supervision

Efficient Large-Scale Methods

Method Paper Key Idea
REMIND Hayes et al. (ECCV 2020) Product quantization for latent replay
SIESTA Garg et al. (NeurIPS 2022) Online learning with PQ-based consolidation

πŸ“ All methods originally designed for class-incremental single-label learning have been carefully adapted to support multi-label classification with domain-incremental scenarios.

πŸ“Š Evaluation Strategy

Four Test Streams

We evaluate models using four distinct test streams to analyze different aspects of knowledge transfer:

Stream Description What It Measures
PAST Samples from all previous tasks (t' < t) Backward transfer & forgetting
PRESENT Samples from the current task (t' = t) Current task performance
FUTURE Samples from upcoming tasks (t' > t) Forward transfer & generalization
ALL All test samples regardless of task Overall model capability

Key Metrics

  • mAP (mean Average Precision): Primary multi-label metric
  • Precision: Positive predictive value
  • Recall: Sensitivity to positive labels
  • F1-Score: Harmonic mean of precision and recall
  • Accuracy: Subset accuracy (all labels must be correct)
  • Hamming Distance: Average per-label error
  • Exact Match: Percentage of perfectly predicted samples

Cardinality Bias Analysis

We introduce cardinality confusion matrices to visualize the tendency of models to predict label cardinalities biased toward recently seen tasks β€” a new manifestation of task-recency bias specific to multi-label continual learning.

βš—οΈ Experiments

Prerequisites

  • Python 3.11 or 3.12
  • Poetry 2.0.0+
  • CUDA-capable GPU (recommended)
  • ~50GB free disk space for datasets

Installation

# Clone the repository
git clone https://github.com/LaurenzBeck/Cardinality-Incremental-Learning.git
cd Cardinality-Incremental-Learning

# Install all dependencies
poetry install

# Verify installation
poetry run python -c "import torch; print(torch.__version__)"

Dataset Setup

Automatic Downloads

  • MixedWM38: Downloads automatically from Kaggle on first run
  • PASCAL VOC: Downloads automatically via TorchVision

Manual Download Required

MS COCO must be downloaded manually:

  1. Visit cocodataset.org
  2. Download 2017 Train/Val images and annotations
  3. Place in {data_path}/coco/data/ where data_path is defined in conf/config.yaml

Running Experiments

Full Pipeline (All Experiments)

# Run complete DVC pipeline: download β†’ preprocess β†’ experiments β†’ analysis
poetry run dvc repro

Individual Experiments

# MixedWM38 with all methods (default experiment)
poetry run python src/stages/continual_learning.py experiment=main

# PASCAL VOC experiments
poetry run python src/stages/continual_learning.py experiment=voc

# MS COCO experiments  
poetry run python src/stages/continual_learning.py experiment=coco

# Loss function ablation study
poetry run python src/stages/continual_learning.py experiment=loss_ablations

Specific Method/Configuration

# Run specific method on PASCAL VOC
poetry run python src/stages/continual_learning.py experiment=voc method=lwf

# Override hyperparameters
poetry run python src/stages/continual_learning.py experiment=voc method=lwf method.appr.lamb=5.0

# Change random seed
poetry run python src/stages/continual_learning.py experiment=main seed=42

# Use different network
poetry run python src/stages/continual_learning.py experiment=voc network=resnet18

Multi-Run Experiments (Hydra)

# Run across multiple seeds
poetry run python src/stages/continual_learning.py experiment=main --multirun seed=0,1,2,3,4

# Grid search over methods
poetry run python src/stages/continual_learning.py experiment=voc --multirun method=finetuning,lwf,ewc,replay

Quick Test Run (Verify Installation)

For a fast test to verify everything works without running full experiments:

# Quick test on MixedWM38 (smallest dataset, ~2-3 minutes)
poetry run python src/stages/continual_learning.py experiment=main method=finetuning network=tiny_byobnet fabric=default method.appr.nepochs=1 seed=0

# Quick test on PASCAL VOC (~5 minutes)
poetry run python src/stages/continual_learning.py experiment=voc method=finetuning network=tiny_byobnet fabric=default method.appr.nepochs=1 seed=0

# Quick test on MS COCO (~10 minutes, requires manual download)
poetry run python src/stages/continual_learning.py experiment=coco method=finetuning network=tiny_byobnet fabric=default method.appr.nepochs=1 seed=0

πŸ’‘ Tip: The fabric=default config uses CPU. For GPU testing, use fabric=gpu instead. tiny_byobnet is a small, fast timm model ideal for testing.

Analysis and Visualization

# Analyze all experiments
poetry run python src/stages/analysis.py --dataset all

# Analyze specific dataset
poetry run python src/stages/analysis.py --dataset mixedwm38
poetry run python src/stages/analysis.py --dataset voc
poetry run python src/stages/analysis.py --dataset coco

# Custom report directory
poetry run python src/stages/analysis.py --dataset voc --report-dir /path/to/reports

Outputs are saved to:

  • reports/figures/mixedwm38/ - MixedWM38 plots
  • reports/figures/pascal_voc/ - PASCAL VOC plots
  • reports/figures/ms_coco/ - MS COCO plots
  • reports/figures/voc_lwf_sweep/ - LwF hyperparameter sweep results

Configuration

Experiments use Hydra for compositional configuration. All config files are in conf/:

conf/
β”œβ”€β”€ experiment/          # Experiment presets
β”‚   β”œβ”€β”€ main.yaml       # MixedWM38 experiments
β”‚   β”œβ”€β”€ voc.yaml        # PASCAL VOC experiments
β”‚   β”œβ”€β”€ coco.yaml       # MS COCO experiments
β”‚   └── ...
β”œβ”€β”€ method/             # CL method configs
β”‚   β”œβ”€β”€ finetuning.yaml
β”‚   β”œβ”€β”€ lwf.yaml
β”‚   β”œβ”€β”€ ewc.yaml
β”‚   └── ...
β”œβ”€β”€ network/            # Model architectures
β”œβ”€β”€ scenario/           # Dataset & stream configs
β”œβ”€β”€ loss/              # Loss functions
β”œβ”€β”€ optimizer/         # Optimizers
└── fabric/            # Lightning Fabric settings

πŸ“‚ Project Structure

Cardinality-Incremental-Learning/
β”œβ”€β”€ πŸ“„ README.md                          # This file
β”œβ”€β”€ πŸ“„ LICENSE                            # MIT License
β”œβ”€β”€ πŸ“„ pyproject.toml                     # Poetry dependencies
β”œβ”€β”€ πŸ“„ dvc.yaml                           # DVC pipeline definition
β”œβ”€β”€ βš™οΈ conf/                               # Hydra configurations
β”‚   β”œβ”€β”€ config.yaml                      # Root config (gitignored)
β”‚   β”œβ”€β”€ experiment/                      # Experiment presets
β”‚   β”œβ”€β”€ method/                          # CL methods
β”‚   β”œβ”€β”€ network/                         # Model architectures
β”‚   β”œβ”€β”€ scenario/                        # Datasets & streams
β”‚   β”œβ”€β”€ loss/                            # Loss functions
β”‚   β”œβ”€β”€ optimizer/                       # Optimizers
β”‚   └── fabric/                          # Training configs
β”œβ”€β”€ πŸ’Ύ data/                               # Data directory (gitignored)
β”‚   β”œβ”€β”€ mixedwm38/
β”‚   β”œβ”€β”€ pascal_voc/
β”‚   └── coco/
β”œβ”€β”€ πŸ“ˆ reports/                            # Experiment results (gitignored)
β”‚   β”œβ”€β”€ main/                            # MixedWM38 results
β”‚   β”œβ”€β”€ voc/                             # PASCAL VOC results
β”‚   β”œβ”€β”€ coco/                            # MS COCO results
β”‚   └── figures/                         # Generated plots
β”œβ”€β”€ πŸ§‘β€πŸ’» src/                                # Source code
β”‚   β”œβ”€β”€ πŸ“¦ cardinality_incremental_learning/  # Main package
β”‚   β”‚   β”œβ”€β”€ data/                        # Dataset & stream implementations
β”‚   β”‚   β”‚   β”œβ”€β”€ splitters.py            # Train/val splitting
β”‚   β”‚   β”‚   β”œβ”€β”€ sets.py                 # Dataset classes
β”‚   β”‚   β”‚   β”œβ”€β”€ codecs.py               # Label encoding/decoding
β”‚   β”‚   β”‚   └── streams/                # Stream constructors
β”‚   β”‚   β”‚       β”œβ”€β”€ mixedwm38.py
β”‚   β”‚   β”‚       β”œβ”€β”€ voc.py
β”‚   β”‚   β”‚       └── coco.py
β”‚   β”‚   β”œβ”€β”€ methods/                     # CL methods
β”‚   β”‚   β”‚   β”œβ”€β”€ incremental_learning.py  # Base class
β”‚   β”‚   β”‚   β”œβ”€β”€ finetuning.py
β”‚   β”‚   β”‚   β”œβ”€β”€ lwf.py
β”‚   β”‚   β”‚   β”œβ”€β”€ ewc.py
β”‚   β”‚   β”‚   β”œβ”€β”€ path_integral.py
β”‚   β”‚   β”‚   β”œβ”€β”€ replay.py
β”‚   β”‚   β”‚   β”œβ”€β”€ pass.py
β”‚   β”‚   β”‚   β”œβ”€β”€ praka.py
β”‚   β”‚   β”‚   β”œβ”€β”€ remind.py
β”‚   β”‚   β”‚   └── siesta.py
β”‚   β”‚   β”œβ”€β”€ networks/                    # Model architectures
β”‚   β”‚   β”‚   β”œβ”€β”€ network.py              # LLL_Net wrapper
β”‚   β”‚   β”‚   β”œβ”€β”€ query2label.py          # Transformer head
β”‚   β”‚   β”‚   β”œβ”€β”€ normalized_linear.py    # Cosine classifier
β”‚   β”‚   β”‚   └── ...
β”‚   β”‚   β”œβ”€β”€ losses/                      # Loss functions
β”‚   β”‚   β”‚   β”œβ”€β”€ asymmetric_loss.py
β”‚   β”‚   β”‚   └── two_way_multilabel_loss.py
β”‚   β”‚   β”œβ”€β”€ metrics.py                   # Evaluation metrics
β”‚   β”‚   β”œβ”€β”€ transforms.py                # Data augmentations
β”‚   β”‚   β”œβ”€β”€ utils.py                     # Utility functions
β”‚   β”‚   β”œβ”€β”€ visualizations.py            # Plotting utilities
β”‚   β”‚   └── last_layer_analysis.py       # Weight analysis
└── πŸ¦‰ stages/                        # DVC pipeline stages
    β”œβ”€β”€ continual_learning.py        # Main training script
    β”œβ”€β”€ analysis.py                  # Results analysis
    β”œβ”€β”€ dataset_visualizations.py
    β”œβ”€β”€ download_and_unzip.py
    └── download_and_preprocess_torchvision_datasets.py

πŸ”§ Technical Details

Framework Architecture

This project is built on a modified FACIL framework adapted for multi-label and domain-incremental scenarios:

  • Base Class: All methods inherit from Inc_Learning_Appr with hooks for train_loop(), criterion(), post_train_process()
  • Multi-Head Network: LLL_Net wrapper supports multiple heads (one per task) with various architectures (linear, Query2Label, normalized)
  • Avalanche Integration: Uses AvalancheDataset for continual learning stream construction
  • Lightning Fabric: Efficient training with automatic device placement, mixed precision, distributed support

Key Dependencies

Library Purpose
PyTorch 2.0+ Deep learning framework
Lightning Fabric Training infrastructure
Hydra 1.3+ Compositional configuration
Avalanche Continual learning benchmarks
timm Model zoo (ResNet, MobileNet, ViT)
torchmetrics Multi-label metrics
DVC Experiment pipeline & versioning
skald Metrics logging
polars Fast dataframe analysis

Multi-Label Specifics

  • Loss Functions: Asymmetric Loss (ASL), BCE, Two-Way Multi-Label Loss
  • (Torch)Metrics: Multilabel Accuracy, Precision, Recall, F1, mAP, Exact Match, Hamming Distance
  • Custom Metrics: MultilabelConfusionMatrix, CardinalityConfusionMatrix
  • Heads: Support for linear, Query2Label, normalized linear layers

Reproducibility

  • βœ… Seeded experiments (utils.seed_everything())
  • βœ… DVC pipeline for full reproducibility
  • βœ… Hydra configs track all hyperparameters
  • βœ… Results logged with experiment metadata
  • βœ… Multiple seeds run for statistical significance

πŸ™ Acknowledgements

Made with ❀️ and β˜• by Laurenz Farthofer.

Funding: This work was funded by the Austrian Research Promotion Agency (FFG, Project No. 931130), and by the Austrian Science Fund (FWF) 10.55776/COE12.

Special Thanks:

  • Infineon Technologies and KAI GmbH for supporting this research and enabling its publication
  • Marc Masana for supervision and his FACIL framework, which served as the foundation for this project
  • Benedikt Tscheschner for valuable assistance in adapting exemplar-free class-incremental methods to the multi-label setting
  • Thomas Pock for academic supervision at Graz University of Technology