Abstract
Domain adaptive active learning is leading the charge in label-efficient training of neural networks. For semantic segmentation, state-of-the-art models jointly use two criteria of uncertainty and diversity to select training labels, combined with a pixel-wise acquisition strategy. However, we show that such methods currently suffer from a class imbalance issue which degrades their performance for larger active learning budgets. We then introduce Class Balanced Dynamic Acquisition (CBDA), a novel active learning method that mitigates this issue, especially in high-budget regimes. The more balanced labels increase minority class performance, which in turn allows the model to outperform the previous baseline by 0.6, 1.7, and 2.4 mIoU for budgets of 5%, 10%, and 20%, respectively. Additionally, the focus on minority classes leads to improvements of the minimum class performance of 0.5, 2.9, and 4.6 IoU respectively. The top-performing model even exceeds the fully supervised baseline, showing that a more balanced label than the entire ground truth can be beneficial.
Key Contributions
Class-imbalance diagnosis for active domain adaptation: We show that pixel-wise active learning for domain-adaptive semantic segmentation becomes increasingly imbalanced as the annotation budget grows, which hurts performance in high-budget regimes.
A simple class-balanced acquisition rule: We introduce Class Balanced Dynamic Acquisition (CBDA), which reweights acquisition scores according to previously collected class statistics so underrepresented classes are queried more often.
Dynamic budget allocation across images: Instead of splitting the budget uniformly per image, the method stacks acquisition scores across the target set and selects the highest-scoring pixels globally.
Stronger minority-class and overall performance: On GTAV to Cityscapes, CBDA improves over the region-acquisition baseline by 0.6, 1.7, and 2.4 mIoU at 5%, 10%, and 20% budgets while also improving the worst-served classes.
Methodology Overview
The work studies active learning for semantic segmentation in a domain adaptation setting, where a model is trained with labeled source data and unlabeled target data, then selectively queries target labels under a fixed annotation budget.
CBDA addresses two limitations of prior pixel-wise acquisition schemes. First, it replaces rigid per-image budget allocation with a dynamic selection rule that chooses the highest-scoring pixels from the full target-set score tensor. Second, it downweights classes that have already consumed more of the acquisition budget relative to a target class distribution, which shifts future queries toward minority classes.
Main Findings
- Standard region-based acquisition becomes more class-imbalanced as the active learning budget increases.
- More balanced queried labels improve minority-class IoU without degrading the dominant classes.
- CBDA yields consistent gains over the baseline on GTAV to Cityscapes at 5%, 10%, and 20% budgets.
- In the strongest setting reported, the method surpasses the fully supervised baseline, suggesting that balanced queried labels can be more informative than labeling the full imbalanced target set.
