Publications

A Contrastive Teacher-Student Framework for Novelty Detection under Style Shifts

Under Review, 2025

In this work, we designed a novelty detection method which is robust to style shifts in the data distribution. By distinguishing between core features and style features and using a teacher-student scheme, we were able to achieve state-of-the-art results on various dataset pairs.

Towards an Explainable Comparison and Alignment of Feature Embeddings

Under Review, 2025

In this work, we introduce the Spectral Pairwise Embedding Comparison (SPEC) framework, a novel method for comparing feature embeddings by analyzing how they cluster similar samples. SPEC leverages kernel matrices and eigendecomposition to reveal mismatches in clustering between embeddings. We also present an optimization strategy to align embeddings.

Scanning Trojaned Models Using Out-of-Distribution Samples

Published in NeurIPS, 2024

In this work, we’ve introduced TRODO, a new method for detecting backdoor attacks in deep neural networks. TRODO identifies trojans by adversarially shifting out-of-distribution (OOD) samples toward in-distribution (ID) and detecting when classifiers mistakenly classify them as ID. This approach is effective even without training data and works against adversarially trained trojaned classifiers, making it adaptable across different scenarios and datasets.

Bahar Dibaei Nia

Publications

A Contrastive Teacher-Student Framework for Novelty Detection under Style Shifts

Towards an Explainable Comparison and Alignment of Feature Embeddings

Scanning Trojaned Models Using Out-of-Distribution Samples