Publications - Efstathia Soufleri

Refereed journal articles

Soufleri, E., Ravikumar, D., & Roy, K. (2024). DP-ImgSyn: Dataset Alignment for Obfuscated, Differentially Private Image Synthesis. TMLR.

@article{soufleridp,
  title = {DP-ImgSyn: Dataset Alignment for Obfuscated, Differentially Private Image Synthesis},
  author = {Soufleri, Efstathia and Ravikumar, Deepak and Roy, Kaushik},
  journal = {TMLR},
  year = {2024},
  publisher = {IEEE},
  file = {2144_dp_imgsyn_dataset_alignment_fo.pdf}
}

The availability of abundant data has catalyzed the expansion of deep learning vision algo- rithms. However, certain vision datasets cannot be publicly released due to privacy reasons. Releasing synthetic images instead of private images is a common approach to overcome this issue. A popular method to generate synthetic images is using Generative Adversar- ial Networks (GANs) with Differential Privacy (DP) guarantees. However, GAN-generated synthetic images are visually similar to private images. This is a severe limitation, par- ticularly when the private dataset depicts visually sensitive and disturbing content. To address this, we propose a non-generative framework, Differentially Private Image Synthesis (DP-ImgSyn), to generate and release synthetic images for image classification tasks. These synthetic images: (1) have DP guarantees, (2) retain the utility of the private images, i.e., a model trained using synthetic images results in similar accuracy as a model trained on pri- vate images, (3) the synthetic images are visually dissimilar to private images. DP-ImgSyn consists of the following steps: First, a teacher model is trained on the private images using a DP training algorithm. Second, public images are used as initialization for the synthetic im- ages which are optimized to align them with the private images. The optimization uses the teacher network’s batch normalization layer statistics (mean, standard deviation) to inject information about the private images into the synthetic images. Third, the synthetic images and their soft labels, obtained from the teacher model, are released and can be deployed for neural network training on image classification tasks. Our experiments on various image classification datasets show that when using similar DP training mechanisms, our framework performs better than generative techniques (up to ≈ 20% in terms of image classification accuracy).

Soufleri, E., & Roy, K. (2021). Network Compression via Mixed Precision Quantization Using a Multi-Layer Perceptron for the Bit-Width Allocation. IEEE Access, 9, 135059–135068.

@article{soufleri2021network,
  title = {Network Compression via Mixed Precision Quantization Using a Multi-Layer Perceptron for the Bit-Width Allocation},
  author = {Soufleri, Efstathia and Roy, Kaushik},
  journal = {IEEE Access},
  volume = {9},
  pages = {135059--135068},
  year = {2021},
  publisher = {IEEE},
  doi = {10.1109/ACCESS.2021.3116418},
  file = {soufleri2021network.pdf}
}

Deep Neural Networks (DNNs) are a powerful tool for solving complex tasks in many application domains. The high performance of DNNs demands significant computational resources, which might not always be available. Network quantization with mixed-precision across the layers can alleviate this high demand. However, determining layer-wise optimal bit-widths is non-trivial, as the search space is exponential. This article proposes a novel technique for allocating layer-wise bit-widths for a DNN using a multi-layer perceptron (MLP). The Kullback-Leibler (KL) divergence of the softmax outputs between the quantized and full precision network is used as the metric to quantify the quantization quality. We explore the relationship between the KL-divergence and the network size, and from our experiments observe that more aggressive quantization leads to higher divergence, and vice versa. The MLP is trained with layer-wise bit-widths as labels and their corresponding KL-divergence as the input. The MLP training set, i.e. the pairs of the layer-wise bit-widths and their corresponding KL-divergence, is collected using a Monte Carlo sampling of the exponential search space. We introduce a penalty term in the loss to ensure that the MLP learns to predict bit-widths resulting in the smallest network size. We show that the layer-wise bit-width predictions from the trained MLP result in reduced network size without degrading accuracy while achieving better or comparable results with SOTA work but with less computational overhead. Our method achieves up to 6x, 4x, 4x compression on VGG16, ResNet50, and GoogLeNet respectively, with no accuracy drop compared to the original full precision pretrained model, on the ImageNet dataset.

Refereed conference proceedings

Ravikumar, D., Soufleri, E., & Roy, K. (2025). Curvature clues: Decoding deep learning privacy with input loss curvature. Advances in Neural Information Processing Systems, 37, 20003–20030.

@inproceedings{ravikumar2025curvature,
  title = {Curvature clues: Decoding deep learning privacy with input loss curvature},
  author = {Ravikumar, Deepak and Soufleri, Efstathia and Roy, Kaushik},
  journal = {Advances in Neural Information Processing Systems},
  volume = {37},
  pages = {20003--20030},
  year = {2025},
  file = {NeurIPS-2024-curvature-clues-decoding-deep-learning-privacy-with-input-loss-curvature-Paper-Conference.pdf}
}

In this paper, we explore the properties of loss curvature with respect to input data in deep neural networks. Curvature of loss with respect to input (termed input loss curvature) is the trace of the Hessian of the loss with respect to the input. We investigate how input loss curvature varies between train and test sets, and its implications for train-test distinguishability. We develop a theoretical framework that derives an upper bound on the train-test distinguishability based on privacy and the size of the training set. This novel insight fuels the development of a new black box membership inference attack utilizing input loss curvature. We validate our theoretical findings through experiments in computer vision classification tasks, demonstrating that input loss curvature surpasses existing methods in membership inference effectiveness. Our analysis highlights how the performance of membership inference attack (MIA) methods varies with the size of the training set, showing that curvature-based MIA outperforms other methods on sufficiently large datasets. This condition is often met by real datasets, as demonstrated by our results on CIFAR10, CIFAR100, and ImageNet. These findings not only advance our understanding of deep neural network behavior but also improve the ability to test privacy-preserving techniques in machine learning.

Ravikumar, D., Soufleri, E., Hashemi, A., & Roy, K. (2024). Unveiling Privacy, Memorization, and Input Curvature Links. Proceedings of the 41st International Conference on Machine Learning, 42192–42212.

@inproceedings{ravikumar2024unveiling,
  title = {Unveiling Privacy, Memorization, and Input Curvature Links},
  author = {Ravikumar, Deepak and Soufleri, Efstathia and Hashemi, Abolfazl and Roy, Kaushik},
  booktitle = {Proceedings of the 41st International Conference on Machine Learning},
  pages = {42192--42212},
  year = {2024},
  file = {ravikumar24a.pdf}
}

Deep Neural Nets (DNNs) have become a pervasive tool for solving many emerging problems. However, they tend to overfit to and memorize the training set. Memorization is of keen interest since it is closely related to several concepts such as generalization, noisy learning, and privacy. To study memorization, Feldman (2019) proposed a formal score, however its computational requirements limit its practical use. Recent research has shown empirical evidence linking input loss curvature (measured by the trace of the loss Hessian w.r.t inputs) and memorization. It was shown to be 3 orders of magnitude more efficient than calculating the memorization score. However, there is a lack of theoretical understanding linking memorization with input loss curvature. In this paper, we not only investigate this connection but also extend our analysis to establish theoretical links between differential privacy, memorization, and input loss curvature. First, we derive an upper bound on memorization characterized by both differential privacy and input loss curvature. Second, we present a novel insight showing that input loss curvature is upper-bounded by the differential privacy parameter. Our theoretical findings are further empirically validated using deep models on CIFAR and ImageNet datasets, showing a strong correlation between our theoretical predictions and results observed in practice.

Kosta, A., Soufleri, E., Chakraborty, I., Agrawal, A., Ankit, A., & Roy, K. (2022). HyperX: A Hybrid RRAM-SRAM partitioned system for error recovery in memristive Xbars. 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE), 88–91.

@inproceedings{kosta2022hyperx,
  title = {HyperX: A Hybrid RRAM-SRAM partitioned system for error recovery in memristive Xbars},
  author = {Kosta, Adarsh and Soufleri, Efstathia and Chakraborty, Indranil and Agrawal, Amogh and Ankit, Aayush and Roy, Kaushik},
  booktitle = {2022 Design, Automation \& Test in Europe Conference \& Exhibition (DATE)},
  pages = {88--91},
  year = {2022},
  organization = {IEEE},
  doi = {10.23919/DATE54114.2022.9774549}
}

Memristive crossbars based on Non-volatile Memory (NVM) technologies such as RRAM, have recently shown great promise for accelerating Deep Neural Networks (DNNs). They achieve this by performing efficient Matrix-Vector-Multiplications (MVMs) while offering dense on-chip storage and minimal off-chip data movement. However, their analog nature of computing introduces functional errors due to non-ideal RRAM devices, significantly degrading the application accuracy. Further, RRAMs suffer from low endurance and high write costs, hindering on-chip trainability. To alleviate these limitations, we propose HyperX, a hybrid RRAM-SRAM system that leverages the complementary benefits of NVM and CMOS technologies. Our proposed system consists of a fixed RRAM block offering area and energy-efficient MVMs and an SRAM block enabling on-chip training to recover the accuracy drop due to the RRAM non-idealities. The improvements are reported in terms of energy and product of latency and area (ms×mm2) , termed as area-normalized latency. Our experiments on CIFAR datasets using ResNet-20 show up to 2.88 × and 10.1 × improvements in inference energy and area-normalized latency, respectively. In addition, for a transfer learning task from ImageNet to CIFAR datasets using ResNet-18, we observe up to 1.58 × and 4.48 × improvements in energy and area-normalized latency, respectively. These improvements are with respect to an all-SRAM baseline.

Panda, P., Soufleri, E., & Roy, K. (2019). Evaluating the Stability of Recurrent Neural Models during Training with Eigenvalue Spectra Analysis. 2019 International Joint Conference on Neural Networks (IJCNN), 1–8.

@inproceedings{panda2019evaluating,
  title = {Evaluating the Stability of Recurrent Neural Models during Training with Eigenvalue Spectra Analysis},
  author = {Panda, Priyadarshini and Soufleri, Efstathia and Roy, Kaushik},
  booktitle = {2019 International Joint Conference on Neural Networks (IJCNN)},
  pages = {1--8},
  year = {2019},
  organization = {IEEE},
  doi = {10.1109/IJCNN.2019.8852181},
  file = {panda2019evaluating.pdf}
}

We analyze the stability of recurrent networks, specifically, reservoir computing models during training by evaluating the eigenvalue spectra of the reservoir dynamics. To circumvent the instability arising in examining a closed loop reservoir system with feedback, we propose to break the closed loop system. Essentially, we unroll the reservoir dynamics over time while incorporating the feedback effects that preserve the overall temporal integrity of the system. We evaluate our methodology for fixed point and time varying targets with least squares regression and FORCE training, respectively. Our analysis establishes eigenvalue spectra (which is, shrinking of spectral circle as training progresses) as a valid and effective metric to gauge the convergence of training as well as the convergence of the chaotic activity of the reservoir toward stable states.

Refereed journal articles

Refereed conference proceedings

About

Contact

Coordinates