About me
Howdy! I am a research scientist at Google Research. I am interested in developing provably better optimization algorithms and generalization-improving techniques for machine learning (ML), especially under data-centric constraints. In general, I like to develop theoretically grounded ML algorithms.
I recently completed my PhD in Computer Science at UT Austin, advised by Prof. Sujay Sanghavi and Prof. Inderjit S. Dhillon. Before that, I received the combined B.Tech. and M.Tech. degree in Electrical Engineering from IIT Bombay. Here I worked with Prof. Subhasis Chaudhuri and received the Undergraduate Research Award.
You can check out my outdated CV here. My email is rdas(at)utexas(dot)edu.
Papers
“Upweighting Easy Samples in Fine-Tuning Mitigates Forgetting” - S Sanyal*, H Prairie*, R Das*, A Kavis*, and S Sanghavi (* denotes equal contribution).
Preprint. Download here.
“Retraining with Predicted Hard Labels Provably Increases Model Accuracy” - R Das, I S Dhillon, A Epasto, A Javanmard, J Mao, V Mirrokni, S Sanghavi, and P Zhong.
Preprint. Download here.
“Towards Quantifying the Preconditioning Effect of Adam” - R Das, N Agarwal, S Sanghavi, and I S Dhillon.
Preprint. Download here.
“Understanding the Training Speedup from Sampling with Approximate Losses” - R Das, X Chen, B Ieong, P Bansal, and S Sanghavi.
ICML 2024. Download paper here.
“Understanding Self-Distillation in the Presence of Label Noise” - R Das and S Sanghavi.
ICML 2023. Download paper here.
“On the Unreasonable Effectiveness of Federated Averaging with Heterogeneous Data” - J Wang, R Das, G Joshi, S Kale, Z Xu, and T Zhang.
TMLR. Download paper here.
“Beyond Uniform Lipschitz Condition in Differentially Private Optimization” - R Das, S Kale, Z Xu, T Zhang, and S Sanghavi.
ICML 2023. Download paper here.
“Differentially Private Federated Learning with Normalized Updates” - R Das, A Hashemi, S Sanghavi, and I S Dhillon.
Download preprint here. Short version presented in OPT2022 workshop of NeurIPS 2022; download here.
“Faster Non-Convex Federated Learning via Global and Local Momentum” - R Das, A Acharya, A Hashemi, S Sanghavi, I S Dhillon, and U Topcu.
“On the Benefits of Multiple Gossip Steps in Communication-Constrained Decentralized Optimization” - A Hashemi, A Acharya*, R Das*, H Vikalo, S Sanghavi, and I S Dhillon (* denotes equal contribution).
IEEE Transactions on Parallel and Distributed Systems. Download paper here and preprint here.
“On the Convergence of a Biased Version of Stochastic Gradient Descent” - R Das, J Zhang, and I S Dhillon.
NeurIPS 2019 Beyond First Order Methods in ML workshop. Download paper here.
“On the Separability of Classes with the Cross-Entropy Loss Function” - R Das and S Chaudhuri.
Preprint. Download here.
“Nonlinear Blind Compressed Sensing under Signal-Dependent Noise” - R Das and A Rajwade.
IEEE International Conference on Image Processing (ICIP) 2019. Download paper here.
“Sparse Kernel PCA for Outlier Detection” - R Das, A Golatkar, and S Awate.
IEEE International Conference on Machine Learning and Applications (ICMLA) 2018 Oral. Download paper here.
iFood Challenge, FGVC Workshop, CVPR 2018 - P Kothari*, A Sadhu*, A Golatkar*, and R Das* (* denotes equal contribution).
Finished $2^{nd}$ in the public leaderboard and $3^{rd}$ in the private leaderboard (Team name: Invincibles). Leaderboard Link. Invited to present our method at CVPR 2018 (slides).
Internships
- Student Researcher at Google Research, New York City, NY (June ‘24 - August ‘24)
Host: Kyriakos Axiotis- Worked on improving the quality of pruned large language models.
- Worked on improving the quality of pruned large language models.
- Student Researcher at Google Research (Remote) (November ‘23 - March ‘24)
Host: Alessandro Epasto- Theoretically showed that retraining with predicted hard labels improves model accuracy in the presence of label noise. Empirically we showed that retraining appropriately can significantly improve training with label differential privacy.
- Theoretically showed that retraining with predicted hard labels improves model accuracy in the presence of label noise. Empirically we showed that retraining appropriately can significantly improve training with label differential privacy.
- Student Researcher at Google DeepMind, Princeton, NJ (June ‘23 - October ‘23)
Host: Naman Agarwal- Derived new theoretical results to quantify the preconditioning effect of the Adam optimizer, and empirically benchmarked several optimization algorithms based on Adam.
- Derived new theoretical results to quantify the preconditioning effect of the Adam optimizer, and empirically benchmarked several optimization algorithms based on Adam.
- Research Intern at Google (Remote) (June ‘21 - August ‘21)
Hosts: Zheng Xu, Satyen Kale, and Tong Zhang- Clipped gradient methods are commonly used in practice for differentially private (DP) training, e.g., DP-SGD. However, a sound theoretical understanding of these methods has been elusive. We provide principled guidance on choosing the clipping threshold in DP-SGD and also derive novel convergence results for DP-SGD in heavy-tailed settings.
- Clipped gradient methods are commonly used in practice for differentially private (DP) training, e.g., DP-SGD. However, a sound theoretical understanding of these methods has been elusive. We provide principled guidance on choosing the clipping threshold in DP-SGD and also derive novel convergence results for DP-SGD in heavy-tailed settings.
- Applied Scientist Intern at Amazon Search, Berkeley, CA (May ‘20 - August ‘20)
Mentor: Dan Hill, Manager: Sujay Sanghavi- Worked on customer-specific query correction by leveraging the “session data” (i.e. previous searches of the customer) using SOTA Transformer models. Our model generated better candidates than the production system.
- Worked on customer-specific query correction by leveraging the “session data” (i.e. previous searches of the customer) using SOTA Transformer models. Our model generated better candidates than the production system.
- Institute for Biomechanics, ETH Zürich, Zürich, Switzerland (May ‘17 - July ‘17)
Guide : Dr. Patrik Christen and Prof. Dr. Ralph Müller, D-HEST- Proposed a stable linear model (with closed-form solution) and a fuzzy boolean network for bone remodeling. Also developed an automated 2D-3D image registration framework for histology images from scratch.
- Proposed a stable linear model (with closed-form solution) and a fuzzy boolean network for bone remodeling. Also developed an automated 2D-3D image registration framework for histology images from scratch.