Sheikh Shams Azam bio photo

Sheikh Shams Azam

Apple  | Purdue | NITK

Email Google Scholar ResearchGate LinkedIn Github ORCID Twitter Instagram Quotes Updated: 05/2024

July 2022 - Present Apple, Cambridge, MA
Research Scientist, AIML Resident

  • Understanding Adaptive Optimizers in Federated Learning for E2E ASR
  • To address research gaps in Federated Learning (FL) for End-to-End Automatic Speech Recognition (E2E ASR), we review conventional FL research, revealing that adaptive optimizers enhance coordination among heterogeneous client updates and some optimizers induce smoother optimization subspace thus improving the performance of FL system (see Publications - C6).
  • Federated Learning with Differential Privacy for End-to-End Speech Recognition
  • We work on first steps towards having on-device training for ASR with federated learning and privacy preservation to deliver privacy to users and improve ASR with using more data / context from on-device training (see Publications - P1).

August 2019 - July 2022 Purdue University
Graduate Research Assistant (RA)

  • Learning from Partially-observed Multimodal Data (Sponsored by HP Lab)
  • Developing unsupervised techniques to learn from partially observed multimodal datasets. The aim is to learn high quality latent representation of datasets (with missing modalities) using self-supervised, unsupervised techniques.
  • Efficient Clustering of Document in Clustered Vector Spaces (Sponsored by HP Lab)
  • Developed a novel technique for interpretable document clustering to help curate data for downstream applications such as personalization of marketing campaigns for focus groups.
  • Federated Private Representation Learning (Sponsored by Northrupp Grumman)
  • Private representation learning using Generative Adversarial Networks (GANs) on distributed datasets using differentially private parameter sharing (see Publications - C3).
  • Link Prediction in Social Learning Networks
  • Graph neural network based link prediction in social learning networks for connection recommendations using both learnt and explicit network metrics.
  • Publications and projects
    • Conference publications: C4 - C2
    • Journal publications: J3 - J2
    • Workshop publications: W1
    • Projects: P8 - P5

June 2021 - August 2021 Zillow Group, Seattle, WA
Applied Scientist-Intern

  • Unsupervised Multimodal Representation Learning and Finetuning for Document Understanding and Scene Attribute Recognition
    • Developed an unsupervised multimodal (visual, lingual, spatial, etc. modes) representation learning framework that leverages the unlabeled raw documents (e.g. property documents) and weakly labeled image dataset (e.g. listings images and descriptions).
    • The learned transformations help improve performance on several downstream few-shot learning tasks including sequence classification, token classification, image attribute detection, and localization etc.

September 2018 - August 2019 Foundation AI, Los Angeles, CA
Research Scientist

  • Computer Vision and NLP for Document Understanding
    • Development of novel CV methods for document analysis and OCR using GANs, CNNs and Graph convolutions.
    • Developed key-value pair extraction NLP model leveraging link prediction techniques on unstructured documents.
  • Publications and projects
    • Conference publications: C1
    • Projects: O5

June 2015 - August 2018 Practo Technologies, Bangalore, India
Data Scientist, Senior Software Engineer, Software Engineer

  • Computer Vision for Medical Imaging
    • Developed novel CV models for diagnosing lung-cancer, brain tumor, and diabetic retinopathy using radiology images.
    • Developed NLP solutions using LSTM and attention based deep learning methods for 90,000-class classification.
    • Developed semi-supervised text classifier for highlighting important phrases in clinical documents.
  • NLP for Automated Medical Document Annotation
    • Developed methods such as Q-Map for fast retrieval of concepts from text documents (see J1) and CASCADENET - a hierarchical deep neural network for massively categorical classification (see C1) - for automated coding ICD-10 (International Code for Diseases), CPT (Current ProceduralTerminology) of clinical documents.
    • The Q-Map and CASCADENET play a vital role in development of subsequent clinical decision support system and other solutions with application in insurance claims automation, disease trend and epidemic outbreak characterization.
  • Publications and projects
    • Journal publications: J1
    • Projects: O4, O1