Archna Bhatia

Research Scientist

Dr. Archna Bhatia is a Research Scientist in Speech and Natural Language Processing at IHMC Ocala. She has held postdoctoral researcher positions at the Language Technologies Institute at Carnegie Mellon University, and at the University of Colorado at Boulder. She earned her Ph.D. in linguistics from the University of Illinois at Urbana-Champaign in 2011.

Dr. Bhatia’s research is focused on (i) speech and natural language processing for health applications; (ii) natural language processing (NLP) for applications such as cybersecurity, information extraction and cognitive modeling; and (iii) understanding/modeling constructions/linguistic phenomena to improve semantic parsing and natural language understanding.

In the health domain, she has worked on developing noninvasive techniques for detection and monitoring of physiological, psychological and/or neurological conditions. For example, she has developed an approach to detect stress and predict individuals’ response to stress based on the speech and language they produce (Bhatia et al., 2021). Previously, she developed a noninvasive speech based method for detection and monitoring of ALS based on divergence from the asymptomatic speech (Bhatia et al., 2017a; Bhatia et al., 2017c).

In other applications, she develops NLP-based systems that integrate human acquired knowledge, such as linguistics and social psychology, with machine learning (ML). For example, utilizing cutting-edge generative Large Language Models (LLMs), along with NLP and conventional ML techniques, her team is currently developing an intelligent cognitive assistant for word retrieval support for older adults with incipient Alzheimer’s disease and related dementias. She has also been developing a system for extraction of individuals’ beliefs and sentiments from the textual content they produce on social media, using NLP and social psychology (Pirolli et al., 2020) to predict behavior. On a NASA funded project, she worked on developing a system that can help the crew in a space mission by automatically identifying the step operators are on in a technical procedure based on their conversations and information extracted from technical manuals using natural language processing and machine learning based analyses and provide recommendations accordingly.  Previously, she worked in a team that developed a human language technology pipeline for active defenses against social engineering attacks that makes use of NLP, computational sociolinguistics and metadata analysis (Bhatia et al., 2020; Dalton et al., 2020; Dorr et al., 2020).

To compute deep semantic representations of sentences, she has attempted to capture the richness of lexical semantics focusing on verb particle constructions, a type of multiword expressions using lexical resources such as WordNet, and has worked on incorporating the acquired knowledge into an ontology and lexicon to improve semantic parsing (Bhatia et al., 2018; Bhatia et al., 2017b).

She has also applied the signal processing techniques and machine learning from the speech domain to breath sounds for hypercapnia detection, respiratory minute volume identification, gas condition detection and depth identification for an ONR-funded project.

Research funding sources: NIA through MassAITC (PI), NASA STTR (Co-Investigator), ONR (Co-PI), DARPA, IARPA, NIH, NSF, Tampa VA

Research Community Engagement: Dr. Bhatia is currently serving as a Senior Area Chair for the “Resources and Evaluation” track for NAACL 2025. She recently served as a Senior Area Chair for the Corpora and Annotation Track at the Joint International LREC-COLING 2024 Conference. Prior to that, she was a nominated officer (2021-2023) in the Standing Committee for the SIGLEX-MWE Section of the Special Interest Group on the Lexicon of the Association for Computational Linguistics (ACL), and now serves in the Advisory Committee of the Section. She led the development of the Hindi corpus for the PARSEME Shared Task on automatic identification of verbal multiword expressions. She has co-organized several workshops as listed below:

  • Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD 2024) @ LREC-COLING 2024 [Currently organizing]
  • 19th Workshop on Multiword Expressions (MWE 2023) @ EACL 2023
  • 18th Workshop on Multiword Expressions (MWE 2022) @ LREC 2022
  • First International Workshop on Social Threats in Online Conversations (STOC 2020) @ LREC 2020
  • Third International Workshop on Spatial Language Understanding (SpLU 2020) @ EMNLP 2020
  • Combined Workshop on Spatial Language Understanding & Grounded Communication for Robotics (SpLU-RoboNLP 2019) @ NAACL-HLT 2019
  • First International Workshop on Spatial Language Understanding (SpLU 2018) @ NAACL-HLT 2018