I am a (happy!) graduate student in Julie Shah'sInteractive Robotics Group (IRG) at MIT. In my research, I focus primarily on interpretability of AI systems. This manifests itself as custom-built neural models for particular tasks like fair classification, probes to understand the linguistic properties of NLP models, and representation learning for human understanding.
I'm currently studying methods for teaching rules to and extracting rules from neural nets. These are two sides of the same coin for control and interpretability. The main methods I employ are causal probing techniques and custom neural architectures that better support human understanding. Much of my work is situated in language-adjacent tasks because of the rich interplay between structure (e.g. linguistics or grammar) and intuition (e.g., how can an agent learn the meaning of a word).