Olivia Grace Watkins

oliviawatkins @ berkeley . edu

Who am I?

I am a SOTA neural network.

  • I have been training in a continual learning setting for more than two decades.
  • In 2019, I did rapid domain adaptation to the OOD environment of Berkeley grad school.
  • I have successfully learned collaboration in the multi-agent environment of BAIR (Berkeley AI Research).
  • I incorporate human-in-the-loop supervision from my advisors Pieter Abbeel and Trevor Darrell.
  • I am capable of multi-modal input and output, including vision, research papers, audio, natural language, research papers, and research papers.
  • I'm robust against all adversarial inputs except chocolate.
  • I achieve near-human performance on all Atari games.

Reviewer Concerns:

  • Approach is not replicable; has only been run on one seed.
  • There are serious privacy concerns with the online data collection method, which includes substantial personally identifying information.
  • Algorithm may incorporate human biases.
  • Source code has been released but is unintelligible; uses only four variable names (ATCG)
  • Couldn't you just use a tranformer for this?

What are my research interests?

I'm excited about designing agents which can learn from humans, reason correctly in language, solve open-ended problems, and act safely and reliably in the world. Interesting research question in this space include:

  • How can we design agents which can learn efficiently from supervision (both from humans and (V)LLMs with common-sense understanding)?
  • Can designing agents which reason in language enable generalization and make it easier for humans to supervise and correct agents?
  • How can we enable language agents to learn from experience while maintaining correct, common-sense reasoning?
  • How can we design agents which can act safely and robustly on the web (and in similar sensitive envs), especially in the presence of adversaries?

Do you have a life outside of research?

In my spare time I play Quidditch and D&D, hang out with friends, make mediocre puns, and procrastinate on keeping my website up to date.


  1. Under Review ICML
    A StrongREJECT for Empty Jailbreaks
    Alexandra Souly*, Qingyuan Lu*, Dillon Bowen*, Tu Trinh, Elvis Hsieh, Sana Pandey, Pieter Abbeel, Justin Svegliato, Scott Emmons*, Olivia Watkins*, Sam Toyer*
    Under Review at ICML 2024
  2. Under Review ICML
    Learning to Model the World with Language
    Jessy Lin, Yuqing Du, Olivia Watkins, Danijar Hafner, Pieter Abbeel, Dan Klein, and Anca Dragan
    Under Review at ICML 2024
  3. ICLR 2024
    Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
    Sam Toyer, Olivia Watkins, Ethan Adrian Mendes, Justin Svegliato, Luke Bailey, Tiffany Wang, Isaac Ong, Karim Elmaaroufi, Pieter Abbeel, Trevor Darrell, Alan Ritter, and Stuart Russell
    ICLR 2024
  4. ICML
    Guiding Pretraining in Reinforcement Learning with Large Language Models
    Du*, Yuqing Watkins*, Olivia, Wang, Zihan, Colas, Cédric, Darrel, Trevor, Abbeel, Pieter, Gupta, Abhishek, and Andreas, Jacob,
    ICML 2023
  5. NeurIPS
    DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models
    Ying Fan, Olivia Watkins, Yuqing Du, Hao Liu, Moonkyung Ryu, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Kangwook Lee, Kimin Lee
    NeurIPS 2023
  6. arXiv
    Aligning Text-to-Image Models using Human Feedback
    Lee, Kimin; Liu Hao; Ryu, Moonkyung; Watkins, Olivia; Du, Yuqing; Boutilier, Craig; Abbeel, Pieter; Ghavamzadeh, Mohammad; Gu, Shixiang Shane
    arXiv 2023
  7. NeurIPS
    Teachable Reinforcement Learning via Advice Distillation
    Watkins, Olivia, Darrel, Trevor, Abbeel, Pieter, Andreas, Jacob, and Gupta, Abhishek
    NeurIPS 2021 2021
  8. ICRA
    Auto-Tuned Sim-to-Real Transfer
    Du *, Yuqing; Watkins*, Olivia; Darrell, Trevor; Abbeel, Pieter; and Pathak, Deepak
    ICRA 2021
  9. ICML Workshop
    Explaining Reinforcement Learning Policies through Counterfactual Trajectories
    Frost, Julius; Watkins, Olivia; Weiner, Eric; Abbeel, Pieter; Darrell, Trevor; Plummer, Bryan; and Saenko, Kate
    ICML Workshop on Human in the Loop Learning 2021
  10. ICNLP
    Hierarchical text generation using an outline
    Drissi, Mehdi; Watkins, Olivia; and Kalita, Jugal
    International Conference on Natural Language Processing 2018
  11. ICML Workshop
    Program language translation using a grammar-driven tree-to-tree model
    Drissi*, Mehdi; Watkins*, Olivia; Khant, Aditya; Ojha, Vivaswat; Sandoval, Pedro; Segev, Rakia; Weiner, Eric; and Keller, Robert
    ICML Workshop on Neural Abstract Machines & Program Induction 2018

Please contact me by email!