Ellis Brown

Profile
I'm a PhD candidate in Computer Vision and Machine Learning at New York University, currently working on multimodal AI systems and representation learning. My research focuses on developing more capable and efficient vision-language models that can better understand and interact with the real world.

I work closely with Professor Saining Xie and collaborate with researchers at Meta AI. My recent projects include Cambrian-1, a comprehensive exploration of vision-centric multimodal LLMs, and Internet Explorer, a novel approach for targeted representation learning using the open web. I'm particularly interested in how we can leverage internet-scale data and self-supervised learning to build more robust visual understanding systems.

My work spans several interconnected areas including multimodal learning, self-supervised representation learning, and zero-shot generalization. I aim to develop AI systems that can effectively bridge the gap between the digital and physical worlds while maintaining transparency and reproducibility. A key focus is making these systems more efficient and accessible to the broader research community through open-source implementations and comprehensive documentation.

Publications

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

Shengbang Tong, Ellis L Brown, Penghao Wu, Sanghyun Woo, Manoj Middepogu, Sai Charitha Akula, Jihan Yang, Shusheng Yang, Adithya Iyer, Xichen Pan, Austin Wang, Rob Fergus, Yann LeCun, Saining Xie

NeurIPS Oral 2024

V-IRL: Grounding Virtual Intelligence in Real Life

V-IRL: Grounding Virtual Intelligence in Real Life

Jihan Yang, Runyu Ding, Ellis L Brown, Xiaojuan Qi, Saining Xie

ECCV 2024

Your Diffusion Model is Secretly a Zero-Shot Classifier

Your Diffusion Model is Secretly a Zero-Shot Classifier

Alexander C. Li, Mihir Prabhudesai, Shivam Duggal, Ellis L Brown, Deepak Pathak

ICCV 2023

Internet Explorer: Targeted Representation Learning on the Open Web

Internet Explorer: Targeted Representation Learning on the Open Web

Alexander C. Li, Ellis L Brown, Alexei A. Efros, Deepak Pathak

ICML 2023

Internet Curiosity: Directed Unsupervised Learning on Uncurated Internet Data

Alexander C. Li, Ellis L Brown, Alexei A. Efros, Deepak Pathak

ECCV Workshops 2022

SpatioTemporal Template-based Search: An Architecture to Model Human Search for Spatiotemporal Targets

SpatioTemporal Template-based Search: An Architecture to Model Human Search for Spatiotemporal Targets

Ellis L Brown, N. Warford, M. Kunda

Advances in Cognitive Systems