Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2504.08727
Cited By
v1
v2
v3 (latest)
Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images
11 April 2025
Boyang Deng
Songyou Peng
Kyle Genova
Gordon Wetzstein
Noah Snavely
Leonidas Guibas
Thomas Funkhouser
HAI
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (11 upvotes)
Papers citing
"Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images"
31 / 31 papers shown
Title
Organizing Unstructured Image Collections using Natural Language
Mingxuan Liu
Zhun Zhong
Jun Li
Gianni Franchi
Subhankar Roy
Elisa Ricci
VLM
547
9
0
07 Oct 2024
Explaining Datasets in Words: Statistical Models with Natural Language Parameters
Neural Information Processing Systems (NeurIPS), 2024
Ruiqi Zhong
Heng Wang
Dan Klein
Jacob Steinhardt
163
10
0
13 Sep 2024
Diffusion Models as Data Mining Tools
Ioannis Siglidis
Aleksander Holynski
Alexei A. Efros
Mathieu Aubry
Shiry Ginosar
DiffM
MedIm
164
4
0
20 Jul 2024
Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion
Boyang Deng
Richard Tucker
Zhengqi Li
Leonidas Guibas
Noah Snavely
Gordon Wetzstein
VGen
3DGS
DiffM
180
26
0
18 Jul 2024
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
Chankyu Lee
Rajarshi Roy
Mengyao Xu
Jonathan Raiman
Mohammad Shoeybi
Bryan Catanzaro
Ming-Yu Liu
RALM
443
337
0
27 May 2024
Discover and Mitigate Multiple Biased Subgroups in Image Classifiers
Zeliang Zhang
Mingqian Feng
Zhiheng Li
Chenliang Xu
213
11
0
19 Mar 2024
Rethinking Interpretability in the Era of Large Language Models
Chandan Singh
J. Inala
Michel Galley
Rich Caruana
Jianfeng Gao
LRM
AI4CE
168
97
0
30 Jan 2024
Describing Differences in Image Sets with Natural Language
Computer Vision and Pattern Recognition (CVPR), 2023
Lisa Dunlap
Yuhui Zhang
Xiaohan Wang
Ruiqi Zhong
Trevor Darrell
Jacob Steinhardt
Joseph E. Gonzalez
Serena Yeung-Levy
CoGe
VLM
228
43
0
05 Dec 2023
Can large language models provide useful feedback on research papers? A large-scale empirical analysis
Weixin Liang
Yuhui Zhang
Hancheng Cao
Binglu Wang
Daisy Ding
...
Siyu He
D. Smith
Yian Yin
Daniel A. McFarland
James Y. Zou
ALM
LM&MA
176
204
0
03 Oct 2023
Prototype-based Dataset Comparison
IEEE International Conference on Computer Vision (ICCV), 2023
Nanne van Noord
151
10
0
05 Sep 2023
Changes to Captions: An Attentive Network for Remote Sensing Change Captioning
IEEE Transactions on Image Processing (IEEE TIP), 2023
Shizhen Chang
Pedram Ghamisi
123
63
0
03 Apr 2023
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
2.7K
19,069
0
15 Mar 2023
Goal Driven Discovery of Distributional Differences via Language Descriptions
Neural Information Processing Systems (NeurIPS), 2023
Ruiqi Zhong
Peter Zhang
Steve Li
Jinwoo Ahn
Dan Klein
Jacob Steinhardt
186
59
0
28 Feb 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
International Conference on Machine Learning (ICML), 2023
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
892
6,060
0
30 Jan 2023
What's in a Decade? Transforming Faces Through Time
Eric Chen
Jin Sun
Apoorv Khandelwal
Dani Lischinski
Noah Snavely
Hadar Averbuch-Elor
162
8
0
13 Oct 2022
PaLI: A Jointly-Scaled Multilingual Language-Image Model
International Conference on Learning Representations (ICLR), 2022
Xi Chen
Tianlin Li
Soravit Changpinyo
A. Piergiovanni
Piotr Padlewski
...
Andreas Steiner
A. Angelova
Xiaohua Zhai
N. Houlsby
Radu Soricut
MLLM
VLM
533
866
0
14 Sep 2022
GSCLIP : A Framework for Explaining Distribution Shifts in Natural Language
Zhiying Zhu
Weixin Liang
James Zou
118
11
0
30 Jun 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Neural Information Processing Systems (NeurIPS), 2022
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
582
4,461
0
29 Apr 2022
Image Difference Captioning with Pre-training and Contrastive Learning
AAAI Conference on Artificial Intelligence (AAAI), 2022
Linli Yao
Weiying Wang
Qin Jin
SSL
VLM
137
50
0
09 Feb 2022
Learning Transferable Visual Models From Natural Language Supervision
International Conference on Machine Learning (ICML), 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
1.7K
37,939
0
26 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
International Conference on Machine Learning (ICML), 2021
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
995
4,601
0
11 Feb 2021
Discovering Visual Patterns in Art Collections with Spatially-consistent Feature Learning
Xi Shen
Alexei A. Efros
Mathieu Aubry
SSL
121
90
0
07 Mar 2019
A Closer Look at Spatiotemporal Convolutions for Action Recognition
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
396
3,285
0
30 Nov 2017
StreetStyle: Exploring world-wide clothing styles from millions of photos
Kevin Blackburn-Matzen
Kavita Bala
Noah Snavely
121
92
0
06 Jun 2017
Using Deep Learning and Google Street View to Estimate the Demographic Makeup of the US
Proceedings of the National Academy of Sciences of the United States of America (PNAS), 2017
Timnit Gebru
J. Krause
Yilun Wang
Duyun Chen
Gaowen Liu
Erez Aiden Lieberman
Li Fei-Fei
HAI
184
439
0
22 Feb 2017
3D Time-lapse Reconstruction from Internet Photos
Ricardo Martín Brualla
D. Gallup
S. M. Seitz
138
23
0
10 Nov 2015
A Century of Portraits: A Visual Historical Record of American High School Yearbooks
Shiry Ginosar
Kate Rakelly
Sarah Sachs
Brian Yin
Crystal Lee
Philipp Krahenbuhl
Alexei A. Efros
134
123
0
09 Nov 2015
Transfer Learning from Deep Features for Remote Sensing and Poverty Mapping
Sang Michael Xie
Neal Jean
Marshall Burke
David B. Lobell
Stefano Ermon
173
438
0
01 Oct 2015
Deep Visual-Semantic Alignments for Generating Image Descriptions
Computer Vision and Pattern Recognition (CVPR), 2014
A. Karpathy
Li Fei-Fei
418
5,809
0
07 Dec 2014
Show and Tell: A Neural Image Caption Generator
Computer Vision and Pattern Recognition (CVPR), 2014
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
3DV
532
6,288
0
17 Nov 2014
Recognizing Image Style
British Machine Vision Conference (BMVC), 2013
Sergey Karayev
Matthew Trentacoste
Helen Han
A. Agarwala
Trevor Darrell
Aaron Hertzmann
Holger Winnemoeller
166
475
0
15 Nov 2013
1