-
Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent RepresentationZibo Zhao, Wen Liu, Xin Chen, Xianfang Zeng, Rui Wang, Pei Cheng, Bin Fu, Tao Chen, Gang Yu, Shenghua GaoarXiv 2023. Paper  2023-06-292023-06-29
-
PFB-Diff: Progressive Feature Blending Diffusion for Text-driven Image EditingWenjing Huang, Shikui Tu, Lei XuarXiv 2023. Paper  2023-06-282023-06-28
-
Decompose and Realign: Tackling Condition Misalignment in Text-to-Image Diffusion ModelsLuozhou Wang, Guibao Shen, Yijun Li, Ying-cong ChenarXiv 2023. Paper  2023-06-262023-06-26
-
A-STAR: Test-time Attention Segregation and Retention for Text-to-image SynthesisAishwarya Agarwal, Srikrishna Karanam, K J Joseph, Apoorv Saxena, Koustava Goswami, Balaji Vasan SrinivasanarXiv 2023. Paper  2023-06-262023-06-26
-
DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion ModelsXiming Xing, Chuang Wang, Haitao Zhou, Jing Zhang, Qian Yu, Dong XuarXiv 2023. Paper  2023-06-262023-06-26
-
Zero-shot spatial layout conditioning for text-to-image diffusion modelsGuillaume Couairon, Marlène Careil, Matthieu Cord, Stéphane Lathuilière, Jakob VerbeekarXiv 2023. Paper  2023-06-232023-06-23
-
DreamTime: An Improved Optimization Strategy for Text-to-3D Content CreationYukun Huang, Jianan Wang, Yukai Shi, Xianbiao Qi, Zheng-Jun Zha, Lei ZhangarXiv 2023. Paper  2023-06-212023-06-21
-
RS5M: A Large Scale Vision-Language Dataset for Remote Sensing Vision-Language Foundation ModelZilun Zhang, Tiancheng Zhao, Yulong Guo, Jianwei YinarXiv 2023. Paper  2023-06-202023-06-20
-
EMoG: Synthesizing Emotive Co-speech 3D Gesture with Diffusion ModelLianying Yin, Yijun Wang, Tianyu He, Jinming Liu, Wei Zhao, Bohan Li, Xin Jin, Jianxin LinarXiv 2023. Paper  2023-06-202023-06-20
-
Align, Adapt and Inject: Sound-guided Unified Image GenerationYue Yang, Kaipeng Zhang, Yuying Ge, Wenqi Shao, Zeyue Xue, Yu Qiao, Ping LuoarXiv 2023. Paper  2023-06-202023-06-20
-
Conditional Text Image Generation with Diffusion ModelsYuanzhi Zhu, Zhaohai Li, Tianwei Wang, Mengchao He, Cong YaoarXiv 2023. Paper  2023-06-192023-06-19
-
Instruct-NeuralTalker: Editing Audio-Driven Talking Radiance Fields with InstructionsYuqi Sun, Reian He, Weimin Tan, Bo YanarXiv 2023. Paper  2023-06-192023-06-19
-
Point-Cloud Completion with Pretrained Text-to-image Diffusion ModelsYoni Kasten, Ohad Rahamim, Gal ChechikarXiv 2023. Paper  2023-06-182023-06-18
-
CLIPSonic: Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision ModelsHao-Wen Dong, Xiaoyu Liu, Jordi Pons, Gautam Bhattacharya, Santiago Pascual, Joan Serrà, Taylor Berg-Kirkpatrick, Julian McAuleyarXiv 2023. Paper  2023-06-162023-06-16
-
Evaluating the Robustness of Text-to-image Diffusion Models against Real-world AttacksHongcheng Gao, Hao Zhang, Yinpeng Dong, Zhijie DengarXiv 2023. Paper  2023-06-162023-06-16
-
Energy-Based Cross Attention for Bayesian Context Update in Text-to-Image Diffusion ModelsGeon Yeong Park, Jeongsol Kim, Beomsu Kim, Sang Wan Lee, Jong Chul YearXiv 2023. Paper  2023-06-162023-06-16
-
Training Multimedia Event Extraction With Generated Images and CaptionsZilin Du, Yunxin Li, Xu Guo, Yidan Sun, Boyang LiarXiv 2023. Paper  2023-06-152023-06-15
-
Linguistic Binding in Diffusion Models: Enhancing Attribute Correspondence through Attention Map AlignmentRoyi Rassin, Eran Hirsch, Daniel Glickman, Shauli Ravfogel, Yoav Goldberg, Gal ChechikarXiv 2023. Paper  2023-06-152023-06-15
-
Diffusion Models for Zero-Shot Open-Vocabulary SegmentationLaurynas Karazija, Iro Laina, Andrea Vedaldi, Christian RupprechtarXiv 2023. Paper  2023-06-152023-06-15
-
Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesisShivam Mehta, Siyang Wang, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje HenterarXiv 2023. Paper  2023-06-152023-06-15
-
Taming Diffusion Models for Music-driven Conducting Motion GenerationZhuoran Zhao, Jinbin Bai, Delong Chen, Debang Wang, Yubo PanarXiv 2023. Paper  2023-06-152023-06-15
-
Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned GenerationYongqi Yang, Ruoyu Wang, Zhihao Qian, Ye Zhu, Yu WuarXiv 2023. Paper  2023-06-142023-06-14
-
GBSD: Generative Bokeh with Stage DiffusionJieren Deng, Xin Zhou, Hao Tian, Zhihong Pan, Derek AguiararXiv 2023. Paper  2023-06-142023-06-14
-
Training-free Diffusion Model Adaptation for Variable-Sized Text-to-Image SynthesisZhiyu Jin, Xuli Shen, Bin Li, Xiangyang XuearXiv 2023. Paper  2023-06-142023-06-14
-
Norm-guided latent space exploration for text-to-image generationDvir Samuel, Rami Ben-Ari, Nir Darshan, Haggai Maron, Gal ChechikarXiv 2023. Paper  2023-06-142023-06-14
-
VidEdit: Zero-Shot and Spatially Aware Text-Driven Video EditingPaul Couairon, Clément Rambour, Jean-Emmanuel Haugeard, Nicolas ThomearXiv 2023. Paper  2023-06-142023-06-14
-
Paste, Inpaint and Harmonize via Denoising: Subject-Driven Image Editing with Pre-Trained Diffusion ModelXin Zhang, Jiaxian Guo, Paul Yoo, Yutaka Matsuo, Yusuke IwasawaarXiv 2023. Paper  2023-06-132023-06-13
-
Rerender A Video: Zero-Shot Text-Guided Video-to-Video TranslationShuai Yang, Yifan Zhou, Ziwei Liu, Chen Change LoyarXiv 2023. Paper  2023-06-132023-06-13
-
InstructP2P: Learning to Edit 3D Point Clouds with Text InstructionsJiale Xu, Xintao Wang, Yan-Pei Cao, Weihao Cheng, Ying Shan, Shenghua GaoarXiv 2023. Paper  2023-06-122023-06-12
-
MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and ImagesJunchen Zhu, Huan Yang, Huiguo He, Wenjing Wang, Zixi Tuo, Wen-Huang Cheng, Lianli Gao, Jingkuan Song, Jianlong FuarXiv 2023. Paper  2023-06-122023-06-12
-
Controlling Text-to-Image Diffusion by Orthogonal FinetuningZeju Qiu, Weiyang Liu, Haiwen Feng, Yuxuan Xue, Yao Feng, Zhen Liu, Dan Zhang, Adrian Weller, Bernhard SchölkopfarXiv 2023. Paper  2023-06-122023-06-12
-
Language-Guided Traffic Simulation via Scene-Level DiffusionZiyuan Zhong, Davis Rempe, Yuxiao Chen, Boris Ivanovic, Yulong Cao, Danfei Xu, Marco Pavone, Baishakhi RayarXiv 2023. Paper  2023-06-102023-06-10
-
Improving Tuning-Free Real Image Editing with Proximal GuidanceLigong Han, Song Wen, Qi Chen, Zhixing Zhang, Kunpeng Song, Mengwei Ren, Ruijiang Gao, Yuxiao Chen, Di Liu, Qilong Zhangli, Anastasis Stathopoulos, Jindong Jiang, Zhaoyang Xia, Akash Srivastava, Dimitris MetaxasarXiv 2023. Paper  2023-06-082023-06-08
-
SyncDiffusion: Coherent Montage via Synchronized Joint DiffusionsYuseung Lee, Kunho Kim, Hyunjin Kim, Minhyuk Sung2023-06-082023-06-08
-
Grounded Text-to-Image Synthesis with Attention RefocusingQuynh Phung, Songwei Ge, Jia-Bin HuangarXiv 2023. Paper  2023-06-082023-06-08
-
BOOT: Data-free Distillation of Denoising Diffusion Models with BootstrappingJiatao Gu, Shuangfei Zhai, Yizhe Zhang, Lingjie Liu, Josh SusskindarXiv 2023. Paper  2023-06-082023-06-08
-
Improving Diffusion-based Image Translation using Asymmetric Gradient GuidanceGihyun Kwon, Jong Chul YearXiv 2023. Paper  2023-06-072023-06-07
-
Integrating Geometric Control into Text-to-Image Diffusion Models for High-Quality Detection Data Generation via Text PromptKai Chen, Enze Xie, Zhe Chen, Lanqing Hong, Zhenguo Li, Dit-Yan YeungarXiv 2023. Paper  2023-06-072023-06-07
-
Multi-modal Latent DiffusionMustapha Bounoua, Giulio Franzese, Pietro MichiardiarXiv 2023. Paper  2023-06-072023-06-07
-
Designing a Better Asymmetric VQGAN for StableDiffusionZixin Zhu, Xuelu Feng, Dongdong Chen, Jianmin Bao, Le Wang, Yinpeng Chen, Lu Yuan, Gang Hua2023-06-072023-06-07
-
ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion ModelsMaitreya Patel, Tejas Gokhale, Chitta Baral, Yezhou YangarXiv 2023. Paper  2023-06-072023-06-07
-
WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion ModelsChanghoon Kim, Kyle Min, Maitreya Patel, Sheng Cheng, Yezhou YangarXiv 2023. Paper  2023-06-072023-06-07
-
User-friendly Image Editing with Minimal Text Input: Leveraging Captioning and Injection TechniquesSunwoo Kim, Wooseok Jang, Hyunsu Kim, Junho Kim, Yunjey Choi, Seungryong Kim, Gayeong LeearXiv 2023. Paper  2023-06-052023-06-05
-
Towards Unified Text-based Person Retrieval: A Large-scale Multi-Attribute and Language Search BenchmarkShuyu Yang, Yinan Zhou, Yaxiong Wang, Yujiao Wu, Li Zhu, Zhedong ZhengarXiv 2023. Paper  2023-06-052023-06-05
-
Instruct-Video2Avatar: Video-to-Avatar Generation with InstructionsShaoxu LiarXiv 2023. Paper  2023-06-052023-06-05
-
HeadSculpt: Crafting 3D Head Avatars with TextXiao Han, Yukang Cao, Kai Han, Xiatian Zhu, Jiankang Deng, Yi-Zhe Song, Tao Xiang, Kwan-Yee K. Wong2023-06-052023-06-05
-
LipVoicer: Generating Speech from Silent Videos Guided by Lip ReadingYochai Yemini, Aviv Shamsian, Lior Bracha, Sharon Gannot, Ethan Fetaya2023-06-052023-06-05
-
Stable Diffusion is UnstableChengbin Du, Yanxi Li, Zhongwei Qiu, Chang XuarXiv 2023. Paper  2023-06-052023-06-05
-
Detector Guidance for Multi-Object Text-to-Image GenerationLuping Liu, Zijian Zhang, Yi Ren, Rongjie Huang, Xiang Yin, Zhou ZhaoarXiv 2023. Paper  2023-06-042023-06-04
-
Efficient Text-Guided 3D-Aware Portrait Generation with Score Distillation Sampling on DistributionYiji Cheng, Fei Yin, Xiaoke Huang, Xintong Yu, Jiaxiang Liu, Shikun Feng, Yujiu Yang, Yansong TangarXiv 2023. Paper  2023-06-032023-06-03
-
Word-Level Explanations for Analyzing Bias in Text-to-Image ModelsAlexander Lin, Lucas Monteiro Paes, Sree Harsha Tanneru, Suraj Srinivas, Himabindu LakkarajuarXiv 2023. Paper  2023-06-032023-06-03
-
Privacy Distillation: Reducing Re-identification Risk of Multimodal Diffusion ModelsVirginia Fernandez, Pedro Sanchez, Walter Hugo Lopez Pinaya, Grzegorz Jacenków, Sotirios A. Tsaftaris, Jorge CardosoarXiv 2023. Paper  2023-06-022023-06-02
-
Audio-Visual Speech Enhancement with Score-Based Generative ModelsJulius Richter, Simone Frintrop, Timo GerkmannarXiv 2023. Paper  2023-06-022023-06-02
-
Video Colorization with Pre-trained Text-to-Image Diffusion ModelsHanyuan Liu, Minshan Xie, Jinbo Xing, Chengze Li, Tien-Tsin WongarXiv 2023. Paper  2023-06-022023-06-02
-
Probabilistic Adaptation of Text-to-Video ModelsMengjiao Yang, Yilun Du, Bo Dai, Dale Schuurmans, Joshua B. Tenenbaum, Pieter Abbeel2023-06-022023-06-02
-
FigGen: Text to Scientific Figure GenerationJuan A. Rodriguez, David Vazquez, Issam Laradji, Marco Pedersoli, Pau RodriguezICLR 2023. Paper  2023-06-012023-06-01
-
UniDiff: Advancing Vision-Language Models with Generative and Discriminative LearningXiao Dong, Runhui Huang, Xiaoyong Wei, Zequn Jie, Jianxing Yu, Jian Yin, Xiaodan LiangarXiv 2023. Paper  2023-06-012023-06-01
-
Wuerstchen: Efficient Pretraining of Text-to-Image ModelsPablo Pernias, Dominic Rampas, Marc AubrevillearXiv 2023. Paper  2023-06-012023-06-01
-
Inserting Anybody in Diffusion Models via Celeb BasisGe Yuan, Xiaodong Cun, Yong Zhang, Maomao Li, Chenyang Qi, Xintao Wang, Ying Shan, Huicheng Zheng2023-06-012023-06-01
-
Make-Your-Video: Customized Video Generation Using Textual and Structural GuidanceJinbo Xing, Menghan Xia, Yuxin Liu, Yuechen Zhang, Yong Zhang, Yingqing He, Hanyuan Liu, Haoxin Chen, Xiaodong Cun, Xintao Wang, Ying Shan, Tien-Tsin Wong2023-06-012023-06-01
-
Cocktail: Mixing Multi-Modality Controls for Text-Conditional Image GenerationMinghui Hu, Jianbin Zheng, Daqing Liu, Chuanxia Zheng, Chaoyue Wang, Dacheng Tao, Tat-Jen Cham2023-06-012023-06-01
-
The Hidden Language of Diffusion ModelsHila Chefer, Oran Lang, Mor Geva, Volodymyr Polosukhin, Assaf Shocher, Michal Irani, Inbar Mosseri, Lior Wolf2023-06-012023-06-01
-
ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image GenerationShaozhe Hao, Kai Han, Shihao Zhao, Kwan-Yee K. Wong2023-06-012023-06-01
-
Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion ModelsChang Liu, Haoning Wu, Yujie Zhong, Xiaoyun Zhang, Weidi Xie2023-06-012023-06-01
-
Intriguing Properties of Text-guided Diffusion ModelsQihao Liu, Adam Kortylewski, Yutong Bai, Song Bai, Alan YuillearXiv 2023. Paper  2023-06-012023-06-01
-
StyleDrop: Text-to-Image Generation in Any StyleKihyuk Sohn, Nataniel Ruiz, Kimin Lee, Daniel Castro Chin, Irina Blok, Huiwen Chang, Jarred Barber, Lu Jiang, Glenn Entis, Yuanzhen Li, Yuan Hao, Irfan Essa, Michael Rubinstein, Dilip Krishnan2023-06-012023-06-01
-
Diffusion Self-Guidance for Controllable Image GenerationDave Epstein, Allan Jabri, Ben Poole, Alexei A. Efros, Aleksander Holynski2023-06-012023-06-01
-
StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation LearnersYonglong Tian, Lijie Fan, Phillip Isola, Huiwen Chang, Dilip KrishnanarXiv 2023. Paper  2023-06-012023-06-01
-
Boosting Text-to-Image Diffusion Models with Fine-Grained Semantic RewardsGuian Fang, Zutao Jiang, Jianhua Han, Guansong Lu, Hang Xu, Xiaodan Liang2023-05-312023-05-31
-
Control4D: Dynamic Portrait Editing by Learning 4D GAN from 2D Diffusion-based EditorRuizhi Shao, Jingxiang Sun, Cheng Peng, Zerong Zheng, Boyao Zhou, Hongwen Zhang, Yebin Liu2023-05-312023-05-31
-
Understanding and Mitigating Copying in Diffusion ModelsGowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, Tom Goldstein2023-05-312023-05-31
-
Diffusion Brush: A Latent Diffusion Model-based Editing Tool for AI-generated ImagesPeyman Gholami, Robert XiaoarXiv 2023. Paper  2023-05-312023-05-31
-
LayerDiffusion: Layered Controlled Image Editing with Diffusion ModelsPengzhi Li, QInxuan Huang, Yikang Ding, Zhiheng LiarXiv 2023. Paper  2023-05-302023-05-30
-
HiFA: High-fidelity Text-to-3D with Advanced Diffusion GuidanceJunzhe Zhu, Peiye ZhuangarXiv 2023. Paper  2023-05-302023-05-30
-
StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar GenerationChi Zhang, Yiwen Chen, Yijun Fu, Zhenglin Zhou, Gang YU, Billzb Wang, Bin Fu, Tao Chen, Guosheng Lin, Chunhua ShenarXiv 2023. Paper  2023-05-302023-05-30
-
Nested Diffusion Processes for Anytime Image GenerationNoam Elata, Bahjat Kawar, Tomer Michaeli, Michael EladarXiv 2023. Paper  2023-05-302023-05-30
-
Video ControlNet: Towards Temporally Consistent Synthetic-to-Real Video Translation Using Conditional Image Diffusion ModelsErnie Chu, Shuo-Yen Lin, Jun-Cheng ChenarXiv 2023. Paper  2023-05-302023-05-30
-
PanoGen: Text-Conditioned Panoramic Environment Generation for Vision-and-Language NavigationJialu Li, Mohit Bansal2023-05-302023-05-30
-
Perturbation-Assisted Sample Synthesis: A Novel Approach for Uncertainty QuantificationYifei Liu, Rex Shen, Xiaotong ShenarXiv 2023. Paper  2023-05-302023-05-30
-
Conditional Score Guidance for Text-Driven Image-to-Image TranslationHyunsoo Lee, Minsoo Kang, Bohyung HanarXiv 2023. Paper  2023-05-292023-05-29
-
InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User InstructionsQian Wang, Biao Zhang, Michael Birsak, Peter WonkaarXiv 2023. Paper  2023-05-292023-05-29
-
Text-Only Image Captioning with Multi-Context Data GenerationFeipeng Ma, Yizhou Zhou, Fengyun Rao, Yueyi Zhang, Xiaoyan SunarXiv 2023. Paper  2023-05-292023-05-29
-
Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-DenoisingFu-Yun Wang, Wenshuo Chen, Guanglu Song, Han-Jia Ye, Yu Liu, Hongsheng Li2023-05-292023-05-29
-
Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion ModelsYuchao Gu, Xintao Wang, Jay Zhangjie Wu, Yujun Shi, Yunpeng Chen, Zihan Fan, Wuyou Xiao, Rui Zhao, Shuning Chang, Weijia Wu, Yixiao Ge, Ying Shan, Mike Zheng Shou2023-05-292023-05-29
-
RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion PathsZeyue Xue, Guanglu Song, Qiushan Guo, Boxiao Liu, Zhuofan Zong, Yu Liu, Ping LuoarXiv 2023. Paper  2023-05-292023-05-29
-
Controllable Text-to-Image Generation with GPT-4Tianjun Zhang, Yi Zhang, Vibhav Vineet, Neel Joshi, Xin WangarXiv 2023. Paper  2023-05-292023-05-29
-
Cognitively Inspired Cross-Modal Data Generation Using Diffusion ModelsZizhao Hu, Mohammad RostamiNeurIPS 2023. Paper  2023-05-282023-05-28
-
FISEdit: Accelerating Text-to-image Editing via Cache-enabled Sparse Diffusion InferenceZihao Yu, Haoyang Li, Fangcheng Fu, Xupeng Miao, Bin CuiarXiv 2023. Paper  2023-05-272023-05-27
-
Towards Consistent Video Editing with Text-to-Image Diffusion ModelsZicheng Zhang, Bonan Li, Xuecheng Nie, Congying Han, Tiande Guo, Luoqi LiuarXiv 2023. Paper  2023-05-272023-05-27
-
Text-to-image Editing by Image Information RemovalZhongping Zhang, Jian Zheng, Jacob Zhiyuan Fang, Bryan A. PlummerarXiv 2023. Paper  2023-05-272023-05-27
-
Negative-prompt Inversion: Fast Image Inversion for Editing with Text-guided Diffusion ModelsDaiki Miyake, Akihiro Iohara, Yu Saito, Toshiyuki TanakaarXiv 2023. Paper  2023-05-262023-05-26
-
Improved Visual Story Generation with Adaptive Context ModelingZhangyin Feng, Yuchen Ren, Xinmiao Yu, Xiaocheng Feng, Duyu Tang, Shuming Shi, Bing QinarXiv 2023. Paper  2023-05-262023-05-26
-
ControlVideo: Adding Conditional Control for One Shot Text-to-Video EditingMin Zhao, Rongzhen Wang, Fan Bao, Chongxuan Li, Jun Zhu2023-05-262023-05-26
-
Custom-Edit: Text-Guided Image Editing with Customized Diffusion ModelsJooyoung Choi, Yunjey Choi, Yunji Kim, Junho Kim, Sungroh YoonarXiv 2023. Paper  2023-05-252023-05-25
-
On Architectural Compression of Text-to-Image Diffusion ModelsBo-Kyeong Kim, Hyoung-Kyu Song, Thibault Castells, Shinkook ChoiarXiv 2023. Paper  2023-05-252023-05-25
-
ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score DistillationZhengyi Wang, Cheng Lu, Yikai Wang, Fan Bao, Chongxuan Li, Hang Su, Jun Zhu2023-05-252023-05-25
-
ProSpect: Expanded Conditioning for the Personalization of Attribute-aware Image GenerationYuxin Zhang, Weiming Dong, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Oliver Deussen, Changsheng XuarXiv 2023. Paper  2023-05-252023-05-25
-
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion ModelsXingqian Xu, Jiayi Guo, Zhangyang Wang, Gao Huang, Irfan Essa, Humphrey Shi2023-05-252023-05-25
-
Diversify Your Vision Datasets with Automatic Diffusion-Based AugmentationLisa Dunlap, Alyssa Umino, Han Zhang, Jiezhi Yang, Joseph E. Gonzalez, Trevor Darrell2023-05-252023-05-25
-
Break-A-Scene: Extracting Multiple Concepts from a Single ImageOmri Avrahami, Kfir Aberman, Ohad Fried, Daniel Cohen-Or, Dani Lischinski2023-05-252023-05-25
-
Parallel Sampling of Diffusion ModelsAndy Shih, Suneel Belkhale, Stefano Ermon, Dorsa Sadigh, Nima Anari2023-05-252023-05-25
-
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion ModelsShihao Zhao, Dongdong Chen, Yen-Chun Chen, Jianmin Bao, Shaozhe Hao, Lu Yuan, Kwan-Yee K. Wong2023-05-252023-05-25
-
DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion ModelsYing Fan, Olivia Watkins, Yuqing Du, Hao Liu, Moonkyung Ryu, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Kangwook Lee, Kimin LeearXiv 2023. Paper  2023-05-252023-05-25
-
Are Diffusion Models Vision-And-Language Reasoners?Benno Krojer, Elinor Poole-Dayan, Vikram Voleti, Christopher Pal, Siva Reddy2023-05-252023-05-25
-
BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and EditingDongxu Li, Junnan Li, Steven C. H. HoiarXiv 2023. Paper  2023-05-242023-05-24
-
I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual MetaphorsTuhin Chakrabarty, Arkadiy Saakyan, Olivia Winn, Artemis Panagopoulou, Yue Yang, Marianna Apidianaki, Smaranda MuresanarXiv 2023. Paper  2023-05-242023-05-24
-
DiffBlender: Scalable and Composable Multimodal Text-to-Image Diffusion ModelsSungnyun Kim, Junsoo Lee, Kibeom Hong, Daesik Kim, Namhyuk Ahn2023-05-242023-05-24
-
ChatFace: Chat-Guided Real Face Editing via Diffusion Latent Space ManipulationDongxu Yue, Qin Guo, Munan Ning, Jiaxi Cui, Yuesheng Zhu, Li YuanarXiv 2023. Paper  2023-05-242023-05-24
-
MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image GenerationMarco Bellagente, Manuel Brack, Hannah Teufel, Felix Friedrich, Björn Deiseroth, Constantin Eichenberg, Andrew Dai, Robert Baldock, Souradeep Nanda, Koen Oostermeijer, Andres Felipe Cruz-Salinas, Patrick Schramowski, Kristian Kersting, Samuel WeinbacharXiv 2023. Paper  2023-05-242023-05-24
-
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language ModelsLong Lian, Boyi Li, Adam Yala, Trevor DarrellarXiv 2023. Paper  2023-05-232023-05-23
-
Understanding Text-driven Motion Synthesis with Keyframe Collaboration via Diffusion ModelsDong Wei, Xiaoning Sun, Huaijiang Sun, Bin Li, Shengxiang Hu, Weiqing Li, Jianfeng LuarXiv 2023. Paper  2023-05-232023-05-23
-
Control-A-Video: Controllable Text-to-Video Generation with Diffusion ModelsWeifeng Chen, Jie Wu, Pan Xie, Hefeng Wu, Jiashi Li, Xin Xia, Xuefeng Xiao, Liang LinarXiv 2023. Paper  2023-05-232023-05-23
-
Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image ModelsYiting Qu, Xinyue Shen, Xinlei He, Michael Backes, Savvas Zannettou, Yang ZhangarXiv 2023. Paper  2023-05-232023-05-23
-
Compositional Text-to-Image Synthesis with Attention Map Control of Diffusion ModelsRuichen Wang, Zekang Chen, Chen Chen, Jian Ma, Haonan Lu, Xiaodong LinarXiv 2023. Paper  2023-05-232023-05-23
-
The CLIP Model is Secretly an Image-to-Prompt ConverterYuxuan Ding, Chunna Tian, Haoxuan Ding, Lingqiao LiuarXiv 2023. Paper  2023-05-222023-05-22
-
AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image GenerationGuy Yariv, Itai Gat, Lior Wolf, Yossi Adi, Idan SchwartzarXiv 2023. Paper  2023-05-222023-05-22
-
ControlVideo: Training-free Controllable Text-to-Video GenerationYabo Zhang, Yuxiang Wei, Dongsheng Jiang, Xiaopeng Zhang, Wangmeng Zuo, Qi Tian2023-05-222023-05-22
-
If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by SelectionShyamgopal Karthik, Karsten Roth, Massimiliano Mancini, Zeynep Akata2023-05-222023-05-22
-
Training Diffusion Models with Reinforcement LearningKevin Black, Michael Janner, Yilun Du, Ilya Kostrikov, Sergey LevinearXiv 2023. Paper  2023-05-222023-05-22
-
FACTIFY3M: A Benchmark for Multimodal Fact Verification with Explainability through 5W Question-AnsweringMegha Chakraborty, Khusbu Pahwa, Anku Rani, Adarsh Mahor, Aditya Pakala, Arghya Sarkar, Harshit Dave, Ishan Paul, Janvita Reddy, Preethi Gurumurthy, Ritvik G, Samahriti Mukherjee, Shreyas Chatterjee, Kinjal Sensharma, Dwip Dalal, Suryavardan S, Shreyash Mishra, Parth Patwa, Aman Chadha, Amit Sheth, Amitava DasarXiv 2023. Paper  2023-05-222023-05-22
-
LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-OnDavide Morelli, Alberto Baldrati, Giuseppe Cartella, Marcella Cornia, Marco Bertini, Rita CucchiaraarXiv 2023. Paper  2023-05-222023-05-22
-
Adversarial Nibbler: A Data-Centric Challenge for Improving the Safety of Text-to-Image ModelsAlicia Parrish, Hannah Rose Kirk, Jessica Quaye, Charvi Rastogi, Max Bartolo, Oana Inel, Juan Ciro, Rafael Mosquera, Addison Howard, Will Cukierski, D. Sculley, Vijay Janapa Reddi, Lora AroyoarXiv 2023. Paper  2023-05-222023-05-22
-
InstructVid2Vid: Controllable Video Editing with Natural Language InstructionsBosheng Qin, Juncheng Li, Siliang Tang, Tat-Seng Chua, Yueting ZhuangarXiv 2023. Paper  2023-05-212023-05-21
-
SneakyPrompt: Evaluating Robustness of Text-to-image Generative Models' Safety FiltersYuchen Yang, Bo Hui, Haolin Yuan, Neil Gong, Yinzhi CaoarXiv 2023. Paper  2023-05-202023-05-20
-
Efficient Cross-Lingual Transfer for Chinese Stable Diffusion with Images as PivotsJinyi Hu, Xu Han, Xiaoyuan Yi, Yutong Chen, Wenhao Li, Zhiyuan Liu, Maosong SunarXiv 2023. Paper  2023-05-192023-05-19
-
Brain Captioning: Decoding human brain activity into images and textMatteo Ferrante, Furkan Ozcelik, Tommaso Boccato, Rufin VanRullen, Nicola ToschiarXiv 2023. Paper  2023-05-192023-05-19
-
Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance FieldsJingbo Zhang, Xiaoyu Li, Ziyu Wan, Can Wang, Jing LiaoarXiv 2023. Paper  2023-05-192023-05-19
-
Any-to-Any Generation via Composable DiffusionZineng Tang, Ziyi Yang, Chenguang Zhu, Michael Zeng, Mohit Bansal2023-05-192023-05-19
-
Late-Constraint Diffusion Guidance for Controllable Image SynthesisChang Liu, Dong Liu2023-05-192023-05-19
-
Inspecting the Geographical Representativeness of Images from Text-to-Image ModelsAbhipsa Basu, R. Venkatesh Babu, Danish PruthiarXiv 2023. Paper  2023-05-182023-05-18
-
X-IQE: eXplainable Image Quality Evaluation for Text-to-Image Generation with Visual Large Language ModelsYixiong Chen2023-05-182023-05-18
-
LDM3D: Latent Diffusion Model for 3DGabriela Ben Melech Stan, Diana Wofk, Scottie Fox, Alex Redden, Will Saxton, Jean Yu, Estelle Aflalo, Shao-Yen Tseng, Fabio Nonato, Matthias Muller, Vasudev LalarXiv 2023. Paper  2023-05-182023-05-18
-
VideoFactory: Swap Attention in Spatiotemporal Diffusions for Text-to-Video GenerationWenjing Wang, Huan Yang, Zixi Tuo, Huiguo He, Junchen Zhu, Jianlong Fu, Jiaying LiuarXiv 2023. Paper  2023-05-182023-05-18
-
TextDiffuser: Diffusion Models as Text PaintersJingye Chen, Yupan Huang, Tengchao Lv, Lei Cui, Qifeng Chen, Furu WeiarXiv 2023. Paper  2023-05-182023-05-18
-
AIwriting: Relations Between Image Generation and Digital WritingScott Rettberg, Talan Memmott, Jill Walker Rettberg, Jason Nelson, Patrick LichtyISEA 2023. Paper  2023-05-182023-05-18
-
Zero-Day Backdoor Attack against Text-to-Image Diffusion Models via PersonalizationYihao Huang, Qing Guo, Felix Juefei-XuarXiv 2023. Paper  2023-05-182023-05-18
-
Discriminative Diffusion Models as Few-shot Vision and Language LearnersXuehai He, Weixi Feng, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, William Yang Wang, Xin Eric WangarXiv 2023. Paper  2023-05-182023-05-18
-
Preserve Your Own Correlation: A Noise Prior for Video Diffusion ModelsSongwei Ge, Seungjun Nah, Guilin Liu, Tyler Poon, Andrew Tao, Bryan Catanzaro, David Jacobs, Jia-Bin Huang, Ming-Yu Liu, Yogesh Balaji2023-05-172023-05-17
-
Make-An-Animation: Large-Scale Text-conditional 3D Human Motion GenerationSamaneh Azadi, Akbar Shah, Thomas Hayes, Devi Parikh, Sonal Gupta2023-05-162023-05-16
-
Generating coherent comic with rich story using ChatGPT and Stable DiffusionZe Jin, Zorina SongarXiv 2023. Paper  2023-05-162023-05-16
-
AMD: Autoregressive Motion DiffusionBo Han, Hao Peng, Minjing Dong, Chang Xu, Yi Ren, Yixuan Shen, Yuheng LiarXiv 2023. Paper  2023-05-162023-05-16
-
Interactive Fashion Content Generation Using LLMs and Latent Diffusion ModelsKrishna Sri Ipsit Mantri, Nevasini SasikumararXiv 2023. Paper  2023-05-152023-05-15
-
Common Diffusion Noise Schedules and Sample Steps are FlawedShanchuan Lin, Bingchen Liu, Jiashi Li, Xiao YangarXiv 2023. Paper  2023-05-152023-05-15
-
Make-A-Protagonist: Generic Video Editing with An Ensemble of ExpertsYuyang Zhao, Enze Xie, Lanqing Hong, Zhenguo Li, Gim Hee Lee2023-05-152023-05-15
-
Null-text Guidance in Diffusion Models is Secretly a Cartoon-style CreatorJing Zhao, Heliang Zheng, Chaoyue Wang, Long Lan, Wanrong Huang, Wenjing Yang2023-05-112023-05-11
-
iEdit: Localised Text-guided Image Editing with Weak SupervisionRumeysa Bodur, Erhan Gundogdu, Binod Bhattarai, Tae-Kyun Kim, Michael Donoser, Loris BazzaniarXiv 2023. Paper  2023-05-102023-05-10
-
Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style TransferNisha Huang, Yuxin Zhang, Weiming DongarXiv 2023. Paper  2023-05-092023-05-09
-
SUR-adapter: Enhancing Text-to-Image Pre-trained Diffusion Models with Large Language ModelsShanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin2023-05-092023-05-09
-
Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion ModelsWenkai Dong, Song Xue, Xiaoyue Duan, Shumin HanarXiv 2023. Paper  2023-05-082023-05-08
-
ReGeneration Learning of Diffusion Models with Rich Prompts for Zero-Shot Image TranslationYupei Lin, Sen Zhang, Xiaojun Yang, Xiao Wang, Yukai Shi2023-05-082023-05-08
-
IIITD-20K: Dense captioning for Text-Image ReIDA V Subramanyam, Niranjan Sundararajan, Vibhu Dubey, Brejesh LallarXiv 2023. Paper  2023-05-082023-05-08
-
DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion ModelsSicheng Yang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Lei Hao, Weihong Bao, Ming Cheng, Long Xiao2023-05-082023-05-08
-
Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data PoisoningShengfang Zhai, Yinpeng Dong, Qingni Shen, Shi Pu, Yuejian Fang, Hang SuarXiv 2023. Paper  2023-05-072023-05-07
-
AADiff: Audio-Aligned Video Synthesis with Text-to-Image DiffusionSeungwoo Lee, Chaerin Kong, Donghyeon Jeon, Nojun KwakarXiv 2023. Paper  2023-05-062023-05-06
-
Guided Image Synthesis via Initial Image Editing in Diffusion ModelJiafeng Mao, Xueting Wang, Kiyoharu AizawaarXiv 2023. Paper  2023-05-052023-05-05
-
DisenBooth: Identity-Preserving Disentangled Tuning for Subject-Driven Text-to-Image GenerationHong Chen, Yipeng Zhang, Xin Wang, Xuguang Duan, Yuwei Zhou, Wenwu Zhu2023-05-052023-05-05
-
Data Curation for Image Captioning with Text-to-Image Generative ModelsWenyan Li, Jonas F. Lotz, Chen Qiu, Desmond ElliottarXiv 2023. Paper  2023-05-052023-05-05
-
Multimodal-driven Talking Face Generation, Face Swapping, Diffusion ModelChao Xu, Shaoting Zhu, Junwei Zhu, Tianxin Huang, Jiangning Zhang, Ying Tai, Yong LiuarXiv 2023. Paper  2023-05-042023-05-04
-
Diffusion Explainer: Visual Explanation for Text-to-image Stable DiffusionSeongmin Lee, Benjamin Hoover, Hendrik Strobelt, Zijie J. Wang, ShengYun Peng, Austin Wright, Kevin Li, Haekyu Park, Haoyang Yang, Duen Horng Chau2023-05-042023-05-04
-
Multimodal Data Augmentation for Image Captioning using Diffusion ModelsChangrong Xiao, Sean Xin Xu, Kunpeng ZhangarXiv 2023. Paper  2023-05-032023-05-03
-
In-Context Learning Unlocked for Diffusion ModelsZhendong Wang, Yifan Jiang, Yadong Lu, Yelong Shen, Pengcheng He, Weizhu Chen, Zhangyang Wang, Mingyuan Zhou2023-05-012023-05-01
-
SceneGenie: Scene Graph Guided Diffusion Models for Image SynthesisAzade Farshad, Yousef Yeganeh, Yu Chi, Chengzhi Shen, Björn Ommer, Nassir NavabarXiv 2023. Paper  2023-04-282023-04-28
-
Edit Everything: A Text-Guided Generative System for Images EditingDefeng Xie, Ruichen Wang, Jian Ma, Chen Chen, Haonan Lu, Dong Yang, Fobo Shi, Xiaodong Lin2023-04-272023-04-27
-
It is all about where you start: Text-to-image generation with seed selectionDvir Samuel, Rami Ben-Ari, Simon Raviv, Nir Darshan, Gal ChechikarXiv 2023. Paper  2023-04-272023-04-27
-
Training-Free Location-Aware Text-to-Image SynthesisJiafeng Mao, Xueting WangarXiv 2023. Paper  2023-04-262023-04-26
-
TextMesh: Generation of Realistic 3D Meshes From Text PromptsChristina Tsalicoglou, Fabian Manhardt, Alessio Tonioni, Michael Niemeyer, Federico TombariarXiv 2023. Paper  2023-04-242023-04-24
-
Using Text-to-Image Generation for Architectural Design IdeationVille Paananen, Jonas Oppenlaender, Aku VisuriarXiv 2023. Paper  2023-04-202023-04-20
-
Anything-3D: Towards Single-view Anything Reconstruction in the WildQiuhong Shen, Xingyi Yang, Xinchao Wang2023-04-192023-04-19
-
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion ModelsAndreas Blattmann, Robin Rombach, Huan Ling, Tim Dockhorn, Seung Wook Kim, Sanja Fidler, Karsten Kreis2023-04-182023-04-18
-
TTIDA: Controllable Generative Data Augmentation via Text-to-Text and Text-to-Image ModelsYuwei Yin, Jean Kaddour, Xiang Zhang, Yixin Nie, Zhenguang Liu, Lingpeng Kong, Qi LiuarXiv 2023. Paper  2023-04-182023-04-18
-
UPGPT: Universal Diffusion Model for Person Image Generation, Editing and Pose TransferSoon Yau Cheong, Armin Mustafa, Andrew Gilbert2023-04-182023-04-18
-
MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and EditingMingdeng Cao, Xintao Wang, Zhongang Qi, Ying Shan, Xiaohu Qie, Yinqiang Zheng2023-04-172023-04-17
-
Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video GenerationJie An, Songyang Zhang, Harry Yang, Sonal Gupta, Jia-Bin Huang, Jiebo Luo, Xi Yin2023-04-172023-04-17
-
Text2Performer: Text-Driven Human Video GenerationYuming Jiang, Shuai Yang, Tong Liang Koh, Wayne Wu, Chen Change Loy, Ziwei Liu2023-04-172023-04-17
-
Delta Denoising ScoreAmir Hertz, Kfir Aberman, Daniel Cohen-Or2023-04-142023-04-14
-
Text-Conditional Contextualized Avatars For Zero-Shot PersonalizationSamaneh Azadi, Thomas Hayes, Akbar Shah, Guan Pang, Devi Parikh, Sonal GuptaarXiv 2023. Paper  2023-04-142023-04-14
-
Soundini: Sound-Guided Diffusion for Natural Video EditingSeung Hyun Lee, Sieun Kim, Innfarn Yoo, Feng Yang, Donghyeon Cho, Youngseo Kim, Huiwen Chang, Jinkyu Kim, Sangpil Kim2023-04-132023-04-13
-
Expressive Text-to-Image Generation with Rich TextSongwei Ge, Taesung Park, Jun-Yan Zhu, Jia-Bin Huang2023-04-132023-04-13
-
Continual Diffusion: Continual Customization of Text-to-Image Diffusion with C-LoRAJames Seale Smith, Yen-Chang Hsu, Lingyu Zhang, Ting Hua, Zsolt Kira, Yilin Shen, Hongxia Jin2023-04-122023-04-12
-
An Edit Friendly DDPM Noise Space: Inversion and ManipulationsInbar Huberman-Spiegelglas, Vladimir Kulikov, Tomer MichaeliarXiv 2023. Paper  2023-04-122023-04-12
-
Improving Diffusion Models for Scene Text Editing with Dual EncodersJiabao Ji, Guanhua Zhang, Zhaowen Wang, Bairu Hou, Zhifei Zhang, Brian Price, Shiyu Chang2023-04-122023-04-12
-
Re-imagine the Negative Prompt Algorithm: Transform 2D Diffusion into 3D, alleviate Janus problem and BeyondMohammadreza Armandpour, Huangjie Zheng, Ali Sadeghian, Amir Sadeghian, Mingyuan ZhouarXiv 2023. Paper  2023-04-112023-04-11
-
HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image ModelsEslam Mohamed Bakr, Pengzhan Sun, Xiaoqian Shen, Faizan Farooq Khan, Li Erran Li, Mohamed Elhoseiny2023-04-112023-04-11
-
Towards Real-time Text-driven Image Manipulation with Unconditional Diffusion ModelsNikita Starodubcev, Dmitry Baranchuk, Valentin Khrulkov, Artem BabenkoarXiv 2023. Paper  2023-04-102023-04-10
-
HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image GenerationXuan Ju, Ailing Zeng, Chenchen Zhao, Jianan Wang, Lei Zhang, Qiang Xu2023-04-092023-04-09
-
Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image SynthesisQiucheng Wu, Yujian Liu, Handong Zhao, Trung Bui, Zhe Lin, Yang Zhang, Shiyu Chang2023-04-072023-04-07
-
DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D ModelHoigi Seo, Hayeon Kim, Gwanghyun Kim, Se Young Chun2023-04-062023-04-06
-
Benchmarking Robustness to Text-Guided CorruptionsMohammadreza Mofayezi, Yasamin MedghalchiarXiv 2023. Paper  2023-04-062023-04-06
-
Training-Free Layout Control with Cross-Attention GuidanceMinghao Chen, Iro Laina, Andrea Vedaldi2023-04-062023-04-06
-
Zero-shot Generative Model Adaptation via Image-specific Prompt LearningJiayi Guo, Chaofei Wang, You Wu, Eric Zhang, Kai Wang, Xingqian Xu, Shiji Song, Humphrey Shi, Gao Huang2023-04-062023-04-06
-
A Diffusion-based Method for Multi-turn Compositional Image GenerationChao Wang, Xiaoyu Yang, Jinmiao Huang, Kevin FerreiraarXiv 2023. Paper  2023-04-052023-04-05
-
Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion ModelsXuhui Jia, Yang Zhao, Kelvin C.K. Chan, Yandong Li, Han Zhang, Boqing Gong, Tingbo Hou, Huisheng Wang, Yu-Chuan SuarXiv 2023. Paper  2023-04-052023-04-05
-
Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative ModelsJaewoong Lee, Sangwon Jang, Jaehyeong Jo, Jaehong Yoon, Yunji Kim, Jin-Hwa Kim, Jung-Woo Ha, Sung Ju HwangarXiv 2023. Paper  2023-04-042023-04-04
-
PODIA-3D: Domain Adaptation of 3D Generative Model Across Large Domain Gap Using Pose-Preserved Text-to-Image DiffusionGwanghyun Kim, Ji Ha Jang, Se Young Chun2023-04-042023-04-04
-
Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image EditingAlberto Baldrati, Davide Morelli, Giuseppe Cartella, Marcella Cornia, Marco Bertini, Rita CucchiaraarXiv 2023. Paper  2023-04-042023-04-04
-
viz2viz: Prompt-driven stylized visualization generation using a diffusion modelJiaqi Wu, John Joon Young Chung, Eytan AdararXiv 2023. Paper  2023-04-042023-04-04
-
DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via Diffusion ModelsYukang Cao, Yan-Pei Cao, Kai Han, Ying Shan, Kwan-Yee K. WongarXiv 2023. Paper  2023-04-032023-04-03
-
ReMoDiffuse: Retrieval-Augmented Motion Diffusion ModelMingyuan Zhang, Xinying Guo, Liang Pan, Zhongang Cai, Fangzhou Hong, Huirong Li, Lei Yang, Ziwei Liu2023-04-032023-04-03
-
DreamFace: Progressive Generation of Animatable 3D Faces under Text GuidanceLongwen Zhang, Qiwei Qiu, Hongyang Lin, Qixuan Zhang, Cheng Shi, Wei Yang, Ye Shi, Sibei Yang, Lan Xu, Jingyi Yu2023-04-012023-04-01
-
GlyphDraw: Learning to Draw Chinese Characters in Image Synthesis Models CoherentlyJian Ma, Mingjun Zhao, Chen Chen, Ruichen Wang, Di Niu, Haonan Lu, Xiaodong Lin2023-03-312023-03-31
-
LayoutDiffusion: Controllable Diffusion Model for Layout-to-image GenerationGuangcong Zheng, Xianpan Zhou, Xuewei Li, Zhongang Qi, Ying Shan, Xi Li2023-03-302023-03-30
-
DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion AutoencoderChenpng Du, Qi Chen, Tianyu He, Xu Tan, Xie Chen, Kai Yu, Sheng Zhao, Jiang BianarXiv 2023. Paper  2023-03-302023-03-30
-
Discriminative Class Tokens for Text-to-Image Diffusion ModelsIdan Schwartz, Vésteinn Snæbjarnarson, Sagie Benaim, Hila Chefer, Ryan Cotterell, Lior Wolf, Serge BelongiearXiv 2023. Paper  2023-03-302023-03-30
-
Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion ModelsWen Wang, Kangyang Xie, Zide Liu, Hao Chen, Yue Cao, Xinlong Wang, Chunhua ShenarXiv 2023. Paper  2023-03-302023-03-30
-
DiffCollage: Parallel Generation of Large Content with Diffusion ModelsQinsheng Zhang, Jiaming Song, Xun Huang, Yongxin Chen, Ming-Yu Liu2023-03-302023-03-30
-
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion ModelsEric Zhang, Kai Wang, Xingqian Xu, Zhangyang Wang, Humphrey Shi2023-03-302023-03-30
-
Social Biases through the Text-to-Image Generation LensRanjita Naik, Besmira NushiarXiv 2023. Paper  2023-03-302023-03-30
-
PAIR-Diffusion: Object-Level Image Editing with Structure-and-Appearance Paired Diffusion ModelsVidit Goel, Elia Peruzzo, Yifan Jiang, Dejia Xu, Nicu Sebe, Trevor Darrell, Zhangyang Wang, Humphrey Shi2023-03-302023-03-30
-
AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose ControlRuixiang Jiang, Can Wang, Jingbo Zhang, Menglei Chai, Mingming He, Dongdong Chen, Jing Liao2023-03-302023-03-30
-
MDP: A Generalized Framework for Text-Guided Image Editing by Manipulating the Diffusion PathQian Wang, Biao Zhang, Michael Birsak, Peter Wonka2023-03-292023-03-29
-
4D Facial Expression Diffusion ModelKaifeng Zou, Sylvain Faisan, Boyang Yu, Sébastien Valette, Hyewon Seo2023-03-292023-03-29
-
StyleDiffusion: Prompt-Embedding Inversion for Text-Based EditingSenmao Li, Joost van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang, Jian YangarXiv 2023. Paper  2023-03-282023-03-28
-
Instruct 3D-to-3D: Text Instruction Guided 3D-to-3D conversionHiromichi Kamata, Yuiko Sakuma, Akio Hayakawa, Masato Ishii, Takuya Narihira2023-03-282023-03-28
-
Anti-DreamBooth: Protecting users from personalized text-to-image synthesisThanh Van Le, Hao Phung, Thuan Hoang Nguyen, Quan Dao, Ngoc Tran, Anh Tran2023-03-272023-03-27
-
Debiasing Scores and Prompts of 2D Diffusion for Robust Text-to-3D GenerationSusung Hong, Donghoon Ahn, Seungryong KimarXiv 2023. Paper  2023-03-272023-03-27
-
Seer: Language Instructed Video Prediction with Latent Diffusion ModelsXianfan Gu, Chuan Wen, Jiaming Song, Yang GaoCVPR Workshop 2023. Paper  2023-03-272023-03-27
-
GestureDiffuCLIP: Gesture Diffusion Model with CLIP LatentsTenglong Ao, Zeyi Zhang, Libin LiuarXiv 2023. Paper  2023-03-262023-03-26
-
Better Aligning Text-to-Image Models with Human PreferenceXiaoshi Wu, Keqiang Sun, Feng Zhu, Rui Zhao, Hongsheng Li2023-03-252023-03-25
-
Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content CreationRui Chen, Yongwei Chen, Ningxin Jiao, Kui JiaarXiv 2023. Paper  2023-03-242023-03-24
-
CompoNeRF: Text-guided Multi-object Compositional NeRF with Editable 3D Scene LayoutYiqi Lin, Haotian Bai, Sijia Li, Haonan Lu, Xiaodong Lin, Hui Xiong, Lin Wang2023-03-242023-03-24
-
DiffuScene: Scene Graph Denoising Diffusion Probabilistic Model for Generative Indoor Scene SynthesisJiapeng Tang, Yinyu Nie, Lev Markhasin, Angela Dai, Justus Thies, Matthias Nießner2023-03-242023-03-24
-
ISS++: Image as Stepping Stone for Text-Guided 3D Shape GenerationZhengzhe Liu, Peng Dai, Ruihui Li, Xiaojuan Qi, Chi-Wing FuICLR 2023. Paper  2023-03-242023-03-24
-
MagicFusion: Boosting Text-to-Image Generation Performance by Fusing Diffusion ModelsJing Zhao, Heliang Zheng, Chaoyue Wang, Long Lan, Wenjing Yang2023-03-232023-03-23
-
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video GeneratorsLevon Khachatryan, Andranik Movsisyan, Vahram Tadevosyan, Roberto Henschel, Zhangyang Wang, Shant Navasardyan, Humphrey Shi2023-03-232023-03-23
-
Ablating Concepts in Text-to-Image Diffusion ModelsNupur Kumari, Bingliang Zhang, Sheng-Yu Wang, Eli Shechtman, Richard Zhang, Jun-Yan Zhu2023-03-232023-03-23
-
ReVersion: Diffusion-Based Relation Inversion from ImagesZiqi Huang, Tianxing Wu, Yuming Jiang, Kelvin C.K. Chan, Ziwei Liu2023-03-232023-03-23
-
Instruct-NeRF2NeRF: Editing 3D Scenes with InstructionsAyaan Haque, Matthew Tancik, Alexei A. Efros, Aleksander Holynski, Angjoo Kanazawa2023-03-222023-03-22
-
Pix2Video: Video Editing using Image DiffusionDuygu Ceylan, Chun-Hao Paul Huang, Niloy J. Mitra2023-03-222023-03-22
-
3D-CLFusion: Fast Text-to-3D Rendering with Contrastive Latent DiffusionYu-Jhe Li, Kris KitaniarXiv 2023. Paper  2023-03-212023-03-21
-
CompoDiff: Versatile Composed Image Retrieval With Latent DiffusionGeonmo Gu, Sanghyuk Chun, Wonjae Kim, HeeJae Jun, Yoohoon Kang, Sangdoo YunarXiv 2023. Paper  2023-03-212023-03-21
-
Vox-E: Text-guided Voxel Editing of 3D ObjectsEtai Sella, Gal Fiebelman, Peter Hedman, Hadar Averbuch-Elor2023-03-212023-03-21
-
SALAD: Part-Level Latent Diffusion for 3D Shape Generation and ManipulationJuil Koo, Seungwoo Yoo, Minh Hieu Nguyen, Minhyuk Sung2023-03-212023-03-21
-
Discovering Interpretable Directions in the Semantic Latent Space of Diffusion ModelsRené Haas, Inbar Huberman-Spiegelglas, Rotem Mulayoff, Tomer MichaeliarXiv 2023. Paper  2023-03-202023-03-20
-
SVDiff: Compact Parameter Space for Diffusion Fine-TuningLigong Han, Yinxiao Li, Han Zhang, Peyman Milanfar, Dimitris Metaxas, Feng YangarXiv 2023. Paper  2023-03-202023-03-20
-
Localizing Object-level Shape Variations with Text-to-Image Diffusion ModelsOr Patashnik, Daniel Garibi, Idan Azuri, Hadar Averbuch-Elor, Daniel Cohen-Or2023-03-202023-03-20
-
Text2Tex: Text-driven Texture Synthesis via Diffusion ModelsDave Zhenyu Chen, Yawar Siddiqui, Hsin-Ying Lee, Sergey Tulyakov, Matthias Nießner2023-03-202023-03-20
-
SKED: Sketch-guided Text-based 3D EditingAryan Mikaeili, Or Perel, Daniel Cohen-Or, Ali Mahdavi-Amiriarxiv 2023. Paper  2023-03-192023-03-19
-
FreeDoM: Training-Free Energy-Guided Conditional Diffusion ModelJiwen Yu, Yinhuai Wang, Chen Zhao, Bernard Ghanem, Jian Zhang2023-03-172023-03-17
-
DiffusionRet: Generative Text-Video Retrieval with Diffusion ModelPeng Jin, Hao Li, Zesen Cheng, Kehan Li, Xiangyang Ji, Chang Liu, Li Yuan, Jie ChenarXiv 2023. Paper  2023-03-172023-03-17
-
GlueGen: Plug and Play Multi-modal Encoders for X-to-image GenerationCan Qin, Ning Yu, Chen Xing, Shu Zhang, Zeyuan Chen, Stefano Ermon, Yun Fu, Caiming Xiong, Ran XuarXiv 2023. Paper  2023-03-172023-03-17
-
DialogPaint: A Dialog-based Image Editing ModelJingxuan Wei, Shiyu Wu, Xin Jiang, Yequan WangarXiv 2023. Paper  2023-03-172023-03-17
-
P+: Extended Textual Conditioning in Text-to-Image GenerationAndrey Voynov, Qinghao Chu, Daniel Cohen-Or, Kfir Aberman2023-03-162023-03-16
-
HIVE: Harnessing Human Feedback for Instructional Visual EditingShu Zhang, Xinyi Yang, Yihao Feng, Can Qin, Chia-Chih Chen, Ning Yu, Zeyuan Chen, Huan Wang, Silvio Savarese, Stefano Ermon, Caiming Xiong, Ran XuarXiv 2023. Paper  2023-03-162023-03-16
-
FateZero: Fusing Attentions for Zero-shot Text-based Video EditingChenyang Qi, Xiaodong Cun, Yong Zhang, Chenyang Lei, Xintao Wang, Ying Shan, Qifeng Chen2023-03-162023-03-16
-
Unified Multi-Modal Latent Diffusion for Joint Subject and Text Conditional Image GenerationYiyang Ma, Huan Yang, Wenjing Wang, Jianlong Fu, Jiaying LiuarXiv 2023. Paper  2023-03-162023-03-16
-
Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style TransferSerin Yang, Hyunmin Hwang, Jong Chul YearXiv 2023. Paper  2023-03-152023-03-15
-
Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a Single Image using Diffusion ModelsDivya Kothandaraman, Tianyi Zhou, Ming Lin, Dinesh Manocha2023-03-152023-03-15
-
Highly Personalized Text Embedding for Image Manipulation by Stable DiffusionInhwa Han, Serin Yang, Taesung Kwon, Jong Chul YearXiv 2023. Paper  2023-03-152023-03-15
-
Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D GenerationJunyoung Seo, Wooseok Jang, Min-Seop Kwak, Jaehoon Ko, Hyeonsu Kim, Junho Kim, Jin-Hwa Kim, Jiyoung Lee, Seungryong KimarXiv 2023. Paper  2023-03-142023-03-14
-
Editing Implicit Assumptions in Text-to-Image Diffusion ModelsHadas Orgad, Bahjat Kawar, Yonatan Belinkov2023-03-142023-03-14
-
Edit-A-Video: Single Video Editing with Object-Aware ConsistencyChaehun Shin, Heeseung Kim, Che Hyun Lee, Sang-gil Lee, Sungroh Yoon2023-03-142023-03-14
-
Erasing Concepts from Diffusion ModelsRohit Gandikota, Joanna Materzynska, Jaden Fiotto-Kaufman, David Bau2023-03-132023-03-13
-
One Transformer Fits All Distributions in Multi-Modal Diffusion at ScaleFan Bao, Shen Nie, Kaiwen Xue, Chongxuan Li, Shi Pu, Yaole Wang, Gang Yue, Yue Cao, Hang Su, Jun Zhu2023-03-122023-03-12
-
Cones: Concept Neurons in Diffusion Models for Customized GenerationZhiheng Liu, Ruili Feng, Kai Zhu, Yifei Zhang, Kecheng Zheng, Yu Liu, Deli Zhao, Jingren Zhou, Yang CaoarXiv 2023. Paper  2023-03-092023-03-09
-
A Prompt Log Analysis of Text-to-Image Generation SystemsYutong Xie, Zhaoying Pan, Jinge Ma, Jie Luo, Qiaozhu MeiarXiv 2023. Paper  2023-03-082023-03-08
-
Video-P2P: Video Editing with Cross-attention ControlShaoteng Liu, Yuechen Zhang, Wenbo Li, Zhe Lin, Jiaya Jia2023-03-082023-03-08
-
Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation ModelsChenfei Wu, Shengming Yin, Weizhen Qi, Xiaodong Wang, Zecheng Tang, Nan Duan2023-03-082023-03-08
-
Zeroth-Order Optimization Meets Human Feedback: Provable Learning via Ranking OraclesZhiwei Tang, Dmitry Rybin, Tsung-Hui Chang2023-03-072023-03-07
-
Unleashing Text-to-Image Diffusion Models for Visual PerceptionWenliang Zhao, Yongming Rao, Zuyan Liu, Benlin Liu, Jie Zhou, Jiwen Lu2023-03-032023-03-03
-
Collage DiffusionVishnu Sarukkai, Linden Li, Arden Ma, Christopher Ré, Kayvon FatahalianarXiv 2023. Paper  2023-03-012023-03-01
-
Towards Enhanced Controllability of Diffusion ModelsWonwoong Cho, Hareesh Ravi, Midhun Harikumar, Vinh Khuc, Krishna Kumar Singh, Jingwan Lu, David I. Inouye, Ajinkya KalearXiv 2023. Paper  2023-02-282023-02-28
-
Directed Diffusion: Direct Control of Object Placement through Attention GuidanceWan-Duo Kurt Ma, J.P. Lewis, W. Bastiaan Kleijn, Thomas LeungarXiv 2023. Paper  2023-02-252023-02-25
-
Modulating Pretrained Diffusion Models for Multimodal Image SynthesisCusuh Ham, James Hays, Jingwan Lu, Krishna Kumar Singh, Zhifei Zhang, Tobias HinzarXiv 2023. Paper  2023-02-242023-02-24
-
Controlled and Conditional Text to Image Generation with Diffusion PriorPranav Aggarwal, Hareesh Ravi, Naveen Marri, Sachin Kelkar, Fengbin Chen, Vinh Khuc, Midhun Harikumar, Ritiz Tambi, Sudharshan Reddy Kakumanu, Purvak Lapsiya, Alvin Ghouas, Sarah Saber, Malavika Ramprasad, Baldo Faieta, Ajinkya KalearXiv 2023. Paper  2023-02-232023-02-23
-
Region-Aware Diffusion for Zero-shot Text-driven Image EditingNisha Huang, Fan Tang, Weiming Dong, Tong-Yee Lee, Changsheng Xu2023-02-232023-02-23
-
Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMCYilun Du, Conor Durkan, Robin Strudel, Joshua B. Tenenbaum, Sander Dieleman, Rob Fergus, Jascha Sohl-Dickstein, Arnaud Doucet, Will Grathwohl2023-02-222023-02-22
-
Learning 3D Photography Videos via Self-supervised Diffusion on Single ImagesXiaodong Wang, Chenfei Wu, Shengming Yin, Minheng Ni, Jianfeng Wang, Linjie Li, Zhengyuan Yang, Fan Yang, Lijuan Wang, Zicheng Liu, Yuejian Fang, Nan DuanarXiv 2023. Paper  2023-02-212023-02-21
-
Boundary Guided Mixing Trajectory for Semantic Control with Diffusion ModelsYe Zhu, Yu Wu, Zhiwei Deng, Olga Russakovsky, Yan YanarXiv 2023. Paper  2023-02-162023-02-16
-
MultiDiffusion: Fusing Diffusion Paths for Controlled Image GenerationOmer Bar-Tal, Lior Yariv, Yaron Lipman, Tali Dekel2023-02-162023-02-16
-
T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion ModelsChong Mou, Xintao Wang, Liangbin Xie, Jian Zhang, Zhongang Qi, Ying Shan, Xiaohu Qie2023-02-162023-02-16
-
Text-driven Visual Synthesis with Latent Diffusion PriorTing-Hsuan Liao, Songwei Ge, Yiran Xu, Yao-Chih Lee, Badour AlBahar, Jia-Bin Huang2023-02-162023-02-16
-
Exploring the Representation Manifolds of Stable Diffusion Through the Lens of Intrinsic DimensionHenry Kvinge, Davis Brown, Charles GodfreyarXiv 2023. Paper  2023-02-162023-02-16
-
PRedItOR: Text Guided Image Editing with Diffusion PrioHareesh Ravi, Sachin Kelkar, Midhun Harikumar, Ajinkya KalearXiv 2023. Paper  2023-02-152023-02-15
-
Dataset Interfaces: Diagnosing Model Failures Using Controllable Counterfactual GenerationJoshua Vendrow, Saachi Jain, Logan Engstrom, Aleksander Madry2023-02-152023-02-15
-
Universal Guidance for Diffusion ModelsArpit Bansal, Hong-Min Chu, Avi Schwarzschild, Soumyadip Sengupta, Micah Goldblum, Jonas Geiping, Tom Goldstein2023-02-142023-02-14
-
Text-Guided Scene Sketch-to-Photo SynthesisAprilPyone MaungMaung, Makoto Shing, Kentaro Mitsui, Kei Sawada, Fumio OkuraarXiv 2023. Paper  2023-02-142023-02-14
-
Analyzing Multimodal Objectives Through the Lens of Generative Diffusion GuidanceChaerin Kong, Nojun KwakarXiv 2023. Paper  2023-02-102023-02-10
-
Adding Conditional Control to Text-to-Image Diffusion ModelsLvmin Zhang, Maneesh Agrawala2023-02-102023-02-10
-
Is This Loss Informative? Speeding Up Textual Inversion with Deterministic Objective EvaluationAnton Voronov, Mikhail Khoroshikh, Artem Babenko, Max RyabininarXiv 2023. Paper  2023-02-092023-02-09
-
Zero-shot Generation of Coherent Storybook from Plain Text Story using Diffusion ModelsHyeonho Jeong, Gihyun Kwon, Jong Chul YearXiv 2023. Paper  2023-02-082023-02-08
-
GLAZE: Protecting Artists from Style Mimicry by Text-to-Image ModelsShawn Shan, Jenna Cryan, Emily Wenger, Haitao Zheng, Rana Hanocka, Ben Y. ZhaoarXiv 2023. Paper  2023-02-082023-02-08
-
Q-Diffusion: Quantizing Diffusion ModelsXiuyu Li, Long Lian, Yijiang Liu, Huanrui Yang, Zhen Dong, Daniel Kang, Shanghang Zhang, Kurt Keutzer2023-02-082023-02-08
-
Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and DiscoveryYuxin Wen, Neel Jain, John Kirchenbauer, Micah Goldblum, Jonas Geiping, Tom Goldstein2023-02-072023-02-07
-
Fair Diffusion: Instructing Text-to-Image Generation Models on FairnessFelix Friedrich, Patrick Schramowski, Manuel Brack, Lukas Struppek, Dominik Hintersdorf, Sasha Luccioni, Kristian KerstingarXiv 2023. Paper  2023-02-072023-02-07
-
Structure and Content-Guided Video Synthesis with Diffusion ModelsPatrick Esser, Johnathan Chiu, Parmida Atighehchian, Jonathan Granskog, Anastasis Germanidis2023-02-062023-02-06
-
Zero-shot Image-to-Image TranslationGaurav Parmar, Krishna Kumar Singh, Richard Zhang, Yijun Li, Jingwan Lu, Jun-Yan ZhuarXiv 2023. Paper  2023-02-062023-02-06
-
Eliminating Prior Bias for Semantic Image Editing via Dual-Cycle DiffusionZuopeng Yang, Tianshu Chu, Xin Lin, Erdun Gao, Daqing Liu, Jie Yang, Chaoyue WangarXiv 2023. Paper  2023-02-052023-02-05
-
ReDi: Efficient Learning-Free Diffusion Inference via Trajectory RetrievalKexun Zhang, Xianjun Yang, William Yang Wang, Lei LiarXiv 2023. Paper  2023-02-052023-02-05
-
Mixture of Diffusers for scene composition and high resolution image generationÁlvaro Barbero Jiménez2023-02-052023-02-05
-
Semantic-Guided Image Augmentation with Pre-trained ModelsBohan Li, Xinghao Wang, Xiao Xu, Yutai Hou, Yunlong Feng, Feng Wang, Wanxiang Che2023-02-042023-02-04
-
TEXTure: Text-Guided Texturing of 3D ShapesElad Richardson, Gal Metzer, Yuval Alaluf, Raja Giryes, Daniel Cohen-Or2023-02-032023-02-03
-
Dreamix: Video Diffusion Models are General Video EditorsEyal Molad, Eliahu Horwitz, Dani Valevski, Alex Rav Acha, Yossi Matias, Yael Pritch, Yaniv Leviathan, Yedid Hoshen2023-02-022023-02-02
-
Trash to Treasure: Using text-to-image models to inform the design of physical artefactsAmy Smith, Hope Schroeder, Ziv Epstein, Michael Cook, Simon Colton, Andrew LippmanAAAI 2023. Paper  2023-02-012023-02-01
-
Zero3D: Semantic-Driven Multi-Category 3D Shape GenerationBo Han, Yitong Liu, Yixuan ShenarXiv 2023. Paper  2023-01-312023-01-31
-
Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion ModelsHila Chefer, Yuval Alaluf, Yael Vinker, Lior Wolf, Daniel Cohen-Or2023-01-312023-01-31
-
GALIP: Generative Adversarial CLIPs for Text-to-Image SynthesisMing Tao, Bing-Kun Bao, Hao Tang, Changsheng Xu2023-01-302023-01-30
-
PromptMix: Text-to-image diffusion models enhance the performance of lightweight networksArian Bakhtiarnia, Qi Zhang, Alexandros Iosifidis2023-01-302023-01-30
-
Shape-aware Text-driven Layered Video EditingYao-Chih Lee, Ji-Ze Genevieve Jang, Yi-Ting Chen, Elizabeth Qiu, Jia-Bin Huang2023-01-302023-01-30
-
Towards Equitable Representation in Text-to-Image Synthesis Models with the Cross-Cultural Understanding Benchmark (CCUB) DatasetZhixuan Liu, Youeun Shin, Beverley-Claire Okogwu, Youngsik Yun, Lia Coleman, Peter Schaldenbrand, Jihie Kim, Jean OharXiv 2023. Paper  2023-01-282023-01-28
-
SEGA: Instructing Diffusion using Semantic DimensionsManuel Brack, Felix Friedrich, Dominik Hintersdorf, Lukas Struppek, Patrick Schramowski, Kristian KerstingarXiv 2023. Paper  2023-01-282023-01-28
-
Text-To-4D Dynamic Scene GenerationUriel Singer, Shelly Sheynin, Adam Polyak, Oron Ashual, Iurii Makarov, Filippos Kokkinos, Naman Goyal, Andrea Vedaldi, Devi Parikh, Justin Johnson, Yaniv TaigmanarXiv 2023. Paper  2023-01-262023-01-26
-
Guiding Text-to-Image Diffusion Model Towards Grounded GenerationZiyi Li, Qinye Zhou, Xiaoyun Zhang, Ya Zhang, Yanfeng Wang, Weidi Xie2023-01-122023-01-12
-
Speech Driven Video Editing via an Audio-Conditioned Diffusion ModelDan Bigioi, Shubhajit Basak, Hugh Jordan, Rachel McDonnell, Peter CorcoranarXiv 2023. Paper  2023-01-102023-01-10
-
DiffTalk: Crafting Diffusion Models for Generalized Talking Head SynthesisShuai Shen, Wenliang Zhao, Zibin Meng, Wanhua Li, Zheng Zhu, Jie Zhou, Jiwen LuarXiv 2023. Paper  2023-01-102023-01-10
-
Speech Driven Video Editing via an Audio-Conditioned Diffusion ModelDan Bigioi, Shubhajit Basak, Hugh Jordan, Rachel McDonnell, Peter Corcoran2023-01-102023-01-10
-
Visual Story Generation Based on Emotion and KeywordsYuetian Chen, Ruohua Li, Bowen Shi, Peiru Liu, Mei SiAIIDE INT 2022. Paper  2023-01-072023-01-07
-
Diffused Heads: Diffusion Models Beat GANs on Talking-Face GenerationMichał Stypułkowski, Konstantinos Vougioukas, Sen He, Maciej Zięba, Stavros Petridis, Maja Pantic2023-01-062023-01-06
-
Muse: Text-To-Image Generation via Masked Generative TransformersHuiwen Chang, Han Zhang, Jarred Barber, AJ Maschinot, Jose Lezama, Lu Jiang, Ming-Hsuan Yang, Kevin Murphy, William T. Freeman, Michael Rubinstein, Yuanzhen Li, Dilip Krishnan2023-01-022023-01-02
-
Exploring Vision Transformers as Diffusion LearnersHe Cao, Jianan Wang, Tianhe Ren, Xianbiao Qi, Yihao Chen, Yuan Yao, Lei ZhangarXiv 2022. Paper  2022-12-282022-12-28
-
Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion ModelsJiale Xu, Xintao Wang, Weihao Cheng, Yan-Pei Cao, Ying Shan, Xiaohu Qie, Shenghua Gao2022-12-282022-12-28
-
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video GenerationJay Zhangjie Wu, Yixiao Ge, Xintao Wang, Weixian Lei, Yuchao Gu, Wynne Hsu, Ying Shan, Xiaohu Qie, Mike Zheng Shou2022-12-222022-12-22
-
Contrastive Language-Vision AI Models Pretrained on Web-Scraped Multimodal Data Exhibit Sexual Objectification BiasRobert Wolfe, Yiwei Yang, Bill Howe, Aylin CaliskanarXiv 2022. Paper  2022-12-212022-12-21
-
Optimizing Prompts for Text-to-Image GenerationYaru Hao, Zewen Chi, Li Dong, Furu Wei2022-12-192022-12-19
-
Uncovering the Disentanglement Capability in Text-to-Image Diffusion ModelsQiucheng Wu, Yujian Liu, Handong Zhao, Ajinkya Kale, Trung Bui, Tong Yu, Zhe Lin, Yang Zhang, Shiyu Chang2022-12-162022-12-16
-
TeTIm-Eval: a novel curated evaluation data set for comparing text-to-image modelsFederico A. Galatolo, Mario G. C. A. Cimino, Edoardo CogottiarXiv 2022. Paper  2022-12-152022-12-15
-
The Infinite Index: Information Retrieval on Generative Text-To-Image ModelsNiklas Deckers, Maik Fröbe, Johannes Kiesel, Gianluca Pandolfo, Christopher Schröder, Benno Stein, Martin PotthastCHIIR 2023. Paper  2022-12-142022-12-14
-
Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image InpaintingSu Wang, Chitwan Saharia, Ceslee Montgomery, Jordi Pont-Tuset, Shai Noy, Stefano Pellegrini, Yasumasa Onoe, Sarah Laszlo, David J. Fleet, Radu Soricut, Jason Baldridge, Mohammad Norouzi, Peter Anderson, William ChanCVPR 2023. Paper  2022-12-132022-12-13
-
LidarCLIP or: How I Learned to Talk to Point CloudsGeorg Hess, Adam Tonderski, Christoffer Petersson, Lennart Svensson, Kalle Åström2022-12-132022-12-13
-
The Stable Artist: Steering Semantics in Diffusion Latent SpaceManuel Brack, Patrick Schramowski, Felix Friedrich, Dominik Hintersdorf, Kristian KerstingarXiv 2022. Paper  2022-12-122022-12-12
-
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image SynthesisWeixi Feng, Xuehai He, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, Xin Eric Wang, William Yang Wang2022-12-092022-12-09
-
SmartBrush: Text and Shape Guided Object Inpainting with Diffusion ModelShaoan Xie, Zhifei Zhang, Zhe Lin, Tobias Hinz, Kun ZhangarXiv 2022. Paper  2022-12-092022-12-09
-
Executing your Commands via Motion Diffusion in Latent SpaceXin Chen, Biao Jiang, Wen Liu, Zilong Huang, Bin Fu, Tao Chen, Jingyi Yu, Gang Yu2022-12-082022-12-08
-
Diffusion Guided Domain Adaptation of Image GeneratorsKunpeng Song, Ligong Han, Bingchen Liu, Dimitris Metaxas, Ahmed Elgammal2022-12-082022-12-08
-
Multi-Concept Customization of Text-to-Image DiffusionNupur Kumari, Bingliang Zhang, Richard Zhang, Eli Shechtman, Jun-Yan Zhu2022-12-082022-12-08
-
SINE: SINgle Image Editing with Text-to-Image Diffusion ModelsZhixing Zhang, Ligong Han, Arnab Ghosh, Dimitris Metaxas, Jian Ren2022-12-082022-12-08
-
SDFusion: Multimodal 3D Shape Completion, Reconstruction, and GenerationYen-Chi Cheng, Hsin-Ying Lee, Sergey Tulyakov, Alexander Schwing, Liangyan Gui2022-12-082022-12-08
-
MoFusion: A Framework for Denoising-Diffusion-based Motion SynthesisRishabh Dabral, Muhammad Hamza Mughal, Vladislav Golyanik, Christian Theobalt2022-12-082022-12-08
-
Judge, Localize, and Edit: Ensuring Visual Commonsense Morality for Text-to-Image GenerationSeongbeom Park, Suhong Moon, Jinkyu KimarXiv 2022. Paper  2022-12-072022-12-07
-
Magic: Multi Art Genre Intelligent Choreography Dataset and Network for 3D Dance GenerationRonghui Li, Junfan Zhao, Yachao Zhang, Mingyang Su, Zeping Ren, Han Zhang, Xiu LiarXiv 2022. Paper  2022-12-072022-12-07
-
Talking Head Generation with Probabilistic Audio-to-Visual Diffusion PriorsZhentao Yu, Zixin Yin, Deyu Zhou, Duomin Wang, Finn Wong, Baoyuan Wang2022-12-072022-12-07
-
Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video EncodingGyeongman Kim, Hajin Shim, Hyunsu Kim, Yunjey Choi, Junho Kim, Eunho Yang2022-12-062022-12-06
-
M-VADER: A Model for Diffusion with Multimodal ContextSamuel Weinbach, Marco Bellagente, Constantin Eichenberg, Andrew Dai, Robert Baldock, Souradeep Nanda, Björn Deiseroth, Koen Oostermeijer, Hannah Teufel, Andres Felipe Cruz-SalinasarXiv 2022. Paper  2022-12-062022-12-06
-
ADIR: Adaptive Diffusion for Image ReconstructionShady Abu-Hussein, Tom Tirer, Raja Giryes2022-12-062022-12-06
-
Diffusion-SDF: Text-to-Shape via Voxelized DiffusionMuheng Li, Yueqi Duan, Jie Zhou, Jiwen Lu2022-12-062022-12-06
-
Semantic-Conditional Diffusion Networks for Image CaptioningJianjie Luo, Yehao Li, Yingwei Pan, Ting Yao, Jianlin Feng, Hongyang Chao, Tao Mei2022-12-062022-12-06
-
NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image PriorsCongyue Deng, Chiyu "Max'' Jiang, Charles R. Qi, Xinchen Yan, Yin Zhou, Leonidas Guibas, Dragomir AnguelovarXiv 2022. Paper  2022-12-062022-12-06
-
Shape-Guided Diffusion with Inside-Outside AttentionDong Huk Park, Grace Luo, Clayton Toste, Samaneh Azadi, Xihui Liu, Maka Karalashvili, Anna Rohrbach, Trevor Darrell2022-12-012022-12-01
-
Unite and Conquer: Cross Dataset Multimodal Synthesis using Diffusion ModelsNithin Gopalakrishnan Nair, Wele Gedara Chaminda Bandara, Vishal M. Patel2022-12-012022-12-01
-
DATID-3D: Diversity-Preserved Domain Adaptation Using Text-to-Image Diffusion for 3D Generative ModelGwanghyun Kim, Se Young Chun2022-11-292022-11-29
-
SinDDM: A Single Image Denoising Diffusion ModelVladimir Kulikov, Shahar Yadin, Matan Kleiner, Tomer Michaeli2022-11-292022-11-29
-
Refined Semantic Enhancement towards Frequency Diffusion for Video CaptioningXian Zhong, Zipeng Li, Shuqin Chen, Kui Jiang, Chen Chen, Mang Ye2022-11-282022-11-28
-
Unified Discrete Diffusion for Simultaneous Vision-Language GenerationMinghui Hu, Chuanxia Zheng, Heliang Zheng, Tat-Jen Cham, Chaoyue Wang, Zuopeng Yang, Dacheng Tao, Ponnuthurai N. SuganthanarXiv 2022. Paper  2022-11-272022-11-27
-
SpaText: Spatio-Textual Representation for Controllable Image GenerationOmri Avrahami, Thomas Hayes, Oran Gafni, Sonal Gupta, Yaniv Taigman, Devi Parikh, Dani Lischinski, Ohad Fried, Xi Yin2022-11-252022-11-25
-
3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion ModelsGang Li, Heliang Zheng, Chaoyue Wang, Chang Li, Changwen Zheng, Dacheng TaoarXiv 2022. Paper  2022-11-252022-11-25
-
Shifted Diffusion for Text-to-image GenerationYufan Zhou, Bingchen Liu, Yizhe Zhu, Xiao Yang, Changyou Chen, Jinhui XuCVPR 2023. Paper  2022-11-242022-11-24
-
Sketch-Guided Text-to-Image Diffusion ModelsAndrey Voynov, Kfir Aberman, Daniel Cohen-Or2022-11-242022-11-24
-
Schrödinger's Bat: Diffusion Models Sometimes Generate Polysemous Words in SuperpositionJennifer C. White, Ryan CotterellarXiv 2022. Paper  2022-11-232022-11-23
-
Make-A-Story: Visual Memory Conditioned Consistent Story GenerationTanzila Rahman, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Shweta Mahajan, Leonid SigalCVPR 2023. Paper  2022-11-232022-11-23
-
SinDiffusion: Learning a Diffusion Model from a Single Natural ImageWeilun Wang, Jianmin Bao, Wengang Zhou, Dongdong Chen, Dong Chen, Lu Yuan, Houqiang Li2022-11-222022-11-22
-
Human Evaluation of Text-to-Image Models on a Multi-Task BenchmarkVitali Petsiuk, Alexander E. Siemenn, Saisamrit Surbehera, Zad Chin, Keith Tyser, Gregory Hunter, Arvind Raghavan, Yann Hicke, Bryan A. Plummer, Ori Kerret, Tonio Buonassisi, Kate Saenko, Armando Solar-Lezama, Iddo DroriNeurIPS Workshop 2022. Paper  2022-11-222022-11-22
-
Plug-and-Play Diffusion Features for Text-Driven Image-to-Image TranslationNarek Tumanyan, Michal Geyer, Shai Bagon, Tali Dekel2022-11-222022-11-22
-
EDICT: Exact Diffusion Inversion via Coupled TransformationsBram Wallace, Akash Gokul, Nikhil Naik2022-11-222022-11-22
-
VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion ModelsAjay Jain, Amber Xie, Pieter Abbeel2022-11-212022-11-21
-
Investigating Prompt Engineering in Diffusion ModelsSam Witteveen, Martin AndrewsNeurIPS Workshop 2022. Paper  2022-11-212022-11-21
-
Exploring Discrete Diffusion Models for Image CaptioningZixin Zhu, Yixuan Wei, Jianfeng Wang, Zhe Gan, Zheng Zhang, Le Wang, Gang Hua, Lijuan Wang, Zicheng Liu, Han Hu2022-11-212022-11-21
-
SinFusion: Training Diffusion Models on a Single Image or VideoYaniv Nikankin, Niv Haim, Michal Irani2022-11-212022-11-21
-
Synthesizing Coherent Story with Auto-Regressive Latent Diffusion ModelsXichen Pan, Pengda Qin, Yuhong Li, Hui Xue, Wenhu Chen2022-11-202022-11-20
-
DiffStyler: Controllable Dual Diffusion for Text-Driven Image StylizationNisha Huang, Yuxin Zhang, Fan Tang, Chongyang Ma, Haibin Huang, Yong Zhang, Weiming Dong, Changsheng XuarXiv 2022. Paper  2022-11-192022-11-19
-
Invariant Learning via Diffusion Dreamed Distribution ShiftsPriyatham Kattakinda, Alexander Levine, Soheil FeiziarXiv 2022. Paper  2022-11-182022-11-18
-
Magic3D: High-Resolution Text-to-3D Content CreationChen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, Tsung-Yi Lin2022-11-182022-11-18
-
InstructPix2Pix: Learning to Follow Image Editing InstructionsTim Brooks, Aleksander Holynski, Alexei A. Efros2022-11-172022-11-17
-
Null-text Inversion for Editing Real Images using Guided Diffusion ModelRon Mokady, Amir Hertz, Kfir Aberman, Yael Pritch, Daniel Cohen-OrarXiv 2022. Paper  2022-11-172022-11-17
-
Direct Inversion: Optimization-Free Text-Driven Real Image Editing with Diffusion ModelsAdham Elarabawy, Harish Kamath, Samuel DentonarXiv 2022. Paper  2022-11-152022-11-15
-
Versatile Diffusion: Text, Images and Variations All in One Diffusion ModelXingqian Xu, Zhangyang Wang, Eric Zhang, Kai Wang, Humphrey Shi2022-11-152022-11-15
-
Arbitrary Style Guidance for Enhanced Diffusion-Based Text-to-Image GenerationZhihong Pan, Xin Zhou, Hao TianWACV 2023. Paper  2022-11-142022-11-14
-
Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion ModelsPatrick Schramowski, Manuel Brack, Björn Deiseroth, Kristian Kersting2022-11-092022-11-09
-
Rickrolling the Artist: Injecting Invisible Backdoors into Text-Guided Image Generation ModelsLukas Struppek, Dominik Hintersdorf, Kristian Kersting2022-11-042022-11-04
-
eDiffi: Text-to-Image Diffusion Models with an Ensemble of Expert DenoisersYogesh Balaji, Seungjun Nah, Xun Huang, Arash Vahdat, Jiaming Song, Karsten Kreis, Miika Aittala, Timo Aila, Samuli Laine, Bryan Catanzaro, Tero Karras, Ming-Yu Liu2022-11-022022-11-02
-
UPainting: Unified Text-to-Image Diffusion Generation with Cross-modal GuidanceWei Li, Xue Xu, Xinyan Xiao, Jiachen Liu, Hu Yang, Guohao Li, Zhanpeng Wang, Zhifan Feng, Qiaoqiao She, Yajuan Lyu, Hua WuarXiv 2022. Paper  2022-10-282022-10-28
-
MagicMix: Semantic Mixing with Diffusion ModelsJun Hao Liew, Hanshu Yan, Daquan Zhou, Jiashi Feng2022-10-282022-10-28
-
ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-ExpertsZhida Feng, Zhenyu Zhang, Xintong Yu, Yewei Fang, Lanxin Li, Xuyi Chen, Yuxiang Lu, Jiaxiang Liu, Weichong Yin, Shikun Feng, Yu Sun, Hao Tian, Hua Wu, Haifeng WangCVPR 2023. Paper  2022-10-272022-10-27
-
How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions?Hritik Bansal, Da Yin, Masoud Monajatipoor, Kai-Wei Chang2022-10-272022-10-27
-
DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative ModelsZijie J. Wang, Evan Montoya, David Munechika, Haoyang Yang, Benjamin Hoover, Duen Horng Chau2022-10-262022-10-26
-
Lafite2: Few-shot Text-to-Image GenerationYufan Zhou, Chunyuan Li, Changyou Chen, Jianfeng Gao, Jinhui XuarXiv 2022. Paper  2022-10-252022-10-25
-
High-Resolution Image Editing via Multi-Stage Blended DiffusionJohannes Ackermann, Minjun Li2022-10-242022-10-24
-
A Visual Tour Of Current Challenges In Multimodal Language ModelsShashank Sonkar, Naiming Liu, Richard G. BaraniukarXiv 2022. Paper  2022-10-222022-10-22
-
Conditional Diffusion with Less Explicit Guidance via Model Predictive ControlMax W. Shen, Ehsan Hajiramezanali, Gabriele Scalia, Alex Tseng, Nathaniel Diamant, Tommaso Biancalani, Andreas LoukasarXiv 2022. Paper  2022-10-212022-10-21
-
Diffusion Models already have a Semantic Latent SpaceMingi Kwon, Jaeseok Jeong, Youngjung Uh2022-10-202022-10-20
-
DiffEdit: Diffusion-based semantic image editing with mask guidanceGuillaume Couairon, Jakob Verbeek, Holger Schwenk, Matthieu CordICLR 2023. Paper  2022-10-202022-10-20
-
Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for Text-to-Image GenerationRuijun Li, Weihua Li, Yi Yang, Hanyu Wei, Jianhua Jiang, Quan BaiarXiv 2022. Paper  2022-10-182022-10-18
-
UniTune: Text-Driven Image Editing by Fine Tuning an Image Generation Model on a Single ImageDani Valevski, Matan Kalman, Yossi Matias, Yaniv LeviathanarXiv 2022. Paper  2022-10-182022-10-18
-
Imagic: Text-Based Real Image Editing with Diffusion ModelsBahjat Kawar, Shiran Zada, Oran Lang, Omer Tov, Huiwen Chang, Tali Dekel, Inbar Mosseri, Michal Irani2022-10-172022-10-17
-
Leveraging Off-the-shelf Diffusion Model for Multi-attribute Fashion Image ManipulationChaerin Kong, DongHyeon Jeon, Ohjoon Kwon, Nojun KwakWACV 2022. Paper  2022-10-122022-10-12
-
Unifying Diffusion Models' Latent Space, with Applications to CycleDiffusion and GuidanceChen Henry Wu, Fernando De la Torre2022-10-112022-10-11
-
clip2latent: Text driven sampling of a pre-trained StyleGAN using denoising diffusion and CLIPJustin N. M. Pinkney, Chuan Li2022-10-052022-10-05
-
LDEdit: Towards Generalized Text Guided Image Manipulation via Latent Diffusion ModelsParamanand Chandramouli, Kanchana Vaishnavi GandikotaBMVC 2022. Paper  2022-10-052022-10-05
-
DALL-E-Bot: Introducing Web-Scale Diffusion Models to RoboticsIvan Kapelyukh, Vitalis Vosylius, Edward JohnsIEEE RA-L 2022. Paper  2022-10-052022-10-05
-
Imagen Video: High Definition Video Generation with Diffusion ModelsJonathan Ho, William Chan, Chitwan Saharia, Jay Whang, Ruiqi Gao, Alexey Gritsenko, Diederik P. Kingma, Ben Poole, Mohammad Norouzi, David J. Fleet, Tim SalimansarXiv 2022. Paper  2022-10-052022-10-05
-
Membership Inference Attacks Against Text-to-image Generation ModelsYixin Wu, Ning Yu, Zheng Li, Michael Backes, Yang ZhangarXiv 2022. Paper  2022-10-032022-10-03
-
2022-09-29
-
Re-Imagen: Retrieval-Augmented Text-to-Image GeneratorWenhu Chen, Hexiang Hu, Chitwan Saharia, William W. CohenarXiv 2022. Paper  2022-09-292022-09-29
-
DreamFusion: Text-to-3D using 2D DiffusionBen Poole, Ajay Jain, Jonathan T. Barron, Ben Mildenhall2022-09-292022-09-29
-
Make-A-Video: Text-to-Video Generation without Text-Video DataUriel Singer, Adam Polyak, Thomas Hayes, Xi Yin, Jie An, Songyang Zhang, Qiyuan Hu, Harry Yang, Oron Ashual, Oran Gafni, Devi Parikh, Sonal Gupta, Yaniv TaigmanarXiv 2022. Paper  2022-09-292022-09-29
-
Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided DiffusionNisha Huang, Fan Tang, Weiming Dong, Changsheng Xu2022-09-272022-09-27
-
Personalizing Text-to-Image Generation via Aesthetic GradientsVictor Gallego2022-09-252022-09-25
-
Best Prompts for Text-to-Image Models and How to Find ThemNikita Pavlichenko, Dmitry UstalovNeurIPS Workshop 2022. Paper  2022-09-232022-09-23
-
The Biased Artist: Exploiting Cultural Biases via Homoglyphs in Text-Guided Image Generation ModelsLukas Struppek, Dominik Hintersdorf, Kristian Kersting2022-09-192022-09-19
-
Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative ModelsChen Henry Wu, Saman Motamed, Shaunak Srivastava, Fernando De la Torre2022-09-142022-09-14
-
ISS: Image as Stepping Stone for Text-Guided 3D Shape GenerationZhengzhe Liu, Peng Dai, Ruihui Li, Xiaojuan Qi, Chi-Wing Fu2022-09-092022-09-09
-
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven GenerationNataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, Kfir Aberman2022-08-252022-08-25
-
Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion ModelsRobin Rombach, Andreas Blattmann, Björn Ommer2022-07-262022-07-26
-
Discrete Contrastive Diffusion for Cross-Modal and Conditional GenerationYe Zhu, Yu Wu, Kyle Olszewski, Jian Ren, Sergey Tulyakov, Yan Yan2022-06-152022-06-15
-
Blended Latent DiffusionOmri Avrahami, Ohad Fried, Dani Lischinski2022-06-062022-06-06
-
Compositional Visual Generation with Composable Diffusion ModelsNan Liu, Shuang Li, Yilun Du, Antonio Torralba, Joshua B. Tenenbaum2022-06-032022-06-03
-
DiVAE: Photorealistic Images Synthesis with Denoising Diffusion DecoderJie Shi, Chenfei Wu, Jian Liang, Xiang Liu, Nan DuanarXiv 2022. Paper  2022-06-012022-06-01
-
Text2Human: Text-Driven Controllable Human Image GenerationYuming Jiang, Shuai Yang, Haonan Qiu, Wayne Wu, Chen Change Loy, Ziwei Liu2022-05-312022-05-31
-
Improved Vector Quantized Diffusion ModelsZhicong Tang, Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen2022-05-312022-05-31
-
Photorealistic Text-to-Image Diffusion Models with Deep Language UnderstandingChitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, S. Sara Mahdavi, Rapha Gontijo Lopes, Tim Salimans, Jonathan Ho, David J Fleet, Mohammad Norouzi2022-05-232022-05-23
-
Retrieval-Augmented Diffusion ModelsAndreas Blattmann, Robin Rombach, Kaan Oktay, Björn Ommer2022-04-252022-04-25
-
Hierarchical Text-Conditional Image Generation with CLIP LatentsAditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, Mark Chen2022-04-132022-04-13
-
KNN-Diffusion: Image Generation via Large-Scale RetrievalOron Ashual, Shelly Sheynin, Adam Polyak, Uriel Singer, Oran Gafni, Eliya Nachmani, Yaniv TaigmanICLR 2023. Paper  2022-04-062022-04-06
-
High-Resolution Image Synthesis with Latent Diffusion ModelsRobin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, Björn Ommer2021-12-202021-12-20
-
Tackling the Generative Learning Trilemma with Denoising Diffusion GANsZhisheng Xiao, Karsten Kreis, Arash Vahdat2021-12-152021-12-15
-
More Control for Free! Image Synthesis with Semantic Diffusion GuidanceXihui Liu, Dong Huk Park, Samaneh Azadi, Gong Zhang, Arman Chopikyan, Yuxiao Hu, Humphrey Shi, Anna Rohrbach, Trevor Darrell2021-12-102021-12-10
-
Blended Diffusion for Text-driven Editing of Natural ImagesOmri Avrahami, Dani Lischinski, Ohad Fried2021-11-292021-11-29
-
Vector Quantized Diffusion Model for Text-to-Image SynthesisShuyang Gu, Dong Chen, Jianmin Bao, Fang Wen, Bo Zhang, Dongdong Chen, Lu Yuan, Baining Guo2021-11-292021-11-29
-
DiffusionCLIP: Text-guided Image Manipulation Using Diffusion ModelsGwanghyun Kim, Jong Chul Ye2021-10-062021-10-06
Counts - 417   Back to
top