1. Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation
    Zibo Zhao, Wen Liu, Xin Chen, Xianfang Zeng, Rui Wang, Pei Cheng, Bin Fu, Tao Chen, Gang Yu, Shenghua Gao
    arXiv 2023. Paper  
    2023-06-29
    2023-06-29
  2. PFB-Diff: Progressive Feature Blending Diffusion for Text-driven Image Editing
    Wenjing Huang, Shikui Tu, Lei Xu
    arXiv 2023. Paper  
    2023-06-28
    2023-06-28
  3. Decompose and Realign: Tackling Condition Misalignment in Text-to-Image Diffusion Models
    Luozhou Wang, Guibao Shen, Yijun Li, Ying-cong Chen
    arXiv 2023. Paper  
    2023-06-26
    2023-06-26
  4. A-STAR: Test-time Attention Segregation and Retention for Text-to-image Synthesis
    Aishwarya Agarwal, Srikrishna Karanam, K J Joseph, Apoorv Saxena, Koustava Goswami, Balaji Vasan Srinivasan
    arXiv 2023. Paper  
    2023-06-26
    2023-06-26
  5. DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models
    Ximing Xing, Chuang Wang, Haitao Zhou, Jing Zhang, Qian Yu, Dong Xu
    arXiv 2023. Paper  
    2023-06-26
    2023-06-26
  6. Zero-shot spatial layout conditioning for text-to-image diffusion models
    Guillaume Couairon, Marlène Careil, Matthieu Cord, Stéphane Lathuilière, Jakob Verbeek
    arXiv 2023. Paper  
    2023-06-23
    2023-06-23
  7. DreamTime: An Improved Optimization Strategy for Text-to-3D Content Creation
    Yukun Huang, Jianan Wang, Yukai Shi, Xianbiao Qi, Zheng-Jun Zha, Lei Zhang
    arXiv 2023. Paper  
    2023-06-21
    2023-06-21
  8. RS5M: A Large Scale Vision-Language Dataset for Remote Sensing Vision-Language Foundation Model
    Zilun Zhang, Tiancheng Zhao, Yulong Guo, Jianwei Yin
    arXiv 2023. Paper  
    2023-06-20
    2023-06-20
  9. EMoG: Synthesizing Emotive Co-speech 3D Gesture with Diffusion Model
    Lianying Yin, Yijun Wang, Tianyu He, Jinming Liu, Wei Zhao, Bohan Li, Xin Jin, Jianxin Lin
    arXiv 2023. Paper  
    2023-06-20
    2023-06-20
  10. Align, Adapt and Inject: Sound-guided Unified Image Generation
    Yue Yang, Kaipeng Zhang, Yuying Ge, Wenqi Shao, Zeyue Xue, Yu Qiao, Ping Luo
    arXiv 2023. Paper  
    2023-06-20
    2023-06-20
  11. Conditional Text Image Generation with Diffusion Models
    Yuanzhi Zhu, Zhaohai Li, Tianwei Wang, Mengchao He, Cong Yao
    arXiv 2023. Paper  
    2023-06-19
    2023-06-19
  12. Instruct-NeuralTalker: Editing Audio-Driven Talking Radiance Fields with Instructions
    Yuqi Sun, Reian He, Weimin Tan, Bo Yan
    arXiv 2023. Paper  
    2023-06-19
    2023-06-19
  13. Point-Cloud Completion with Pretrained Text-to-image Diffusion Models
    Yoni Kasten, Ohad Rahamim, Gal Chechik
    arXiv 2023. Paper  
    2023-06-18
    2023-06-18
  14. CLIPSonic: Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision Models
    Hao-Wen Dong, Xiaoyu Liu, Jordi Pons, Gautam Bhattacharya, Santiago Pascual, Joan Serrà, Taylor Berg-Kirkpatrick, Julian McAuley
    arXiv 2023. Paper  
    2023-06-16
    2023-06-16
  15. Evaluating the Robustness of Text-to-image Diffusion Models against Real-world Attacks
    Hongcheng Gao, Hao Zhang, Yinpeng Dong, Zhijie Deng
    arXiv 2023. Paper  
    2023-06-16
    2023-06-16
  16. Energy-Based Cross Attention for Bayesian Context Update in Text-to-Image Diffusion Models
    Geon Yeong Park, Jeongsol Kim, Beomsu Kim, Sang Wan Lee, Jong Chul Ye
    arXiv 2023. Paper  
    2023-06-16
    2023-06-16
  17. Training Multimedia Event Extraction With Generated Images and Captions
    Zilin Du, Yunxin Li, Xu Guo, Yidan Sun, Boyang Li
    arXiv 2023. Paper  
    2023-06-15
    2023-06-15
  18. Linguistic Binding in Diffusion Models: Enhancing Attribute Correspondence through Attention Map Alignment
    Royi Rassin, Eran Hirsch, Daniel Glickman, Shauli Ravfogel, Yoav Goldberg, Gal Chechik
    arXiv 2023. Paper  
    2023-06-15
    2023-06-15
  19. Diffusion Models for Zero-Shot Open-Vocabulary Segmentation
    Laurynas Karazija, Iro Laina, Andrea Vedaldi, Christian Rupprecht
    arXiv 2023. Paper  
    2023-06-15
    2023-06-15
  20. Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis
    Shivam Mehta, Siyang Wang, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje Henter
    arXiv 2023. Paper  
    2023-06-15
    2023-06-15
  21. Taming Diffusion Models for Music-driven Conducting Motion Generation
    Zhuoran Zhao, Jinbin Bai, Delong Chen, Debang Wang, Yubo Pan
    arXiv 2023. Paper  
    2023-06-15
    2023-06-15
  22. Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation
    Yongqi Yang, Ruoyu Wang, Zhihao Qian, Ye Zhu, Yu Wu
    arXiv 2023. Paper  
    2023-06-14
    2023-06-14
  23. GBSD: Generative Bokeh with Stage Diffusion
    Jieren Deng, Xin Zhou, Hao Tian, Zhihong Pan, Derek Aguiar
    arXiv 2023. Paper  
    2023-06-14
    2023-06-14
  24. Training-free Diffusion Model Adaptation for Variable-Sized Text-to-Image Synthesis
    Zhiyu Jin, Xuli Shen, Bin Li, Xiangyang Xue
    arXiv 2023. Paper  
    2023-06-14
    2023-06-14
  25. Norm-guided latent space exploration for text-to-image generation
    Dvir Samuel, Rami Ben-Ari, Nir Darshan, Haggai Maron, Gal Chechik
    arXiv 2023. Paper  
    2023-06-14
    2023-06-14
  26. VidEdit: Zero-Shot and Spatially Aware Text-Driven Video Editing
    Paul Couairon, Clément Rambour, Jean-Emmanuel Haugeard, Nicolas Thome
    arXiv 2023. Paper  
    2023-06-14
    2023-06-14
  27. Paste, Inpaint and Harmonize via Denoising: Subject-Driven Image Editing with Pre-Trained Diffusion Model
    Xin Zhang, Jiaxian Guo, Paul Yoo, Yutaka Matsuo, Yusuke Iwasawa
    arXiv 2023. Paper  
    2023-06-13
    2023-06-13
  28. Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
    Shuai Yang, Yifan Zhou, Ziwei Liu, Chen Change Loy
    arXiv 2023. Paper  
    2023-06-13
    2023-06-13
  29. InstructP2P: Learning to Edit 3D Point Clouds with Text Instructions
    Jiale Xu, Xintao Wang, Yan-Pei Cao, Weihao Cheng, Ying Shan, Shenghua Gao
    arXiv 2023. Paper  
    2023-06-12
    2023-06-12
  30. MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images
    Junchen Zhu, Huan Yang, Huiguo He, Wenjing Wang, Zixi Tuo, Wen-Huang Cheng, Lianli Gao, Jingkuan Song, Jianlong Fu
    arXiv 2023. Paper  
    2023-06-12
    2023-06-12
  31. Controlling Text-to-Image Diffusion by Orthogonal Finetuning
    Zeju Qiu, Weiyang Liu, Haiwen Feng, Yuxuan Xue, Yao Feng, Zhen Liu, Dan Zhang, Adrian Weller, Bernhard Schölkopf
    arXiv 2023. Paper  
    2023-06-12
    2023-06-12
  32. Language-Guided Traffic Simulation via Scene-Level Diffusion
    Ziyuan Zhong, Davis Rempe, Yuxiao Chen, Boris Ivanovic, Yulong Cao, Danfei Xu, Marco Pavone, Baishakhi Ray
    arXiv 2023. Paper  
    2023-06-10
    2023-06-10
  33. Improving Tuning-Free Real Image Editing with Proximal Guidance
    Ligong Han, Song Wen, Qi Chen, Zhixing Zhang, Kunpeng Song, Mengwei Ren, Ruijiang Gao, Yuxiao Chen, Di Liu, Qilong Zhangli, Anastasis Stathopoulos, Jindong Jiang, Zhaoyang Xia, Akash Srivastava, Dimitris Metaxas
    arXiv 2023. Paper  
    2023-06-08
    2023-06-08
  34. SyncDiffusion: Coherent Montage via Synchronized Joint Diffusions
    Yuseung Lee, Kunho Kim, Hyunjin Kim, Minhyuk Sung
    arXiv 2023. Paper   Project   Github  
    2023-06-08
    2023-06-08
  35. Grounded Text-to-Image Synthesis with Attention Refocusing
    Quynh Phung, Songwei Ge, Jia-Bin Huang
    arXiv 2023. Paper  
    2023-06-08
    2023-06-08
  36. BOOT: Data-free Distillation of Denoising Diffusion Models with Bootstrapping
    Jiatao Gu, Shuangfei Zhai, Yizhe Zhang, Lingjie Liu, Josh Susskind
    arXiv 2023. Paper  
    2023-06-08
    2023-06-08
  37. Improving Diffusion-based Image Translation using Asymmetric Gradient Guidance
    Gihyun Kwon, Jong Chul Ye
    arXiv 2023. Paper  
    2023-06-07
    2023-06-07
  38. Integrating Geometric Control into Text-to-Image Diffusion Models for High-Quality Detection Data Generation via Text Prompt
    Kai Chen, Enze Xie, Zhe Chen, Lanqing Hong, Zhenguo Li, Dit-Yan Yeung
    arXiv 2023. Paper  
    2023-06-07
    2023-06-07
  39. Multi-modal Latent Diffusion
    Mustapha Bounoua, Giulio Franzese, Pietro Michiardi
    arXiv 2023. Paper  
    2023-06-07
    2023-06-07
  40. Designing a Better Asymmetric VQGAN for StableDiffusion
    Zixin Zhu, Xuelu Feng, Dongdong Chen, Jianmin Bao, Le Wang, Yinpeng Chen, Lu Yuan, Gang Hua
    arXiv 2023. Paper   Github  
    2023-06-07
    2023-06-07
  41. ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion Models
    Maitreya Patel, Tejas Gokhale, Chitta Baral, Yezhou Yang
    arXiv 2023. Paper  
    2023-06-07
    2023-06-07
  42. WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models
    Changhoon Kim, Kyle Min, Maitreya Patel, Sheng Cheng, Yezhou Yang
    arXiv 2023. Paper  
    2023-06-07
    2023-06-07
  43. User-friendly Image Editing with Minimal Text Input: Leveraging Captioning and Injection Techniques
    Sunwoo Kim, Wooseok Jang, Hyunsu Kim, Junho Kim, Yunjey Choi, Seungryong Kim, Gayeong Lee
    arXiv 2023. Paper  
    2023-06-05
    2023-06-05
  44. Towards Unified Text-based Person Retrieval: A Large-scale Multi-Attribute and Language Search Benchmark
    Shuyu Yang, Yinan Zhou, Yaxiong Wang, Yujiao Wu, Li Zhu, Zhedong Zheng
    arXiv 2023. Paper  
    2023-06-05
    2023-06-05
  45. Instruct-Video2Avatar: Video-to-Avatar Generation with Instructions
    Shaoxu Li
    arXiv 2023. Paper  
    2023-06-05
    2023-06-05
  46. HeadSculpt: Crafting 3D Head Avatars with Text
    Xiao Han, Yukang Cao, Kai Han, Xiatian Zhu, Jiankang Deng, Yi-Zhe Song, Tao Xiang, Kwan-Yee K. Wong
    arXiv 2023. Paper   Project  
    2023-06-05
    2023-06-05
  47. LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading
    Yochai Yemini, Aviv Shamsian, Lior Bracha, Sharon Gannot, Ethan Fetaya
    arXiv 2023. Paper   Project  
    2023-06-05
    2023-06-05
  48. Stable Diffusion is Unstable
    Chengbin Du, Yanxi Li, Zhongwei Qiu, Chang Xu
    arXiv 2023. Paper  
    2023-06-05
    2023-06-05
  49. Detector Guidance for Multi-Object Text-to-Image Generation
    Luping Liu, Zijian Zhang, Yi Ren, Rongjie Huang, Xiang Yin, Zhou Zhao
    arXiv 2023. Paper  
    2023-06-04
    2023-06-04
  50. Efficient Text-Guided 3D-Aware Portrait Generation with Score Distillation Sampling on Distribution
    Yiji Cheng, Fei Yin, Xiaoke Huang, Xintong Yu, Jiaxiang Liu, Shikun Feng, Yujiu Yang, Yansong Tang
    arXiv 2023. Paper  
    2023-06-03
    2023-06-03
  51. Word-Level Explanations for Analyzing Bias in Text-to-Image Models
    Alexander Lin, Lucas Monteiro Paes, Sree Harsha Tanneru, Suraj Srinivas, Himabindu Lakkaraju
    arXiv 2023. Paper  
    2023-06-03
    2023-06-03
  52. Privacy Distillation: Reducing Re-identification Risk of Multimodal Diffusion Models
    Virginia Fernandez, Pedro Sanchez, Walter Hugo Lopez Pinaya, Grzegorz Jacenków, Sotirios A. Tsaftaris, Jorge Cardoso
    arXiv 2023. Paper  
    2023-06-02
    2023-06-02
  53. Audio-Visual Speech Enhancement with Score-Based Generative Models
    Julius Richter, Simone Frintrop, Timo Gerkmann
    arXiv 2023. Paper  
    2023-06-02
    2023-06-02
  54. Video Colorization with Pre-trained Text-to-Image Diffusion Models
    Hanyuan Liu, Minshan Xie, Jinbo Xing, Chengze Li, Tien-Tsin Wong
    arXiv 2023. Paper  
    2023-06-02
    2023-06-02
  55. Probabilistic Adaptation of Text-to-Video Models
    Mengjiao Yang, Yilun Du, Bo Dai, Dale Schuurmans, Joshua B. Tenenbaum, Pieter Abbeel
    arXiv 2023. Paper   Project  
    2023-06-02
    2023-06-02
  56. FigGen: Text to Scientific Figure Generation
    Juan A. Rodriguez, David Vazquez, Issam Laradji, Marco Pedersoli, Pau Rodriguez
    ICLR 2023. Paper  
    2023-06-01
    2023-06-01
  57. UniDiff: Advancing Vision-Language Models with Generative and Discriminative Learning
    Xiao Dong, Runhui Huang, Xiaoyong Wei, Zequn Jie, Jianxing Yu, Jian Yin, Xiaodan Liang
    arXiv 2023. Paper  
    2023-06-01
    2023-06-01
  58. Wuerstchen: Efficient Pretraining of Text-to-Image Models
    Pablo Pernias, Dominic Rampas, Marc Aubreville
    arXiv 2023. Paper  
    2023-06-01
    2023-06-01
  59. Inserting Anybody in Diffusion Models via Celeb Basis
    Ge Yuan, Xiaodong Cun, Yong Zhang, Maomao Li, Chenyang Qi, Xintao Wang, Ying Shan, Huicheng Zheng
    arXiv 2023. Paper   Project  
    2023-06-01
    2023-06-01
  60. Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance
    Jinbo Xing, Menghan Xia, Yuxin Liu, Yuechen Zhang, Yong Zhang, Yingqing He, Hanyuan Liu, Haoxin Chen, Xiaodong Cun, Xintao Wang, Ying Shan, Tien-Tsin Wong
    arXiv 2023. Paper   Project  
    2023-06-01
    2023-06-01
  61. Cocktail: Mixing Multi-Modality Controls for Text-Conditional Image Generation
    Minghui Hu, Jianbin Zheng, Daqing Liu, Chuanxia Zheng, Chaoyue Wang, Dacheng Tao, Tat-Jen Cham
    arXiv 2023. Paper   Project   Github  
    2023-06-01
    2023-06-01
  62. The Hidden Language of Diffusion Models
    Hila Chefer, Oran Lang, Mor Geva, Volodymyr Polosukhin, Assaf Shocher, Michal Irani, Inbar Mosseri, Lior Wolf
    arXiv 2023. Paper   Project  
    2023-06-01
    2023-06-01
  63. ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation
    Shaozhe Hao, Kai Han, Shihao Zhao, Kwan-Yee K. Wong
    arXiv 2023. Paper   Github  
    2023-06-01
    2023-06-01
  64. Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models
    Chang Liu, Haoning Wu, Yujie Zhong, Xiaoyun Zhang, Weidi Xie
    arXiv 2023. Paper   Project  
    2023-06-01
    2023-06-01
  65. Intriguing Properties of Text-guided Diffusion Models
    Qihao Liu, Adam Kortylewski, Yutong Bai, Song Bai, Alan Yuille
    arXiv 2023. Paper  
    2023-06-01
    2023-06-01
  66. StyleDrop: Text-to-Image Generation in Any Style
    Kihyuk Sohn, Nataniel Ruiz, Kimin Lee, Daniel Castro Chin, Irina Blok, Huiwen Chang, Jarred Barber, Lu Jiang, Glenn Entis, Yuanzhen Li, Yuan Hao, Irfan Essa, Michael Rubinstein, Dilip Krishnan
    arXiv 2023. Paper   Project  
    2023-06-01
    2023-06-01
  67. Diffusion Self-Guidance for Controllable Image Generation
    Dave Epstein, Allan Jabri, Ben Poole, Alexei A. Efros, Aleksander Holynski
    arXiv 2023. Paper   Project  
    2023-06-01
    2023-06-01
  68. StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners
    Yonglong Tian, Lijie Fan, Phillip Isola, Huiwen Chang, Dilip Krishnan
    arXiv 2023. Paper  
    2023-06-01
    2023-06-01
  69. Boosting Text-to-Image Diffusion Models with Fine-Grained Semantic Rewards
    Guian Fang, Zutao Jiang, Jianhua Han, Guansong Lu, Hang Xu, Xiaodan Liang
    arXiv 2023. Paper   Github  
    2023-05-31
    2023-05-31
  70. Control4D: Dynamic Portrait Editing by Learning 4D GAN from 2D Diffusion-based Editor
    Ruizhi Shao, Jingxiang Sun, Cheng Peng, Zerong Zheng, Boyao Zhou, Hongwen Zhang, Yebin Liu
    arXiv 2023. Paper   Project  
    2023-05-31
    2023-05-31
  71. Understanding and Mitigating Copying in Diffusion Models
    Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, Tom Goldstein
    CVPR 2023. Paper   Github  
    2023-05-31
    2023-05-31
  72. Diffusion Brush: A Latent Diffusion Model-based Editing Tool for AI-generated Images
    Peyman Gholami, Robert Xiao
    arXiv 2023. Paper  
    2023-05-31
    2023-05-31
  73. LayerDiffusion: Layered Controlled Image Editing with Diffusion Models
    Pengzhi Li, QInxuan Huang, Yikang Ding, Zhiheng Li
    arXiv 2023. Paper  
    2023-05-30
    2023-05-30
  74. HiFA: High-fidelity Text-to-3D with Advanced Diffusion Guidance
    Junzhe Zhu, Peiye Zhuang
    arXiv 2023. Paper  
    2023-05-30
    2023-05-30
  75. StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation
    Chi Zhang, Yiwen Chen, Yijun Fu, Zhenglin Zhou, Gang YU, Billzb Wang, Bin Fu, Tao Chen, Guosheng Lin, Chunhua Shen
    arXiv 2023. Paper  
    2023-05-30
    2023-05-30
  76. Nested Diffusion Processes for Anytime Image Generation
    Noam Elata, Bahjat Kawar, Tomer Michaeli, Michael Elad
    arXiv 2023. Paper  
    2023-05-30
    2023-05-30
  77. Video ControlNet: Towards Temporally Consistent Synthetic-to-Real Video Translation Using Conditional Image Diffusion Models
    Ernie Chu, Shuo-Yen Lin, Jun-Cheng Chen
    arXiv 2023. Paper  
    2023-05-30
    2023-05-30
  78. PanoGen: Text-Conditioned Panoramic Environment Generation for Vision-and-Language Navigation
    Jialu Li, Mohit Bansal
    arXiv 2023. Paper   Project   Github  
    2023-05-30
    2023-05-30
  79. Perturbation-Assisted Sample Synthesis: A Novel Approach for Uncertainty Quantification
    Yifei Liu, Rex Shen, Xiaotong Shen
    arXiv 2023. Paper  
    2023-05-30
    2023-05-30
  80. Conditional Score Guidance for Text-Driven Image-to-Image Translation
    Hyunsoo Lee, Minsoo Kang, Bohyung Han
    arXiv 2023. Paper  
    2023-05-29
    2023-05-29
  81. InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions
    Qian Wang, Biao Zhang, Michael Birsak, Peter Wonka
    arXiv 2023. Paper  
    2023-05-29
    2023-05-29
  82. Text-Only Image Captioning with Multi-Context Data Generation
    Feipeng Ma, Yizhou Zhou, Fengyun Rao, Yueyi Zhang, Xiaoyan Sun
    arXiv 2023. Paper  
    2023-05-29
    2023-05-29
  83. Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising
    Fu-Yun Wang, Wenshuo Chen, Guanglu Song, Han-Jia Ye, Yu Liu, Hongsheng Li
    arXiv 2023. Paper   Github  
    2023-05-29
    2023-05-29
  84. Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
    Yuchao Gu, Xintao Wang, Jay Zhangjie Wu, Yujun Shi, Yunpeng Chen, Zihan Fan, Wuyou Xiao, Rui Zhao, Shuning Chang, Weijia Wu, Yixiao Ge, Ying Shan, Mike Zheng Shou
    arXiv 2023. Paper   Project  
    2023-05-29
    2023-05-29
  85. RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths
    Zeyue Xue, Guanglu Song, Qiushan Guo, Boxiao Liu, Zhuofan Zong, Yu Liu, Ping Luo
    arXiv 2023. Paper  
    2023-05-29
    2023-05-29
  86. Controllable Text-to-Image Generation with GPT-4
    Tianjun Zhang, Yi Zhang, Vibhav Vineet, Neel Joshi, Xin Wang
    arXiv 2023. Paper  
    2023-05-29
    2023-05-29
  87. Cognitively Inspired Cross-Modal Data Generation Using Diffusion Models
    Zizhao Hu, Mohammad Rostami
    NeurIPS 2023. Paper  
    2023-05-28
    2023-05-28
  88. FISEdit: Accelerating Text-to-image Editing via Cache-enabled Sparse Diffusion Inference
    Zihao Yu, Haoyang Li, Fangcheng Fu, Xupeng Miao, Bin Cui
    arXiv 2023. Paper  
    2023-05-27
    2023-05-27
  89. Towards Consistent Video Editing with Text-to-Image Diffusion Models
    Zicheng Zhang, Bonan Li, Xuecheng Nie, Congying Han, Tiande Guo, Luoqi Liu
    arXiv 2023. Paper  
    2023-05-27
    2023-05-27
  90. Text-to-image Editing by Image Information Removal
    Zhongping Zhang, Jian Zheng, Jacob Zhiyuan Fang, Bryan A. Plummer
    arXiv 2023. Paper  
    2023-05-27
    2023-05-27
  91. Negative-prompt Inversion: Fast Image Inversion for Editing with Text-guided Diffusion Models
    Daiki Miyake, Akihiro Iohara, Yu Saito, Toshiyuki Tanaka
    arXiv 2023. Paper  
    2023-05-26
    2023-05-26
  92. Improved Visual Story Generation with Adaptive Context Modeling
    Zhangyin Feng, Yuchen Ren, Xinmiao Yu, Xiaocheng Feng, Duyu Tang, Shuming Shi, Bing Qin
    arXiv 2023. Paper  
    2023-05-26
    2023-05-26
  93. ControlVideo: Adding Conditional Control for One Shot Text-to-Video Editing
    Min Zhao, Rongzhen Wang, Fan Bao, Chongxuan Li, Jun Zhu
    arXiv 2023. Paper   Project  
    2023-05-26
    2023-05-26
  94. Custom-Edit: Text-Guided Image Editing with Customized Diffusion Models
    Jooyoung Choi, Yunjey Choi, Yunji Kim, Junho Kim, Sungroh Yoon
    arXiv 2023. Paper  
    2023-05-25
    2023-05-25
  95. On Architectural Compression of Text-to-Image Diffusion Models
    Bo-Kyeong Kim, Hyoung-Kyu Song, Thibault Castells, Shinkook Choi
    arXiv 2023. Paper  
    2023-05-25
    2023-05-25
  96. ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation
    Zhengyi Wang, Cheng Lu, Yikai Wang, Fan Bao, Chongxuan Li, Hang Su, Jun Zhu
    arXiv 2023. Paper   Project  
    2023-05-25
    2023-05-25
  97. ProSpect: Expanded Conditioning for the Personalization of Attribute-aware Image Generation
    Yuxin Zhang, Weiming Dong, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Oliver Deussen, Changsheng Xu
    arXiv 2023. Paper  
    2023-05-25
    2023-05-25
  98. Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models
    Xingqian Xu, Jiayi Guo, Zhangyang Wang, Gao Huang, Irfan Essa, Humphrey Shi
    arXiv 2023. Paper   Github  
    2023-05-25
    2023-05-25
  99. Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation
    Lisa Dunlap, Alyssa Umino, Han Zhang, Jiezhi Yang, Joseph E. Gonzalez, Trevor Darrell
    arXiv 2023. Paper   Github  
    2023-05-25
    2023-05-25
  100. Break-A-Scene: Extracting Multiple Concepts from a Single Image
    Omri Avrahami, Kfir Aberman, Ohad Fried, Daniel Cohen-Or, Dani Lischinski
    arXiv 2023. Paper   Project  
    2023-05-25
    2023-05-25
  101. Parallel Sampling of Diffusion Models
    Andy Shih, Suneel Belkhale, Stefano Ermon, Dorsa Sadigh, Nima Anari
    arXiv 2023. Paper   Github  
    2023-05-25
    2023-05-25
  102. Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
    Shihao Zhao, Dongdong Chen, Yen-Chun Chen, Jianmin Bao, Shaozhe Hao, Lu Yuan, Kwan-Yee K. Wong
    arXiv 2023. Paper   Project   Github  
    2023-05-25
    2023-05-25
  103. DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models
    Ying Fan, Olivia Watkins, Yuqing Du, Hao Liu, Moonkyung Ryu, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Kangwook Lee, Kimin Lee
    arXiv 2023. Paper  
    2023-05-25
    2023-05-25
  104. Are Diffusion Models Vision-And-Language Reasoners?
    Benno Krojer, Elinor Poole-Dayan, Vikram Voleti, Christopher Pal, Siva Reddy
    arXiv 2023. Paper   Github  
    2023-05-25
    2023-05-25
  105. BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
    Dongxu Li, Junnan Li, Steven C. H. Hoi
    arXiv 2023. Paper  
    2023-05-24
    2023-05-24
  106. I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual Metaphors
    Tuhin Chakrabarty, Arkadiy Saakyan, Olivia Winn, Artemis Panagopoulou, Yue Yang, Marianna Apidianaki, Smaranda Muresan
    arXiv 2023. Paper  
    2023-05-24
    2023-05-24
  107. DiffBlender: Scalable and Composable Multimodal Text-to-Image Diffusion Models
    Sungnyun Kim, Junsoo Lee, Kibeom Hong, Daesik Kim, Namhyuk Ahn
    arXiv 2023. Paper   Github  
    2023-05-24
    2023-05-24
  108. ChatFace: Chat-Guided Real Face Editing via Diffusion Latent Space Manipulation
    Dongxu Yue, Qin Guo, Munan Ning, Jiaxi Cui, Yuesheng Zhu, Li Yuan
    arXiv 2023. Paper  
    2023-05-24
    2023-05-24
  109. MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation
    Marco Bellagente, Manuel Brack, Hannah Teufel, Felix Friedrich, Björn Deiseroth, Constantin Eichenberg, Andrew Dai, Robert Baldock, Souradeep Nanda, Koen Oostermeijer, Andres Felipe Cruz-Salinas, Patrick Schramowski, Kristian Kersting, Samuel Weinbach
    arXiv 2023. Paper  
    2023-05-24
    2023-05-24
  110. LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
    Long Lian, Boyi Li, Adam Yala, Trevor Darrell
    arXiv 2023. Paper  
    2023-05-23
    2023-05-23
  111. Understanding Text-driven Motion Synthesis with Keyframe Collaboration via Diffusion Models
    Dong Wei, Xiaoning Sun, Huaijiang Sun, Bin Li, Shengxiang Hu, Weiqing Li, Jianfeng Lu
    arXiv 2023. Paper  
    2023-05-23
    2023-05-23
  112. Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models
    Weifeng Chen, Jie Wu, Pan Xie, Hefeng Wu, Jiashi Li, Xin Xia, Xuefeng Xiao, Liang Lin
    arXiv 2023. Paper  
    2023-05-23
    2023-05-23
  113. Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
    Yiting Qu, Xinyue Shen, Xinlei He, Michael Backes, Savvas Zannettou, Yang Zhang
    arXiv 2023. Paper  
    2023-05-23
    2023-05-23
  114. Compositional Text-to-Image Synthesis with Attention Map Control of Diffusion Models
    Ruichen Wang, Zekang Chen, Chen Chen, Jian Ma, Haonan Lu, Xiaodong Lin
    arXiv 2023. Paper  
    2023-05-23
    2023-05-23
  115. The CLIP Model is Secretly an Image-to-Prompt Converter
    Yuxuan Ding, Chunna Tian, Haoxuan Ding, Lingqiao Liu
    arXiv 2023. Paper  
    2023-05-22
    2023-05-22
  116. AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation
    Guy Yariv, Itai Gat, Lior Wolf, Yossi Adi, Idan Schwartz
    arXiv 2023. Paper  
    2023-05-22
    2023-05-22
  117. ControlVideo: Training-free Controllable Text-to-Video Generation
    Yabo Zhang, Yuxiang Wei, Dongsheng Jiang, Xiaopeng Zhang, Wangmeng Zuo, Qi Tian
    arXiv 2023. Paper   Github  
    2023-05-22
    2023-05-22
  118. If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection
    Shyamgopal Karthik, Karsten Roth, Massimiliano Mancini, Zeynep Akata
    arXiv 2023. Paper   Project  
    2023-05-22
    2023-05-22
  119. Training Diffusion Models with Reinforcement Learning
    Kevin Black, Michael Janner, Yilun Du, Ilya Kostrikov, Sergey Levine
    arXiv 2023. Paper  
    2023-05-22
    2023-05-22
  120. FACTIFY3M: A Benchmark for Multimodal Fact Verification with Explainability through 5W Question-Answering
    Megha Chakraborty, Khusbu Pahwa, Anku Rani, Adarsh Mahor, Aditya Pakala, Arghya Sarkar, Harshit Dave, Ishan Paul, Janvita Reddy, Preethi Gurumurthy, Ritvik G, Samahriti Mukherjee, Shreyas Chatterjee, Kinjal Sensharma, Dwip Dalal, Suryavardan S, Shreyash Mishra, Parth Patwa, Aman Chadha, Amit Sheth, Amitava Das
    arXiv 2023. Paper  
    2023-05-22
    2023-05-22
  121. LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On
    Davide Morelli, Alberto Baldrati, Giuseppe Cartella, Marcella Cornia, Marco Bertini, Rita Cucchiara
    arXiv 2023. Paper  
    2023-05-22
    2023-05-22
  122. Adversarial Nibbler: A Data-Centric Challenge for Improving the Safety of Text-to-Image Models
    Alicia Parrish, Hannah Rose Kirk, Jessica Quaye, Charvi Rastogi, Max Bartolo, Oana Inel, Juan Ciro, Rafael Mosquera, Addison Howard, Will Cukierski, D. Sculley, Vijay Janapa Reddi, Lora Aroyo
    arXiv 2023. Paper  
    2023-05-22
    2023-05-22
  123. InstructVid2Vid: Controllable Video Editing with Natural Language Instructions
    Bosheng Qin, Juncheng Li, Siliang Tang, Tat-Seng Chua, Yueting Zhuang
    arXiv 2023. Paper  
    2023-05-21
    2023-05-21
  124. SneakyPrompt: Evaluating Robustness of Text-to-image Generative Models' Safety Filters
    Yuchen Yang, Bo Hui, Haolin Yuan, Neil Gong, Yinzhi Cao
    arXiv 2023. Paper  
    2023-05-20
    2023-05-20
  125. Efficient Cross-Lingual Transfer for Chinese Stable Diffusion with Images as Pivots
    Jinyi Hu, Xu Han, Xiaoyuan Yi, Yutong Chen, Wenhao Li, Zhiyuan Liu, Maosong Sun
    arXiv 2023. Paper  
    2023-05-19
    2023-05-19
  126. Brain Captioning: Decoding human brain activity into images and text
    Matteo Ferrante, Furkan Ozcelik, Tommaso Boccato, Rufin VanRullen, Nicola Toschi
    arXiv 2023. Paper  
    2023-05-19
    2023-05-19
  127. Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields
    Jingbo Zhang, Xiaoyu Li, Ziyu Wan, Can Wang, Jing Liao
    arXiv 2023. Paper  
    2023-05-19
    2023-05-19
  128. Any-to-Any Generation via Composable Diffusion
    Zineng Tang, Ziyi Yang, Chenguang Zhu, Michael Zeng, Mohit Bansal
    arXiv 2023. Paper   Project   Github  
    2023-05-19
    2023-05-19
  129. Late-Constraint Diffusion Guidance for Controllable Image Synthesis
    Chang Liu, Dong Liu
    arXiv 2023. Paper   Project   Github  
    2023-05-19
    2023-05-19
  130. Inspecting the Geographical Representativeness of Images from Text-to-Image Models
    Abhipsa Basu, R. Venkatesh Babu, Danish Pruthi
    arXiv 2023. Paper  
    2023-05-18
    2023-05-18
  131. X-IQE: eXplainable Image Quality Evaluation for Text-to-Image Generation with Visual Large Language Models
    Yixiong Chen
    arXiv 2023. Paper   Github  
    2023-05-18
    2023-05-18
  132. LDM3D: Latent Diffusion Model for 3D
    Gabriela Ben Melech Stan, Diana Wofk, Scottie Fox, Alex Redden, Will Saxton, Jean Yu, Estelle Aflalo, Shao-Yen Tseng, Fabio Nonato, Matthias Muller, Vasudev Lal
    arXiv 2023. Paper  
    2023-05-18
    2023-05-18
  133. VideoFactory: Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation
    Wenjing Wang, Huan Yang, Zixi Tuo, Huiguo He, Junchen Zhu, Jianlong Fu, Jiaying Liu
    arXiv 2023. Paper  
    2023-05-18
    2023-05-18
  134. TextDiffuser: Diffusion Models as Text Painters
    Jingye Chen, Yupan Huang, Tengchao Lv, Lei Cui, Qifeng Chen, Furu Wei
    arXiv 2023. Paper  
    2023-05-18
    2023-05-18
  135. AIwriting: Relations Between Image Generation and Digital Writing
    Scott Rettberg, Talan Memmott, Jill Walker Rettberg, Jason Nelson, Patrick Lichty
    ISEA 2023. Paper  
    2023-05-18
    2023-05-18
  136. Zero-Day Backdoor Attack against Text-to-Image Diffusion Models via Personalization
    Yihao Huang, Qing Guo, Felix Juefei-Xu
    arXiv 2023. Paper  
    2023-05-18
    2023-05-18
  137. Discriminative Diffusion Models as Few-shot Vision and Language Learners
    Xuehai He, Weixi Feng, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, William Yang Wang, Xin Eric Wang
    arXiv 2023. Paper  
    2023-05-18
    2023-05-18
  138. Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
    Songwei Ge, Seungjun Nah, Guilin Liu, Tyler Poon, Andrew Tao, Bryan Catanzaro, David Jacobs, Jia-Bin Huang, Ming-Yu Liu, Yogesh Balaji
    arXiv 2023. Paper   Project  
    2023-05-17
    2023-05-17
  139. Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation
    Samaneh Azadi, Akbar Shah, Thomas Hayes, Devi Parikh, Sonal Gupta
    arXiv 2023. Paper   Project  
    2023-05-16
    2023-05-16
  140. Generating coherent comic with rich story using ChatGPT and Stable Diffusion
    Ze Jin, Zorina Song
    arXiv 2023. Paper  
    2023-05-16
    2023-05-16
  141. AMD: Autoregressive Motion Diffusion
    Bo Han, Hao Peng, Minjing Dong, Chang Xu, Yi Ren, Yixuan Shen, Yuheng Li
    arXiv 2023. Paper  
    2023-05-16
    2023-05-16
  142. Interactive Fashion Content Generation Using LLMs and Latent Diffusion Models
    Krishna Sri Ipsit Mantri, Nevasini Sasikumar
    arXiv 2023. Paper  
    2023-05-15
    2023-05-15
  143. Common Diffusion Noise Schedules and Sample Steps are Flawed
    Shanchuan Lin, Bingchen Liu, Jiashi Li, Xiao Yang
    arXiv 2023. Paper  
    2023-05-15
    2023-05-15
  144. Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts
    Yuyang Zhao, Enze Xie, Lanqing Hong, Zhenguo Li, Gim Hee Lee
    arXiv 2023. Paper   Project   Github  
    2023-05-15
    2023-05-15
  145. Null-text Guidance in Diffusion Models is Secretly a Cartoon-style Creator
    Jing Zhao, Heliang Zheng, Chaoyue Wang, Long Lan, Wanrong Huang, Wenjing Yang
    arXiv 2023. Paper   Project   Github  
    2023-05-11
    2023-05-11
  146. iEdit: Localised Text-guided Image Editing with Weak Supervision
    Rumeysa Bodur, Erhan Gundogdu, Binod Bhattarai, Tae-Kyun Kim, Michael Donoser, Loris Bazzani
    arXiv 2023. Paper  
    2023-05-10
    2023-05-10
  147. Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style Transfer
    Nisha Huang, Yuxin Zhang, Weiming Dong
    arXiv 2023. Paper  
    2023-05-09
    2023-05-09
  148. SUR-adapter: Enhancing Text-to-Image Pre-trained Diffusion Models with Large Language Models
    Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin
    arXiv 2023. Paper   Github  
    2023-05-09
    2023-05-09
  149. Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion Models
    Wenkai Dong, Song Xue, Xiaoyue Duan, Shumin Han
    arXiv 2023. Paper  
    2023-05-08
    2023-05-08
  150. ReGeneration Learning of Diffusion Models with Rich Prompts for Zero-Shot Image Translation
    Yupei Lin, Sen Zhang, Xiaojun Yang, Xiao Wang, Yukai Shi
    arXiv 2023. Paper   Project  
    2023-05-08
    2023-05-08
  151. IIITD-20K: Dense captioning for Text-Image ReID
    A V Subramanyam, Niranjan Sundararajan, Vibhu Dubey, Brejesh Lall
    arXiv 2023. Paper  
    2023-05-08
    2023-05-08
  152. DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models
    Sicheng Yang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Lei Hao, Weihong Bao, Ming Cheng, Long Xiao
    IJCAI 2023. Paper   Github  
    2023-05-08
    2023-05-08
  153. Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning
    Shengfang Zhai, Yinpeng Dong, Qingni Shen, Shi Pu, Yuejian Fang, Hang Su
    arXiv 2023. Paper  
    2023-05-07
    2023-05-07
  154. AADiff: Audio-Aligned Video Synthesis with Text-to-Image Diffusion
    Seungwoo Lee, Chaerin Kong, Donghyeon Jeon, Nojun Kwak
    arXiv 2023. Paper  
    2023-05-06
    2023-05-06
  155. Guided Image Synthesis via Initial Image Editing in Diffusion Model
    Jiafeng Mao, Xueting Wang, Kiyoharu Aizawa
    arXiv 2023. Paper  
    2023-05-05
    2023-05-05
  156. DisenBooth: Identity-Preserving Disentangled Tuning for Subject-Driven Text-to-Image Generation
    Hong Chen, Yipeng Zhang, Xin Wang, Xuguang Duan, Yuwei Zhou, Wenwu Zhu
    arXiv 2023. Paper   Project  
    2023-05-05
    2023-05-05
  157. Data Curation for Image Captioning with Text-to-Image Generative Models
    Wenyan Li, Jonas F. Lotz, Chen Qiu, Desmond Elliott
    arXiv 2023. Paper  
    2023-05-05
    2023-05-05
  158. Multimodal-driven Talking Face Generation, Face Swapping, Diffusion Model
    Chao Xu, Shaoting Zhu, Junwei Zhu, Tianxin Huang, Jiangning Zhang, Ying Tai, Yong Liu
    arXiv 2023. Paper  
    2023-05-04
    2023-05-04
  159. Diffusion Explainer: Visual Explanation for Text-to-image Stable Diffusion
    Seongmin Lee, Benjamin Hoover, Hendrik Strobelt, Zijie J. Wang, ShengYun Peng, Austin Wright, Kevin Li, Haekyu Park, Haoyang Yang, Duen Horng Chau
    arXiv 2023. Paper   Project  
    2023-05-04
    2023-05-04
  160. Multimodal Data Augmentation for Image Captioning using Diffusion Models
    Changrong Xiao, Sean Xin Xu, Kunpeng Zhang
    arXiv 2023. Paper  
    2023-05-03
    2023-05-03
  161. In-Context Learning Unlocked for Diffusion Models
    Zhendong Wang, Yifan Jiang, Yadong Lu, Yelong Shen, Pengcheng He, Weizhu Chen, Zhangyang Wang, Mingyuan Zhou
    arXiv 2023. Paper   Project   Github  
    2023-05-01
    2023-05-01
  162. SceneGenie: Scene Graph Guided Diffusion Models for Image Synthesis
    Azade Farshad, Yousef Yeganeh, Yu Chi, Chengzhi Shen, Björn Ommer, Nassir Navab
    arXiv 2023. Paper  
    2023-04-28
    2023-04-28
  163. Edit Everything: A Text-Guided Generative System for Images Editing
    Defeng Xie, Ruichen Wang, Jian Ma, Chen Chen, Haonan Lu, Dong Yang, Fobo Shi, Xiaodong Lin
    arXiv 2023. Paper   Github  
    2023-04-27
    2023-04-27
  164. It is all about where you start: Text-to-image generation with seed selection
    Dvir Samuel, Rami Ben-Ari, Simon Raviv, Nir Darshan, Gal Chechik
    arXiv 2023. Paper  
    2023-04-27
    2023-04-27
  165. Training-Free Location-Aware Text-to-Image Synthesis
    Jiafeng Mao, Xueting Wang
    arXiv 2023. Paper  
    2023-04-26
    2023-04-26
  166. TextMesh: Generation of Realistic 3D Meshes From Text Prompts
    Christina Tsalicoglou, Fabian Manhardt, Alessio Tonioni, Michael Niemeyer, Federico Tombari
    arXiv 2023. Paper  
    2023-04-24
    2023-04-24
  167. Using Text-to-Image Generation for Architectural Design Ideation
    Ville Paananen, Jonas Oppenlaender, Aku Visuri
    arXiv 2023. Paper  
    2023-04-20
    2023-04-20
  168. Anything-3D: Towards Single-view Anything Reconstruction in the Wild
    Qiuhong Shen, Xingyi Yang, Xinchao Wang
    arXiv 2023. Paper   Github  
    2023-04-19
    2023-04-19
  169. Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
    Andreas Blattmann, Robin Rombach, Huan Ling, Tim Dockhorn, Seung Wook Kim, Sanja Fidler, Karsten Kreis
    CVPR 2023. Paper   Project  
    2023-04-18
    2023-04-18
  170. TTIDA: Controllable Generative Data Augmentation via Text-to-Text and Text-to-Image Models
    Yuwei Yin, Jean Kaddour, Xiang Zhang, Yixin Nie, Zhenguang Liu, Lingpeng Kong, Qi Liu
    arXiv 2023. Paper  
    2023-04-18
    2023-04-18
  171. UPGPT: Universal Diffusion Model for Person Image Generation, Editing and Pose Transfer
    Soon Yau Cheong, Armin Mustafa, Andrew Gilbert
    arXiv 2023. Paper   Github  
    2023-04-18
    2023-04-18
  172. MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing
    Mingdeng Cao, Xintao Wang, Zhongang Qi, Ying Shan, Xiaohu Qie, Yinqiang Zheng
    arXiv 2023. Paper   Github  
    2023-04-17
    2023-04-17
  173. Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation
    Jie An, Songyang Zhang, Harry Yang, Sonal Gupta, Jia-Bin Huang, Jiebo Luo, Xi Yin
    arXiv 2023. Paper   Project  
    2023-04-17
    2023-04-17
  174. Text2Performer: Text-Driven Human Video Generation
    Yuming Jiang, Shuai Yang, Tong Liang Koh, Wayne Wu, Chen Change Loy, Ziwei Liu
    arXiv 2023. Paper   Project  
    2023-04-17
    2023-04-17
  175. Delta Denoising Score
    Amir Hertz, Kfir Aberman, Daniel Cohen-Or
    arXiv 2023. Paper   Project  
    2023-04-14
    2023-04-14
  176. Text-Conditional Contextualized Avatars For Zero-Shot Personalization
    Samaneh Azadi, Thomas Hayes, Akbar Shah, Guan Pang, Devi Parikh, Sonal Gupta
    arXiv 2023. Paper  
    2023-04-14
    2023-04-14
  177. Soundini: Sound-Guided Diffusion for Natural Video Editing
    Seung Hyun Lee, Sieun Kim, Innfarn Yoo, Feng Yang, Donghyeon Cho, Youngseo Kim, Huiwen Chang, Jinkyu Kim, Sangpil Kim
    arXiv 2023. Paper   Project  
    2023-04-13
    2023-04-13
  178. Expressive Text-to-Image Generation with Rich Text
    Songwei Ge, Taesung Park, Jun-Yan Zhu, Jia-Bin Huang
    arXiv 2023. Paper   Project   Github  
    2023-04-13
    2023-04-13
  179. Continual Diffusion: Continual Customization of Text-to-Image Diffusion with C-LoRA
    James Seale Smith, Yen-Chang Hsu, Lingyu Zhang, Ting Hua, Zsolt Kira, Yilin Shen, Hongxia Jin
    arXiv 2023. Paper   Project  
    2023-04-12
    2023-04-12
  180. An Edit Friendly DDPM Noise Space: Inversion and Manipulations
    Inbar Huberman-Spiegelglas, Vladimir Kulikov, Tomer Michaeli
    arXiv 2023. Paper  
    2023-04-12
    2023-04-12
  181. Improving Diffusion Models for Scene Text Editing with Dual Encoders
    Jiabao Ji, Guanhua Zhang, Zhaowen Wang, Bairu Hou, Zhifei Zhang, Brian Price, Shiyu Chang
    arXiv 2023. Paper   Github  
    2023-04-12
    2023-04-12
  182. Re-imagine the Negative Prompt Algorithm: Transform 2D Diffusion into 3D, alleviate Janus problem and Beyond
    Mohammadreza Armandpour, Huangjie Zheng, Ali Sadeghian, Amir Sadeghian, Mingyuan Zhou
    arXiv 2023. Paper  
    2023-04-11
    2023-04-11
  183. HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models
    Eslam Mohamed Bakr, Pengzhan Sun, Xiaoqian Shen, Faizan Farooq Khan, Li Erran Li, Mohamed Elhoseiny
    arXiv 2023. Paper   Project  
    2023-04-11
    2023-04-11
  184. Towards Real-time Text-driven Image Manipulation with Unconditional Diffusion Models
    Nikita Starodubcev, Dmitry Baranchuk, Valentin Khrulkov, Artem Babenko
    arXiv 2023. Paper  
    2023-04-10
    2023-04-10
  185. HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image Generation
    Xuan Ju, Ailing Zeng, Chenchen Zhao, Jianan Wang, Lei Zhang, Qiang Xu
    arXiv 2023. Paper   Github  
    2023-04-09
    2023-04-09
  186. Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis
    Qiucheng Wu, Yujian Liu, Handong Zhao, Trung Bui, Zhe Lin, Yang Zhang, Shiyu Chang
    arXiv 2023. Paper   Github  
    2023-04-07
    2023-04-07
  187. DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model
    Hoigi Seo, Hayeon Kim, Gwanghyun Kim, Se Young Chun
    arXiv 2023. Paper   Project  
    2023-04-06
    2023-04-06
  188. Benchmarking Robustness to Text-Guided Corruptions
    Mohammadreza Mofayezi, Yasamin Medghalchi
    arXiv 2023. Paper  
    2023-04-06
    2023-04-06
  189. Training-Free Layout Control with Cross-Attention Guidance
    Minghao Chen, Iro Laina, Andrea Vedaldi
    arXiv 2023. Paper   Project   Github  
    2023-04-06
    2023-04-06
  190. Zero-shot Generative Model Adaptation via Image-specific Prompt Learning
    Jiayi Guo, Chaofei Wang, You Wu, Eric Zhang, Kai Wang, Xingqian Xu, Shiji Song, Humphrey Shi, Gao Huang
    CVPR 2023. Paper   Github  
    2023-04-06
    2023-04-06
  191. A Diffusion-based Method for Multi-turn Compositional Image Generation
    Chao Wang, Xiaoyu Yang, Jinmiao Huang, Kevin Ferreira
    arXiv 2023. Paper  
    2023-04-05
    2023-04-05
  192. Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models
    Xuhui Jia, Yang Zhao, Kelvin C.K. Chan, Yandong Li, Han Zhang, Boqing Gong, Tingbo Hou, Huisheng Wang, Yu-Chuan Su
    arXiv 2023. Paper  
    2023-04-05
    2023-04-05
  193. Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models
    Jaewoong Lee, Sangwon Jang, Jaehyeong Jo, Jaehong Yoon, Yunji Kim, Jin-Hwa Kim, Jung-Woo Ha, Sung Ju Hwang
    arXiv 2023. Paper  
    2023-04-04
    2023-04-04
  194. PODIA-3D: Domain Adaptation of 3D Generative Model Across Large Domain Gap Using Pose-Preserved Text-to-Image Diffusion
    Gwanghyun Kim, Ji Ha Jang, Se Young Chun
    arXiv 2023. Paper   Project  
    2023-04-04
    2023-04-04
  195. Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing
    Alberto Baldrati, Davide Morelli, Giuseppe Cartella, Marcella Cornia, Marco Bertini, Rita Cucchiara
    arXiv 2023. Paper  
    2023-04-04
    2023-04-04
  196. viz2viz: Prompt-driven stylized visualization generation using a diffusion model
    Jiaqi Wu, John Joon Young Chung, Eytan Adar
    arXiv 2023. Paper  
    2023-04-04
    2023-04-04
  197. DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via Diffusion Models
    Yukang Cao, Yan-Pei Cao, Kai Han, Ying Shan, Kwan-Yee K. Wong
    arXiv 2023. Paper  
    2023-04-03
    2023-04-03
  198. ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model
    Mingyuan Zhang, Xinying Guo, Liang Pan, Zhongang Cai, Fangzhou Hong, Huirong Li, Lei Yang, Ziwei Liu
    arXiv 2023. Paper   Project   Github  
    2023-04-03
    2023-04-03
  199. DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance
    Longwen Zhang, Qiwei Qiu, Hongyang Lin, Qixuan Zhang, Cheng Shi, Wei Yang, Ye Shi, Sibei Yang, Lan Xu, Jingyi Yu
    arXiv 2023. Paper   Project  
    2023-04-01
    2023-04-01
  200. GlyphDraw: Learning to Draw Chinese Characters in Image Synthesis Models Coherently
    Jian Ma, Mingjun Zhao, Chen Chen, Ruichen Wang, Di Niu, Haonan Lu, Xiaodong Lin
    arXiv 2023. Paper   Project  
    2023-03-31
    2023-03-31
  201. LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation
    Guangcong Zheng, Xianpan Zhou, Xuewei Li, Zhongang Qi, Ying Shan, Xi Li
    CVPR 2023. Paper   Github  
    2023-03-30
    2023-03-30
  202. DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder
    Chenpng Du, Qi Chen, Tianyu He, Xu Tan, Xie Chen, Kai Yu, Sheng Zhao, Jiang Bian
    arXiv 2023. Paper  
    2023-03-30
    2023-03-30
  203. Discriminative Class Tokens for Text-to-Image Diffusion Models
    Idan Schwartz, Vésteinn Snæbjarnarson, Sagie Benaim, Hila Chefer, Ryan Cotterell, Lior Wolf, Serge Belongie
    arXiv 2023. Paper  
    2023-03-30
    2023-03-30
  204. Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models
    Wen Wang, Kangyang Xie, Zide Liu, Hao Chen, Yue Cao, Xinlong Wang, Chunhua Shen
    arXiv 2023. Paper  
    2023-03-30
    2023-03-30
  205. DiffCollage: Parallel Generation of Large Content with Diffusion Models
    Qinsheng Zhang, Jiaming Song, Xun Huang, Yongxin Chen, Ming-Yu Liu
    CVPR 2023. Paper   Project  
    2023-03-30
    2023-03-30
  206. Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
    Eric Zhang, Kai Wang, Xingqian Xu, Zhangyang Wang, Humphrey Shi
    arXiv 2023. Paper   Github  
    2023-03-30
    2023-03-30
  207. Social Biases through the Text-to-Image Generation Lens
    Ranjita Naik, Besmira Nushi
    arXiv 2023. Paper  
    2023-03-30
    2023-03-30
  208. PAIR-Diffusion: Object-Level Image Editing with Structure-and-Appearance Paired Diffusion Models
    Vidit Goel, Elia Peruzzo, Yifan Jiang, Dejia Xu, Nicu Sebe, Trevor Darrell, Zhangyang Wang, Humphrey Shi
    arXiv 2023. Paper   Github  
    2023-03-30
    2023-03-30
  209. AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control
    Ruixiang Jiang, Can Wang, Jingbo Zhang, Menglei Chai, Mingming He, Dongdong Chen, Jing Liao
    arXiv 2023. Paper   Project   Github  
    2023-03-30
    2023-03-30
  210. MDP: A Generalized Framework for Text-Guided Image Editing by Manipulating the Diffusion Path
    Qian Wang, Biao Zhang, Michael Birsak, Peter Wonka
    arXiv 2023. Paper   Github  
    2023-03-29
    2023-03-29
  211. 4D Facial Expression Diffusion Model
    Kaifeng Zou, Sylvain Faisan, Boyang Yu, Sébastien Valette, Hyewon Seo
    arXiv 2023. Paper   Github  
    2023-03-29
    2023-03-29
  212. StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing
    Senmao Li, Joost van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang, Jian Yang
    arXiv 2023. Paper  
    2023-03-28
    2023-03-28
  213. Instruct 3D-to-3D: Text Instruction Guided 3D-to-3D conversion
    Hiromichi Kamata, Yuiko Sakuma, Akio Hayakawa, Masato Ishii, Takuya Narihira
    arXiv 2023. Paper   Github  
    2023-03-28
    2023-03-28
  214. Anti-DreamBooth: Protecting users from personalized text-to-image synthesis
    Thanh Van Le, Hao Phung, Thuan Hoang Nguyen, Quan Dao, Ngoc Tran, Anh Tran
    SIGGRAPH 2023. Paper   Github  
    2023-03-27
    2023-03-27
  215. Debiasing Scores and Prompts of 2D Diffusion for Robust Text-to-3D Generation
    Susung Hong, Donghoon Ahn, Seungryong Kim
    arXiv 2023. Paper  
    2023-03-27
    2023-03-27
  216. Seer: Language Instructed Video Prediction with Latent Diffusion Models
    Xianfan Gu, Chuan Wen, Jiaming Song, Yang Gao
    CVPR Workshop 2023. Paper  
    2023-03-27
    2023-03-27
  217. GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents
    Tenglong Ao, Zeyi Zhang, Libin Liu
    arXiv 2023. Paper  
    2023-03-26
    2023-03-26
  218. Better Aligning Text-to-Image Models with Human Preference
    Xiaoshi Wu, Keqiang Sun, Feng Zhu, Rui Zhao, Hongsheng Li
    arXiv 2023. Paper   Github  
    2023-03-25
    2023-03-25
  219. Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation
    Rui Chen, Yongwei Chen, Ningxin Jiao, Kui Jia
    arXiv 2023. Paper  
    2023-03-24
    2023-03-24
  220. CompoNeRF: Text-guided Multi-object Compositional NeRF with Editable 3D Scene Layout
    Yiqi Lin, Haotian Bai, Sijia Li, Haonan Lu, Xiaodong Lin, Hui Xiong, Lin Wang
    arXiv 2023. Paper   Project  
    2023-03-24
    2023-03-24
  221. DiffuScene: Scene Graph Denoising Diffusion Probabilistic Model for Generative Indoor Scene Synthesis
    Jiapeng Tang, Yinyu Nie, Lev Markhasin, Angela Dai, Justus Thies, Matthias Nießner
    arXiv 2023. Paper   Project  
    2023-03-24
    2023-03-24
  222. ISS++: Image as Stepping Stone for Text-Guided 3D Shape Generation
    Zhengzhe Liu, Peng Dai, Ruihui Li, Xiaojuan Qi, Chi-Wing Fu
    ICLR 2023. Paper  
    2023-03-24
    2023-03-24
  223. MagicFusion: Boosting Text-to-Image Generation Performance by Fusing Diffusion Models
    Jing Zhao, Heliang Zheng, Chaoyue Wang, Long Lan, Wenjing Yang
    arXiv 2023. Paper   Project   Github  
    2023-03-23
    2023-03-23
  224. Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
    Levon Khachatryan, Andranik Movsisyan, Vahram Tadevosyan, Roberto Henschel, Zhangyang Wang, Shant Navasardyan, Humphrey Shi
    arXiv 2023. Paper   Github  
    2023-03-23
    2023-03-23
  225. Ablating Concepts in Text-to-Image Diffusion Models
    Nupur Kumari, Bingliang Zhang, Sheng-Yu Wang, Eli Shechtman, Richard Zhang, Jun-Yan Zhu
    arXiv 2023. Paper   Project   Github  
    2023-03-23
    2023-03-23
  226. ReVersion: Diffusion-Based Relation Inversion from Images
    Ziqi Huang, Tianxing Wu, Yuming Jiang, Kelvin C.K. Chan, Ziwei Liu
    arXiv 2023. Paper   Project   Github  
    2023-03-23
    2023-03-23
  227. Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions
    Ayaan Haque, Matthew Tancik, Alexei A. Efros, Aleksander Holynski, Angjoo Kanazawa
    arXiv 2023. Paper   Project  
    2023-03-22
    2023-03-22
  228. Pix2Video: Video Editing using Image Diffusion
    Duygu Ceylan, Chun-Hao Paul Huang, Niloy J. Mitra
    arXiv 2023. Paper   Project  
    2023-03-22
    2023-03-22
  229. 3D-CLFusion: Fast Text-to-3D Rendering with Contrastive Latent Diffusion
    Yu-Jhe Li, Kris Kitani
    arXiv 2023. Paper  
    2023-03-21
    2023-03-21
  230. CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion
    Geonmo Gu, Sanghyuk Chun, Wonjae Kim, HeeJae Jun, Yoohoon Kang, Sangdoo Yun
    arXiv 2023. Paper  
    2023-03-21
    2023-03-21
  231. Vox-E: Text-guided Voxel Editing of 3D Objects
    Etai Sella, Gal Fiebelman, Peter Hedman, Hadar Averbuch-Elor
    arXiv 2023. Paper   Project  
    2023-03-21
    2023-03-21
  232. SALAD: Part-Level Latent Diffusion for 3D Shape Generation and Manipulation
    Juil Koo, Seungwoo Yoo, Minh Hieu Nguyen, Minhyuk Sung
    arXiv 2023. Paper   Project  
    2023-03-21
    2023-03-21
  233. Discovering Interpretable Directions in the Semantic Latent Space of Diffusion Models
    René Haas, Inbar Huberman-Spiegelglas, Rotem Mulayoff, Tomer Michaeli
    arXiv 2023. Paper  
    2023-03-20
    2023-03-20
  234. SVDiff: Compact Parameter Space for Diffusion Fine-Tuning
    Ligong Han, Yinxiao Li, Han Zhang, Peyman Milanfar, Dimitris Metaxas, Feng Yang
    arXiv 2023. Paper  
    2023-03-20
    2023-03-20
  235. Localizing Object-level Shape Variations with Text-to-Image Diffusion Models
    Or Patashnik, Daniel Garibi, Idan Azuri, Hadar Averbuch-Elor, Daniel Cohen-Or
    arXiv 2023. Paper   Project  
    2023-03-20
    2023-03-20
  236. Text2Tex: Text-driven Texture Synthesis via Diffusion Models
    Dave Zhenyu Chen, Yawar Siddiqui, Hsin-Ying Lee, Sergey Tulyakov, Matthias Nießner
    arXiv 2023. Paper   Project  
    2023-03-20
    2023-03-20
  237. SKED: Sketch-guided Text-based 3D Editing
    Aryan Mikaeili, Or Perel, Daniel Cohen-Or, Ali Mahdavi-Amiri
    arxiv 2023. Paper  
    2023-03-19
    2023-03-19
  238. FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model
    Jiwen Yu, Yinhuai Wang, Chen Zhao, Bernard Ghanem, Jian Zhang
    arXiv 2023. Paper   Github  
    2023-03-17
    2023-03-17
  239. DiffusionRet: Generative Text-Video Retrieval with Diffusion Model
    Peng Jin, Hao Li, Zesen Cheng, Kehan Li, Xiangyang Ji, Chang Liu, Li Yuan, Jie Chen
    arXiv 2023. Paper  
    2023-03-17
    2023-03-17
  240. GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation
    Can Qin, Ning Yu, Chen Xing, Shu Zhang, Zeyuan Chen, Stefano Ermon, Yun Fu, Caiming Xiong, Ran Xu
    arXiv 2023. Paper  
    2023-03-17
    2023-03-17
  241. DialogPaint: A Dialog-based Image Editing Model
    Jingxuan Wei, Shiyu Wu, Xin Jiang, Yequan Wang
    arXiv 2023. Paper  
    2023-03-17
    2023-03-17
  242. P+: Extended Textual Conditioning in Text-to-Image Generation
    Andrey Voynov, Qinghao Chu, Daniel Cohen-Or, Kfir Aberman
    arXiv 2023. Paper   Project  
    2023-03-16
    2023-03-16
  243. HIVE: Harnessing Human Feedback for Instructional Visual Editing
    Shu Zhang, Xinyi Yang, Yihao Feng, Can Qin, Chia-Chih Chen, Ning Yu, Zeyuan Chen, Huan Wang, Silvio Savarese, Stefano Ermon, Caiming Xiong, Ran Xu
    arXiv 2023. Paper  
    2023-03-16
    2023-03-16
  244. FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
    Chenyang Qi, Xiaodong Cun, Yong Zhang, Chenyang Lei, Xintao Wang, Ying Shan, Qifeng Chen
    arXiv 2023. Paper   Project   Github  
    2023-03-16
    2023-03-16
  245. Unified Multi-Modal Latent Diffusion for Joint Subject and Text Conditional Image Generation
    Yiyang Ma, Huan Yang, Wenjing Wang, Jianlong Fu, Jiaying Liu
    arXiv 2023. Paper  
    2023-03-16
    2023-03-16
  246. Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer
    Serin Yang, Hyunmin Hwang, Jong Chul Ye
    arXiv 2023. Paper  
    2023-03-15
    2023-03-15
  247. Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a Single Image using Diffusion Models
    Divya Kothandaraman, Tianyi Zhou, Ming Lin, Dinesh Manocha
    arXiv 2023. Paper   Github  
    2023-03-15
    2023-03-15
  248. Highly Personalized Text Embedding for Image Manipulation by Stable Diffusion
    Inhwa Han, Serin Yang, Taesung Kwon, Jong Chul Ye
    arXiv 2023. Paper  
    2023-03-15
    2023-03-15
  249. Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation
    Junyoung Seo, Wooseok Jang, Min-Seop Kwak, Jaehoon Ko, Hyeonsu Kim, Junho Kim, Jin-Hwa Kim, Jiyoung Lee, Seungryong Kim
    arXiv 2023. Paper  
    2023-03-14
    2023-03-14
  250. Editing Implicit Assumptions in Text-to-Image Diffusion Models
    Hadas Orgad, Bahjat Kawar, Yonatan Belinkov
    arXiv 2023. Paper   Project   Github  
    2023-03-14
    2023-03-14
  251. Edit-A-Video: Single Video Editing with Object-Aware Consistency
    Chaehun Shin, Heeseung Kim, Che Hyun Lee, Sang-gil Lee, Sungroh Yoon
    arXiv 2023. Paper   Project  
    2023-03-14
    2023-03-14
  252. Erasing Concepts from Diffusion Models
    Rohit Gandikota, Joanna Materzynska, Jaden Fiotto-Kaufman, David Bau
    arXiv 2023. Paper   Project   Github  
    2023-03-13
    2023-03-13
  253. One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale
    Fan Bao, Shen Nie, Kaiwen Xue, Chongxuan Li, Shi Pu, Yaole Wang, Gang Yue, Yue Cao, Hang Su, Jun Zhu
    arXiv 2023. Paper   Github  
    2023-03-12
    2023-03-12
  254. Cones: Concept Neurons in Diffusion Models for Customized Generation
    Zhiheng Liu, Ruili Feng, Kai Zhu, Yifei Zhang, Kecheng Zheng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao
    arXiv 2023. Paper  
    2023-03-09
    2023-03-09
  255. A Prompt Log Analysis of Text-to-Image Generation Systems
    Yutong Xie, Zhaoying Pan, Jinge Ma, Jie Luo, Qiaozhu Mei
    arXiv 2023. Paper  
    2023-03-08
    2023-03-08
  256. Video-P2P: Video Editing with Cross-attention Control
    Shaoteng Liu, Yuechen Zhang, Wenbo Li, Zhe Lin, Jiaya Jia
    arXiv 2023. Paper   Project  
    2023-03-08
    2023-03-08
  257. Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
    Chenfei Wu, Shengming Yin, Weizhen Qi, Xiaodong Wang, Zecheng Tang, Nan Duan
    arXiv 2023. Paper   Github  
    2023-03-08
    2023-03-08
  258. Zeroth-Order Optimization Meets Human Feedback: Provable Learning via Ranking Oracles
    Zhiwei Tang, Dmitry Rybin, Tsung-Hui Chang
    arXiv 2023. Paper   Github  
    2023-03-07
    2023-03-07
  259. Unleashing Text-to-Image Diffusion Models for Visual Perception
    Wenliang Zhao, Yongming Rao, Zuyan Liu, Benlin Liu, Jie Zhou, Jiwen Lu
    arXiv 2023. Paper   Github  
    2023-03-03
    2023-03-03
  260. Collage Diffusion
    Vishnu Sarukkai, Linden Li, Arden Ma, Christopher Ré, Kayvon Fatahalian
    arXiv 2023. Paper  
    2023-03-01
    2023-03-01
  261. Towards Enhanced Controllability of Diffusion Models
    Wonwoong Cho, Hareesh Ravi, Midhun Harikumar, Vinh Khuc, Krishna Kumar Singh, Jingwan Lu, David I. Inouye, Ajinkya Kale
    arXiv 2023. Paper  
    2023-02-28
    2023-02-28
  262. Directed Diffusion: Direct Control of Object Placement through Attention Guidance
    Wan-Duo Kurt Ma, J.P. Lewis, W. Bastiaan Kleijn, Thomas Leung
    arXiv 2023. Paper  
    2023-02-25
    2023-02-25
  263. Modulating Pretrained Diffusion Models for Multimodal Image Synthesis
    Cusuh Ham, James Hays, Jingwan Lu, Krishna Kumar Singh, Zhifei Zhang, Tobias Hinz
    arXiv 2023. Paper  
    2023-02-24
    2023-02-24
  264. Controlled and Conditional Text to Image Generation with Diffusion Prior
    Pranav Aggarwal, Hareesh Ravi, Naveen Marri, Sachin Kelkar, Fengbin Chen, Vinh Khuc, Midhun Harikumar, Ritiz Tambi, Sudharshan Reddy Kakumanu, Purvak Lapsiya, Alvin Ghouas, Sarah Saber, Malavika Ramprasad, Baldo Faieta, Ajinkya Kale
    arXiv 2023. Paper  
    2023-02-23
    2023-02-23
  265. Region-Aware Diffusion for Zero-shot Text-driven Image Editing
    Nisha Huang, Fan Tang, Weiming Dong, Tong-Yee Lee, Changsheng Xu
    arXiv 2023. Paper   Github  
    2023-02-23
    2023-02-23
  266. Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC
    Yilun Du, Conor Durkan, Robin Strudel, Joshua B. Tenenbaum, Sander Dieleman, Rob Fergus, Jascha Sohl-Dickstein, Arnaud Doucet, Will Grathwohl
    arXiv 2023. Paper   Project  
    2023-02-22
    2023-02-22
  267. Learning 3D Photography Videos via Self-supervised Diffusion on Single Images
    Xiaodong Wang, Chenfei Wu, Shengming Yin, Minheng Ni, Jianfeng Wang, Linjie Li, Zhengyuan Yang, Fan Yang, Lijuan Wang, Zicheng Liu, Yuejian Fang, Nan Duan
    arXiv 2023. Paper  
    2023-02-21
    2023-02-21
  268. Boundary Guided Mixing Trajectory for Semantic Control with Diffusion Models
    Ye Zhu, Yu Wu, Zhiwei Deng, Olga Russakovsky, Yan Yan
    arXiv 2023. Paper  
    2023-02-16
    2023-02-16
  269. MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation
    Omer Bar-Tal, Lior Yariv, Yaron Lipman, Tali Dekel
    arXiv 2023. Paper   roject   Github  
    2023-02-16
    2023-02-16
  270. T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models
    Chong Mou, Xintao Wang, Liangbin Xie, Jian Zhang, Zhongang Qi, Ying Shan, Xiaohu Qie
    arXiv 2023. Paper   Github  
    2023-02-16
    2023-02-16
  271. Text-driven Visual Synthesis with Latent Diffusion Prior
    Ting-Hsuan Liao, Songwei Ge, Yiran Xu, Yao-Chih Lee, Badour AlBahar, Jia-Bin Huang
    arXiv 2023. Paper   Project  
    2023-02-16
    2023-02-16
  272. Exploring the Representation Manifolds of Stable Diffusion Through the Lens of Intrinsic Dimension
    Henry Kvinge, Davis Brown, Charles Godfrey
    arXiv 2023. Paper  
    2023-02-16
    2023-02-16
  273. PRedItOR: Text Guided Image Editing with Diffusion Prio
    Hareesh Ravi, Sachin Kelkar, Midhun Harikumar, Ajinkya Kale
    arXiv 2023. Paper  
    2023-02-15
    2023-02-15
  274. Dataset Interfaces: Diagnosing Model Failures Using Controllable Counterfactual Generation
    Joshua Vendrow, Saachi Jain, Logan Engstrom, Aleksander Madry
    arXiv 2023. Paper   Github  
    2023-02-15
    2023-02-15
  275. Universal Guidance for Diffusion Models
    Arpit Bansal, Hong-Min Chu, Avi Schwarzschild, Soumyadip Sengupta, Micah Goldblum, Jonas Geiping, Tom Goldstein
    arXiv 2023. Paper   Github  
    2023-02-14
    2023-02-14
  276. Text-Guided Scene Sketch-to-Photo Synthesis
    AprilPyone MaungMaung, Makoto Shing, Kentaro Mitsui, Kei Sawada, Fumio Okura
    arXiv 2023. Paper  
    2023-02-14
    2023-02-14
  277. Analyzing Multimodal Objectives Through the Lens of Generative Diffusion Guidance
    Chaerin Kong, Nojun Kwak
    arXiv 2023. Paper  
    2023-02-10
    2023-02-10
  278. Adding Conditional Control to Text-to-Image Diffusion Models
    Lvmin Zhang, Maneesh Agrawala
    arXiv 2023. Paper   Github  
    2023-02-10
    2023-02-10
  279. Is This Loss Informative? Speeding Up Textual Inversion with Deterministic Objective Evaluation
    Anton Voronov, Mikhail Khoroshikh, Artem Babenko, Max Ryabinin
    arXiv 2023. Paper  
    2023-02-09
    2023-02-09
  280. Zero-shot Generation of Coherent Storybook from Plain Text Story using Diffusion Models
    Hyeonho Jeong, Gihyun Kwon, Jong Chul Ye
    arXiv 2023. Paper  
    2023-02-08
    2023-02-08
  281. GLAZE: Protecting Artists from Style Mimicry by Text-to-Image Models
    Shawn Shan, Jenna Cryan, Emily Wenger, Haitao Zheng, Rana Hanocka, Ben Y. Zhao
    arXiv 2023. Paper  
    2023-02-08
    2023-02-08
  282. Q-Diffusion: Quantizing Diffusion Models
    Xiuyu Li, Long Lian, Yijiang Liu, Huanrui Yang, Zhen Dong, Daniel Kang, Shanghang Zhang, Kurt Keutzer
    arXiv 2023. Paper   Github  
    2023-02-08
    2023-02-08
  283. Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery
    Yuxin Wen, Neel Jain, John Kirchenbauer, Micah Goldblum, Jonas Geiping, Tom Goldstein
    arXiv 2023. Paper   Github  
    2023-02-07
    2023-02-07
  284. Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness
    Felix Friedrich, Patrick Schramowski, Manuel Brack, Lukas Struppek, Dominik Hintersdorf, Sasha Luccioni, Kristian Kersting
    arXiv 2023. Paper  
    2023-02-07
    2023-02-07
  285. Structure and Content-Guided Video Synthesis with Diffusion Models
    Patrick Esser, Johnathan Chiu, Parmida Atighehchian, Jonathan Granskog, Anastasis Germanidis
    arXiv 2023. Paper   Project  
    2023-02-06
    2023-02-06
  286. Zero-shot Image-to-Image Translation
    Gaurav Parmar, Krishna Kumar Singh, Richard Zhang, Yijun Li, Jingwan Lu, Jun-Yan Zhu
    arXiv 2023. Paper  
    2023-02-06
    2023-02-06
  287. Eliminating Prior Bias for Semantic Image Editing via Dual-Cycle Diffusion
    Zuopeng Yang, Tianshu Chu, Xin Lin, Erdun Gao, Daqing Liu, Jie Yang, Chaoyue Wang
    arXiv 2023. Paper  
    2023-02-05
    2023-02-05
  288. ReDi: Efficient Learning-Free Diffusion Inference via Trajectory Retrieval
    Kexun Zhang, Xianjun Yang, William Yang Wang, Lei Li
    arXiv 2023. Paper  
    2023-02-05
    2023-02-05
  289. Mixture of Diffusers for scene composition and high resolution image generation
    Álvaro Barbero Jiménez
    arXiv 2023. Paper   Github  
    2023-02-05
    2023-02-05
  290. Semantic-Guided Image Augmentation with Pre-trained Models
    Bohan Li, Xinghao Wang, Xiao Xu, Yutai Hou, Yunlong Feng, Feng Wang, Wanxiang Che
    SIGGRAPH 2023. Paper   Project  
    2023-02-04
    2023-02-04
  291. TEXTure: Text-Guided Texturing of 3D Shapes
    Elad Richardson, Gal Metzer, Yuval Alaluf, Raja Giryes, Daniel Cohen-Or
    arXiv 2023. Paper   Project   Github  
    2023-02-03
    2023-02-03
  292. Dreamix: Video Diffusion Models are General Video Editors
    Eyal Molad, Eliahu Horwitz, Dani Valevski, Alex Rav Acha, Yossi Matias, Yael Pritch, Yaniv Leviathan, Yedid Hoshen
    arXiv 2023. Paper   Project  
    2023-02-02
    2023-02-02
  293. Trash to Treasure: Using text-to-image models to inform the design of physical artefacts
    Amy Smith, Hope Schroeder, Ziv Epstein, Michael Cook, Simon Colton, Andrew Lippman
    AAAI 2023. Paper  
    2023-02-01
    2023-02-01
  294. Zero3D: Semantic-Driven Multi-Category 3D Shape Generation
    Bo Han, Yitong Liu, Yixuan Shen
    arXiv 2023. Paper  
    2023-01-31
    2023-01-31
  295. Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models
    Hila Chefer, Yuval Alaluf, Yael Vinker, Lior Wolf, Daniel Cohen-Or
    SIGGRAPH 2023. Paper   Project   Github  
    2023-01-31
    2023-01-31
  296. GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
    Ming Tao, Bing-Kun Bao, Hao Tang, Changsheng Xu
    CVPR 2023. Paper   Github  
    2023-01-30
    2023-01-30
  297. PromptMix: Text-to-image diffusion models enhance the performance of lightweight networks
    Arian Bakhtiarnia, Qi Zhang, Alexandros Iosifidis
    arXiv 2023. Paper   Github  
    2023-01-30
    2023-01-30
  298. Shape-aware Text-driven Layered Video Editing
    Yao-Chih Lee, Ji-Ze Genevieve Jang, Yi-Ting Chen, Elizabeth Qiu, Jia-Bin Huang
    arXiv 2023. Paper   Project  
    2023-01-30
    2023-01-30
  299. Towards Equitable Representation in Text-to-Image Synthesis Models with the Cross-Cultural Understanding Benchmark (CCUB) Dataset
    Zhixuan Liu, Youeun Shin, Beverley-Claire Okogwu, Youngsik Yun, Lia Coleman, Peter Schaldenbrand, Jihie Kim, Jean Oh
    arXiv 2023. Paper  
    2023-01-28
    2023-01-28
  300. SEGA: Instructing Diffusion using Semantic Dimensions
    Manuel Brack, Felix Friedrich, Dominik Hintersdorf, Lukas Struppek, Patrick Schramowski, Kristian Kersting
    arXiv 2023. Paper  
    2023-01-28
    2023-01-28
  301. Text-To-4D Dynamic Scene Generation
    Uriel Singer, Shelly Sheynin, Adam Polyak, Oron Ashual, Iurii Makarov, Filippos Kokkinos, Naman Goyal, Andrea Vedaldi, Devi Parikh, Justin Johnson, Yaniv Taigman
    arXiv 2023. Paper  
    2023-01-26
    2023-01-26
  302. Guiding Text-to-Image Diffusion Model Towards Grounded Generation
    Ziyi Li, Qinye Zhou, Xiaoyun Zhang, Ya Zhang, Yanfeng Wang, Weidi Xie
    arXiv 2023. Paper   Project  
    2023-01-12
    2023-01-12
  303. Speech Driven Video Editing via an Audio-Conditioned Diffusion Model
    Dan Bigioi, Shubhajit Basak, Hugh Jordan, Rachel McDonnell, Peter Corcoran
    arXiv 2023. Paper  
    2023-01-10
    2023-01-10
  304. DiffTalk: Crafting Diffusion Models for Generalized Talking Head Synthesis
    Shuai Shen, Wenliang Zhao, Zibin Meng, Wanhua Li, Zheng Zhu, Jie Zhou, Jiwen Lu
    arXiv 2023. Paper  
    2023-01-10
    2023-01-10
  305. Speech Driven Video Editing via an Audio-Conditioned Diffusion Model
    Dan Bigioi, Shubhajit Basak, Hugh Jordan, Rachel McDonnell, Peter Corcoran
    arXiv 2023. Paper   Project   Github  
    2023-01-10
    2023-01-10
  306. Visual Story Generation Based on Emotion and Keywords
    Yuetian Chen, Ruohua Li, Bowen Shi, Peiru Liu, Mei Si
    AIIDE INT 2022. Paper  
    2023-01-07
    2023-01-07
  307. Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation
    Michał Stypułkowski, Konstantinos Vougioukas, Sen He, Maciej Zięba, Stavros Petridis, Maja Pantic
    arXiv 2023. Paper   Project  
    2023-01-06
    2023-01-06
  308. Muse: Text-To-Image Generation via Masked Generative Transformers
    Huiwen Chang, Han Zhang, Jarred Barber, AJ Maschinot, Jose Lezama, Lu Jiang, Ming-Hsuan Yang, Kevin Murphy, William T. Freeman, Michael Rubinstein, Yuanzhen Li, Dilip Krishnan
    arXiv 2023. Paper   Project  
    2023-01-02
    2023-01-02
  309. Exploring Vision Transformers as Diffusion Learners
    He Cao, Jianan Wang, Tianhe Ren, Xianbiao Qi, Yihao Chen, Yuan Yao, Lei Zhang
    arXiv 2022. Paper  
    2022-12-28
    2022-12-28
  310. Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models
    Jiale Xu, Xintao Wang, Weihao Cheng, Yan-Pei Cao, Ying Shan, Xiaohu Qie, Shenghua Gao
    CVPR 2023. Paper   Project  
    2022-12-28
    2022-12-28
  311. Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
    Jay Zhangjie Wu, Yixiao Ge, Xintao Wang, Weixian Lei, Yuchao Gu, Wynne Hsu, Ying Shan, Xiaohu Qie, Mike Zheng Shou
    arXiv 2022. Paper   Project  
    2022-12-22
    2022-12-22
  312. Contrastive Language-Vision AI Models Pretrained on Web-Scraped Multimodal Data Exhibit Sexual Objectification Bias
    Robert Wolfe, Yiwei Yang, Bill Howe, Aylin Caliskan
    arXiv 2022. Paper  
    2022-12-21
    2022-12-21
  313. Optimizing Prompts for Text-to-Image Generation
    Yaru Hao, Zewen Chi, Li Dong, Furu Wei
    arXiv 2022. Paper   Project   Github  
    2022-12-19
    2022-12-19
  314. Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models
    Qiucheng Wu, Yujian Liu, Handong Zhao, Ajinkya Kale, Trung Bui, Tong Yu, Zhe Lin, Yang Zhang, Shiyu Chang
    arXiv 2022. Paper   Github  
    2022-12-16
    2022-12-16
  315. TeTIm-Eval: a novel curated evaluation data set for comparing text-to-image models
    Federico A. Galatolo, Mario G. C. A. Cimino, Edoardo Cogotti
    arXiv 2022. Paper  
    2022-12-15
    2022-12-15
  316. The Infinite Index: Information Retrieval on Generative Text-To-Image Models
    Niklas Deckers, Maik Fröbe, Johannes Kiesel, Gianluca Pandolfo, Christopher Schröder, Benno Stein, Martin Potthast
    CHIIR 2023. Paper  
    2022-12-14
    2022-12-14
  317. Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting
    Su Wang, Chitwan Saharia, Ceslee Montgomery, Jordi Pont-Tuset, Shai Noy, Stefano Pellegrini, Yasumasa Onoe, Sarah Laszlo, David J. Fleet, Radu Soricut, Jason Baldridge, Mohammad Norouzi, Peter Anderson, William Chan
    CVPR 2023. Paper  
    2022-12-13
    2022-12-13
  318. LidarCLIP or: How I Learned to Talk to Point Clouds
    Georg Hess, Adam Tonderski, Christoffer Petersson, Lennart Svensson, Kalle Åström
    arXiv 2022. Paper   Github  
    2022-12-13
    2022-12-13
  319. The Stable Artist: Steering Semantics in Diffusion Latent Space
    Manuel Brack, Patrick Schramowski, Felix Friedrich, Dominik Hintersdorf, Kristian Kersting
    arXiv 2022. Paper  
    2022-12-12
    2022-12-12
  320. Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
    Weixi Feng, Xuehai He, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, Xin Eric Wang, William Yang Wang
    ICLR 2023. Paper   Github  
    2022-12-09
    2022-12-09
  321. SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model
    Shaoan Xie, Zhifei Zhang, Zhe Lin, Tobias Hinz, Kun Zhang
    arXiv 2022. Paper  
    2022-12-09
    2022-12-09
  322. Executing your Commands via Motion Diffusion in Latent Space
    Xin Chen, Biao Jiang, Wen Liu, Zilong Huang, Bin Fu, Tao Chen, Jingyi Yu, Gang Yu
    arXiv 2022. Paper   Project  
    2022-12-08
    2022-12-08
  323. Diffusion Guided Domain Adaptation of Image Generators
    Kunpeng Song, Ligong Han, Bingchen Liu, Dimitris Metaxas, Ahmed Elgammal
    arXiv 2022. Paper   Project  
    2022-12-08
    2022-12-08
  324. Multi-Concept Customization of Text-to-Image Diffusion
    Nupur Kumari, Bingliang Zhang, Richard Zhang, Eli Shechtman, Jun-Yan Zhu
    arXiv 2022. Paper   Project  
    2022-12-08
    2022-12-08
  325. SINE: SINgle Image Editing with Text-to-Image Diffusion Models
    Zhixing Zhang, Ligong Han, Arnab Ghosh, Dimitris Metaxas, Jian Ren
    arXiv 2022. Paper   Project   Github  
    2022-12-08
    2022-12-08
  326. SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation
    Yen-Chi Cheng, Hsin-Ying Lee, Sergey Tulyakov, Alexander Schwing, Liangyan Gui
    arXiv 2022. Paper   Project  
    2022-12-08
    2022-12-08
  327. MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis
    Rishabh Dabral, Muhammad Hamza Mughal, Vladislav Golyanik, Christian Theobalt
    arXiv 2022. Paper   Project  
    2022-12-08
    2022-12-08
  328. Judge, Localize, and Edit: Ensuring Visual Commonsense Morality for Text-to-Image Generation
    Seongbeom Park, Suhong Moon, Jinkyu Kim
    arXiv 2022. Paper  
    2022-12-07
    2022-12-07
  329. Magic: Multi Art Genre Intelligent Choreography Dataset and Network for 3D Dance Generation
    Ronghui Li, Junfan Zhao, Yachao Zhang, Mingyang Su, Zeping Ren, Han Zhang, Xiu Li
    arXiv 2022. Paper  
    2022-12-07
    2022-12-07
  330. Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors
    Zhentao Yu, Zixin Yin, Deyu Zhou, Duomin Wang, Finn Wong, Baoyuan Wang
    arXiv 2022. Paper   Project  
    2022-12-07
    2022-12-07
  331. Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding
    Gyeongman Kim, Hajin Shim, Hyunsu Kim, Yunjey Choi, Junho Kim, Eunho Yang
    CVPR 2023. Paper   Project   Github  
    2022-12-06
    2022-12-06
  332. M-VADER: A Model for Diffusion with Multimodal Context
    Samuel Weinbach, Marco Bellagente, Constantin Eichenberg, Andrew Dai, Robert Baldock, Souradeep Nanda, Björn Deiseroth, Koen Oostermeijer, Hannah Teufel, Andres Felipe Cruz-Salinas
    arXiv 2022. Paper  
    2022-12-06
    2022-12-06
  333. ADIR: Adaptive Diffusion for Image Reconstruction
    Shady Abu-Hussein, Tom Tirer, Raja Giryes
    arXiv 2022. Paper   Project  
    2022-12-06
    2022-12-06
  334. Diffusion-SDF: Text-to-Shape via Voxelized Diffusion
    Muheng Li, Yueqi Duan, Jie Zhou, Jiwen Lu
    CVPR 2023. Paper   Project   Github  
    2022-12-06
    2022-12-06
  335. Semantic-Conditional Diffusion Networks for Image Captioning
    Jianjie Luo, Yehao Li, Yingwei Pan, Ting Yao, Jianlin Feng, Hongyang Chao, Tao Mei
    CVPR 2023. Paper   Github  
    2022-12-06
    2022-12-06
  336. NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image Priors
    Congyue Deng, Chiyu "Max'' Jiang, Charles R. Qi, Xinchen Yan, Yin Zhou, Leonidas Guibas, Dragomir Anguelov
    arXiv 2022. Paper  
    2022-12-06
    2022-12-06
  337. Shape-Guided Diffusion with Inside-Outside Attention
    Dong Huk Park, Grace Luo, Clayton Toste, Samaneh Azadi, Xihui Liu, Maka Karalashvili, Anna Rohrbach, Trevor Darrell
    arXiv 2022. Paper   Project  
    2022-12-01
    2022-12-01
  338. Unite and Conquer: Cross Dataset Multimodal Synthesis using Diffusion Models
    Nithin Gopalakrishnan Nair, Wele Gedara Chaminda Bandara, Vishal M. Patel
    arXiv 2022. Paper   Project  
    2022-12-01
    2022-12-01
  339. DATID-3D: Diversity-Preserved Domain Adaptation Using Text-to-Image Diffusion for 3D Generative Model
    Gwanghyun Kim, Se Young Chun
    CVPR 2023. Paper   Github  
    2022-11-29
    2022-11-29
  340. SinDDM: A Single Image Denoising Diffusion Model
    Vladimir Kulikov, Shahar Yadin, Matan Kleiner, Tomer Michaeli
    arXiv 2022. Paper   Project  
    2022-11-29
    2022-11-29
  341. Refined Semantic Enhancement towards Frequency Diffusion for Video Captioning
    Xian Zhong, Zipeng Li, Shuqin Chen, Kui Jiang, Chen Chen, Mang Ye
    arXiv 2022. Paper   Github  
    2022-11-28
    2022-11-28
  342. Unified Discrete Diffusion for Simultaneous Vision-Language Generation
    Minghui Hu, Chuanxia Zheng, Heliang Zheng, Tat-Jen Cham, Chaoyue Wang, Zuopeng Yang, Dacheng Tao, Ponnuthurai N. Suganthan
    arXiv 2022. Paper  
    2022-11-27
    2022-11-27
  343. SpaText: Spatio-Textual Representation for Controllable Image Generation
    Omri Avrahami, Thomas Hayes, Oran Gafni, Sonal Gupta, Yaniv Taigman, Devi Parikh, Dani Lischinski, Ohad Fried, Xi Yin
    CVPR 2023. Paper   Project  
    2022-11-25
    2022-11-25
  344. 3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models
    Gang Li, Heliang Zheng, Chaoyue Wang, Chang Li, Changwen Zheng, Dacheng Tao
    arXiv 2022. Paper  
    2022-11-25
    2022-11-25
  345. Shifted Diffusion for Text-to-image Generation
    Yufan Zhou, Bingchen Liu, Yizhe Zhu, Xiao Yang, Changyou Chen, Jinhui Xu
    CVPR 2023. Paper  
    2022-11-24
    2022-11-24
  346. Sketch-Guided Text-to-Image Diffusion Models
    Andrey Voynov, Kfir Aberman, Daniel Cohen-Or
    arXiv 2022. Paper   Project  
    2022-11-24
    2022-11-24
  347. Schrödinger's Bat: Diffusion Models Sometimes Generate Polysemous Words in Superposition
    Jennifer C. White, Ryan Cotterell
    arXiv 2022. Paper  
    2022-11-23
    2022-11-23
  348. Make-A-Story: Visual Memory Conditioned Consistent Story Generation
    Tanzila Rahman, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Shweta Mahajan, Leonid Sigal
    CVPR 2023. Paper  
    2022-11-23
    2022-11-23
  349. SinDiffusion: Learning a Diffusion Model from a Single Natural Image
    Weilun Wang, Jianmin Bao, Wengang Zhou, Dongdong Chen, Dong Chen, Lu Yuan, Houqiang Li
    arXiv 2022. Paper   Github  
    2022-11-22
    2022-11-22
  350. Human Evaluation of Text-to-Image Models on a Multi-Task Benchmark
    Vitali Petsiuk, Alexander E. Siemenn, Saisamrit Surbehera, Zad Chin, Keith Tyser, Gregory Hunter, Arvind Raghavan, Yann Hicke, Bryan A. Plummer, Ori Kerret, Tonio Buonassisi, Kate Saenko, Armando Solar-Lezama, Iddo Drori
    NeurIPS Workshop 2022. Paper  
    2022-11-22
    2022-11-22
  351. Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation
    Narek Tumanyan, Michal Geyer, Shai Bagon, Tali Dekel
    CVPR 2023. Paper   Github  
    2022-11-22
    2022-11-22
  352. EDICT: Exact Diffusion Inversion via Coupled Transformations
    Bram Wallace, Akash Gokul, Nikhil Naik
    arXiv 2022. Paper   Github  
    2022-11-22
    2022-11-22
  353. VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models
    Ajay Jain, Amber Xie, Pieter Abbeel
    arXiv 2022. Paper   Project  
    2022-11-21
    2022-11-21
  354. Investigating Prompt Engineering in Diffusion Models
    Sam Witteveen, Martin Andrews
    NeurIPS Workshop 2022. Paper  
    2022-11-21
    2022-11-21
  355. Exploring Discrete Diffusion Models for Image Captioning
    Zixin Zhu, Yixuan Wei, Jianfeng Wang, Zhe Gan, Zheng Zhang, Le Wang, Gang Hua, Lijuan Wang, Zicheng Liu, Han Hu
    arXiv 2022. Paper   Github  
    2022-11-21
    2022-11-21
  356. SinFusion: Training Diffusion Models on a Single Image or Video
    Yaniv Nikankin, Niv Haim, Michal Irani
    arXiv 2022. Paper   Github  
    2022-11-21
    2022-11-21
  357. Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
    Xichen Pan, Pengda Qin, Yuhong Li, Hui Xue, Wenhu Chen
    arXiv 2022. Paper   Github  
    2022-11-20
    2022-11-20
  358. DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization
    Nisha Huang, Yuxin Zhang, Fan Tang, Chongyang Ma, Haibin Huang, Yong Zhang, Weiming Dong, Changsheng Xu
    arXiv 2022. Paper  
    2022-11-19
    2022-11-19
  359. Invariant Learning via Diffusion Dreamed Distribution Shifts
    Priyatham Kattakinda, Alexander Levine, Soheil Feizi
    arXiv 2022. Paper  
    2022-11-18
    2022-11-18
  360. Magic3D: High-Resolution Text-to-3D Content Creation
    Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, Tsung-Yi Lin
    CVPR 2023. Paper   Project  
    2022-11-18
    2022-11-18
  361. InstructPix2Pix: Learning to Follow Image Editing Instructions
    Tim Brooks, Aleksander Holynski, Alexei A. Efros
    CVPR 2023. Paper   Project   Github  
    2022-11-17
    2022-11-17
  362. Null-text Inversion for Editing Real Images using Guided Diffusion Model
    Ron Mokady, Amir Hertz, Kfir Aberman, Yael Pritch, Daniel Cohen-Or
    arXiv 2022. Paper  
    2022-11-17
    2022-11-17
  363. Direct Inversion: Optimization-Free Text-Driven Real Image Editing with Diffusion Models
    Adham Elarabawy, Harish Kamath, Samuel Denton
    arXiv 2022. Paper  
    2022-11-15
    2022-11-15
  364. Versatile Diffusion: Text, Images and Variations All in One Diffusion Model
    Xingqian Xu, Zhangyang Wang, Eric Zhang, Kai Wang, Humphrey Shi
    arXiv 2022. Paper   Github  
    2022-11-15
    2022-11-15
  365. Arbitrary Style Guidance for Enhanced Diffusion-Based Text-to-Image Generation
    Zhihong Pan, Xin Zhou, Hao Tian
    WACV 2023. Paper  
    2022-11-14
    2022-11-14
  366. Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models
    Patrick Schramowski, Manuel Brack, Björn Deiseroth, Kristian Kersting
    CVPR 2023. Paper   Github  
    2022-11-09
    2022-11-09
  367. Rickrolling the Artist: Injecting Invisible Backdoors into Text-Guided Image Generation Models
    Lukas Struppek, Dominik Hintersdorf, Kristian Kersting
    arXiv 2022. Paper   Github  
    2022-11-04
    2022-11-04
  368. eDiffi: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
    Yogesh Balaji, Seungjun Nah, Xun Huang, Arash Vahdat, Jiaming Song, Karsten Kreis, Miika Aittala, Timo Aila, Samuli Laine, Bryan Catanzaro, Tero Karras, Ming-Yu Liu
    arXiv 2022. Paper   Github  
    2022-11-02
    2022-11-02
  369. UPainting: Unified Text-to-Image Diffusion Generation with Cross-modal Guidance
    Wei Li, Xue Xu, Xinyan Xiao, Jiachen Liu, Hu Yang, Guohao Li, Zhanpeng Wang, Zhifan Feng, Qiaoqiao She, Yajuan Lyu, Hua Wu
    arXiv 2022. Paper  
    2022-10-28
    2022-10-28
  370. MagicMix: Semantic Mixing with Diffusion Models
    Jun Hao Liew, Hanshu Yan, Daquan Zhou, Jiashi Feng
    arXiv 2022. Paper   Project  
    2022-10-28
    2022-10-28
  371. ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts
    Zhida Feng, Zhenyu Zhang, Xintong Yu, Yewei Fang, Lanxin Li, Xuyi Chen, Yuxiang Lu, Jiaxiang Liu, Weichong Yin, Shikun Feng, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang
    CVPR 2023. Paper  
    2022-10-27
    2022-10-27
  372. How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions?
    Hritik Bansal, Da Yin, Masoud Monajatipoor, Kai-Wei Chang
    EMNLP 2022. Paper   Github  
    2022-10-27
    2022-10-27
  373. DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models
    Zijie J. Wang, Evan Montoya, David Munechika, Haoyang Yang, Benjamin Hoover, Duen Horng Chau
    arXiv 2022. Paper   Project   Github  
    2022-10-26
    2022-10-26
  374. Lafite2: Few-shot Text-to-Image Generation
    Yufan Zhou, Chunyuan Li, Changyou Chen, Jianfeng Gao, Jinhui Xu
    arXiv 2022. Paper  
    2022-10-25
    2022-10-25
  375. High-Resolution Image Editing via Multi-Stage Blended Diffusion
    Johannes Ackermann, Minjun Li
    NeurIPS Workshop 2022. Paper   Github  
    2022-10-24
    2022-10-24
  376. A Visual Tour Of Current Challenges In Multimodal Language Models
    Shashank Sonkar, Naiming Liu, Richard G. Baraniuk
    arXiv 2022. Paper  
    2022-10-22
    2022-10-22
  377. Conditional Diffusion with Less Explicit Guidance via Model Predictive Control
    Max W. Shen, Ehsan Hajiramezanali, Gabriele Scalia, Alex Tseng, Nathaniel Diamant, Tommaso Biancalani, Andreas Loukas
    arXiv 2022. Paper  
    2022-10-21
    2022-10-21
  378. Diffusion Models already have a Semantic Latent Space
    Mingi Kwon, Jaeseok Jeong, Youngjung Uh
    ICLR 2023. Paper   Project  
    2022-10-20
    2022-10-20
  379. DiffEdit: Diffusion-based semantic image editing with mask guidance
    Guillaume Couairon, Jakob Verbeek, Holger Schwenk, Matthieu Cord
    ICLR 2023. Paper  
    2022-10-20
    2022-10-20
  380. Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for Text-to-Image Generation
    Ruijun Li, Weihua Li, Yi Yang, Hanyu Wei, Jianhua Jiang, Quan Bai
    arXiv 2022. Paper  
    2022-10-18
    2022-10-18
  381. UniTune: Text-Driven Image Editing by Fine Tuning an Image Generation Model on a Single Image
    Dani Valevski, Matan Kalman, Yossi Matias, Yaniv Leviathan
    arXiv 2022. Paper  
    2022-10-18
    2022-10-18
  382. Imagic: Text-Based Real Image Editing with Diffusion Models
    Bahjat Kawar, Shiran Zada, Oran Lang, Omer Tov, Huiwen Chang, Tali Dekel, Inbar Mosseri, Michal Irani
    CVPR 2023. Paper   Project  
    2022-10-17
    2022-10-17
  383. Leveraging Off-the-shelf Diffusion Model for Multi-attribute Fashion Image Manipulation
    Chaerin Kong, DongHyeon Jeon, Ohjoon Kwon, Nojun Kwak
    WACV 2022. Paper  
    2022-10-12
    2022-10-12
  384. Unifying Diffusion Models' Latent Space, with Applications to CycleDiffusion and Guidance
    Chen Henry Wu, Fernando De la Torre
    arXiv 2022. Paper   Github-1   Github-2  
    2022-10-11
    2022-10-11
  385. clip2latent: Text driven sampling of a pre-trained StyleGAN using denoising diffusion and CLIP
    Justin N. M. Pinkney, Chuan Li
    BMVC 2022. Paper   Github  
    2022-10-05
    2022-10-05
  386. LDEdit: Towards Generalized Text Guided Image Manipulation via Latent Diffusion Models
    Paramanand Chandramouli, Kanchana Vaishnavi Gandikota
    BMVC 2022. Paper  
    2022-10-05
    2022-10-05
  387. DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics
    Ivan Kapelyukh, Vitalis Vosylius, Edward Johns
    IEEE RA-L 2022. Paper  
    2022-10-05
    2022-10-05
  388. Imagen Video: High Definition Video Generation with Diffusion Models
    Jonathan Ho, William Chan, Chitwan Saharia, Jay Whang, Ruiqi Gao, Alexey Gritsenko, Diederik P. Kingma, Ben Poole, Mohammad Norouzi, David J. Fleet, Tim Salimans
    arXiv 2022. Paper  
    2022-10-05
    2022-10-05
  389. Membership Inference Attacks Against Text-to-image Generation Models
    Yixin Wu, Ning Yu, Zheng Li, Michael Backes, Yang Zhang
    arXiv 2022. Paper  
    2022-10-03
    2022-10-03
  390. Creative Painting with Latent Diffusion Models
    Xianchao Wu
    arXiv 2022. Paper  
    2022-09-29
    2022-09-29
  391. Re-Imagen: Retrieval-Augmented Text-to-Image Generator
    Wenhu Chen, Hexiang Hu, Chitwan Saharia, William W. Cohen
    arXiv 2022. Paper  
    2022-09-29
    2022-09-29
  392. DreamFusion: Text-to-3D using 2D Diffusion
    Ben Poole, Ajay Jain, Jonathan T. Barron, Ben Mildenhall
    arXiv 2022. Paper   Github  
    2022-09-29
    2022-09-29
  393. Make-A-Video: Text-to-Video Generation without Text-Video Data
    Uriel Singer, Adam Polyak, Thomas Hayes, Xi Yin, Jie An, Songyang Zhang, Qiyuan Hu, Harry Yang, Oron Ashual, Oran Gafni, Devi Parikh, Sonal Gupta, Yaniv Taigman
    arXiv 2022. Paper  
    2022-09-29
    2022-09-29
  394. Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion
    Nisha Huang, Fan Tang, Weiming Dong, Changsheng Xu
    ACM MM 2022. Paper   Github  
    2022-09-27
    2022-09-27
  395. Personalizing Text-to-Image Generation via Aesthetic Gradients
    Victor Gallego
    NeurIPS Workshop 2022. Paper   Github  
    2022-09-25
    2022-09-25
  396. Best Prompts for Text-to-Image Models and How to Find Them
    Nikita Pavlichenko, Dmitry Ustalov
    NeurIPS Workshop 2022. Paper  
    2022-09-23
    2022-09-23
  397. The Biased Artist: Exploiting Cultural Biases via Homoglyphs in Text-Guided Image Generation Models
    Lukas Struppek, Dominik Hintersdorf, Kristian Kersting
    arXiv 2022. Paper   Github  
    2022-09-19
    2022-09-19
  398. Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models
    Chen Henry Wu, Saman Motamed, Shaunak Srivastava, Fernando De la Torre
    NeurIPS 2022. Paper   Github  
    2022-09-14
    2022-09-14
  399. ISS: Image as Stepping Stone for Text-Guided 3D Shape Generation
    Zhengzhe Liu, Peng Dai, Ruihui Li, Xiaojuan Qi, Chi-Wing Fu
    ICLR 2023. Paper   Github  
    2022-09-09
    2022-09-09
  400. DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
    Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, Kfir Aberman
    CVPR 2023. Paper   Project   Github  
    2022-08-25
    2022-08-25
  401. Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models
    Robin Rombach, Andreas Blattmann, Björn Ommer
    arXiv 2022. Paper   Github  
    2022-07-26
    2022-07-26
  402. Discrete Contrastive Diffusion for Cross-Modal and Conditional Generation
    Ye Zhu, Yu Wu, Kyle Olszewski, Jian Ren, Sergey Tulyakov, Yan Yan
    ICLR 2023. Paper   Github  
    2022-06-15
    2022-06-15
  403. Blended Latent Diffusion
    Omri Avrahami, Ohad Fried, Dani Lischinski
    ACM 2022. Paper   Project   Github  
    2022-06-06
    2022-06-06
  404. Compositional Visual Generation with Composable Diffusion Models
    Nan Liu, Shuang Li, Yilun Du, Antonio Torralba, Joshua B. Tenenbaum
    ECCV 2022. Paper   Project   Github  
    2022-06-03
    2022-06-03
  405. DiVAE: Photorealistic Images Synthesis with Denoising Diffusion Decoder
    Jie Shi, Chenfei Wu, Jian Liang, Xiang Liu, Nan Duan
    arXiv 2022. Paper  
    2022-06-01
    2022-06-01
  406. Text2Human: Text-Driven Controllable Human Image Generation
    Yuming Jiang, Shuai Yang, Haonan Qiu, Wayne Wu, Chen Change Loy, Ziwei Liu
    ACM 2022. Paper   Github  
    2022-05-31
    2022-05-31
  407. Improved Vector Quantized Diffusion Models
    Zhicong Tang, Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen
    arXiv 2022. Paper   Github  
    2022-05-31
    2022-05-31
  408. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
    Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, S. Sara Mahdavi, Rapha Gontijo Lopes, Tim Salimans, Jonathan Ho, David J Fleet, Mohammad Norouzi
    NeurIPS 2022. Paper   Github  
    2022-05-23
    2022-05-23
  409. Retrieval-Augmented Diffusion Models
    Andreas Blattmann, Robin Rombach, Kaan Oktay, Björn Ommer
    NeurIPS 2022. Paper   Github  
    2022-04-25
    2022-04-25
  410. Hierarchical Text-Conditional Image Generation with CLIP Latents
    Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, Mark Chen
    arXiv 2022. Paper   Github  
    2022-04-13
    2022-04-13
  411. KNN-Diffusion: Image Generation via Large-Scale Retrieval
    Oron Ashual, Shelly Sheynin, Adam Polyak, Uriel Singer, Oran Gafni, Eliya Nachmani, Yaniv Taigman
    ICLR 2023. Paper  
    2022-04-06
    2022-04-06
  412. High-Resolution Image Synthesis with Latent Diffusion Models
    Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, Björn Ommer
    CVPR 2022. Paper   Github  
    2021-12-20
    2021-12-20
  413. Tackling the Generative Learning Trilemma with Denoising Diffusion GANs
    Zhisheng Xiao, Karsten Kreis, Arash Vahdat
    ICLR 2022 (Spotlight). Paper   Project  
    2021-12-15
    2021-12-15
  414. More Control for Free! Image Synthesis with Semantic Diffusion Guidance
    Xihui Liu, Dong Huk Park, Samaneh Azadi, Gong Zhang, Arman Chopikyan, Yuxiao Hu, Humphrey Shi, Anna Rohrbach, Trevor Darrell
    WACV 2021. Paper   Project  
    2021-12-10
    2021-12-10
  415. Blended Diffusion for Text-driven Editing of Natural Images
    Omri Avrahami, Dani Lischinski, Ohad Fried
    CVPR 2022. Paper   Project   Github  
    2021-11-29
    2021-11-29
  416. Vector Quantized Diffusion Model for Text-to-Image Synthesis
    Shuyang Gu, Dong Chen, Jianmin Bao, Fang Wen, Bo Zhang, Dongdong Chen, Lu Yuan, Baining Guo
    CVPR 2022. Paper   Github  
    2021-11-29
    2021-11-29
  417. DiffusionCLIP: Text-guided Image Manipulation Using Diffusion Models
    Gwanghyun Kim, Jong Chul Ye
    CVPR 2022. Paper   Github  
    2021-10-06
    2021-10-06
Counts - 417   Back to top