-
UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed DataHeeseung Kim, Sungwon Kim, Jiheum Yeom, Sungroh YoonarXiv 2023. Paper  2023-06-282023-06-28
-
Diffusion Posterior Sampling for Informed Single-Channel DereverberationJean-Marie Lemercier, Simon Welker, Timo GerkmannarXiv 2023. Paper  2023-06-212023-06-21
-
Text-Driven Foley Sound Generation With Latent Diffusion ModelYi Yuan, Haohe Liu, Xubo Liu, Xiyuan Kang, Peipei Wu, Mark D. Plumbley, Wenwu WangarXiv 2023. Paper  2023-06-172023-06-17
-
CLIPSonic: Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision ModelsHao-Wen Dong, Xiaoyu Liu, Jordi Pons, Gautam Bhattacharya, Santiago Pascual, Joan Serrà, Taylor Berg-Kirkpatrick, Julian McAuleyarXiv 2023. Paper  2023-06-162023-06-16
-
Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesisShivam Mehta, Siyang Wang, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje HenterarXiv 2023. Paper  2023-06-152023-06-15
-
Variance-Preserving-Based Interpolation Diffusion Models for Speech EnhancementZilu Guo, Jun Du, Chin-Hui Lee, Yu Gao, Wenbin ZhangarXiv 2023. Paper  2023-06-142023-06-14
-
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language ModelsYinghao Aaron Li, Cong Han, Vinay S. Raghavan, Gavin Mischler, Nima MesgaraniarXiv 2023. Paper  2023-06-132023-06-13
-
UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and VocodingChenpeng Du, Yiwei Guo, Feiyu Shen, Zhijun Liu, Zheng Liang, Xie Chen, Shuai Wang, Hui Zhang, Kai YuarXiv 2023. Paper  2023-06-132023-06-13
-
HiddenSinger: High-Quality Singing Voice Synthesis via Neural Audio Codec and Latent Diffusion ModelsJi-Sang Hwang, Sang-Hoon Lee, Seong-Whan LeearXiv 2023. Paper  2023-06-122023-06-12
-
Boosting Fast and High-Quality Speech Synthesis with Linear DiffusionHaogeng Liu, Tao Wang, Jie Cao, Ran He, Jianhua TaoarXiv 2023. Paper  2023-06-092023-06-09
-
Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion BridgeWenhao Guan, Tao Li, Yishuang Li, Hukai Huang, Qingyang Hong, Lin LiInterspeech 2023. Paper  2023-06-072023-06-07
-
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive BiasZiyue Jiang, Yi Ren, Zhenhui Ye, Jinglin Liu, Chen Zhang, Qian Yang, Shengpeng Ji, Rongjie Huang, Chunfeng Wang, Xiang Yin, Zejun Ma, Zhou Zhao2023-06-062023-06-06
-
Zero-Shot Blind Audio Bandwidth ExtensionEloi Moliner, Filip Elvander, Vesa VälimäkiarXiv 2023. Paper  2023-06-022023-06-02
-
UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion ModelAnastasiia Iashchenko, Pavel Andreev, Ivan Shchekotov, Nicholas Babaev, Dmitry VetrovInterspeech 2023. Paper  2023-06-012023-06-01
-
EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech SynthesisHaobin Tang, Xulong Zhang, Jianzong Wang, Ning Cheng, Jing XiaoInterSpeech 2023. Paper  2023-06-012023-06-01
-
Make-An-Audio 2: Temporal-Enhanced Text-to-Audio GenerationJiawei Huang, Yi Ren, Rongjie Huang, Dongchao Yang, Zhenhui Ye, Chen Zhang, Jinglin Liu, Xiang Yin, Zejun Ma, Zhou ZhaoarXiv 2023. Paper  2023-05-292023-05-29
-
Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic ModelXiang Li, Songxiang Liu, Max W. Y. Lam, Zhiyong Wu, Chao Weng, Helen MengInterspeech 2023. Paper  2023-05-262023-05-26
-
DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice ConversionHa-Yeong Choi, Sang-Hoon Lee, Seong-Whan Lee2023-05-252023-05-25
-
Efficient Neural Music GenerationMax W. Y. Lam, Qiao Tian, Tang Li, Zongyu Yin, Siyuan Feng, Ming Tu, Yuliang Ji, Rui Xia, Mingbo Ma, Xuchen Song, Jitong Chen, Yuping Wang, Yuxuan Wang2023-05-252023-05-25
-
2023-05-24
-
FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion ModelsZiyue Jiang, Qian Yang, Jialong Zuo, Zhenhui Ye, Rongjie Huang, Yi Ren, Zhou Zhao2023-05-232023-05-23
-
ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based ModelsMinki Kang, Wooseok Han, Sung Ju Hwang, Eunho YangarXiv 2023. Paper  2023-05-232023-05-23
-
SE-Bridge: Speech Enhancement with Consistent Brownian BridgeZhibin Qiu, Mengfan Fu, Fuchun Sun, Gulila Altenbek, Hao HuangarXiv 2023. Paper  2023-05-232023-05-23
-
U-DiT TTS: U-Diffusion Vision Transformer for Text-to-SpeechXin Jing, Yi Chang, Zijiang Yang, Jiangjian Xie, Andreas Triantafyllopoulos, Bjoern W. Schuller2023-05-222023-05-22
-
DiffAVA: Personalized Text-to-Audio Generation with Visual AlignmentShentong Mo, Jing Shi, Yapeng TianarXiv 2023. Paper  2023-05-222023-05-22
-
ViT-TTS: Visual Text-to-Speech with Scalable Diffusion TransformerHuadai Liu, Rongjie Huang, Xuan Lin, Wenqiang Xu, Maozong Zheng, Hong Chen, Jinzheng He, Zhou Zhao2023-05-222023-05-22
-
Duplex Diffusion Models Improve Speech-to-Speech TranslationXianchao WuACL 2023. Paper  2023-05-222023-05-22
-
A Preliminary Study on Augmenting Speech Emotion Recognition using a Diffusion ModelIbrahim Malik, Siddique Latif, Raja Jurdak, Björn SchullerarXiv 2023. Paper  2023-05-192023-05-19
-
RMSSinger: Realistic-Music-Score based Singing Voice SynthesisJinzheng He, Jinglin Liu, Zhenhui Ye, Rongjie Huang, Chenye Cui, Huadai Liu, Zhou Zhao2023-05-182023-05-18
-
Diffusion-Based Speech Enhancement with Joint Generative and Predictive DecodersHao Shi, Kazuki Shimada, Masato Hirano, Takashi Shibuya, Yuichiro Koyama, Zhi Zhong, Shusuke Takahashi, Tatsuya Kawahara, Yuki MitsufujiarXiv 2023. Paper  2023-05-182023-05-18
-
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency ModelZhen Ye, Wei Xue, Xu Tan, Jie Chen, Qifeng Liu, Yike Guo2023-05-112023-05-11
-
Diffusion-based Signal Refiner for Speech SeparationMasato Hirano, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki MitsufujiarXiv 2023. Paper  2023-05-102023-05-10
-
Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion ModelDeepanway Ghosal, Navonil Majumder, Ambuj Mehrish, Soujanya Poria2023-04-242023-04-24
-
DiffVoice: Text-to-Speech with Latent DiffusionZhijun Liu, Yiwei Guo, Kai YuICASSP 2023. Paper  2023-04-232023-04-23
-
AUDIT: Audio Editing by Following Instructions with Latent Diffusion ModelsYuancheng Wang, Zeqian Ju, Xu Tan, Lei He, Zhizheng Wu, Jiang Bian, Sheng Zhao2023-04-032023-04-03
-
Data Augmentation for Environmental Sound Classification Using Diffusion Probabilistic Model with Top-k Selection DiscriminatorYunhao Chen, Yunjie Zhu, Zihui Yan, Jianlu Shen, Zhen Ren, Yifan Huang2023-03-272023-03-27
-
Enhancing Unsupervised Speech Recognition with Diffusion GANsXianchao WuICASSP 2023. Paper  2023-03-232023-03-23
-
Speech Signal Improvement Using Causal Generative Diffusion ModelsJulius Richter, Simon Welker, Jean-Marie Lemercier, Bunlong Lay, Tal Peer, Timo GerkmannICASSP 2023. Paper  2023-03-152023-03-15
-
2023-03-15
-
DiffuseRoll: Multi-track multi-category music generation based on diffusion modelHongfei WangarXiv 2023. Paper  2023-03-142023-03-14
-
An investigation into the adaptability of a diffusion-based TTS modelHaolin Chen, Philip N. GarnerarXiv 2023. Paper  2023-03-032023-03-03
-
Defending against Adversarial Audio via Diffusion ModelShutong Wu, Jiongxiao Wang, Wei Ping, Weili Nie, Chaowei Xiao2023-03-022023-03-02
-
Reducing the Prior Mismatch of Stochastic Differential Equations for Diffusion-based Speech EnhancementBunlong Lay, Simon Welker, Julius Richter, Timo GerkmannarXiv 2023. Paper  2023-02-282023-02-28
-
Imaginary Voice: Face-styled Diffusion Model for Text-to-SpeechJiyoung Lee, Joon Son Chung, Soo-Whan ChungICASSP 2023. Paper  2023-02-272023-02-27
-
Metric-oriented Speech Enhancement using Diffusion Probabilistic ModelChen Chen, Yuchen Hu, Weiwei Weng, Eng Siong ChngarXiv 2023. Paper  2023-02-232023-02-23
-
ERNIE-Music: Text-to-Waveform Music Generation with Diffusion ModelsPengfei Zhu, Chao Pang, Shuohuan Wang, Yekun Chai, Yu Sun, Hao Tian, Hua WuarXiv 2023. Paper  2023-02-092023-02-09
-
Noise2Music: Text-conditioned Music Generation with Diffusion ModelsQingqing Huang, Daniel S. Park, Tao Wang, Timo I. Denk, Andy Ly, Nanxin Chen, Zhengdong Zhang, Zhishuai Zhang, Jiahui Yu, Christian Frank, Jesse Engel, Quoc V. Le, William Chan, Wei Han2023-02-082023-02-08
-
Multi-Source Diffusion Models for Simultaneous Music Generation and SeparationGiorgio Mariani, Irene Tallini, Emilian Postolache, Michele Mancusi, Luca Cosmo, Emanuele Rodolà2023-02-042023-02-04
-
Multi-Source Diffusion Models for Simultaneous Music Generation and SeparationGiorgio Mariani, Irene Tallini, Emilian Postolache, Michele Mancusi, Luca Cosmo, Emanuele Rodolà2023-02-042023-02-04
-
InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style PromptDongchao Yang, Songxiang Liu, Rongjie Huang, Guangzhi Lei, Chao Weng, Helen Meng, Dong Yu2023-01-312023-01-31
-
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion ModelsRongjie Huang, Jiawei Huang, Dongchao Yang, Yi Ren, Luping Liu, Mingze Li, Zhenhui Ye, Jinglin Liu, Xiang Yin, Zhou Zhao2023-01-302023-01-30
-
AudioLDM: Text-to-Audio Generation with Latent Diffusion ModelsHaohe Liu, Zehua Chen, Yi Yuan, Xinhao Mei, Xubo Liu, Danilo Mandic, Wenwu Wang, Mark D. Plumbley2023-01-292023-01-29
-
Moûsai: Text-to-Music Generation with Long-Context Latent DiffusionFlavio Schneider, Zhijing Jin, Bernhard Schölkopf2023-01-272023-01-27
-
Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source SeparationShahar Lutati, Eliya Nachmani, Lior WolfarXiv 2023. Paper  2023-01-252023-01-25
-
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to SpeechZehua Chen, Yihan Wu, Yichong Leng, Jiawei Chen, Haohe Liu, Xu Tan, Yang Cui, Ke Wang, Lei He, Sheng Zhao, Jiang Bian, Danilo Mandic2022-12-302022-12-30
-
StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and DereverberationJean-Marie Lemercier, Julius Richter, Simon Welker, Timo GerkmannICASSP 2023. Paper  2022-12-222022-12-22
-
MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video GenerationLudan Ruan, Yiyang Ma, Huan Yang, Huiguo He, Bei Liu, Jianlong Fu, Nicholas Jing Yuan, Qin Jin, Baining Guo2022-12-192022-12-19
-
Text-to-speech synthesis based on latent variable conversion using diffusion probabilistic model and variational autoencoderYusuke Yasuda, Tomoki TodaICASSP 2023. Paper  2022-12-162022-12-16
-
Any-speaker Adaptive Text-To-Speech Synthesis with Diffusion ModelsMinki Kang, Dongchan Min, Sung Ju Hwang2022-11-172022-11-17
-
EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label GuidanceYiwei Guo, Chenpeng Du, Xie Chen, Kai Yu2022-11-172022-11-17
-
Unsupervised vocal dereverberation with diffusion-based generative modelsKoichi Saito, Naoki Murata, Toshimitsu Uesaka, Chieh-Hsin Lai, Yuhta Takida, Takao Fukui, Yuki MitsufujiICASSP 2023. Paper  2022-11-082022-11-08
-
DiffPhase: Generative Diffusion-based STFT Phase RetrievalTal Peer, Simon Welker, Timo GerkmannICASSP 2023. Paper  2022-11-082022-11-08
-
NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTSDongchao Yang, Songxiang Liu, Jianwei Yu, Helin Wang, Chao Weng, Yuexian ZouICASSP 2023. Paper  2022-11-042022-11-04
-
Cold Diffusion for Speech EnhancementHao Yen, François G. Germain, Gordon Wichern, Jonathan Le RouxICASSP 2023. Paper  2022-11-042022-11-04
-
Analysing Diffusion-based Generative Approaches versus Discriminative Approaches for Speech RestorationJean-Marie Lemercier, Julius Richter, Simon Welker, Timo Gerkmann2022-11-042022-11-04
-
SDMuse: Stochastic Differential Music Editing and Generation via Hybrid RepresentationChen Zhang, Yi Ren, Kejun Zhang, Shuicheng Yan2022-11-012022-11-01
-
Diffusion-based Generative Speech Source SeparationRobin Scheibler, Youna Ji, Soo-Whan Chung, Jaeuk Byun, Soyeon Choe, Min-Seok ChoiICASSP 2023. Paper  2022-10-312022-10-31
-
SRTNet: Time Domain Speech Enhancement Via Stochastic RefinementZhibin Qiu, Mengfan Fu, Yinfeng Yu, LiLi Yin, Fuchun Sun, Hao Huang2022-10-302022-10-30
-
A Versatile Diffusion-based Generative Refiner for Speech EnhancementRyosuke Sawata, Naoki Murata, Yuhta Takida, Toshimitsu Uesaka, Takashi Shibuya, Shusuke Takahashi, Yuki MitsufujiICASSP 2023. Paper  2022-10-272022-10-27
-
Conditioning and Sampling in Variational Diffusion Models for Speech Super-resolutionChin-Yun Yu, Sung-Lin Yeh, György Fazekas, Hao Tang2022-10-272022-10-27
-
Solving Audio Inverse Problems with a Diffusion ModelEloi Moliner, Jaakko Lehtinen, Vesa VälimäkiICASSP 2023. Paper  2022-10-272022-10-27
-
Full-band General Audio Synthesis with Score-based DiffusionSantiago Pascual, Gautam Bhattacharya, Chunghsin Yeh, Jordi Pons, Joan SerràarXiv 2022. Paper  2022-10-262022-10-26
-
TransFusion: Transcribing Speech with Multinomial DiffusionMatthew Baas, Kevin Eloff, Herman Kamper2022-10-142022-10-14
-
Hierarchical Diffusion Models for Singing Voice Neural VocoderNaoya Takahashi, Mayank Kumar, Singh, Yuki MitsufujiarXiv 2022. Paper  2022-10-142022-10-14
-
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point IterationYuma Koizumi, Kohei Yatabe, Heiga Zen, Michiel Bacchiani2022-10-032022-10-03
-
Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic Wasserstein GANYin-Ping Cho, Yu Tsao, Hsin-Min Wang, Yi-Wen Liu2022-09-212022-09-21
-
Instrument Separation of Symbolic Music by Explicitly Guided Diffusion ModelSangjun Han, Hyeongrae Ihm, DaeHan Ahn, Woohyung LimNeurIPS Workshop 2022. Paper  2022-09-052022-09-05
-
Speech Enhancement and Dereverberation with Diffusion-based Generative ModelsJulius Richter, Simon Welker, Jean-Marie Lemercier, Bunlong Lay, Timo Gerkmann2022-08-112022-08-11
-
DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive EvaluationDa-Yi Wu, Wen-Yi Hsiao, Fu-Rong Yang, Oscar Friedman, Warren Jackson, Scott Bruzenak, Yi-Wen Liu, Yi-Hsuan Yang2022-08-092022-08-09
-
Diffsound: Discrete Diffusion Model for Text-to-sound GenerationDongchao Yang, Jianwei Yu, Helin Wang, Wen Wang, Chao Weng, Yuexian Zou, Dong Yu2022-07-202022-07-20
-
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-SpeechRongjie Huang, Zhou Zhao, Huadai Liu, Jinglin Liu, Chenye Cui, Yi Ren2022-07-132022-07-13
-
NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling RatesSeungu Han, Junhyeok Lee2022-06-172022-06-17
-
CARD: Classification and Regression Diffusion ModelsXizewen Han, Huangjie Zheng, Mingyuan ZhouNeurIPS 2022. Paper  2022-06-152022-06-15
-
Adversarial Audio Synthesis with Complex-valued Polynomial NetworksYongtao Wu, Grigorios G Chrysos, Volkan CevherICML workshop 2022. Paper  2022-06-142022-06-14
-
Multi-instrument Music Synthesis with Spectrogram DiffusionCurtis Hawthorne, Ian Simon, Adam Roberts, Neil Zeghidour, Josh Gardner, Ethan Manilow, Jesse EngelISMIR 2022. Paper  2022-06-112022-06-11
-
Universal Speech Enhancement with Score-based DiffusionJoan Serrà, Santiago Pascual, Jordi Pons, R. Oguz Araz, Davide ScainiarXiv 2022. Paper  2022-06-072022-06-07
-
Zero-Shot Voice Conditioning for Denoising Diffusion TTS ModelsAlon Levkovitch, Eliya Nachmani, Lior Wolf2022-06-052022-06-05
-
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed DataSungwon Kim, Heeseung Kim, Sungroh Yoon2022-05-302022-05-30
-
BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio SynthesisYichong Leng, Zehua Chen, Junliang Guo, Haohe Liu, Jiawei Chen, Xu Tan, Danilo Mandic, Lei He, Xiang-Yang Li, Tao Qin, Sheng Zhao, Tie-Yan Liu2022-05-302022-05-30
-
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech SynthesisRongjie Huang, Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu, Yi Ren, Zhou Zhao2022-04-212022-04-21
-
Speech Enhancement with Score-Based Generative Models in the Complex STFT DomainSimon Welker, Julius Richter, Timo Gerkmann2022-03-312022-03-31
-
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral ShapingYuma Koizumi, Heiga Zen, Kohei Yatabe, Nanxin Chen, Michiel BacchianiInterspeech 2022. Paper  2022-03-312022-03-31
-
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech SynthesisMax W. Y. Lam, Jun Wang, Dan Su, Dong Yu2022-03-252022-03-25
-
Conditional Diffusion Probabilistic Model for Speech EnhancementYen-Ju Lu, Zhong-Qiu Wang, Shinji Watanabe, Alexander Richard, Cheng Yu, Yu Tsao2022-02-102022-02-10
-
InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in TrainingZehua Chen, Xu Tan, Ke Wang, Shifeng Pan, Danilo Mandic, Lei He, Sheng ZhaoICASSP 2022. Paper  2022-02-082022-02-08
-
ItôWave: Itô Stochastic Differential Equation Is All You Need For Wave GenerationShoule Wu, Ziqiang Shi2022-01-292022-01-29
-
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANsSongxiang Liu, Dan Su, Dong Yu2022-01-282022-01-28
-
Itô-Taylor Sampling Scheme for Denoising Diffusion Probabilistic Models using Ideal DerivativesHideyuki Tachibana, Mocho Go, Muneyoshi Inahara, Yotaro Katayama, Yotaro WatanabearXiv 2021. Paper  2021-12-262021-12-26
-
Guided-TTS:Text-to-Speech with Untranscribed SpeechHeeseung Kim, Sungwon Kim, Sungroh YoonICML 2021. Paper  2021-11-302021-11-30
-
Denoising Diffusion Gamma ModelsEliya Nachmani, Robin San Roman, Lior WolfarXiv 2021. Paper  2021-10-102021-10-10
-
EdiTTS: Score-based Editing for Controllable Text-to-SpeechJaesung Tae, Hyeongju Kim, Taesu Kim2021-10-062021-10-06
-
Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling SchemeVadim Popov, Ivan Vovk, Vladimir Gogoryan, Tasnima Sadekova, Mikhail Kudinov, Jiansheng Wei2021-09-282021-09-28
-
A Study on Speech Enhancement Based on Diffusion Probabilistic ModelYen-Ju Lu, Yu Tsao, Shinji WatanabeAPSIPA 2021. Paper  2021-07-252021-07-25
-
Variational Diffusion ModelsDiederik P. Kingma, Tim Salimans, Ben Poole, Jonathan Ho2021-07-012021-07-01
-
WaveGrad 2: Iterative Refinement for Text-to-Speech SynthesisNanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak, William Chan2021-06-172021-06-17
-
CRASH: Raw Audio Score-based Generative Modeling for Controllable High-resolution Drum Sound SynthesisSimon Rouard, Gaëtan Hadjeres2021-06-142021-06-14
-
PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Driven Adaptive PriorSang-gil Lee, Heeseung Kim, Chaehun Shin, Xu Tan, Chang Liu, Qi Meng, Tao Qin, Wei Chen, Sungroh Yoon, Tie-Yan Liu2021-06-112021-06-11
-
DiffSVC: A Diffusion Probabilistic Model for Singing Voice Conversion*Songxiang Liu, Yuewen Cao, Dan Su, Helen Meng2021-05-282021-05-28
-
ItôTTS and ItôWave: Linear Stochastic Differential Equation Is All You Need For Audio GenerationShoule Wu, Ziqiang Shi2021-05-172021-05-17
-
Grad-TTS: A Diffusion Probabilistic Model for Text-to-SpeechVadim Popov, Ivan Vovk, Vladimir Gogoryan, Tasnima Sadekova, Mikhail Kudinov2021-05-132021-05-13
-
DiffSinger: Singing Voice Synthesis via Shallow Diffusion MechanismJinglin Liu, Chengxi Li, Yi Ren, Feiyang Chen, Peng Liu, Zhou Zhao2021-05-062021-05-06
-
DiffSinger: Singing Voice Synthesis via Shallow Diffusion MechanismJinglin Liu, Chengxi Li, Yi Ren, Feiyang Chen, Peng Liu, Zhou Zhao2021-05-062021-05-06
-
Restoring degraded speech via a modified diffusion modelJianwei Zhang, Suren Jayasuriya, Visar BerishaInterspeech 2021. Paper  2021-04-222021-04-22
-
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling*Junhyeok Lee, Seungu Han2021-04-062021-04-06
-
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech*Myeonghun Jeong, Hyeongju Kim, Sung Jun Cheon, Byoung Jin Choi, Nam Soo KimInterspeech 2021. Paper  2021-04-032021-04-03
-
Symbolic Music Generation with Diffusion ModelsGautam Mittal, Jesse Engel, Curtis Hawthorne, Ian Simon2021-03-302021-03-30
-
DiffWave: A Versatile Diffusion Model for Audio SynthesisZhifeng Kong, Wei Ping, Jiaji Huang, Kexin Zhao, Bryan Catanzaro2020-09-212020-09-21
-
WaveGrad: Estimating Gradients for Waveform GenerationNanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, William Cha2020-09-022020-09-02
Counts - 118   Back to
top