首页 移动开发 iOS     /    Transformer-TTS:Pytorch实现的“基于变压器网络的神经语音合成”

Transformer-TTS:Pytorch实现的“基于变压器网络的神经语音合成”

上传者: weixin_42099906 | 上传时间:2016/5/8 12:34:51 | 文件大小:1.51MB | 文件类型:ZIP
Transformer-TTS:Pytorch实现的“基于变压器网络的神经语音合成”
变压器-TTSPytorch实现与众所周知的saco2seq模型(如tacotron)相比,该模型的训练速度快约3至4倍,并且合成语音的质量几乎相同。
通过实验确认,每步花费约0.5秒。
我没有使用波网声码器,而是使用tacotron的CBHG模型学习了后网络,并使用griffin-lim算法将频谱图转换为原始波。
要求安装python3安装pytorch==0.4.0安装要求:pipinstall-rrequirements.txt数据我使用了LJSpeech数据集,该数据集由成对的文本脚本和wav文件组成。
完整的数据集(13,100对)可在下载。
我将和用作预处理代码。
预训练模型您可以下载预训练的模型(AR模型为160K,Postnet为100K)在检查点/目录中找到预训练的模型。
留意图约15k步后出现对角线对齐。
以下留意图以16

文件下载

资源详情

[{"title":"(63个子文件1.51MB)Transformer-TTS:Pytorch实现的“基于变压器网络的神经语音合成”","children":[{"title":"Transformer-TTS-master","children":[{"title":".gitignore <span style='color:#111;'>119B</span>","children":null,"spread":false},{"title":"text","children":[{"title":"__init__.py <span style='color:#111;'>2.15KB</span>","children":null,"spread":false},{"title":"symbols.py <span style='color:#111;'>702B</span>","children":null,"spread":false},{"title":"cleaners.py <span style='color:#111;'>2.36KB</span>","children":null,"spread":false},{"title":"numbers.py <span style='color:#111;'>2.09KB</span>","children":null,"spread":false},{"title":"cmudict.py <span style='color:#111;'>1.91KB</span>","children":null,"spread":false}],"spread":true},{"title":"requirements.txt <span style='color:#111;'>106B</span>","children":null,"spread":false},{"title":"hyperparams.py <span style='color:#111;'>742B</span>","children":null,"spread":false},{"title":"synthesis.py <span style='color:#111;'>2.10KB</span>","children":null,"spread":false},{"title":"train_postnet.py <span style='color:#111;'>2.00KB</span>","children":null,"spread":false},{"title":"network.py <span style='color:#111;'>6.29KB</span>","children":null,"spread":false},{"title":"samples","children":[{"title":"test.wav <span style='color:#111;'>428.67KB</span>","children":null,"spread":false}],"spread":true},{"title":"LICENSE <span style='color:#111;'>1.04KB</span>","children":null,"spread":false},{"title":"train_transformer.py <span style='color:#111;'>3.79KB</span>","children":null,"spread":false},{"title":"utils.py <span style='color:#111;'>4.12KB</span>","children":null,"spread":false},{"title":"png","children":[{"title":"mel_original.png <span style='color:#111;'>54.67KB</span>","children":null,"spread":false},{"title":"attention_encoder","children":[{"title":"attention_enc_0_2.png <span style='color:#111;'>395B</span>","children":null,"spread":false},{"title":"attention_enc_0_3.png <span style='color:#111;'>529B</span>","children":null,"spread":false},{"title":"attention_enc_1_3.png <span style='color:#111;'>5.88KB</span>","children":null,"spread":false},{"title":"attention_enc_1_0.png <span style='color:#111;'>780B</span>","children":null,"spread":false},{"title":"attention_enc_0_0.png <span style='color:#111;'>658B</span>","children":null,"spread":false},{"title":"attention_enc_2_1.png <span style='color:#111;'>4.10KB</span>","children":null,"spread":false},{"title":"attention_enc_2_3.png <span style='color:#111;'>1.19KB</span>","children":null,"spread":false},{"title":"attention_enc_2_0.png <span style='color:#111;'>6.39KB</span>","children":null,"spread":false},{"title":"attention_enc_1_2.png <span style='color:#111;'>6.13KB</span>","children":null,"spread":false},{"title":"attention_enc_2_2.png <span style='color:#111;'>4.75KB</span>","children":null,"spread":false},{"title":"attention_enc_1_1.png <span style='color:#111;'>6.49KB</span>","children":null,"spread":false},{"title":"attention_enc_0_1.png <span style='color:#111;'>758B</span>","children":null,"spread":false}],"spread":false},{"title":"mel_pred.png <span style='color:#111;'>49.92KB</span>","children":null,"spread":false},{"title":"attention","children":[{"title":"attention_0_2.png <span style='color:#111;'>28.26KB</span>","children":null,"spread":false},{"title":"attention_1_0.png <span style='color:#111;'>4.62KB</span>","children":null,"spread":false},{"title":"attention_0_0.png <span style='color:#111;'>23.06KB</span>","children":null,"spread":false},{"title":"attention_0_3.png <span style='color:#111;'>14.14KB</span>","children":null,"spread":false},{"title":"attention_2_1.png <span style='color:#111;'>15.48KB</span>","children":null,"spread":false},{"title":"attention_2_3.png <span style='color:#111;'>18.86KB</span>","children":null,"spread":false},{"title":"attention_0_1.png <span style='color:#111;'>6.28KB</span>","children":null,"spread":false},{"title":"attention_2_2.png <span style='color:#111;'>6.19KB</span>","children":null,"spread":false},{"title":"attention_1_2.png <span style='color:#111;'>4.45KB</span>","children":null,"spread":false},{"title":"attention_1_1.png <span style='color:#111;'>16.70KB</span>","children":null,"spread":false},{"title":"attention_1_3.png <span style='color:#111;'>17.08KB</span>","children":null,"spread":false},{"title":"attention_2_0.png <span style='color:#111;'>13.11KB</span>","children":null,"spread":false}],"spread":false},{"title":"training_loss.png <span style='color:#111;'>113.36KB</span>","children":null,"spread":false},{"title":"attention_decoder","children":[{"title":"attention_dec_1_0.png <span style='color:#111;'>14.64KB</span>","children":null,"spread":false},{"title":"attention_dec_0_1.png <span style='color:#111;'>5.77KB</span>","children":null,"spread":false},{"title":"attention_dec_1_2.png <span style='color:#111;'>19.01KB</span>","children":null,"spread":false},{"title":"attention_dec_0_3.png <span style='color:#111;'>15.43KB</span>","children":null,"spread":false},{"title":"attention_dec_2_0.png <span style='color:#111;'>16.10KB</span>","children":null,"spread":false},{"title":"attention_dec_0_2.png <span style='color:#111;'>10.54KB</span>","children":null,"spread":false},{"title":"attention_dec_2_1.png <span style='color:#111;'>17.53KB</span>","children":null,"spread":false},{"title":"attention_dec_1_3.png <span style='color:#111;'>18.53KB</span>","children":null,"spread":false},{"title":"attention_dec_2_2.png <span style='color:#111;'>16.63KB</span>","children":null,"spread":false},{"title":"attention_dec_0_0.png <span style='color:#111;'>16.89KB</span>","children":null,"spread":false},{"title":"attention_dec_1_1.png <span style='color:#111;'>15.70KB</span>","children":null,"spread":false},{"title":"attention_dec_2_3.png <span style='color:#111;'>19.65KB</span>","children":null,"spread":false}],"spread":false},{"title":"model.png <span style='color:#111;'>137.17KB</span>","children":null,"spread":false},{"title":"attention_encoder.gif <span style='color:#111;'>33.84KB</span>","children":null,"spread":false},{"title":"attention.gif <span style='color:#111;'>167.15KB</span>","children":null,"spread":false},{"title":"alphas.png <span style='color:#111;'>98.99KB</span>","children":null,"spread":false},{"title":"attention_decoder.gif <span style='color:#111;'>326.38KB</span>","children":null,"spread":false}],"spread":false},{"title":"README.md <span style='color:#111;'>4.73KB</span>","children":null,"spread":false},{"title":"prepare_data.py <span style='color:#111;'>1.32KB</span>","children":null,"spread":false},{"title":"module.py <span style='color:#111;'>15.30KB</span>","children":null,"spread":false},{"title":"preprocess.py <span style='color:#111;'>5.41KB</span>","children":null,"spread":false}],"spread":false}],"spread":true}]

评论信息

免责申明

【好快吧下载】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【好快吧下载】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【好快吧下载】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,8686821#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明