首页 开发技术 其它     /    DistributedDeepLearning:关于在BatchAI上运行分布式深度学习的教程-源码

DistributedDeepLearning:关于在BatchAI上运行分布式深度学习的教程-源码

上传者: weixin_42127748 | 上传时间:2023/7/28 19:18:56 | 文件大小:437KB | 文件类型:ZIP
DistributedDeepLearning:关于在BatchAI上运行分布式深度学习的教程-源码
培训关于批处理AI的分布式培训此仓库是有关如何使用BatchAI以分布式方式训练CNN模型的教程。
涵盖的场景是图像分类,但是该解决方案可以推广到其他深度学习场景,例如分段和对象检测。
图像分类是计算机视觉应用中的常见任务,通常通过训练卷积神经网络(CNN)来解决。
对于具有大型数据集的大型模型,单个GPU的训练过程可能需要数周或数月。
在某些情况下,模型太大,以致于无法在GPU上放置合理的批处理大小。
在这些情况下使用分布式培训有助于缩短培训时间。
在此特定方案中,使用Horovod在ImageNet数据集以及合成数据上训练ResNet50CNN模型。
本教程演示了如何使用三个最受欢迎的深度学习框架来完成此任务:TensorFlow,Keras和PyTorch。
有许多方法可以以分布式方式训练深度学习模型,包括数据同步和基于同步和异步更新的模型并行方法。
当前,最常见的场景是与同步更新并行的数据-这是最容易实现的,并且对于大多数用例而言已经足够。
在具有同步更新的数据并行分布式训练中,该模型在N个硬件设备之间复制,并且一小批训练样本被划分为N个微批次(参见图2)。
每个设备都

文件下载

资源详情

[{"title":"(31个子文件437KB)DistributedDeepLearning:关于在BatchAI上运行分布式深度学习的教程-源码","children":[{"title":"DistributedDeepLearning-master","children":[{"title":"HorovodTF","children":[{"title":"01_TrainTensorflowModel.ipynb <span style='color:#111;'>12.32KB</span>","children":null,"spread":false},{"title":"src","children":[{"title":"imagenet_estimator_tf_horovod.py <span style='color:#111;'>13.40KB</span>","children":null,"spread":false},{"title":"resnet_model.py <span style='color:#111;'>13.18KB</span>","children":null,"spread":false}],"spread":true},{"title":"Docker","children":[{"title":"Dockerfile <span style='color:#111;'>2.26KB</span>","children":null,"spread":false}],"spread":true},{"title":"00_CreateImageAndTest.ipynb <span style='color:#111;'>5.60KB</span>","children":null,"spread":false}],"spread":true},{"title":".gitignore <span style='color:#111;'>1.17KB</span>","children":null,"spread":false},{"title":"images","children":[{"title":"dist_training_diag2.png <span style='color:#111;'>65.44KB</span>","children":null,"spread":false}],"spread":true},{"title":"00_DataProcessing.ipynb <span style='color:#111;'>4.20KB</span>","children":null,"spread":false},{"title":"Makefile <span style='color:#111;'>1.18KB</span>","children":null,"spread":false},{"title":"HorovodKeras","children":[{"title":"src","children":[{"title":"imagenet_keras_horovod.py <span style='color:#111;'>11.71KB</span>","children":null,"spread":false},{"title":"data_generator.py <span style='color:#111;'>1.80KB</span>","children":null,"spread":false}],"spread":true},{"title":"01_TrainKerasModel.ipynb <span style='color:#111;'>12.28KB</span>","children":null,"spread":false},{"title":"Docker","children":[{"title":"Dockerfile <span style='color:#111;'>2.40KB</span>","children":null,"spread":false}],"spread":true},{"title":"00_CreateImageAndTest.ipynb <span style='color:#111;'>5.58KB</span>","children":null,"spread":false}],"spread":true},{"title":"LICENSE <span style='color:#111;'>1.13KB</span>","children":null,"spread":false},{"title":"HorovodPytorch","children":[{"title":"src","children":[{"title":"imagenet_pytorch_horovod.py <span style='color:#111;'>10.54KB</span>","children":null,"spread":false}],"spread":true},{"title":"01_TrainPyTorchModel.ipynb <span style='color:#111;'>12.22KB</span>","children":null,"spread":false},{"title":"Docker","children":[{"title":"Dockerfile <span style='color:#111;'>2.99KB</span>","children":null,"spread":false}],"spread":true},{"title":"cluster_config","children":[{"title":"nodeprep.sh <span style='color:#111;'>159B</span>","children":null,"spread":false},{"title":"docker.service <span style='color:#111;'>1.23KB</span>","children":null,"spread":false},{"title":"cluster.json <span style='color:#111;'>295B</span>","children":null,"spread":false}],"spread":true},{"title":"00_CreateImageAndTest.ipynb <span style='color:#111;'>5.59KB</span>","children":null,"spread":false}],"spread":true},{"title":"Docker","children":[{"title":"dockerfile <span style='color:#111;'>2.16KB</span>","children":null,"spread":false},{"title":"environment.yml <span style='color:#111;'>269B</span>","children":null,"spread":false},{"title":"jupyter_notebook_config.py <span style='color:#111;'>166B</span>","children":null,"spread":false}],"spread":true},{"title":"01_CreateResources.ipynb <span style='color:#111;'>17.28KB</span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'>4.94KB</span>","children":null,"spread":false},{"title":"include","children":[{"title":"build.mk <span style='color:#111;'>325B</span>","children":null,"spread":false}],"spread":true},{"title":"common","children":[{"title":"timer.py <span style='color:#111;'>2.93KB</span>","children":null,"spread":false},{"title":"utils.py <span style='color:#111;'>871B</span>","children":null,"spread":false}],"spread":true},{"title":"valprep.sh <span style='color:#111;'>2.12MB</span>","children":null,"spread":false}],"spread":false}],"spread":true}]

评论信息

免责申明

【好快吧下载】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【好快吧下载】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【好快吧下载】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,8686821#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明