主要责任者Szepesvári,Csaba.题名Algorithmsforreinforcementlearning[electronicresource]/CsabaSzepesvári.出版资料SanRafael,Calif.(1537FourthStreet,SanRafael,CA94901USA):Morgan&Claypool,c2010.摘要附注Reinforcementlearningisalearningparadigmconcernedwithlearningtocontrolasystemsoastomaximizeanumericalperformancemeasurethatexpressesalong-termobjective.Whatdistinguishesreinforcementlearningfromsupervisedlearningisthatonlypartialfeedbackisgiventothelearneraboutthelearner'spredictions.Further,thepredictionsmayhavelongtermeffectsthroughinfluencingthefuturestateofthecontrolledsystem.Thus,timeplaysaspecialrole.Thegoalinreinforcementlearningistodevelopefficientlearningalgorithms,aswellastounderstandthealgorithms'meritsandlimitations.Reinforcementlearningisofgreatinterestbecauseofthelargenumberofpracticalapplicationsthatitcanbeusedtoaddress,rangingfromproblemsinartificialintelligencetooperationsresearchorcontrolengineering.Inthisbook,wefocusonthosealgorithmsofreinforcementlearningthatbuildonthepowerfultheoryofdynamicprogramming.Wegiveafairlycomprehensivecatalogoflearningproblems,describethecoreideas,notealargenumberofstateoftheartalgorithms,followedbythediscussionoftheirtheoreticalpropertiesandlimitations.
2025/10/5 9:16:40
1.71MB
强化学习
1