batch_rl:Atari 2600游戏上的离线强化学习(又名批量强化学习)-源码

上传者: 42098759 | 上传时间: 2021-07-07 20:36:41 | 文件大小: 63KB | 文件类型: ZIP
离线强化学习的乐观观点(ICML,2020年) 该项目使用框架提供开放源代码实施,以运行提到的实验。 在这项工作中,我们使用DQN代理的记录的经验在脱机设置(即 )中训练非策略代理(如下所示),而在训练过程中不与环境进行任何新的交互。 有关项目页面,请参考 。 如何在50M数据集上训练脱机代理而没有RAM错误? 请参阅 。 DQN重播数据集(记录的DQN数据) DQN重播数据集的收集方式如下:我们首先在60款训练代理,并为2亿帧(标准协议)启用了,并保存(观察,动作,奖励,下一个)的所有体验元组。观察) (约5000万)。 可以在公共gs://atari-replay-datasets中找到此记录的DQN数据,可以使用下载。 要安装gsutil,请按照的说明进行操作。 安装gsutil之后,运行命令以复制整个数据集: gsutil -m cp -R gs://atari-rep

文件下载

资源详情

[{"title":"( 44 个子文件 63KB ) batch_rl:Atari 2600游戏上的离线强化学习(又名批量强化学习)-源码","children":[{"title":"batch_rl-master","children":[{"title":"README.md <span style='color:#111;'> 8.09KB </span>","children":null,"spread":false},{"title":"batch_rl","children":[{"title":"baselines","children":[{"title":"configs","children":[{"title":"random.gin <span style='color:#111;'> 1.31KB </span>","children":null,"spread":false},{"title":"quantile.gin <span style='color:#111;'> 1.43KB </span>","children":null,"spread":false},{"title":"dqn.gin <span style='color:#111;'> 1.52KB </span>","children":null,"spread":false}],"spread":true},{"title":"train.py <span style='color:#111;'> 2.68KB </span>","children":null,"spread":false},{"title":"agents","children":[{"title":"__init__.py <span style='color:#111;'> 608B </span>","children":null,"spread":false},{"title":"random_agent.py <span style='color:#111;'> 1.26KB </span>","children":null,"spread":false},{"title":"dqn_agent.py <span style='color:#111;'> 2.29KB </span>","children":null,"spread":false},{"title":"quantile_agent.py <span style='color:#111;'> 2.68KB </span>","children":null,"spread":false}],"spread":true},{"title":"__init__.py <span style='color:#111;'> 608B </span>","children":null,"spread":false},{"title":"run_experiment.py <span style='color:#111;'> 1017B </span>","children":null,"spread":false},{"title":"replay_memory","children":[{"title":"logged_replay_buffer.py <span style='color:#111;'> 5.04KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 608B </span>","children":null,"spread":false},{"title":"logged_prioritized_replay_buffer.py <span style='color:#111;'> 5.77KB </span>","children":null,"spread":false}],"spread":true}],"spread":true},{"title":"tests","children":[{"title":"atari_init_test.py <span style='color:#111;'> 1.60KB </span>","children":null,"spread":false},{"title":"fixed_replay_runner_test.py <span style='color:#111;'> 2.86KB </span>","children":null,"spread":false}],"spread":true},{"title":"fixed_replay","children":[{"title":"configs","children":[{"title":"quantile.gin <span style='color:#111;'> 1.50KB </span>","children":null,"spread":false},{"title":"c51.gin <span style='color:#111;'> 1.70KB </span>","children":null,"spread":false},{"title":"multi_head_dqn.gin <span style='color:#111;'> 1.62KB </span>","children":null,"spread":false},{"title":"rem.gin <span style='color:#111;'> 1.63KB </span>","children":null,"spread":false},{"title":"dqn.gin <span style='color:#111;'> 1.62KB </span>","children":null,"spread":false}],"spread":true},{"title":"train.py <span style='color:#111;'> 3.24KB </span>","children":null,"spread":false},{"title":"agents","children":[{"title":"multi_network_dqn_agent.py <span style='color:#111;'> 3.08KB </span>","children":null,"spread":false},{"title":"multi_head_dqn_agent.py <span style='color:#111;'> 3.14KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 608B </span>","children":null,"spread":false},{"title":"dqn_agent.py <span style='color:#111;'> 3.56KB </span>","children":null,"spread":false},{"title":"rainbow_agent.py <span style='color:#111;'> 3.59KB </span>","children":null,"spread":false},{"title":"quantile_agent.py <span style='color:#111;'> 3.63KB </span>","children":null,"spread":false}],"spread":true},{"title":"__init__.py <span style='color:#111;'> 608B </span>","children":null,"spread":false},{"title":"run_experiment.py <span style='color:#111;'> 4.33KB </span>","children":null,"spread":false},{"title":"replay_memory","children":[{"title":"fixed_replay_buffer.py <span style='color:#111;'> 6.61KB </span>","children":null,"spread":false}],"spread":true}],"spread":true},{"title":"multi_head","children":[{"title":"multi_network_dqn_agent.py <span style='color:#111;'> 8.86KB </span>","children":null,"spread":false},{"title":"multi_head_dqn_agent.py <span style='color:#111;'> 5.57KB </span>","children":null,"spread":false},{"title":"atari_helpers.py <span style='color:#111;'> 14.20KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 608B </span>","children":null,"spread":false},{"title":"quantile_agent.py <span style='color:#111;'> 9.30KB </span>","children":null,"spread":false}],"spread":true},{"title":"__init__.py <span style='color:#111;'> 608B </span>","children":null,"spread":false}],"spread":true},{"title":"LICENSE <span style='color:#111;'> 11.15KB </span>","children":null,"spread":false},{"title":"CONTRIBUTING.md <span style='color:#111;'> 1.23KB </span>","children":null,"spread":false},{"title":"online","children":[{"title":"configs","children":[{"title":"quantile.gin <span style='color:#111;'> 1.57KB </span>","children":null,"spread":false},{"title":"c51.gin <span style='color:#111;'> 1.50KB </span>","children":null,"spread":false},{"title":"rem.gin <span style='color:#111;'> 1.40KB </span>","children":null,"spread":false},{"title":"dqn.gin <span style='color:#111;'> 1.32KB </span>","children":null,"spread":false}],"spread":true},{"title":"train.py <span style='color:#111;'> 2.30KB </span>","children":null,"spread":false}],"spread":true}],"spread":true}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明