(进阶)004 - AWS DeepRacer重要资料合集

本文详细介绍了AWS DeepRacer的奖励函数参数及其在指导车辆行驶中的应用,包括走中线、保持在界内的奖励策略,并提供了log分析工具和社区资源。同时提到了如何获取赛道waypoint数据以及计算最佳路线的方法,为DeepRacer的进阶训练和优化提供指导。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Deppracer 进阶必看的资料合集

1. reward_function 入口参数说明

https://docs.aws.amazon.com/deepracer/latest/developerguide/deepracer-reward-function-input.html
{
    "all_wheels_on_track": Boolean,        # flag to indicate if the agent is on the track
    "x": float,                            # agent's x-coordinate in meters
    "y": float,                            # agent's y-coordinate in meters
    "closest_objects": [int, int],         # zero-based indices of the two closest objects to the agent's current position of (x, y).
    "closest_waypoints": [int, int],       # indices of the two nearest waypoints.
    "distance_from_center": float,         # distance in meters from the track center 
    "is_crashed": Boolean,                 # Boolean flag to indicate whether the agent has crashed.
    "is_left_of_center": Boolean,          # Flag to indicate if the agent is on the left side to the track center or not. 
    "is_offtrack": Boolean,                # Boolean flag to indicate whether the agent has gone off track.
    "is_reversed": Boolean,                # flag to indicate if the agent is driving clockwise (True) or counter clockwise (False).
    "heading": float,                      # agent's yaw in degrees
    "objects_distance": [float, ],         # list of the objects' distances in meters between 0 and track_length in relation to the starting line.
    "objects_heading": [float, ],          # list of the objects' headings in degrees between -180 and 180.
    "objects_left_of_center": [Boolean, ], # list of Boolean flags indicating whether elements' objects are left of the center (True) or not (False).
    "objects_location": [(float, float),], # list of object locations [(x,y), ...].
    "objects_speed": [float, ],            # list of the objects' speeds in meters per second.
    "progress": float,                     # percentage of track completed
    "speed": float,                        # agent's speed in meters per second (m/s)
    "steering_angle": float,               # agent's steering angle in degrees
    "steps": int,                          # number steps completed
    "track_length": float,                 # track length in meters.
    "track_width": float,                  # width of the track
    "waypoints": [(float, float), ]        # list of (x,y) as milestones along the track center

}

重要参数有:

  1. all_wheels_on_track
  2. x,y
  3. distance_from_center
  4. heading
  5. progress
  6. speed
  7. steps
  8. track_width
  9. waypoints

2. 常见 reward_function 实例

1. 走中线

def reward_function(params):
    # Read input parameters
    track_width = params['track_width']
    distance_from_center = params['distance_from_center']

    # Calculate 3 markers that are increasingly further away from the center line
    marker_1 = 0.1 * track_width
    marker_2 = 0.25 * track_width
    marker_3 = 0.5 * track_width

    # Give higher reward if the car is closer to center line and vice versa
    if distance_from_center <= marker_1:
        reward = 1
    elif distance_from_center <= marker_2:
        reward = 0.5
    elif distance_from_center <= marker_3:
        reward = 0.1
    else:
        reward = 1e-3  # likely crashed/ close to off track

    return reward

2. 保持在界内

def reward_function(params):
    # Read input parameters
    all_wheels_on_track = params['all_wheels_on_track']
    distance_from_center = params['distance_from_center']
    track_width = params['track_width']
    
    # Give a very low reward by default
    reward = 1e-3

    # Give a high reward if no wheels go off the track and 
    # the car is somewhere in between the track borders 
    if all_wheels_on_track and (0.5*track_width - distance_from_center) >= 0.05:
        reward = 1.0

    # Always return a float value
    return reward

3. 防止蛇形


def reward_function(params):
    # Read input parameters
    all_wheels_on_track = params['all_wheels_on_track']
    distance_from_center = params['distance_from_center']
    track_width = params['track_width']
    
    # Give a very low reward by default
    reward = 1e-3

    # Give a high reward if no wheels go off the track and 
    # the car is somewhere in between the track borders 
    if all_wheels_on_track and (0.5*track_width - distance_from_center) >= 0.05:
        reward = 1.0
    # Always return a float value
    return reward

3. log 分析

link: lulu2002

在这里插入图片描述
使用方法:

  1. 下载你的log(Training log 或者 Evaluation log)
  2. 点击按钮上传模型
  3. 在表格点击你要分析的episod
  4. 右边图像中可以看到本次的小车路径以及速度分布
  5. 其他一些信息也可以通过面板找到

3. 通过AWS deepracer 社区获取赛道 waypoint

link: aws-deepracer-community

在这里插入图片描述
在这里你可以得到所有赛道的waypoint 数据,以及赛道总长度和赛道宽度,以及发布时间,
这里的2022_may_pro.npy 里面 存放了该赛道 waypoint 值。

4. 获取AWS deepracer 社区信息

link: Deeprace社区
在这里插入图片描述
在Deepracer 社区你可以获取 你需要的信息,其中 deepracer-for-cloud 是一个可以部署到云上或本地的一个库,因为在AWS上直接training是需要花钱的 3.55$ / h,也就是近 24¥/ h ,平民玩家伤不起,如果你对这个有兴趣可以在本地部署一个,不过这个对电脑要求也挺高的,我的电脑 16G内存,3G GPU,勉强能带动,只能通过较小的batch size来训练,否则CPU/GPU 使用过高容易被系统kill掉。

CPU 基本上80% 以上
在这里插入图片描述
GPU 基本 95% 以上
在这里插入图片描述
时不时就挂了,内存溢出

在这里插入图片描述
建议32G 内存 + 8G GPU

5. 计算最佳路线

link: cdthompson/deepracer-k1999-race-lines

最佳线路的原理是: 将中间点的曲率用相邻两点来代替,使得曲线更加光滑平整,后面的博客中将提取他的核心代码进行计算。
在这里插入图片描述

6. Capstone

link: Capstone

  1. 计算动作空间以及数据分析和可视化
  2. 计算最佳速度以及速度分布和可视化 RaceLine_Speed_ActionSpace
  3. 计算最佳路线, 这里借鉴了K1999的算法 Race-Line-Calculation
  4. 使用selenium 实现自动提交和自动提交训练 Selenium_DeepRacer

7. 参考

  1. https://github.com/cdthompson/deepracer-k1999-race-lines
  2. https://github.com/aws-deepracer-community/deepracer-analysis
  3. https://github.com/aws-deepracer-community/deepracer-simapp/tree/master/bundle/deepracer_simulation_environment/share/deepracer_simulation_environment/routes

以上内容我将会在后面的博客通过自己的理解一一讲解用法,和使用技巧。

>>> 如果你觉得我的文章对你有用,不妨 【点赞】 加 【关注】,你的支持是我持续写作的动力,thank you! <<<

评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Rambo.Fan

码字不易,打赏有动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值