huggingbutt/roller_ball
RollerBall
Objective:
The blue ball aims to catch the red block.
Agent:
The agent is a blue ball.
It accepts a 2-dimensional np.array of type np.float32 as input. The first parameter ranges from -1 to 1 and represents the force along the x-axis. The second parameter ranges from -1 to 1 and represents the force along the z-axis.
Observation:
The observation space consists of 13 dimensions:
target_position_x: x-axis position of the target block
target_position_y: y-axis position of the target block
target_position_z: z-axis position of the target block
player_position_x: x-axis position of the blue ball
player_position_y: y-axis position of the blue ball
player_position_z: z-axis position of the blue ball
player_velocity_x: x-axis velocity of the blue ball
player_velocity_y: y-axis velocity of the blue ball
player_velocity_z: z-axis velocity of the blue ball
player_angular_velocity_x: x-axis angular velocity of the blue ball
player_angular_velocity_y: y-axis angular velocity of the blue ball
player_angular_velocity_z: z-axis angular velocity of the blue ball
distance: Distance between the target block and the blue ball, calculated from their respective positions to determine game termination.
Customizable Functions:
transform_fun: Processes the raw observation returned by the environment for further handling by reward_fun and control_fun.
reward_fun: Determines the reward for the current step based on original observation.
control_fun: Determines the environment's state based on based on original observation.
Default Function Implementations:
def transform_fun(obs):
return np.array([
# Write your code here.
obs.player_position_x,
obs.player_position_y,
obs.player_position_z,
obs.player_velocity_x,
obs.player_velocity_y,
obs.player_velocity_z,
obs.player_angular_velocity_x,
obs.player_angular_velocity_y,
obs.player_angular_velocity_z,
obs.target_position_x,
obs.target_position_y,
obs.target_position_z
# End of your code.
], dtype=np.float32)
def reward_fun(obs):
# Your must return a floating-point number as the reward of this step.
# Write your reward function here.
if obs.distance <= 1.42:
return 1.0
else:
return 0.0
# End of your code.
# Do not alter the following line.
return 0.0
def control_fun(obs):
response = {}
# Write your code here.
if obs.distance <= 1.42:
response['terminated'] = 'true'
if obs.player_position_y <= 0:
response['terminated'] = 'true'
response['re-entry'] = 'true'
# End of your code.
return response