Gym-Softrobot Environment Suite
Contents
Gym-Softrobot Environment Suite#
Environment Setup#
Our environment contains set of controllable slender-bodies to achieve set of tasks. The theory and details of the physics simulation is documented in CosseratRods.org.
Interface#
State#
The state information may or may-not reflect the entirety of the system: partially observable. The state space is composed with the following parameters:
Position (relative) and director at each element (spatial discretization in simulation)
Could be supplemented with velocity/acceleration and angular velocity/acceleration
6-modes of strains (normal/binormal shear, normal/binormal curvature, stretch, twist)
Absolute position
Target location and its velocity (if applicable)
Index of the agent (multi-agent case)
Previous action
Action#
Internal curvature resembling tendon-driven actuation or muscle actuation.
The number of DoFs (degrees of freedoms) along the arm depends on the length of the arm, and may vary depending on the environment and its version. Actuation functions equals to the interpolation of provided DoFs.
Internal torque/force for direct activation
Reward#
The reward function is not yet finalized. Different version may contain different reward function.
The reward is defined as the composition of the following quantities:
Forward Reward: typically used in locomotion case
Velocity of the body
Position difference between control-steps
Survive Reward: given for stability purpose and to prevant wild policy
Nan panelty: unstable or unexpected behavior
Cross-over panelty: panelty for multiple arm crossing eachother (in 2D)
Control Panelty: minimum-actuation solution
Square-average of the action
Energy: minimum-energy solution
Total bending energy
Total shear energy
Time Limit:
Small constant panelty at each steps
Large constant panelty for not achieving goal.
Miscellaneous:
Target reaching/grabbing reward
Remaining distance to the target
Octopus#
Single Arm Control#
This environment is the testcase for simplest one-arm control.
OctoArmSingle-v0
[Alpha]
Multi Arm Control#
OctoFlat-v0
[Alpha]
Advanced#
The goal of this environment is to control 8 arms of the octopus attached to the body, and move towards the targeted location.
OctoReach-v0
[Working in Process]OctoSwim-v0
[Working in Process]OctoHunt-v0
[Working in Process]
Soft Arm#
Soft Arm with Muscle Control#
SoftArmTrackingEnv-v0
Miscellaneous Env#
Soft Pendulum [Transfer learning, Soft Arm Control]#
SoftPendulum-v0
Snake#
ContinuumSnake-v0
[Alpha]
This environment is inspired from the Continuum Snake case in PyElastica.
The goal is to control the snake to achieve fastest velocity.
Unlike the original example, where the control is defined by the amplitude and phase-shift of the sinusoidal activation, our environment challenges the player to give an action every dt
time-steps.
Available Wrappers#
Here is the list of available wrappers that we provide. The purpose of the wrapper is to convert the environment to make it compatible with other external packages. These wrappers should be compatible with typical OpenAI-gym wrappers, such as SubprocVecEnv or VecFrameStack, although we highly recommend using our wrapper on the outer-most layer to be safe.
Note
Because of the nature of the wrapper design, we cannot guarantee 100% compatibility with all other available tools. Please make an GitHub issue if you find any bug in this feature.
Converter#
Description#
Convert environment to PyMarl multi-agent environments. |
Built-in Wrappers#
- class gym_softrobot.wrapper.ConvertToPyMarlEnv(ma_env, *args, **kwargs)[source]#
Convert environment to PyMarl multi-agent environments. Template from: oxwhirl/pymar/src/envs/multiagentenv.py
Available benchmark algorithms:
The wrapper to convert gym_softrobot environment (that is MA-compatible) to ‘multiagentenv’ that is compatible to PyMARL. The purpose is to run benchmark study with standard CTDE algorithms, such as QMIX, COMA, etc. Here is the example snippet:
env = ConvertToPyMarlEnv(ma_env=gym.make("OctoCrawl-v0"))
- __init__(ma_env, *args, **kwargs)[source]#
- Parameters
- ma_envgym.Env
Multi-agent compatible environment In gym-softrobot, environment with ‘multiagent’ tag in meta data must be true.
- get_obs_size()[source]#
Returns the shape of the observation
The observation in regular gym_softrobot is given as (n * state_space)
- get_state()[source]#
return global state.
Notes
Ideally, this function should not be used during decentralized execution.