IGLU Environments

All environments provided here are created to be used for the Silent builder task of IGLU competition. In the Silent builder task, the goal is to create an agent that has access to the past conversation that led to some structure built by human builder.

IGLUSilentBuilder-v0

The agent spawns at the center of building zone which is a 11x9x11 cuboid above the blocks which are marked white. Each step the agent gets a pov image, an inventory state, a position, and the goal information which is described below. The agent can navigate inside the building zone, select block stack from the inventory and place/break blocks. The goal of the agent is to build the target structure using only the text of the conversation of human architect with human builder taken from the dataset.

Observation space

Observation space of silent builder consisits of six components

Dict({
    "pov": Box(low=0, high=255, shape=(64, 64, 3)),
    "inventory": Box(low=0, high=20, shape=(6,)),
    "agentPos": Box(low= [-5, 0, -5, 0,  -90],
                    high=[ 5, 8,  5, 360, 90],
                    shape=(5,)),
    "grid": Box(low=-1, high=5, shape=(9, 11, 11)),
    "compass": Dict({"angle": Box(low=-180.0, high=180.0, shape=())}),
    "chat": String()
})

First, "pov" is a 64x64 RGB first-person view image of the agent. In "inventory" there are stack counts for each of six block stacks: blue, yellow, green, orange, purple, red. At the start of the episode blue stack is active. The "agentPos" component is described by 5 numbers which are x, y, z coordinates and pitch, yaw angles. "grid" observation contains block ids of voxel grid captured from the building zone. Id -1 coorresponds to air block and the rest of them are ordered as in the "inventory" observation. "compass" component is provided since there is no information about the dlobal direction inside the images (the building zone looks the same at each direction). Finally, "chat" represents the conversation between the architect and the builder acquired from human-human interation which coorresponds to the current task.

Additionally, the agent has access to target structure of the current task. It is stored inside info dictionary by 'target_grid' key. The representation is the same as for "grid" observation component.

Warning

This observation space will not be used for evaluation in the Silent Builder task of the IGLU competition. For the evaluation environment, see IGLUSilentBuilderVisual-v0.

Action space

The IGLUSilentBuilder-v0 environment can be customized with three different action spaces.

Human-level actions:

Dict({
    "forward": Discrete(2),
    "back": Discrete(2),
    "left": Discrete(2),
    "right": Discrete(2),
    "jump": Discrete(2),
    "attack": Discrete(2),
    "use": Discrete(2),
    "camera": Box(low=-180.0, high=180.0, shape=(2,)),
    "hotbar": Discrete(7),
})

This action space is the same as that in MineRL competition environments except there are "hotbar" selection commands added. "hotbar" commands correspond to the selection of 6 block stacks of different colours + one action that does nothing with the selected stack.

Discrete coordinate actions:

Dict({
    "move": Discrete(3),
    "strafe": Discrete(3),
    "jump": Discrete(2),
    "attack": Discrete(2),
    "use": Discrete(2),
    "camera": Box(low=-180.0, high=180.0, shape=(2,)),
    "hotbar": Discrete(7),
})

Following these actions, the agent would move over discrete positions coorresponding to centers of blocks. For navigation commands ("move", "strafe"), there are 3 options which coorrespond to no-op, forward, and backward movement (no-op, right, and left in case of "strafe"). If "jump" action is non-zero alongsize the movement action, the jump would occur simultaneously with movement (as otherwise the agent would be unable to jump upstairs). Take this into account when designing your action space discretization.

Note that states are changed correspondingly immidiately after applying each of these actions.

Continuous movement actions:

Dict({
    "move_x": Box(low=-1, high=1, shape=()),
    "move_y": Box(low=-1, high=1, shape=()),
    "move_z": Box(low=-1, high=1, shape=()),
    "camera": Box(low=-180.0, high=180.0, shape=(2,)),
    "attack": Discrete(2),
    "use": Discrete(2),
    "hotbar": Discrete(7),
})

This action space allows agent to fly freely inside the building zone without collisions (except with the ground and invisible walls surrounding the building zone). The rest components of the action space are the same as in the previous two spaces.

Note that due to how Minecraft processes that kind of events, states are changed with the delay of 2-4 actions.

To select a proper action space, one can simply pass the corresponding argument to the environment constructor:

# For human level actions
env = gym.make('IGLUSilentBuilder-v0', action_space='human-level')
# For discrete coordinates movement
env = gym.make('IGLUSilentBuilder-v0', action_space='discrete')
# For continuous coordinates movement
env = gym.make('IGLUSilentBuilder-v0', action_space='continuous')

The default value is 'human-level'.

Warning

To speed up the environment, iglu doesn’t reload the whole Minecraft mission at each env.reset() (as it takes 3-5 seconds). Instead, it cleans up the building zone, teleports the agent into the initial position refilling its inventory. Such “fake reset” operation costs just one environment step. But this is an experimental feature that may work unstable, leading to unwanted bugs with minerl socket interaction. If you experience such kind of problems, you can disable fake reset using export IGLU_DISABLE_FAKE_RESET=1 before running script.

IGLUSilentBuilderVisual-v0

This environment will be used during the evaluation of the solutions to Silent Builder task of IGLU competition. It provides a reduced observation space and the same actions.

Observation space

Observation space of visual silent builder consisits of four components

Dict({
    "pov": Box(low=0, high=255, shape=(64, 64, 3)),
    "inventory": Box(low=0, high=20, shape=(6,)),
    "compass": Dict({"angle": Box(low=-180.0, high=180.0, shape=())}),
    "chat": String()
})

Each of them was described in the previous section.

Action space

In this environment there is again a freedom to select any action space you want.

# For human level actions
env = gym.make('IGLUSilentBuilderVisual-v0', action_space='human-level')
# For discrete coordinates movement
env = gym.make('IGLUSilentBuilderVisual-v0', action_space='discrete')
# For continuous coordinates movement
env = gym.make('IGLUSilentBuilderVisual-v0', action_space='continuous')

The default value is 'human-level'.

Reward calculation

Each step, the agent receives a reward. Its value is determined by the current goal structure and the one built so far. The reward is determined regardless of global spatial position of currently placed blocks, it only takes into account how much the built blocks are similar to the target structure. To make it possible, at each step we calculate the intersection between the built and the target structures for each spatial translation within the horizontal plane and rotation around the vertical axis. Then we take the maximal intersection value among all translation and rotations. To calculate the reward, we compare the maximal intersection size from the current step with the one from the previous step. We reward the agent with 2 for the increase of the maximal intersection size, with -2 for the decrease of the maximal intersection size, and with 1/-1 for removing/placing a block without a change of the maximal intersection size.

In the image below, there is an example of L-shaped target and two blocks placed diagonally. Despite the target is somewhere at the center of the zone, it fully covers two diagonal blocks (for some spatial alignment) forcing the agent to complete the structure where they started placing blocks.