2048
Usage
or you can directly load Play2048
class
Description
2048 ...
Specs
Name | Value |
---|---|
Version | v0 |
Number of players | 1 |
Number of actions | 4 |
Observation shape | (4, 4, 31) |
Observation type | bool |
Rewards | {0, 2, 4, ...} |
Observation
Our obseervation design basically follows [Antonoglou+22]
:
In our 2048 experiments we used a binary representation of the observation as an input to our model. Specifically, the 4 × 4 board was flattened into a single vector of size 16, and a binary representation of 31 bits for each number was obtained, for a total size of 496 numbers.
However, instaead of 496
-d flat vector, we employ (4, 4, 31)
vector.
Index | Description |
---|---|
[i, j, b] |
represents that square (i, j) has a tile of 2 ^ b if b > 0 |
Action
Each action corresnponds to 0 (left)
, 1 (up)
, 2 (right)
, 3 (down)
.
Rewards
Sum of merged tiles.
Termination
If all squares are filled with tiles and no legal actions are available, the game terminates.
Version History
v2
: Two updates (v2.0.0)- Fix
legal_action_mask
@sotetsuk in #1049 - Specify rng key explicitly (API v2) by @sotetsuk in #1058
v1
: Fix reward overflow bug by @sotetsuk in #1034 (v1.4.0)v0
: Initial release (v1.0.0)
Reference
[Antonoglou+22]
"Planning in Stochastic Environments with a Learned Modell", ICLR