-
Notifications
You must be signed in to change notification settings - Fork 7
A3C LSTM GA for language grounding #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
4554cd6
afd03a3
0c84638
12a8e9f
8e16277
81c7069
66d5915
072e754
f84b440
1d9b78c
b86e5a1
a2ce6e8
94d193c
75f64d7
1a5bd49
90bd31b
9a02df5
828d950
fb1ddcc
b212284
a5cebf2
9977a2a
7b8b87f
cb25429
a43ea64
324a0f0
1e64989
44da993
b419713
3acecaf
3e52ba7
1dd0a72
89bc411
b751b27
1bb10e9
11fa2c4
23bff3a
e7e2d30
9feb73f
318e54f
6889f30
c133832
14379ad
011332e
2d13a8f
97fbe3a
6d2eb98
90932b0
97743c7
655d9b1
ee3a9eb
83ec524
d033b34
2138ac9
a51106b
a6eeeaf
5507373
cea21a3
2ffe6f3
d12e6c5
0e167f2
eca78bf
9e4ba68
132df1b
de96042
6edc1e3
97eada7
64e57e1
b29c93e
7668186
ce4dd9a
8a8b1a3
ea83ca5
09e052d
7bde258
7796fd7
936027f
1a1548a
2b83816
cdf2df9
d1a89c0
5569f3f
b8d85a6
5456517
8e4320d
87713bf
8000
9c7b08f
e2cadb1
2001282
5dc2fd5
396ff42
06f5478
639d9c7
dd338a1
e3da7c9
af95bab
922d691
70aaf76
File filter
Filter by extension
Conversations
Jump to
Diff view
8000Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
|
||
\.idea/ | ||
|
||
*.pyc | ||
/experiments/* | ||
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18,24 +18,30 @@ More detailed information on ai2thor environment can be found on their | |
|
||
<div align="center"> | ||
<img src="docs/bowls_fp_404_compressed_gif.gif" width="294px" /> | ||
<p>A3C agent learning during training on NaturalLanguagePickUpMultipleObjectTask in one of our customized scenes and tasks with the target object being CUPS!</p> | ||
<p>A3C agent training on NaturalLanguagePickUpMultipleObjectTask in one of our customized scenes and tasks with the target object being CUPS!</p> | ||
</div> | ||
|
||
## Overview | ||
## Running algorithms on ai2thor | ||
|
||
This project will include implementations and adaptations of the following papers as a benchmark of | ||
the current state of the art approaches to the problem: | ||
|
||
- [Ikostrikov's A3C](https://github.com/ikostrikov/pytorch-a3c) | ||
- [A3C](https://arxiv.org/abs/1602.01783) [Code from Ikostrikov](https://github.com/ikostrikov/pytorch-a3c) | ||
- [Gated-Attention Architectures for Task-Oriented Language Grounding](https://arxiv.org/abs/1706.07230) | ||
-- *Original code available on [DeepRL-Grounding](https://github.com/devendrachaplot/DeepRL-Grounding)* | ||
also based on Ikostrikov's A3C | ||
-- A3C with gated attention (A3C_LSTM_GA) *Original code available on [DeepRL-Grounding](https://github.com/devendrachaplot/DeepRL-Grounding)* | ||
also based on A3C made by Ikostrikov. | ||
|
||
Implementations of these can be found in the algorithms folder and a3c can be run on AI2ThorEnv with: | ||
`python algorithms/a3c/main.py` | ||
- `python algorithms/a3c/main.py` | ||
- For running a config file which is set to the BowlsVsCups variant of the NaturalLanguagePickUpObjectTask in tasks.py for running A3C_LSTM_GA model: | ||
`python algorithms/a3c/main.py --config-file-name NL_pickup_bowls_vs_cups_fp1_config.json --verbose-num-steps True --num-random-actions-at-init 4` | ||
- For running [ViZDoom](https://github.com/mwydmuch/ViZDoom) (you will need to install ViZDoom) synchronous with 1 process: | ||
`python algorithms/a3c/main.py --verbose-num-steps True --sync --vizdoom -v 1` | ||
- For running atari with 8 processes: | ||
`python algorithms/a3c/main.py --atari --num-processes 8` | ||
|
||
Check the argparse help for more details and variations of running the algorithm with different | ||
hyperparams and on the atari environment as well. | ||
For A3C's `-eid` param you can specify experiment names which will create folders for checkpointing and hyperparameters, otherwise experiment name is the current date and a concatenated random guid. Check the argparse help for more details and variations of running the algorithm with different | ||
hyperparams. | ||
|
||
## Installation | ||
|
||
|
@@ -74,9 +80,11 @@ for episode in range(N_EPISODES): | |
|
||
### Environment and Task configurations | ||
|
||
##### JSON config files and config_dict | ||
|
||
The environment is typically defined by a JSON configuration file located on the `gym_ai2thor/config_files` | ||
folder. You can find an example `config_example.json` to see how to customize it. Here there is one | ||
as well: | ||
folder. You can find a full example at `default_config.json` to see how to customize it. Here there is | ||
another one as well: | ||
|
||
``` | ||
# gym_ai2thor/config_files/myconfig.json | ||
|
@@ -86,6 +94,8 @@ as well: | |
'acceptable_receptacles': ['CounterTop', 'TableTop', 'Sink'], | ||
'openable_objects': ['Microwave'], | ||
'scene_id': 'FloorPlan28', | ||
'gridSize': 0.1, | ||
'continuous_movement': true, | ||
'grayscale': True, | ||
'resolution': (300, 300), | ||
'task': {'task_name': 'PickUp', | ||
|
@@ -95,7 +105,11 @@ as well: | |
For experimentation it is important to be able to make slight modifications of the environment | ||
without having to create a new config file each time. The class `AI2ThorEnv` includes the keyword | ||
argument `config_dict`, that allows to input a python dictionary **in addition to** the config file | ||
that overrides the parameters described in the config. | ||
that overrides the parameters described in the config. In summary, the full interface to the constructor: | ||
|
||
`env = AI2ThorEnv(env = AI2ThorEnv(config_file=config_file_name, config_dict=config_dict))` | ||
|
||
##### Tasks and TaskFactory | ||
|
||
The tasks are defined in `envs/tasks.py` and allow for particular configurations regarding the | ||
rewards given and termination conditions for an episode. You can use the tasks that we defined | ||
|
@@ -128,11 +142,19 @@ class MoveAheadTask(BaseTask): | |
|
||
def reset(self): | ||
self.step_num = 0 | ||
``` | ||
``` | ||
|
||
Some tasks allow you return extra state by filling in the get_extra_state() function (e.g. for returning a Natural Language instruction within the state). Again, check | ||
tasks.py for more details. | ||
|
||
##### Examples and Task variants | ||
|
||
We encourage you to explore the scripts on the `examples` folder to guide you on the wrapper | ||
functionalities and explore how to create more customized versions of ai2thor environments and | ||
tasks. | ||
|
||
And most importantly, config files and tasks can be combined together to form **Task variants** e.g. NaturalLanguagePickUpObjectTask but only allowing | ||
cups and bowls to be picked up hence: `gym_ai2thor/config_files/NL_pickup_bowls_vs_cups_fp1_config.json` | ||
|
||
Here is the desired result of an example task in which the goal of the agent is to place a cup in the | ||
sink. | ||
|
@@ -145,7 +167,7 @@ sink. | |
|
||
## The Team | ||
|
||
[The M Tank](http://www.themtank.org/) is a non-partisan organisation that works solely to recognise the multifaceted | ||
[MTank](http://www.themtank.org/) is a non-partisan organisation that works solely to recognise the multifaceted | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not an issue for you here but in general. We should change the message from MTank to something more related to what we do now. This seems very outdated There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yep, i can do it here if we think of something There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. opened issue in slack |
||
nature of Artificial Intelligence research and to highlight key developments within all sectors affected by these | ||
advancements. Through the creation of unique resources, the combination of ideas and their provision to the public, | ||
this project hopes to encourage the dialogue which is beginning to take place globally. | ||
|
Uh oh!
There was an error while loading. Please reload this page.