06 March -- 12 March 2022

Last Week’s Work Review NIL You need password to access to the content, go to Slack *#phdsukai to find more. ...

<span title='2022-03-04 11:32:45 +1100 AEDT'>March 4, 2022</span>&nbsp;·&nbsp;5 min&nbsp;·&nbsp;Sukai Huang

20 February -- 5 March, 2022

Last Week’s Work Review Feedback from the pre-confirmation meeting annotation process is complicated, a concrete plan is required need to know how to solve misinterpretation and quality issue concrete my research, get the timeline and deadline for each research steps before next confirmation meeting. Annotations Explain what is your purpose for this annotation work. what do you want to prove? what is the problem, what is the input and output and what is your solution. think about how to design an annotation protocol, an annotation interface (tools) You make some annotations first to demonstrate, using IGLU data, NetHack data, Angry Birds data. MineRL data Maybe think about the annotation process details https://nethackchallenge.com/ https://minerl.io/dataset/ Readings Read related works and papers “Linear Temporal Logic”, “NetHack” check this out https://ai.facebook.com/blog/minihack-a-new-sandbox-for-open-ended-reinforcement-learning Spartan take my desktop back home check how to use Spartan resource You need password to access to the content, go to Slack *#phdsukai to find more. ...

<span title='2022-02-18 14:52:59 +1100 AEDT'>February 18, 2022</span>&nbsp;·&nbsp;8 min&nbsp;·&nbsp;Sukai Huang

13 February -- 19 February, 2022

Last Week’s Work Review Well, since IGLU environment has bugs when we use its “discrete action space”. Our work for the silent builder competition model get stuck (though it is not important) This week we continue revising our pre-confirmation report and also prepare slides for the presentation You need password to access to the content, go to Slack *#phdsukai to find more. ...

<span title='2022-02-16 15:27:05 +1100 AEDT'>February 16, 2022</span>&nbsp;·&nbsp;1 min&nbsp;·&nbsp;Sukai Huang

06 February -- 12 February, 2022

Last Week’s Work Review Although our phase 1 and phase 2 modules are not getting good performance, we are going to build our phase 3 module now. Don’t be upset because the results from phase 1 and phase 2 exactly means that there is room for further research… You need password to access to the content, go to Slack *#phdsukai to find more. ...

<span title='2022-02-06 18:40:28 +1100 AEDT'>February 6, 2022</span>&nbsp;·&nbsp;2 min&nbsp;·&nbsp;Sukai Huang

23 January -- 29 January, 2022

Last Week’s Work Review Recall our Phase 1 visual module training scheme Our Phase 2 language module training scheme the testing accuracy could achieve $99.0%$ But I doubt about the true performance of this model. the model can achieve high accuracy by just memorising the current 3D grid value as long as the incremental change is small. which is true for this dataset because one instruction is often paired with a small change in the environment I should change the current training scheme to prevent model from taking shortcuts local addition one step (no intermediate states) After completing language module, our next step is to build PDDL model to bridge the gap between current state to the target state described by the instruction. You need password to access to the content, go to Slack *#phdsukai to find more. ...

<span title='2022-01-25 12:36:30 +1100 AEDT'>January 25, 2022</span>&nbsp;·&nbsp;4 min&nbsp;·&nbsp;Sukai Huang

16 January -- 22 January, 2022

Last Week’s Work Review key points from last week’s meeting: sequence of instructions matters and how can we capture this key feature e.g., the agent must build pillars before it can build the bell To generate training dataset. We can also use Monte Carlo methods to randomly extract movements from trajectories of human players compared to create random agents from scratch, Monte Carlo imitation methods would generate a more human-like trajectories I finished the visual module training codes on Wednesday this week (a little bit late). But many codes are reusable next time. so far the accuracy of visual module is desirable ($\approx 98.9%$) This week we are going to focus on “text to grids” conversion Originally the second stage is to label the intents of conversation sentences for the “Modular RL model”. However, Trevor and Nir suggested that I can start from simpler works first. One possible task is to rebuild human Builder’s (intermediate) structure based on (partial) instructions from Architect agent. well, are we assuming that human Builder’s actions are perfect solution to the Architect instructions? how can we break this assumption? we can imitate human players first and after that we can still improve the agent by providing them with good reward signal Besides Builder task, Architect task is also very interesting e.g., how to convey instructions to the agent, what would be the best workload for each conversation step? baseline, one block at a time … You need password to access to the content, go to Slack *#phdsukai to find more. ...

<span title='2022-01-19 21:01:40 +1100 AEDT'>January 19, 2022</span>&nbsp;·&nbsp;2 min&nbsp;·&nbsp;Sukai Huang

09 January -- 15 January, 2022

Last Week’s Work Review Last week I checked the environment codes for IGLU competition. So, this week we should start to build our Modular RL model. You need password to access to the content, go to Slack *#phdsukai to find more. ...

<span title='2022-01-07 22:20:48 +1100 AEDT'>January 7, 2022</span>&nbsp;·&nbsp;3 min&nbsp;·&nbsp;Sukai Huang

02 January -- 08 January, 2022

Last Week’s Work Review We continue reading the code of the winner model of IGLU competition. Also, we need to check the IGLU slack channel, download the training dataset and understand how to utilise the dataloader tools. You need password to access to the content, go to Slack *#phdsukai to find more. ...

<span title='2022-01-01 17:07:59 +1100 AEDT'>January 1, 2022</span>&nbsp;·&nbsp;5 min&nbsp;·&nbsp;Sukai Huang

19 December -- 31 December, 2021

Last Week’s Work Review We decided that we focus on the Modular RL model and Policy Sketch idea Our testing environment could be normal Minecraft or IGLU competition environment I would like to use IGLU environment because it have very great training dataset for imitation learning as well as a group of active engineers that can help to answer questions regarding to their platform. Their dataset type is dialogue type. (We can try the model in different types of dataset, dialogue, walkthrough, speech etc.) In order to extend the previous work, we need to create additional label to the dialogue dataset so as to classify “policy sketches” in the dialogue dataset. (Then our policy sketch identification task becomes a supervised learning task. we can improve this to unsupervised clustering task later) You need password to access to the content, go to Slack *#phdsukai to find more. ...

<span title='2021-12-20 19:26:40 +1100 AEDT'>December 20, 2021</span>&nbsp;·&nbsp;4 min&nbsp;·&nbsp;Sukai Huang

12 December -- 18 December, 2021

Last Week’s Work Review In the research field of Grounded Language Learning, what will be my research questions? formalise the question think about the assumptions pick out one problem (not big but specific) from the existing experiments You need password to access to the content, go to Slack *#phdsukai to find more. ...

<span title='2021-12-10 13:59:41 +1100 AEDT'>December 10, 2021</span>&nbsp;·&nbsp;12 min&nbsp;·&nbsp;Sukai Huang

05 December -- 11 December, 2021

Last Week’s Work Review We finally decided that we should work on Interactive Grounded Language Understanding in Minecraft environment. First of all, we should test on the baseline model provided and check the performance of existing models. Notice that the NeurIPS Minecraft competition workshop will start at Fri Dec 10 06:45 AM -- 07:05 AM (AEDT) We should follow up on the competition model when they are available. You need password to access to the content, go to Slack *#phdsukai to find more. ...

<span title='2021-12-05 10:48:45 +1100 AEDT'>December 5, 2021</span>&nbsp;·&nbsp;6 min&nbsp;·&nbsp;Sukai Huang