20 March -- 2 April, 2022

Last Week’s Work Review Our first step should be writing codes for our baseline RL model, and after that we can try to add additional language interpreter on it and see if we can improve the performance by interpreting the guidebook we now have two things to do build baseline RL model for both NetHack and MiniHack environment then we try to feed language data into the model. decision transformer model seems a future proof model to embed language information build a user-friendly and useful annotation tool for annotators. can record the gameplay can annotate the objects can add instructions You need password to access to the content, go to Slack *#phdsukai to find more. ...

<span title='2022-03-21 14:29:31 +1100 AEDT'>March 21, 2022</span>&nbsp;·&nbsp;8 min&nbsp;·&nbsp;Sukai Huang

13 March -- 19 March, 2022

Last Week’s Work Review do not restrict what people annotate, do not limit the vocabulary… we can use modern BERT model to interpret natural language utterances. before we dive into the conversion from natural language utterances into logical forms, we can try to use general NLP models to give a end to end trial first… You need password to access to the content, go to Slack *#phdsukai to find more. ...

<span title='2022-03-14 14:29:03 +1100 AEDT'>March 14, 2022</span>&nbsp;·&nbsp;4 min&nbsp;·&nbsp;Sukai Huang

06 March -- 12 March 2022

Last Week’s Work Review NIL You need password to access to the content, go to Slack *#phdsukai to find more. ...

<span title='2022-03-04 11:32:45 +1100 AEDT'>March 4, 2022</span>&nbsp;·&nbsp;5 min&nbsp;·&nbsp;Sukai Huang

20 February -- 5 March, 2022

Last Week’s Work Review Feedback from the pre-confirmation meeting annotation process is complicated, a concrete plan is required need to know how to solve misinterpretation and quality issue concrete my research, get the timeline and deadline for each research steps before next confirmation meeting. Annotations Explain what is your purpose for this annotation work. what do you want to prove? what is the problem, what is the input and output and what is your solution. think about how to design an annotation protocol, an annotation interface (tools) You make some annotations first to demonstrate, using IGLU data, NetHack data, Angry Birds data. MineRL data Maybe think about the annotation process details https://nethackchallenge.com/ https://minerl.io/dataset/ Readings Read related works and papers “Linear Temporal Logic”, “NetHack” check this out https://ai.facebook.com/blog/minihack-a-new-sandbox-for-open-ended-reinforcement-learning Spartan take my desktop back home check how to use Spartan resource You need password to access to the content, go to Slack *#phdsukai to find more. ...

<span title='2022-02-18 14:52:59 +1100 AEDT'>February 18, 2022</span>&nbsp;·&nbsp;8 min&nbsp;·&nbsp;Sukai Huang