Alba Gragera Exploring the Limitations of Using LLMs to Fix Planning Tasks 2023

[TOC] Title: Exploring the Limitations of Using LLMs to Fix Planning Tasks Author: Alba Gragera et. al. Publish Year: icaps23.icaps-conference Review Date: Wed, Sep 20, 2023 url: https://icaps23.icaps-conference.org/program/workshops/keps/KEPS-23_paper_3645.pdf Summary of paper Motivation In this work, the authors present ongoing efforts on exploring the limitations of LLMs in task requiring reasoning and planning competences: that of assisting humans in the process of fixing planning tasks. Contribution investigate how good LLMs are at repairing planning tasks when the prompt is given in PDDL and when it is given in natural language....

<span title='2023-09-20 20:22:32 +1000 AEST'>September 20, 2023</span>&nbsp;ยท&nbsp;2 min&nbsp;ยท&nbsp;403 words&nbsp;ยท&nbsp;Sukai Huang

Tathagata Chakraborti Plan Explanations as Model Reconciliation 2017

[TOC] Title: Plan Explanations as Model Reconciliation: Moving beyond explanation as soliloquy Author: Tathagata Chakraborti Publish Year: 30 May 2017 Review Date: Tue, Sep 19, 2023 url: https://arxiv.org/pdf/1701.08317.pdf Summary of paper Motivation Past work on plan explanations primarily involved AI system explaining the correctness of its plan and t he rationale for its decision in terms of its own model. Such soliloquy is inadequate (think about the case where GPT4 cannot find errors in PDDL domain file due to over confidence) in this work, the author said that due to the domain and task model difference between human and AI system, the soliloquy is inadequate....

<span title='2023-09-19 22:04:06 +1000 AEST'>September 19, 2023</span>&nbsp;ยท&nbsp;3 min&nbsp;ยท&nbsp;630 words&nbsp;ยท&nbsp;Sukai Huang

Vishal Pallagani Plansformer Tool Demonstrating Generation of Symbolic Plans Using Transformers 2023

[TOC] Title: Plansformer โ€“ Tool Demonstrating Generation of Symbolic Plans Using Transformers Author: Vishal Pallagani et. al. Publish Year: IJCAI-23 Review Date: Sat, Sep 16, 2023 url: https://www.ijcai.org/proceedings/2023/0839.pdf Summary of paper Motivation making a bridge between planning in LLM and planning in traditional automatic planner Design of Plansformer in the evaluation phase, planner testing helps to validate the plan (both the syntax validation and plan optimality validation), model testing helps to force a linguistic consistency (in this case it supervise the semantics)....

<span title='2023-09-16 00:46:56 +1000 AEST'>September 16, 2023</span>&nbsp;ยท&nbsp;1 min&nbsp;ยท&nbsp;105 words&nbsp;ยท&nbsp;Sukai Huang

Junnan_li Blip2 Boostrapping Language Image Pretraining 2023

[TOC] Title: BLIP2 - Boostrapping Language Image Pretraining 2023 Author: Junnan Li et. al. Publish Year: 15 Jun 2023 Review Date: Mon, Aug 28, 2023 url: https://arxiv.org/pdf/2301.12597.pdf Summary of paper The paper titled โ€œBLIP-2โ€ proposes a new and efficient pre-training strategy for vision-and-language models. The cost of training such models has been increasingly prohibitive due to the large scale of the models. BLIP-2 aims to address this issue by leveraging off-the-shelf, pre-trained image encoders and large language models (LLMs) that are kept frozen during the pre-training process....

<span title='2023-08-28 18:48:08 +1000 AEST'>August 28, 2023</span>&nbsp;ยท&nbsp;2 min&nbsp;ยท&nbsp;327 words&nbsp;ยท&nbsp;Sukai Huang

Peng_gao Llama Adapter V2 2023

[TOC] Title: Llama Adapter V2 Author: Peng Gao et. al. Publish Year: 28 Apr 2023 Review Date: Mon, Aug 28, 2023 url: https://arxiv.org/pdf/2304.15010.pdf Summary of paper The paper presents LLaMA-Adapter V2, an enhanced version of the original LLaMA-Adapter designed for multi-modal reasoning and instruction following. The paper aims to address the limitations of the original LLaMA-Adapter, which could not generalize well to open-ended visual instructions and lagged behind GPT-4 in performance....

<span title='2023-08-28 18:47:05 +1000 AEST'>August 28, 2023</span>&nbsp;ยท&nbsp;2 min&nbsp;ยท&nbsp;246 words&nbsp;ยท&nbsp;Sukai Huang

Rodrigo Reward Machines Exploiting Reward Function Structure in Rl 2022

[TOC] Title: Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning 2022 Author: Rodrigo Toro Icarte et. al. Publish Year: 2022 AI Access Foundation Review Date: Thu, Aug 17, 2023 url: https://arxiv.org/abs/2010.03950 Summary of paper Motivation in most RL applications, however, users have to program the reward function and hence, there is the opportunity to make the reward function visible and RL agent can exploit the functionโ€™s internal structure to learn optimal policies in a more sample efficient manner....

<span title='2023-08-17 16:32:09 +1000 AEST'>August 17, 2023</span>&nbsp;ยท&nbsp;2 min&nbsp;ยท&nbsp;321 words&nbsp;ยท&nbsp;Sukai Huang

Rodrigo Using Reward Machines for High Level Task Specification and Decomposition in Rl 2018

[TOC] Title: Reward Machines for High Level Task Specification and Decomposition in Reinforcement Learning Author: Rodrigo Toro Icarte et. al. Publish Year: PMLR 2018 Review Date: Thu, Aug 17, 2023 url: http://proceedings.mlr.press/v80/icarte18a/icarte18a.pdf Summary of paper Motivation proposing a reward machine while exposing reward function structure to the learner and supporting decomposition. Contribution in contrast to hierarchical RL methods which might converge to suboptimal policies. We prove that QRM is guaranteed to converge to an optimal policy in the tabular case....

<span title='2023-08-17 11:13:24 +1000 AEST'>August 17, 2023</span>&nbsp;ยท&nbsp;2 min&nbsp;ยท&nbsp;360 words&nbsp;ยท&nbsp;Sukai Huang

William_berrios Towards Language Models That Can See 2023

[TOC] Title: Towards Language Models That Can See: Computer Vision Through the LENS of Natural Language Author: William Berrios et. al. Publish Year: 28 Jun 2023 Review Date: Mon, Jul 3, 2023 url: https://arxiv.org/pdf/2306.16410.pdf Summary of paper Contribution proposing LENS, a modular approach that addresses computer vision tasks by harnessing the few-shot, in-context learning abilities of language models through natural language descriptions of visual inputs LENS enables any off-the-shelf LLM to have visual capabilities without auxiliary training or data LENS framework a redundant text prompt might be helpful LENS components...

<span title='2023-07-03 19:33:22 +1000 AEST'>July 3, 2023</span>&nbsp;ยท&nbsp;1 min&nbsp;ยท&nbsp;152 words&nbsp;ยท&nbsp;Sukai Huang

Lionel_wong From Word Models to World Models 2023

[TOC] Title: From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought Author: Lionel Wong et. al. Publish Year: 23 Jun 2023 Review Date: Sun, Jul 2, 2023 url: https://arxiv.org/pdf/2306.12672.pdf Summary of paper Motivation leverage a theory of linguistic meaning to build machines that think in more human-like ways. we frame linguistic meaning as a context-sensitive mapping from NL into a probabilistic language of thought (PLoT) โ€“ a general-purpose symbolic substrate for probabilistic, generative world modelling...

<span title='2023-07-02 21:24:50 +1000 AEST'>July 2, 2023</span>&nbsp;ยท&nbsp;3 min&nbsp;ยท&nbsp;460 words&nbsp;ยท&nbsp;Sukai Huang

Jianning_wang Boosting Language Models Reasoning With Chain of Knowledge Prompting 2023

[TOC] Title: Boosting Language Models Reasoning With Chain of Knowledge Prompting Author: Jianing Wang et. al. Publish Year: 10 Jun 2023 Review Date: Sun, Jul 2, 2023 url: https://arxiv.org/pdf/2306.06427.pdf Summary of paper Motivation โ€œChain of Thought (CoT)โ€ aims at designing a simple prompt like โ€œLetโ€™s think step by stepโ€ however, the generated rationales often come with mistakes, making unfactual and unfaithful reasoning chain To mitigate this brittleness, we propose a novel Chain-of-Knowlege knowledge evidence in the form of structure triple Contribution Benefiting from CoK, we additional introduce a F^2 -Verification method to estimate the reliable response, the wrong evidence can be indicated to prompt the LLM to rethink....

<span title='2023-07-02 16:09:58 +1000 AEST'>July 2, 2023</span>&nbsp;ยท&nbsp;2 min&nbsp;ยท&nbsp;264 words&nbsp;ยท&nbsp;Sukai Huang

Lin_guan Leveraging Pretrained Llm to Construct and Utilise World Models for Model Based Task Planning 2023

[TOC] Title: Leveraging Pretrained Large Language Models to Construct and Utilise World Models for Model Based Task Planning Author: Lin Guan et. al. Publish Year: 24 May 2023 Review Date: Sun, Jun 4, 2023 url: https://arxiv.org/pdf/2305.14909.pdf Summary of paper Motivation However, methods that use LLMs directly as planners are currently impractical due to several factors, including limited correctness of plans, strong reliance on feedback from interactions with simulators or even the actual environment, and the inefficiency in utilizing human feedback....

<span title='2023-06-04 12:01:46 +1000 AEST'>June 4, 2023</span>&nbsp;ยท&nbsp;3 min&nbsp;ยท&nbsp;499 words&nbsp;ยท&nbsp;Sukai Huang

Dharma_kc Neural Machine Translation for Code Generation 2023

[TOC] Title: Neural Machine Translation for Code Generation Author: Dharma KC et. al. Publish Year: 22 May 2023 Review Date: Sun, May 28, 2023 url: https://arxiv.org/pdf/2305.13504.pdf Summary of paper Motivation Recently, NMT methods have been adapted to the generation of program code. In NMT for code generation, the task is to generate output source code that satisfies constraints expressed in the input. Conclusion NMT-based architecture are getting quite popular for source generation from various input....

<span title='2023-05-28 09:52:32 +1000 AEST'>May 28, 2023</span>&nbsp;ยท&nbsp;1 min&nbsp;ยท&nbsp;181 words&nbsp;ยท&nbsp;Sukai Huang

Jiannan_xiang Language Models Meet World Models 2023

[TOC] Title: Language Models Meet World Models: Embodied Experiences Enhance Language Models Author: Jiannan Xiang et. al. Publish Year: 22 May 2023 Review Date: Fri, May 26, 2023 url: https://arxiv.org/pdf/2305.10626v2.pdf Summary of paper Motivation LLM often struggle with simple reasoning and planning in physical environment the limitation arises from the fact that LMs are trained only on written text and miss essential embodied knowledge and skills. Contribution we propose a new paradigm of enhancing LMs by finetuning them with world models, to gain diverse embodied knowledge while retaining their general language capabilities....

<span title='2023-05-26 01:00:02 +1000 AEST'>May 26, 2023</span>&nbsp;ยท&nbsp;2 min&nbsp;ยท&nbsp;357 words&nbsp;ยท&nbsp;Sukai Huang

Ryan_yang PG3 Policy Guided Planning for Generalised Policy Generation 2022

[TOC] Title: PG3 Policy Guided Planning for Generalised Policy Generation Author: Ryan Yang et. al. Publish Year: 21 Apr 2022 Review Date: Wed, May 24, 2023 url: https://arxiv.org/pdf/2204.10420.pdf Summary of paper Motivation a longstanding objective in classical planning is to synthesise policies that generalise across multiple problems from the same domain this work, we study generalised policy search-based methods with a focus on the score function used to guide the search over policies Contribution we study a specific instantiation of policy search where planning problems are PDDL-based and policies are lifted decision lists....

<span title='2023-05-24 19:57:16 +1000 AEST'>May 24, 2023</span>&nbsp;ยท&nbsp;2 min&nbsp;ยท&nbsp;304 words&nbsp;ยท&nbsp;Sukai Huang

Shunyu_yao Tree of Thoughts 2023

[TOC] Title: Tree of Thoughts: Deliberate Problem Solving with LLM Author: Shunyu Yao et. al. Publish Year: 17 May 2023 Review Date: Wed, May 24, 2023 url: https://arxiv.org/pdf/2305.10601.pdf Summary of paper Motivation might benefit from augmentation by a more deliberate โ€œSystem 2โ€ planning process that (1) maintains and explores diverse alternatives for current choices instead of just picking one, and (2) evaluates its current status and actively looks ahead or backtracks to make more global decisions....

<span title='2023-05-24 16:35:10 +1000 AEST'>May 24, 2023</span>&nbsp;ยท&nbsp;1 min&nbsp;ยท&nbsp;104 words&nbsp;ยท&nbsp;Sukai Huang

Tom_silver Generalised Planning in PDDL Domains With Pretrained Large Language Models 2023

[TOC] Title: Generalised Planning in Pddl Domains With Pretrained Large Language Models Author: Tom Silver et. al. Publish Year: 18 May 2023 Review Date: Tue, May 23, 2023 url: https://arxiv.org/pdf/2305.11014.pdf Summary of paper Motivation in particular, we consider PDDL domains and use GPT-4 to synthesize Python programs, we also consider Chain of Thought (CoT) summarisation, where the LLM is prompted to summarize the domain and propose a strategy in words before synthesizing the program we consider automated debugging, where the program is validated with respect to the training tasks, and in case of errors, the LLM is re-prompted with four types of feedback....

<span title='2023-05-23 21:27:15 +1000 AEST'>May 23, 2023</span>&nbsp;ยท&nbsp;3 min&nbsp;ยท&nbsp;551 words&nbsp;ยท&nbsp;Sukai Huang

Yongliang Hugginggpt 2023

[TOC] Title: HuggingGPT: Solving AI tasks with ChatGPT and its Friends in Hugging Face Author: Yongliang Shen et. al. Publish Year: 2 Apr 2023 Review Date: Tue, May 23, 2023 url: https://arxiv.org/pdf/2303.17580.pdf Summary of paper Motivation while there are abundant AI models available for different domains and modalities, they cannot handle complicated AI tasks. we advocate that LLMs could act as a controller to manage existing AI models to solve complicated AI tasks and language could be a generic interface to empower this Contribution specifically, we use ChatGPT to conduct task planning when receiving a user request, select models according to their function descriptions available in Hugging face, execute each subtask with the selected AI model, and summarize the response according to the execution results....

<span title='2023-05-23 11:57:02 +1000 AEST'>May 23, 2023</span>&nbsp;ยท&nbsp;2 min&nbsp;ยท&nbsp;288 words&nbsp;ยท&nbsp;Sukai Huang

Yaqi_xie Translating Natural Language to Planning Goals With Llm 2023

[TOC] Title: Translating Natural Language to Planning Goals With LLM Author: Yaqi Xie et. al. Publish Year: 10 Feb 2023 Review Date: Mon, May 22, 2023 url: https://arxiv.org/pdf/2302.05128.pdf Summary of paper Motivation Unfortunately, recent work has also shown that LLMs are unable to perform accurate reasoning nor solve planning problem LLM can act as a natural interface between the planner and human users Our empirical results on GPT 3.5 variants show that LLMs are much better suited towards translation rather than planning....

<span title='2023-05-22 12:30:25 +1000 AEST'>May 22, 2023</span>&nbsp;ยท&nbsp;1 min&nbsp;ยท&nbsp;142 words&nbsp;ยท&nbsp;Sukai Huang

Bo_liu Llmp Empowering Large Language Models With Optimal Planning Proficiency 2023

[TOC] Title: LLM+P Empowering Large Language Models With Optimal Planning Proficiency Author: Bo Liu Publish Year: 5 May 2023 Review Date: Mon, May 22, 2023 url: https://arxiv.org/pdf/2304.11477.pdf Summary of paper Motivation However, so far, LLMs cannot reliably solve long-horizon planning problems. By contrast, classical planners, once a problem is given in a formatted way, can use efficient search algorithms to quickly identify correct, or even optimal plans. Contribution introduce LLM+P, it takes in a natural language description of a planning problem, then return a correct plan for solving that problem in natural language....

<span title='2023-05-22 11:56:15 +1000 AEST'>May 22, 2023</span>&nbsp;ยท&nbsp;2 min&nbsp;ยท&nbsp;251 words&nbsp;ยท&nbsp;Sukai Huang

Siyu_yuan Distilling Script Knowledge From Large Language Models for Constrainted Language Planning 2023

[TOC] Title: Distilling Script Knowledge From Large Language Models for Constrainted Language Planning Author: Siyu Yuan et. al. Publish Year: 18 May 2023 Review Date: Mon, May 22, 2023 url: https://arxiv.org/pdf/2305.05252.pdf Summary of paper Motivation to accomplish everyday goals, human usually plan their actions in accordance with step-by-step instructions, such instruction are discovered as goal-oriented scripts. In this paper, we define the task of constrained language planning for the first time....

<span title='2023-05-22 11:31:39 +1000 AEST'>May 22, 2023</span>&nbsp;ยท&nbsp;2 min&nbsp;ยท&nbsp;304 words&nbsp;ยท&nbsp;Sukai Huang

Junnan_li BLIP Bootstrapping Language Image Pre Training for Unified Vision Language Understanding and Generation 2022

[TOC] Title: BLIP Bootstrapping Language Image Pre Training for Unified Vision Language Understanding and Generation 2022 Author: Junnan Li et. al. Publish Year: 15 Feb 2022 Review Date: Mon, May 22, 2023 url: https://arxiv.org/pdf/2201.12086.pdf Summary of paper Motivation performance improvement has been largely achieved by scaling up the dataset with noisy image-text pairs collected from the web, which is a suboptimal source of supervision Contribution BLIP effectively utilises the noisy web data by bootstrapping the captions, where a captioner generates synthetic captions and a filter removes the noisy ones....

<span title='2023-05-22 11:17:28 +1000 AEST'>May 22, 2023</span>&nbsp;ยท&nbsp;2 min&nbsp;ยท&nbsp;240 words&nbsp;ยท&nbsp;Sukai Huang

Harsh_jhamtani Natural Language Decomposition and Interpretation of Complex Utterances 2023

[TOC] Title: Natural Language Decomposition and Interpretation of Complex Utterances Author: Jacob Andreas Publish Year: 15 May 2023 Review Date: Mon, May 22, 2023 url: https://arxiv.org/pdf/2305.08677.pdf Summary of paper Motivation natural language interface often require supervised data to translate user request into structure intent representations however, during data collection, it can be difficult to anticipate and formalise the full range of user needs we introduce an approach for equipping a simple language to code model to handle complex utterances via a process of hierarchical natural language decomposition....

<span title='2023-05-22 09:54:04 +1000 AEST'>May 22, 2023</span>&nbsp;ยท&nbsp;10 min&nbsp;ยท&nbsp;2088 words&nbsp;ยท&nbsp;Sukai Huang

Alexander_kirillov Segment Anything 2023

[TOC] Title: Segment Anything Author: Alexander Kirillov et. al. Publish Year: 5 Apr 2023 Review Date: Sun, May 21, 2023 url: https://arxiv.org/pdf/2304.02643.pdf Summary of paper Motivation we introduce the segment anything project: a new task, model and dataset for image segmentation. Using the model in a data collection loop, we built the largest segmentation dataset to date. Contribution the model is designed and trained to be promptable, so it can transfer zero-shot to new images distributions and tasks....

<span title='2023-05-21 11:56:54 +1000 AEST'>May 21, 2023</span>&nbsp;ยท&nbsp;2 min&nbsp;ยท&nbsp;356 words&nbsp;ยท&nbsp;Sukai Huang

Rohit_gridhar Imagebind One Embedding Space to Bind Them All 2023

[TOC] Title: ImageBind One Embedding Space to Bind Them All Author: Rohit Girdhar et. al. Publish Year: 9 May 2023 Review Date: Mon, May 15, 2023 url: https://arxiv.org/pdf/2305.05665.pdf Summary of paper Motivation we present ImageBind, an approach to learn a joint embedding across six different modalities ImageBind can leverage recent large scale vision-language models, and extend their zero shot capabilities to new modalities just using their natural pairing with images. Contribution we show that all combinations of paired data are not necessary to train such a joint embedding, and only image-paired data is sufficient to bind the modalities together....

<span title='2023-05-15 15:06:48 +1000 AEST'>May 15, 2023</span>&nbsp;ยท&nbsp;2 min&nbsp;ยท&nbsp;235 words&nbsp;ยท&nbsp;Sukai Huang

Qinghao_hitea Hierarchical Temporal Aware Video Language Pre Training 2022

[TOC] Title: Hierarchical Temporal Aware Video Language Pre Training Author: Qinghao Ye, Fei Huang et. al. Publish Year: 30 Dec 2022 Review Date: Thu, Apr 6, 2023 url: https://arxiv.org/pdf/2212.14546.pdf Summary of paper Motivation most previous methods directly inherit or adapt typical image-language pre-training paradigms to video-language pretraining, thus not fully exploiting the unique characteristic of video, i.e., temporal. Contribution this paper, the two novel pretraining tasks for modeling cross-modal alignment between moments and texts as well as the temporal relations of video-text pairs....

<span title='2023-04-06 10:02:22 +0800 +0800'>April 6, 2023</span>&nbsp;ยท&nbsp;2 min&nbsp;ยท&nbsp;411 words&nbsp;ยท&nbsp;Sukai Huang