Posts

Jun Wang Conformal Temporal Logic Planning Using Llm 2023

[TOC] Title: Conformal Temporal Logic Planning Using Llm 2023 Author: Jun Wang et. al. Publish Year: 19 Dec, 2023 Review Date: Sun, Jan 21, 2024 url: arXiv:2309.10092v2 Summary of paper Motivation Unlike previous methods that focus on low-level system configurations, this approach focuses on NL-based atomic propositions. now the LTL tasks are defined over NL-based atomic propositions Robots are required to perform high-level sub tasks specified in natural language. To formally define the overarching mission, they leverage LTL defined over atomic predicates modelling these NL-based sub-tasks. Contribution To address the challenge of ensuring the correctness of robot plans with respect to these LTL-encoded tasks, the authors propose HERACLEs, a hierarchical conformal natural language planner. HERACLEs employs automata theory to determine the next NL-specified sub-tasks for mission progress, employs Large Language Models to design robot plans to fulfill these sub-tasks, and uses conformal prediction to assess the probabilistic correctness of the plans, deciding whether external assistance is needed. The paper provides theoretical probabilistic guarantees for mission satisfaction and presents extensive comparative experiments on mobile manipulation tasks. Some key terms Limitation for previous work ...

Gerevini Plan Constraints and Preferences in Pddl3 2005

[TOC] Title: Gerevini Plan Constraints and Preferences in PDDL3 Author: Alfonso Gerevini, Derek Long Publish Year: 2005 Review Date: Thu, Jan 11, 2024 url: http://www.cs.yale.edu/~dvm/papers/pddl-ipc5.pdf Summary of paper Motivation the notion of plan quality in automated planning is a practically very important issue. it is important to generate plans of good or optimal quality and we need to express the plan quality the proposed extended language allows us to express strong and soft constraints on plan trajectories i.e., constraints over possible actions in the plan and intermediate states reached by the plan as well as strong and soft problem goals. Some key terms some scenarios ...

Nir Lipo Planning With Perspectives Using Functional Strips 2022

[TOC] Title: Planning With Perspectives – Using Decomposing Epistemic Planning using Functional STRIPS Author: Guang Hu, Nir Lipovetzky Publish Year: 2022 Review Date: Thu, Jan 11, 2024 url: https://nirlipo.github.io/publication/hu-2022-planning/ Summary of paper Motivation we present a novel approach to epistemic planning called planning with perspectives (PWP) that is both more expressive and computationally more efficient than existing state of the art epistemic planning tools. Contribution in this paper, we decompose epistemic planning by delegating reasoning about epistemic formulae to an external solver, i.e., Functional STRIPS F-STRIPS supports the user of external, black-box functions within action models. Building on recent work that demonstrates the relationship between what an agent ‘sees’ and what it knows, we define the perspective of each agent using an external function, and build a solver for epistemic logic around this. Some key terms external functions (black-box) ...

Alex_coulter Theory Alignment via a Classical Encoding of Regular Bismulation 2022

[TOC] Title: Theory Alignment via a Classical Encoding of Regular Bismulation 2022 Author: Alex Coulter et. al. Publish Year: KEPS 2022 Review Date: Wed, Nov 29, 2023 url: https://icaps22.icaps-conference.org/workshops/KEPS/KEPS-22_paper_7781.pdf Summary of paper Motivation the main question we seek to answer is how we can test if two models align (where the fluents and action implementations may differ), and if not, where that misalignment occurs. Contribution the work is built on a foundation of regular bisimulation found that the proposed alignment was not only viable, with many submissions having “solutions” to the merged model showing where a modelling error occurs, but several cases demonstrated errors with the submitted domains that were subtle and detected only by this added approach. Some key terms Bisimulation ...

Pascal Bercher Detecting Ai Planning Modelling Mistakes Potential Errors and Benchmark Domains 2023

[TOC] Title: Detecting Ai Planning Modelling Mistakes Potential Errors and Benchmark Domains Author: Pascal Bercher et. al. Publish Year: 2023 Review Date: Mon, Nov 13, 2023 url: https://bercher.net/publications/2023/Sleath2023PossibleModelingErrors.pdf Summary of paper Contribution the author provided a compilation of potential modelling errors the author supply a public repository of 56 (flawed) benchmark domains conducted an evaluation of well-known AI planning tools for their ability to diagnose those errors, showing that not a single tool is able to spot all errors, with no tool being strictly stronger than another. Some key terms list of errors ...

Yecheng Jason Ma Eureka Human Level Reward Design via Coding Large Language Models 2023

[TOC] Title: Eureka Human Level Reward Design via Coding Large Language Models 2023 Author: Yecheng Jason Ma et. al. Publish Year: 19 Oct 2023 Review Date: Fri, Oct 27, 2023 url: https://arxiv.org/pdf/2310.12931.pdf Summary of paper Motivation harnessing LLMs to learn complex low-level manipulation tasks, remains an open problem. we bridge this fundamental gap by using LLMs to produce rewards that can be used to acquire conplex skill via reinforcement learning. Contribution Eureka generate reward functions that outperform expert human-engineered rewards. the generality of Eureka also enables a new gradient-free in-context learning approach to reinforcement learning from human feedback (RLHF) Some key terms given detailed environmental code and natural language description about the task, the LLMs can generate reward function candidate sampling. As many real-world RL tasks admit sparse rewards that are difficult for learning, reward shaping that provides incremental learning signals is necessary in practice reward design problem ...

Mark Chen Evaluating Large Language Models Trained on Code 2021

[TOC] Title: Evaluating Large Language Models Trained on Code Author: Mark Chen et. al. OPENAI Publish Year: 14 Jul 2021 Review Date: Mon, Oct 16, 2023 url: https://arxiv.org/pdf/2107.03374.pdf Summary of paper Motivation it is the research paper behind Github Copilot tech more recently, language models have also fueled progress towards the longstanding challenge of program synthesis. Contribution we find that repeated sampling from the model is a surprisingly effective strategy for producing working solutions to difficult prompts. limitation difficulty with docstrings describing long chain of operations and with binding operations to variables. Some key terms HumanEval ...

Baptiste Roziere Code Llama Open Foundation Model for Code 2023

[TOC] Title: Code Llama Open Foundation Model for Code Author: Baptiste Roziere et. al. META AI Publish Year: 2023 Review Date: Mon, Oct 16, 2023 url: https://scontent.fmel13-1.fna.fbcdn.net/v/t39.2365-6/369856151_1754812304950972_1159666448927483931_n.pdf?_nc_cat=107&ccb=1-7&_nc_sid=3c67a6&_nc_ohc=Hcg6QsYJx1wAX_okEZO&_nc_ht=scontent.fmel13-1.fna&oh=00_AfAYtfHJfYeomAQWiMUTRo96iP8d4sZrlIfD_KAeYlYaDQ&oe=6531E8CF Summary of paper Motivation CODE Llama, support for large input contexts, and zero-shot instruction following ability for programming tasks. Contribution CODE llama reaches SOTA performance among open models on several code benchmarks Some key terms By training on domain-specific datasets, LLM have proved effective more broadly on applications that require advanced natural language understanding. ...

Haotian Liu Improved Baselines With Visual Instruction Tuning 2023

[TOC] Title: Improved Baselines With Visual Instruction Tuning Author: Haotian Liu et. al. Publish Year: Oct 5 2023 Review Date: Sun, Oct 8, 2023 url: https://arxiv.org/pdf/2310.03744.pdf Summary of paper Motivation we show that the fully-connected vision-language cross-modal connector in LLaVA is surprisingly powerful and data-efficient. Contribution with simple modifications to LLaVA, namely, using CLIP-ViT with an MLP projection and adding academic-task-oriented VQA data with simple response formatting prompts, they establish stronger baseline. Some key terms Improvement one: MLP cross modal connector ...

Christabel Wayllace Goal Recognition Design With Stochastic Agent Action Outcomes 2016

[TOC] Title: Christabel Wayllace Goal Recognition Design With Stochastic Agent Action Outcomes 2016 Author: Christable Wayllace et. al. Publish Year: IJCAI 2016 Review Date: Fri, Oct 6, 2023 url: https://www.ijcai.org/Proceedings/16/Papers/464.pdf Summary of paper Motivation in this paper, they generalize the Goal Recognition Design (GRD) problem to Stochastic GRD (S-GRD) problems, which handle stochastic action outcomes. Some key terms Plan and goal recognition problem it aims to identify the actual plan or goal of an agent given its behaviour. Goal Recognition Design ...

Alba Gragera Pddl Domain Repair Fixing Domains With Incomplete Action Effects 2023

[TOC] Title: PDDL Domain Repair Fixing Domains With Incomplete Action Effects Author: Alba Gragera et. al. Publish Year: ICAPS 2023 Review Date: Wed, Sep 20, 2023 url: https://icaps23.icaps-conference.org/demos/papers/2791_paper.pdf Summary of paper Contribution in this paper, they present a tool to repair planning models where the effects of some actions are incomplete. The received input is compiled to a new extended planning task, in which actions are permitted to insert possible missing effects. The solution is a plan that achieves the goals of the original problem while also alerting users of the modification made. ...

Alba Gragera Exploring the Limitations of Using LLMs to Fix Planning Tasks 2023

[TOC] Title: Exploring the Limitations of Using LLMs to Fix Planning Tasks Author: Alba Gragera et. al. Publish Year: icaps23.icaps-conference Review Date: Wed, Sep 20, 2023 url: https://icaps23.icaps-conference.org/program/workshops/keps/KEPS-23_paper_3645.pdf Summary of paper Motivation In this work, the authors present ongoing efforts on exploring the limitations of LLMs in task requiring reasoning and planning competences: that of assisting humans in the process of fixing planning tasks. Contribution investigate how good LLMs are at repairing planning tasks when the prompt is given in PDDL and when it is given in natural language. also they tested on incomplete initial state and also incomplete domains which lack a necessary action effect to achieve the goals. in all cases, LLMs are used as stand-alone, and they directly assess the correctness of the solutions it generates. conclusion: they demonstrate that although LLMs can in principle facilitate iterative refinement of PDDL models through user interaction, their limited reasoning abilities render them insufficient for identifying meaningful changes to ill-defined planning models that result into solvable planning tasks. ...

Tathagata Chakraborti Plan Explanations as Model Reconciliation 2017

[TOC] Title: Plan Explanations as Model Reconciliation: Moving beyond explanation as soliloquy Author: Tathagata Chakraborti Publish Year: 30 May 2017 Review Date: Tue, Sep 19, 2023 url: https://arxiv.org/pdf/1701.08317.pdf Summary of paper Motivation Past work on plan explanations primarily involved AI system explaining the correctness of its plan and t he rationale for its decision in terms of its own model. Such soliloquy is inadequate (think about the case where GPT4 cannot find errors in PDDL domain file due to over confidence) in this work, the author said that due to the domain and task model difference between human and AI system, the soliloquy is inadequate. Contribution They show how explanation can be seen as a “model reconciliation problem” (MRP), where AI system in effect suggests changes to the human’s model, so as to make its plan be optimal with respected to that changed human model. In other words, they need to update human’s mindset about the domain and task model such that the plan generated from the AI system fits human’s expectation. Some key terms Definition of a classical planning problem ...

Vishal Pallagani Plansformer Tool Demonstrating Generation of Symbolic Plans Using Transformers 2023

[TOC] Title: Plansformer – Tool Demonstrating Generation of Symbolic Plans Using Transformers Author: Vishal Pallagani et. al. Publish Year: IJCAI-23 Review Date: Sat, Sep 16, 2023 url: https://www.ijcai.org/proceedings/2023/0839.pdf Summary of paper Motivation making a bridge between planning in LLM and planning in traditional automatic planner Design of Plansformer in the evaluation phase, planner testing helps to validate the plan (both the syntax validation and plan optimality validation), model testing helps to force a linguistic consistency (in this case it supervise the semantics). Function of this Plansformer The Plansformer operates as an AI planner designed for plan generation, not for creating PDDLs from natural language descriptions. ...

Junnan_li Blip2 Boostrapping Language Image Pretraining 2023

[TOC] Title: BLIP2 - Boostrapping Language Image Pretraining 2023 Author: Junnan Li et. al. Publish Year: 15 Jun 2023 Review Date: Mon, Aug 28, 2023 url: https://arxiv.org/pdf/2301.12597.pdf Summary of paper The paper titled “BLIP-2” proposes a new and efficient pre-training strategy for vision-and-language models. The cost of training such models has been increasingly prohibitive due to the large scale of the models. BLIP-2 aims to address this issue by leveraging off-the-shelf, pre-trained image encoders and large language models (LLMs) that are kept frozen during the pre-training process. ...

Peng_gao Llama Adapter V2 2023

[TOC] Title: Llama Adapter V2 Author: Peng Gao et. al. Publish Year: 28 Apr 2023 Review Date: Mon, Aug 28, 2023 url: https://arxiv.org/pdf/2304.15010.pdf Summary of paper The paper presents LLaMA-Adapter V2, an enhanced version of the original LLaMA-Adapter designed for multi-modal reasoning and instruction following. The paper aims to address the limitations of the original LLaMA-Adapter, which could not generalize well to open-ended visual instructions and lagged behind GPT-4 in performance. ...

Rodrigo Reward Machines Exploiting Reward Function Structure in Rl 2022

[TOC] Title: Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning 2022 Author: Rodrigo Toro Icarte et. al. Publish Year: 2022 AI Access Foundation Review Date: Thu, Aug 17, 2023 url: https://arxiv.org/abs/2010.03950 Summary of paper Motivation in most RL applications, however, users have to program the reward function and hence, there is the opportunity to make the reward function visible and RL agent can exploit the function’s internal structure to learn optimal policies in a more sample efficient manner. Contribution different methodology of RL for Reward Machines compared to their previous studies, this work tested a collection of RL methods that can exploit a reward machine’s internal structure to improve sample efficiency Some key terms counterfactual experiences for reward machines (CRM) ...

Rodrigo Using Reward Machines for High Level Task Specification and Decomposition in Rl 2018

[TOC] Title: Reward Machines for High Level Task Specification and Decomposition in Reinforcement Learning Author: Rodrigo Toro Icarte et. al. Publish Year: PMLR 2018 Review Date: Thu, Aug 17, 2023 url: http://proceedings.mlr.press/v80/icarte18a/icarte18a.pdf Summary of paper Motivation proposing a reward machine while exposing reward function structure to the learner and supporting decomposition. Contribution in contrast to hierarchical RL methods which might converge to suboptimal policies. We prove that QRM is guaranteed to converge to an optimal policy in the tabular case. Some key terms intro ...

William_berrios Towards Language Models That Can See 2023

[TOC] Title: Towards Language Models That Can See: Computer Vision Through the LENS of Natural Language Author: William Berrios et. al. Publish Year: 28 Jun 2023 Review Date: Mon, Jul 3, 2023 url: https://arxiv.org/pdf/2306.16410.pdf Summary of paper Contribution proposing LENS, a modular approach that addresses computer vision tasks by harnessing the few-shot, in-context learning abilities of language models through natural language descriptions of visual inputs LENS enables any off-the-shelf LLM to have visual capabilities without auxiliary training or data LENS framework ...

Lionel_wong From Word Models to World Models 2023

[TOC] Title: From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought Author: Lionel Wong et. al. Publish Year: 23 Jun 2023 Review Date: Sun, Jul 2, 2023 url: https://arxiv.org/pdf/2306.12672.pdf Summary of paper Motivation leverage a theory of linguistic meaning to build machines that think in more human-like ways. we frame linguistic meaning as a context-sensitive mapping from NL into a probabilistic language of thought (PLoT) – a general-purpose symbolic substrate for probabilistic, generative world modelling ...

Jianning_wang Boosting Language Models Reasoning With Chain of Knowledge Prompting 2023

[TOC] Title: Boosting Language Models Reasoning With Chain of Knowledge Prompting Author: Jianing Wang et. al. Publish Year: 10 Jun 2023 Review Date: Sun, Jul 2, 2023 url: https://arxiv.org/pdf/2306.06427.pdf Summary of paper Motivation “Chain of Thought (CoT)” aims at designing a simple prompt like “Let’s think step by step” however, the generated rationales often come with mistakes, making unfactual and unfaithful reasoning chain To mitigate this brittleness, we propose a novel Chain-of-Knowlege knowledge evidence in the form of structure triple Contribution Benefiting from CoK, we additional introduce a F^2 -Verification method to estimate the reliable response, the wrong evidence can be indicated to prompt the LLM to rethink. Some key terms ...

Lin_guan Leveraging Pretrained Llm to Construct and Utilise World Models for Model Based Task Planning 2023

[TOC] Title: Leveraging Pretrained Large Language Models to Construct and Utilise World Models for Model Based Task Planning Author: Lin Guan et. al. Publish Year: 24 May 2023 Review Date: Sun, Jun 4, 2023 url: https://arxiv.org/pdf/2305.14909.pdf Summary of paper Motivation However, methods that use LLMs directly as planners are currently impractical due to several factors, including limited correctness of plans, strong reliance on feedback from interactions with simulators or even the actual environment, and the inefficiency in utilizing human feedback. Contribution introduce a alternative paradigm that construct an explicit world (domain) model in planning domain definition language (PDDL) and then use it to plan with sound domain-independent planners. users can correct the PDDL before the real planning. Findings GPT-4 can readily correct all the errors according to natural language feedback from PDDL validators and humans. Some key terms approach ...

Dharma_kc Neural Machine Translation for Code Generation 2023

[TOC] Title: Neural Machine Translation for Code Generation Author: Dharma KC et. al. Publish Year: 22 May 2023 Review Date: Sun, May 28, 2023 url: https://arxiv.org/pdf/2305.13504.pdf Summary of paper Motivation Recently, NMT methods have been adapted to the generation of program code. In NMT for code generation, the task is to generate output source code that satisfies constraints expressed in the input. Conclusion NMT-based architecture are getting quite popular for source generation from various input. The NMT-based code generation is useful in multiple domains such as code generation from input binary or assembly (decompilation), code-to-code translation, code repair, bug fixing, and many more. some open problems source code has long dependencies in multiple places next-token prediction technique may lost the dependency information Methods that can break down a problem into small problems, generate code for such subprograms, and evaluate them are good potential research direction sample efficiency Current code generation does not combine code abstraction to higher-level abstractions as human do. Execution-guided synthesis currently works with DSLs, but extending them to real-world source code generation is a research direction. Retrieve-and-Edit framework

Jiannan_xiang Language Models Meet World Models 2023

[TOC] Title: Language Models Meet World Models: Embodied Experiences Enhance Language Models Author: Jiannan Xiang et. al. Publish Year: 22 May 2023 Review Date: Fri, May 26, 2023 url: https://arxiv.org/pdf/2305.10626v2.pdf Summary of paper Motivation LLM often struggle with simple reasoning and planning in physical environment the limitation arises from the fact that LMs are trained only on written text and miss essential embodied knowledge and skills. Contribution we propose a new paradigm of enhancing LMs by finetuning them with world models, to gain diverse embodied knowledge while retaining their general language capabilities. the experiments in a virtual physical world simulation environment will be used to finetune LMs to teach diverse abilities of reasoning and acting in the physical world, e.g., planning and completing goals, object permanence and tracking etc. to preserve the generalisation ability of LM models, we use elastic weight consolidation (EWC) for selective weight updates, combined with low-rank adapters (LoRA) for training efficiency. Some key terms ...

Ryan_yang PG3 Policy Guided Planning for Generalised Policy Generation 2022

[TOC] Title: PG3 Policy Guided Planning for Generalised Policy Generation Author: Ryan Yang et. al. Publish Year: 21 Apr 2022 Review Date: Wed, May 24, 2023 url: https://arxiv.org/pdf/2204.10420.pdf Summary of paper Motivation a longstanding objective in classical planning is to synthesise policies that generalise across multiple problems from the same domain this work, we study generalised policy search-based methods with a focus on the score function used to guide the search over policies Contribution we study a specific instantiation of policy search where planning problems are PDDL-based and policies are lifted decision lists. Some key terms what is generalised planning and generalised policy search (GPS) ...