Sudhir Agarwal Translate Infer Compile for Accurate Text to Plan 2024

[TOC]

Title: TIC: Translate-Infer-Compile for accurate “text to plan” using LLMs and logical intermediate representations
Author: Sudhir Agarwal et. al.
Publish Year: Jan 2024
Review Date: Sat, Feb 17, 2024
url: https://arxiv.org/pdf/2402.06608.pdf

Summary of paper

Motivation

using an LLM to generate the task PDDL from a natural language planning task descriptions is challenging. One of the primary reasons for failure is that the LLM often make errors generating information that must abide by the constraints specified in the domain knowledge or the task descriptions
- It could be that the crucial domain knowledge information is often implicitly embedded or hidden within the descriptions, making it difficult for the LLM to accurately generate the task PDDL.
- In fact, in this paper, the issue that LLM faced is that LLMs cannot handle the “fact enumeration and object enumeration” well. e.g., the description “3 dispensers for 3 ingredients.” is more directly represented as the shortcut map(dispenser, dispenses, ingredient) rather than explicitly enumerating all the facts.

Contribution

The approach described focuses on bridging the gap between natural language understanding and classical planning. It combines the strengths of large language models (LLMs) for natural language processing and classical planning tools for task planning. Unlike previous methods that directly use LLMs for generating Planning Domain Definition Language (PDDL) representations, this approach involves three steps:

Translate: LLMs are used to generate a logically interpretable intermediate representation of natural language task descriptions.
Infer: Additional logically dependent information is derived from the intermediate representation using a logic reasoner, such as an Answer Set Programming solver.
Compile: The target task PDDL is generated from the base and inferred information.

By only using LLMs to output the intermediate representation, errors are significantly reduced. This approach, known as the TIC approach, achieves high accuracy in generating task PDDLs for all evaluated domains, leveraging the strengths of both LLMs and classical planning tools.

Some key terms

Intermediate Representation

the core idea of the TIC approach is the introduction of the intermediate representation.
unlike the in-context example of direct PDDL generation approaches, an intermediate representation does not contain information that is not present in the task descriptions but required in the task PDDL,
- it means that the LLM does not need to derive dispenser 1, dispenser 2, dispenser 3 from the sentence “3 dispensers”.
some are higher level rules and the LLM should not explicitly enumerating all the facts
- for example, a barman task description may not contain information corresponding to the PDDL facts (next 10 11), (next 11 12), because the knowledge that the shaker levels have an order is domain knowledge that is common for all barman queries.

comments from sukai

PDDL snippets sometimes is very concrete (e.g., you need to state all the fact facts (next n n+1), and we should prevent LLM from doing it because LLM is not good at stating low level things.

Results

Summary

The core idea of the paper is that directly outputting PDDL from LLMs is challenging due to their inability to handle very concrete information, such as enumerating every facts to every objects, leading to error-prone translations. Instead, the paper suggests having LLMs translate to intermediate semi-formal representations that can represent higher level rules. This approach addresses the issue of LLMs having to explicitly enumerate all objects to state facts by allowing them to output more abstract representations (high level declaration and high level rules) and then they manually implemented ASP rules and ASP solvers to infer materialised representation. ASP refers to answer set programming.

Potential future work

Can we just replace the whole ASP section to asking LLMs to ask python code to handle the “object and fact enumeration” things. In this case, we do not need to manually writing ASP rules which were stated in Section 3.3. Translating to both PDDL and Python code… Well, it leads to the hybrid modelling topic.

Summary of paper#

Motivation#

Contribution#

Some key terms#

Results#

Summary#

Potential future work#