Shivam_miglani Nltopddl Learning From Nlp Manuals 2020

[TOC]

Title: NLtoPDDL: One-Shot Learning of PDDL Models from Natural Language Process Manuals
Author: Shivam Miglani et. al.
Publish Year: 2020
Review Date: Mar 2022

Summary of paper

Motivation

pipeline

Pipeline architecture

Phase 1 we have a DQN that learns to extract words that represent action name, action arguments, and the sequence of actions present in annotated NL process manuals. (why only action name, do we need to extract other information???) Again, why this is called DQN RL? is it just normal supervised learning… (Check EASDRL paper to understand Phase 1)

Phase 2 we extract the action sequences by feeding the unseen process manual to Phase 1’s trained DQN. From this extracted sequences,

LOCM2 algorithm learns a partial PDDL model in one shot
the model can be completed with human input in an interactive fashion
they named it as interactive-LOCM (iLOCM)

EASDRL How it works

It represents action sequences extract as a RL problem and uses two DQNs. Each DQN can perform only RL actions: select or reject a word.

the first DQN only select action words but the second DQN will based on the action words to select arguments words

some pre-processing methods

the author replaced pronouns with nouns in the unseen process manual to reduce ambiguity using a neural based technique using Huggingface library (neuralcoref https://github.com/huggingface/neuralcoref)

Some key terms

LOCM2 what it is

this method tried to cluster similar looking actions and arguments under one template, resulting in higher PDDL models with precise action sets

some limitations for the proposed model

Since the DQNs were trained to extract single words, the learned model extracts multiple words for the same argument and also misses out on implicit arguments.

e.g., The learned model for tea domain assigned correct duration to the actions but missed out preconditions and effects related to “mug” argument and static “hand” argument.

Also extracting words with adjectives such as “cold milk” or compound nouns such as “training personnel” would be better (??!!! word association!)

Minor comments

Flair library for word embeddings might be useful

https://github.com/flairNLP/flair

Annotated dataset

For the training of the DRL models, the pre-annotated datasets are taken from Feng, Zhuo, and Kambhampati’s repository1for three real-world domains

Microsoft Windows Help and Support (WinHelp) documents
CookingTutorial (Cooking)
WikiHow Home and Garden (WikiHG)

Some examples for Phase 1 action extraction

Learned state machine via iLOCM

Potential future work

I see, there is no special method to find arguments or adjectives

let’s see if we can use some word association technique to do that so as to enhance the better argument score.

Summary of paper#

Motivation#

Some key terms#

some limitations for the proposed model#

Minor comments#

Potential future work#