Code Generation

Jia Li Structured Cot Prompting for Code Generation 2023

[TOC] Title: Structured Chaint of Thought Prompting for Code Generation 2023 Author: Jia Li et. al. Publish Year: 7 Sep 2023 Review Date: Wed, Feb 28, 2024 url: https://arxiv.org/pdf/2305.06599.pdf Summary of paper Contribution The paper introduces Structured CoTs (SCoTs) and a novel prompting technique called SCoT prompting for improving code generation with Large Language Models (LLMs) like ChatGPT and Codex. Unlike the previous Chain-of-Thought (CoT) prompting, which focuses on natural language reasoning steps, SCoT prompting leverages the structural information inherent in source code. By incorporating program structures (sequence, branch, and loop structures) into intermediate reasoning steps (SCoTs), LLMs are guided to generate more structured and accurate code. Evaluation on three benchmarks demonstrates that SCoT prompting outperforms CoT prompting by up to 13.79% in Pass@1, is preferred by human developers in terms of program quality, and exhibits robustness to various examples, leading to substantial improvements in code generation performance. ...

Mark Chen Evaluating Large Language Models Trained on Code 2021

[TOC] Title: Evaluating Large Language Models Trained on Code Author: Mark Chen et. al. OPENAI Publish Year: 14 Jul 2021 Review Date: Mon, Oct 16, 2023 url: https://arxiv.org/pdf/2107.03374.pdf Summary of paper Motivation it is the research paper behind Github Copilot tech more recently, language models have also fueled progress towards the longstanding challenge of program synthesis. Contribution we find that repeated sampling from the model is a surprisingly effective strategy for producing working solutions to difficult prompts. limitation difficulty with docstrings describing long chain of operations and with binding operations to variables. Some key terms HumanEval ...

Baptiste Roziere Code Llama Open Foundation Model for Code 2023

[TOC] Title: Code Llama Open Foundation Model for Code Author: Baptiste Roziere et. al. META AI Publish Year: 2023 Review Date: Mon, Oct 16, 2023 url: https://scontent.fmel13-1.fna.fbcdn.net/v/t39.2365-6/369856151_1754812304950972_1159666448927483931_n.pdf?_nc_cat=107&ccb=1-7&_nc_sid=3c67a6&_nc_ohc=Hcg6QsYJx1wAX_okEZO&_nc_ht=scontent.fmel13-1.fna&oh=00_AfAYtfHJfYeomAQWiMUTRo96iP8d4sZrlIfD_KAeYlYaDQ&oe=6531E8CF Summary of paper Motivation CODE Llama, support for large input contexts, and zero-shot instruction following ability for programming tasks. Contribution CODE llama reaches SOTA performance among open models on several code benchmarks Some key terms By training on domain-specific datasets, LLM have proved effective more broadly on applications that require advanced natural language understanding. ...

Dharma_kc Neural Machine Translation for Code Generation 2023

[TOC] Title: Neural Machine Translation for Code Generation Author: Dharma KC et. al. Publish Year: 22 May 2023 Review Date: Sun, May 28, 2023 url: https://arxiv.org/pdf/2305.13504.pdf Summary of paper Motivation Recently, NMT methods have been adapted to the generation of program code. In NMT for code generation, the task is to generate output source code that satisfies constraints expressed in the input. Conclusion NMT-based architecture are getting quite popular for source generation from various input. The NMT-based code generation is useful in multiple domains such as code generation from input binary or assembly (decompilation), code-to-code translation, code repair, bug fixing, and many more. some open problems source code has long dependencies in multiple places next-token prediction technique may lost the dependency information Methods that can break down a problem into small problems, generate code for such subprograms, and evaluate them are good potential research direction sample efficiency Current code generation does not combine code abstraction to higher-level abstractions as human do. Execution-guided synthesis currently works with DSLs, but extending them to real-world source code generation is a research direction. Retrieve-and-Edit framework