Efficient Guided Generation for Large Language Models: Abstract and Intro

📆 6/2/2024 7:12 AM

United States News News

United States Latest News,United States Headlines

📆6/2/2024 7:12 AM

📰hackernoon

⏱336 sec. here / 7 min. at publisher

📊News: 138% · Publisher: 51%

Researchers propose a finite-state machine framework for text generation, offering precise control and improved performance.

Author: Brandon T. Willard, Normal Computing; R´emi Louf, Normal Computing. Table of Links Abstract and Intro LLM Sampling and Guided Generation Iterative FSM Processing and Indexing Extensions to Iterative Parsing Discussion, References and Acknowledgments Abstract In this article we show how the problem of neural text generation can be constructively reformulated in terms of transitions between the states of a finite-state machine.

This framework leads to an efficient approach to guiding text generation with regular expressions and context-free grammars by allowing the construction of an index over a language model’s vocabulary. The approach is model agnostic, allows one to enforce domain-specific knowledge and constraints, and enables the construction of reliable interfaces by guaranteeing the structure of the generated text. It adds little overhead to the token sequence generation process and significantly outperforms existing solutions. An implementation is provided in the open source Python library Outlines . 1. Introduction We are concerned with the problem of generating sequences of tokens from a large language model that conform to regular expressions or context-free grammars . This kind of guided LLM generation is used to make LLM model output usable under rigid formatting requirements that are either hard or costly to capture through fine-tuning alone . Such features have recently been generalized in prompting libraries and interfaces , but their applicability can be limited by their scaling costs. Most implementations of guided generation bias the score values used to determine the probabilities of the tokens in an LLM’s vocabulary. A common and sufficient approach involves repeated evaluations over the entire vocabulary in order to determine which tokens are valid–according to the constraints and previously sampled tokens–and setting the probabilities of invalid tokens to zero. This approach entails a fixed O cost for each token generated, where N is the size of the LLM’s vocabulary. We propose an approach that uses the finite state machine formulation of regular expressions to both arbitrarily start and stop guided generation and allow the construction of an index with which the set of nonzero-probability tokens can be obtained efficiently at each step. The result is an algorithm that costs O on average. For the regular expression case, our approach shares the most similarity with Kuchnik et al. , which uses a transducer formulation to obtain FSMs defined over a language model’s vocabulary, and these FSMs contain much of the same information and scaling benefits as the indices described here. Our approach does not require the complete transducer abstraction and can be used to more easily extend existing, efficient regular expression libraries without modifying the underlying automatons and their implementations. More importantly, our indexing approach can also be extended to CFGs and LALR parsers to allow for efficient guided generation according to popular data formats and programming languages . The transition to parsing is made by way of augmentations to traditional LALR parser components and operations, making it–again–an approach that can be used to extend existing parser implementations. This paper is available on arxiv under CC 4.0 license. Author: Brandon T. Willard, Normal Computing; R´emi Louf, Normal Computing. Author: Author: Brandon T. Willard, Normal Computing; R´emi Louf, Normal Computing. Table of Links Abstract and Intro LLM Sampling and Guided Generation Iterative FSM Processing and Indexing Extensions to Iterative Parsing Discussion, References and Acknowledgments Abstract and Intro Abstract and Intro LLM Sampling and Guided Generation LLM Sampling and Guided Generation Iterative FSM Processing and Indexing Iterative FSM Processing and Indexing Extensions to Iterative Parsing Extensions to Iterative Parsing Discussion, References and Acknowledgments Discussion, References and Acknowledgments Abstract In this article we show how the problem of neural text generation can be constructively reformulated in terms of transitions between the states of a finite-state machine. This framework leads to an efficient approach to guiding text generation with regular expressions and context-free grammars by allowing the construction of an index over a language model’s vocabulary. The approach is model agnostic, allows one to enforce domain-specific knowledge and constraints, and enables the construction of reliable interfaces by guaranteeing the structure of the generated text. It adds little overhead to the token sequence generation process and significantly outperforms existing solutions. An implementation is provided in the open source Python library Outlines . 1. Introduction We are concerned with the problem of generating sequences of tokens from a large language model that conform to regular expressions or context-free grammars . This kind of guided LLM generation is used to make LLM model output usable under rigid formatting requirements that are either hard or costly to capture through fine-tuning alone . Such features have recently been generalized in prompting libraries and interfaces , but their applicability can be limited by their scaling costs. Most implementations of guided generation bias the score values used to determine the probabilities of the tokens in an LLM’s vocabulary. A common and sufficient approach involves repeated evaluations over the entire vocabulary in order to determine which tokens are valid–according to the constraints and previously sampled tokens–and setting the probabilities of invalid tokens to zero. This approach entails a fixed O cost for each token generated, where N is the size of the LLM’s vocabulary. We propose an approach that uses the finite state machine formulation of regular expressions to both arbitrarily start and stop guided generation and allow the construction of an index with which the set of nonzero-probability tokens can be obtained efficiently at each step. The result is an algorithm that costs O on average. For the regular expression case, our approach shares the most similarity with Kuchnik et al. , which uses a transducer formulation to obtain FSMs defined over a language model’s vocabulary, and these FSMs contain much of the same information and scaling benefits as the indices described here. Our approach does not require the complete transducer abstraction and can be used to more easily extend existing, efficient regular expression libraries without modifying the underlying automatons and their implementations. More importantly, our indexing approach can also be extended to CFGs and LALR parsers to allow for efficient guided generation according to popular data formats and programming languages . The transition to parsing is made by way of augmentations to traditional LALR parser components and operations, making it–again–an approach that can be used to extend existing parser implementations. This paper is available on arxiv under CC 4.0 license. This paper is available on arxiv under CC 4.0 license. available on arxiv

We have summarized this news so that you can read it quickly. If you are interested in the news, you can read the full text here. Read more:

Write Comment

United States Latest News, United States Headlines

Similar News:You can also read news stories similar to this one that we have collected from other news sources.

Preparing Complex Datasets for Amazon's Recommender System StudyLearn about data engineering strategies and efficient computation techniques for large-scale data processing.
Read more »

New research shows the value of guided reminiscence on your mental health.Thinking about the past can stir up all kinds of memories, some bad and some good. New research on guided reminiscence shows how to use your past to bring joy to your life now.
Read more »

A large percentage of first-generation students have been impacted by FAFSA challengesFAFSA issues have caused minority students to delay attending college and have led to severe stress and anxiety for other prospective students.
Read more »

Bentley Continental GT V8 PHEV prototype reviewA new-generation Bentley Continental GT gets a new-generation engine with mighty plug-in assistance
Read more »

What to stream: Embark on guided tour of world cinema with MubiMubi, which is now also a publisher of criticism and commentary (see: Mubi Notebook) and film distributor, fashions themselves as a catch-all destination for film lovers, with an emphasis on curati…
Read more »

Boeing wins massive US contract to turn ‘dumb’ bombs into guided weaponsBoeing will build JDAM tail kits, spares, repairs, technical and Laser Joint Direct Attack Munition sensor kits as part of the contract.
Read more »