RestGPT: Connecting Large Language Models with Real-World RESTful APIs

Yifan Song1, Weimin Xiong1, Dawei Zhu1, Wenhao Zhu1, Han Qian1, Mingbo Song1, Hailiang Huang1, Cheng Li2, Ke Wang2, Rong Yao2, Ye Tian2, Sujian Li1
1Peking University, 2Huawei Technologies

The code is released in https://github.com/Yifan-Song793/RestGPT.

Abstract

This work aims to construct a large language model based autonomous agent, RestGPT, to control real-world applications, such as movie database and music player. To achieve this, we connect LLMs with RESTful APIs and tackle the practical challenges of planning, API calling, and response parsing. To fully evaluate the performance of RestGPT, we propose RestBench, a high-quality test set which consists of two real-world scenarios and human-annotated instructions with gold solution paths.

RESTful APIs

RESTful APIs have become a popular way to expose functionalities and data of web services to client applications. There are also millions of RESTful APIs available on Internet, such as Spotify, Twitter, Gmail, etc. RESTful APIs usually follow OpenAPI Specification (OAS), which describes the operations, parameters, and response schemas of each API endpoint.

RestGPT

RestGPT consists of three main components: a Planner, an API Selector, and an Executor, the core of each component is prompting an LLM. Unlike previous work that generates static plans which are not adaptable to environment feedback, RestGPT employs a coarse-to-fine online planning framework. Specifically, the planner decomposes user instructions into sub-tasks in the format of natural language, which are then mapped to API calls by the API selector, forming a coarse-to-fine task planning. On the other hand, the planner performs online planning of subsequent sub-tasks based on the executor's response. To execute RESTful API calls, we further divided the Executor into two modules: a Caller and a response Parser. The caller reads the complete API documentation to organize the API call parameters while the parser generates Python code that parses responses based on the response schema defined in OAS.

Examples

BibTeX

@misc{song2023restgpt,
      title={RestGPT: Connecting Large Language Models with Real-World RESTful APIs}, 
      author={Yifan Song and Weimin Xiong and Dawei Zhu and Wenhao Wu and Han Qian and Mingbo Song and Hailiang Huang and Cheng Li and Ke Wang and Rong Yao and Ye Tian and Sujian Li},
      year={2023},
      eprint={2306.06624},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}