0

0

Enhance Reasoning for Large Language Models in the Game Werewolf

    Published 4/1/2024 by Shuang Wu, Liwen Zhu, Tao Yang, Shiwei Xu, Qiang Fu, Yang Wei, Haobo Fu

    This paper presents a framework that integrates Large Language Models (LLMs) with an external Thinker module to enhance the reasoning capabilities of LLM-based agents. Unlike using prompt engineering to augment LLMs, the Thinker directly accesses knowledge from databases and employs optimization techniques.

    The framework forms a reasoning hierarchy where LLMs handle intuitive tasks like natural language processing, while the Thinker focuses on complex logical analysis and domain-specific knowledge tasks. The authors demonstrate the framework using a 9-player Werewolf game that requires dual-system reasoning.

    The paper introduces a communication protocol between LLMs and the Thinker, and trains the Thinker using data from 18800 human sessions and reinforcement learning. Experiments show the framework's effectiveness in deductive reasoning, speech generation, and online game evaluation. The authors also fine-tune a 6B LLM to surpass GPT4 when integrated with the Thinker.

    Additionally, the paper contributes the largest dataset for social deduction games to date.

    Full paper

    Loading...

    Loading PDF viewer...

    Read original: arXiv:2402.02330



    This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

    Total Score

    0

    Follow @aimodelsfyi on 𝕏 →

    Related Papers

    Explore the Reasoning Capability of LLMs in the Chess Testbed
    Total Score

    0

    Explore the Reasoning Capability of LLMs in the Chess Testbed

    Shu Wang, Lei Ji, Renxi Wang, Wenxiao Zhao, Haokun Liu, Yifan Hou, Ying Nian Wu

    Reasoning is a central capability of human intelligence. In recent years, with the advent of large-scale datasets, pretrained large language models have emerged with new capabilities, including reasoning. However, these models still struggle with long-term, complex reasoning tasks, such as playing chess. Based on the observation that expert chess players employ a dual approach combining long-term strategic play with short-term tactical play along with language explanation, we propose improving the reasoning capability of large language models in chess by integrating annotated strategy and tactic. Specifically, we collect a dataset named MATE, which consists of 1 million chess positions with candidate moves annotated by chess experts for strategy and tactics. We finetune the LLaMA-3-8B model and compare it against state-of-the-art commercial language models in the task of selecting better chess moves. Our experiments show that our models perform better than GPT, Claude, and Gemini models. We find that language explanations can enhance the reasoning capability of large language models.

    Read more

    11/12/2024

    🔍

    Total Score

    0

    An Enhanced Prompt-Based LLM Reasoning Scheme via Knowledge Graph-Integrated Collaboration

    Yihao Li, Ru Zhang, Jianyi Liu

    While Large Language Models (LLMs) demonstrate exceptional performance in a multitude of Natural Language Processing (NLP) tasks, they encounter challenges in practical applications, including issues with hallucinations, inadequate knowledge updating, and limited transparency in the reasoning process. To overcome these limitations, this study innovatively proposes a collaborative training-free reasoning scheme involving tight cooperation between Knowledge Graph (KG) and LLMs. This scheme first involves using LLMs to iteratively explore KG, selectively retrieving a task-relevant knowledge subgraph to support reasoning. The LLMs are then guided to further combine inherent implicit knowledge to reason on the subgraph while explicitly elucidating the reasoning process. Through such a cooperative approach, our scheme achieves more reliable knowledge-based reasoning and facilitates the tracing of the reasoning results. Experimental results show that our scheme significantly progressed across multiple datasets, notably achieving over a 10% improvement on the QALD10 dataset compared to the best baseline and the fine-tuned state-of-the-art (SOTA) work. Building on this success, this study hopes to offer a valuable reference for future research in the fusion of KG and LLMs, thereby enhancing LLMs' proficiency in solving complex issues.

    Read more

    6/13/2024

    💬

    Total Score

    0

    Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf

    Yuzhuang Xu, Shuo Wang, Peng Li, Fuwen Luo, Xiaolong Wang, Weidong Liu, Yang Liu

    Communication games, which we refer to as incomplete information games that heavily depend on natural language communication, hold significant research value in fields such as economics, social science, and artificial intelligence. In this work, we explore the problem of how to engage large language models (LLMs) in communication games, and in response, propose a tuning-free framework. Our approach keeps LLMs frozen, and relies on the retrieval and reflection on past communications and experiences for improvement. An empirical study on the representative and widely-studied communication game, ``Werewolf'', demonstrates that our framework can effectively play Werewolf game without tuning the parameters of the LLMs. More importantly, strategic behaviors begin to emerge in our experiments, suggesting that it will be a fruitful journey to engage LLMs in communication games and associated domains.

    Read more

    5/14/2024

    Rational Metareasoning for Large Language Models
    Total Score

    0

    Rational Metareasoning for Large Language Models

    C. Nicol`o De Sabbata, Theodore R. Sumers, Thomas L. Griffiths

    Being prompted to engage in reasoning has emerged as a core technique for using large language models (LLMs), deploying additional inference-time compute to improve task performance. However, as LLMs increase in both size and adoption, inference costs are correspondingly becoming increasingly burdensome. How, then, might we optimize reasoning's cost-performance tradeoff? This work introduces a novel approach based on computational models of metareasoning used in cognitive science, training LLMs to selectively use intermediate reasoning steps only when necessary. We first develop a reward function that incorporates the Value of Computation by penalizing unnecessary reasoning, then use this reward function with Expert Iteration to train the LLM. Compared to few-shot chain-of-thought prompting and STaR, our method significantly reduces inference costs (20-37% fewer tokens generated across three models) while maintaining task performance across diverse datasets.

    Read more

    10/10/2024