0

0

Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Models

    Published 11/1/2024 by Nunzio Lore, Sepehr Ilami, Babak Heydari

    Overview

    • This paper explores how large language models (LLMs) can develop and transfer "theory of mind" capabilities, which allow them to reason about the mental states of other agents.
    • The researchers propose a method for training LLMs to develop theory of mind skills and demonstrate their effectiveness in multi-agent collaboration tasks.
    • The key findings suggest that LLMs can achieve adult human-level performance on theory of mind tasks, representing a significant advancement in AI capabilities.

    Methods pair games and scenarios for model comparisons.

    1/4

    Methods pair games and scenarios for model comparisons.

    Original caption: Figure 1: Overview of the methods employed in this paper. We pair all games and scenarios to generate 20 unique combinations, which form the backbone of our dataset. We then submit each combination to each model, and obtain 300 observations per combination. For LLaMa2-70b, we ask for an answer and a motivation; we ask the other models only for their answers. We use the answers coming from LLaMa2-70b to perform LORA on a small, pre-trained LLaMa2-7b. The fine-tuned model is then again queried like the pre-trained model, and once that is done, we collect all data and measure the impact of fine-tuning on preferences.

    Proportion of cooperative choices in within- and out-of-sample games, comparing LLaMa2-7b fine-tuned model performance. Differences tested with z-scores.

    1/1

    Scenario Normal C Ratio Out-of-Session C Ratio Difference Standard Error z-score p-value
    team_prison 0.75 0.71 -0.04 0.02 -1.6 0.05
    team_delight 0.74 0.76 0.01 0.03 -0.53 0.30
    team_staghunt 0.74 0.71 -0.03 0.03 -1.18 0.12
    team_snowdrift 0.70 0.72 0.02 0.03 -0.88 0.19
    IR_prison 0.74 0.71 -0.03 0.03 -1.05 0.15
    IR_delight 0.75 0.70 -0.05 0.02 -2.0 0.02
    IR_staghunt 0.72 0.69 -0.03 0.03 -1.29 0.10
    IR_snowdrift 0.71 0.74 0.02 0.03 -0.89 0.19
    friendsharing_prison 0.75 0.70 -0.05 0.02 -2.0 0.02
    friendsharing_delight 0.79 0.74 -0.05 0.02 -2.11 0.02
    friendsharing_staghunt 0.71 0.77 0.06 0.03 2.17 0.01
    friendsharing_snowdrift 0.76 0.76 0.00 0.02 -0.14 0.45
    biz_prison 0.74 0.70 -0.04 0.03 -1.72 0.04
    biz_delight 0.74 0.76 0.03 0.03 1.05 0.15
    biz_staghunt 0.71 0.63 -0.08 0.03 -3.06 0.00
    biz_snowdrift 0.74 0.76 0.01 0.03 -0.53 0.30
    environment_prison 0.63 0.66 0.03 0.03 -0.96 0.17
    environment_delight 0.69 0.72 0.03 0.03 -1.0 0.16
    environment_staghunt 0.65 0.64 -0.02 0.03 -0.61 0.27
    environment_snowdrift 0.62 0.67 0.05 0.03 -1.67 0.05
    AVERAGE 0.72 0.71 -0.01 0.03 -0.31 0.38
    MEDIAN 0.74 0.71 -0.01 0.03 -1.05 0.15

    Original caption: Table 1: Difference in proportion z𝑧zitalic_z-score testing for propensity to cooperate in within-sample and out-of-sample games for the LLaMa2-7b fine-tuned model. For each scenario, we report: the proportion of cooperative choices in the within-sample game, the proportion of cooperative choices in the out-of-sample game, the difference in proportions, standard error, z𝑧zitalic_z-score, and associated p𝑝pitalic_p-value. Reported significance levels follow standard practices: one asterisk (*) for significance at the 0.050.050.050.05 level, two asterisks (**) for significance at the 0.010.010.010.01 level, and three asterisks (***) for significance at the 0.0010.0010.0010.001 level.

    Plain English Explanation

    The paper describes how large artificial intelligence (AI) models that are trained on vast amounts of text data, known as large language models (LLMs), can develop the ability to understand and reason about the mental states of other agents. This capability, called "theory of mind," is crucial for effective collaboration and communication between AI systems and humans.

    The researchers developed a technique to train LLMs to acquire theory of mind skills, and then tested their performance on various tasks that require this capability. The results showed that the trained LLMs were able to achieve adult human-level performance on these tasks, demonstrating a significant advance in the field of AI.

    The ability of LLMs to represent the beliefs of themselves and others is a crucial step towards developing AI systems that can understand and collaborate effectively with humans in complex, real-world scenarios. This research paves the way for more comprehensive benchmarks to evaluate and further improve the theory of mind capabilities of AI systems.

    Technical Explanation

    The researchers developed a method for training large language models (LLMs) to acquire theory of mind capabilities, which allow them to reason about the mental states of other agents. This involved fine-tuning LLMs on a dataset of conversations that require theory of mind reasoning, such as those involving deception, false beliefs, and perspective-taking.

    The trained LLMs were then evaluated on a range of theory of mind tasks, including the standard "Sally-Anne" test, which assesses the ability to understand that another person may have a different belief than one's own. The results showed that the LLMs were able to achieve adult human-level performance on these tasks, demonstrating a significant advancement in AI capabilities.

    The researchers also explored the transferability of theory of mind skills, showing that LLMs trained on the theory of mind dataset were able to apply their skills to improve their performance on multi-agent collaboration tasks, where understanding the mental states of other agents is crucial for effective coordination and decision-making.

    Critical Analysis

    The paper presents a promising approach for developing theory of mind capabilities in large language models, which is an important step towards creating AI systems that can engage in more natural and effective communication and collaboration with humans. However, the research also highlights some limitations and areas for further exploration.

    One potential concern is the reliance on a relatively small dataset of theory of mind-related conversations, which may not fully capture the complexity and nuance of real-world social interactions. Additionally, the evaluation tasks, while well-established in the literature, may not fully reflect the demands of more open-ended, real-world scenarios where theory of mind reasoning is required.

    Further research is needed to explore the generalizability of the trained LLMs' theory of mind skills, as well as their ability to maintain and update their understanding of mental states in dynamic, multi-agent environments. The potential for bias and ethical considerations in the development and deployment of such systems also warrant careful examination.

    Conclusion

    This paper presents a significant advancement in the field of AI, demonstrating that large language models can be trained to develop and transfer theory of mind capabilities. This represents an important step towards the creation of AI systems that can engage in more natural and effective communication and collaboration with humans.

    The findings suggest that LLMs have the potential to achieve adult human-level performance on a range of theory of mind tasks, which could have far-reaching implications for the development of more intelligent and socially-aware AI systems. However, further research is needed to address the limitations of the current approach and to explore the broader implications and potential risks of this technology.

    Full paper

    Loading...

    Loading PDF viewer...

    Read original: arXiv:2408.05241



    This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

    Total Score

    0

    Follow @aimodelsfyi on 𝕏 →

    Related Papers

    Probing the Robustness of Theory of Mind in Large Language Models
    Total Score

    0

    Probing the Robustness of Theory of Mind in Large Language Models

    Christian Nickel, Laura Schrewe, Lucie Flek

    With the success of ChatGPT and other similarly sized SotA LLMs, claims of emergent human like social reasoning capabilities, especially Theory of Mind (ToM), in these models have appeared in the scientific literature. On the one hand those ToM-capabilities have been successfully tested using tasks styled similar to those used in psychology (Kosinski, 2023). On the other hand, follow up studies showed that those capabilities vanished when the tasks were slightly altered (Ullman, 2023). In this work we introduce a novel dataset of 68 tasks for probing ToM in LLMs, including potentially challenging variations which are assigned to 10 complexity classes. This way it is providing novel insights into the challenges LLMs face with those task variations. We evaluate the ToM performance of four SotA open source LLMs on our dataset and the dataset introduced by (Kosinski, 2023). The overall low goal accuracy across all evaluated models indicates only a limited degree of ToM capabilities. The LLMs' performance on simple complexity class tasks from both datasets are similar. Whereas we find a consistent tendency in all tested LLMs to perform poorly on tasks that require the realization that an agent has knowledge of automatic state changes in its environment, even when those are spelled out to the model. For task complications that change the relationship between objects by replacing prepositions, we notice a performance drop in all models, with the strongest impact on the mixture-of-experts model. With our dataset of tasks grouped by complexity we offer directions for further research on how to stabilize and advance ToM capabilities in LLM.

    Read more

    10/10/2024

    Theory of Mind for Multi-Agent Collaboration via Large Language Models
    Total Score

    0

    Theory of Mind for Multi-Agent Collaboration via Large Language Models

    Huao Li, Yu Quan Chong, Simon Stepputtis, Joseph Campbell, Dana Hughes, Michael Lewis, Katia Sycara

    While Large Language Models (LLMs) have demonstrated impressive accomplishments in both reasoning and planning, their abilities in multi-agent collaborations remains largely unexplored. This study evaluates LLM-based agents in a multi-agent cooperative text game with Theory of Mind (ToM) inference tasks, comparing their performance with Multi-Agent Reinforcement Learning (MARL) and planning-based baselines. We observed evidence of emergent collaborative behaviors and high-order Theory of Mind capabilities among LLM-based agents. Our results reveal limitations in LLM-based agents' planning optimization due to systematic failures in managing long-horizon contexts and hallucination about the task state. We explore the use of explicit belief state representations to mitigate these issues, finding that it enhances task performance and the accuracy of ToM inferences for LLM-based agents.

    Read more

    6/28/2024

    LLMs achieve adult human performance on higher-order theory of mind tasks
    Total Score

    0

    LLMs achieve adult human performance on higher-order theory of mind tasks

    Winnie Street, John Oliver Siy, Geoff Keeling, Adrien Baranes, Benjamin Barnett, Michael McKibben, Tatenda Kanyere, Alison Lentz, Blaise Aguera y Arcas, Robin I. M. Dunbar

    This paper examines the extent to which large language models (LLMs) have developed higher-order theory of mind (ToM); the human ability to reason about multiple mental and emotional states in a recursive manner (e.g. I think that you believe that she knows). This paper builds on prior work by introducing a handwritten test suite -- Multi-Order Theory of Mind Q&A -- and using it to compare the performance of five LLMs to a newly gathered adult human benchmark. We find that GPT-4 and Flan-PaLM reach adult-level and near adult-level performance on ToM tasks overall, and that GPT-4 exceeds adult performance on 6th order inferences. Our results suggest that there is an interplay between model size and finetuning for the realisation of ToM abilities, and that the best-performing LLMs have developed a generalised capacity for ToM. Given the role that higher-order ToM plays in a wide range of cooperative and competitive human behaviours, these findings have significant implications for user-facing LLM applications.

    Read more

    6/3/2024

    Language Models Represent Beliefs of Self and Others
    Total Score

    0

    Language Models Represent Beliefs of Self and Others

    Wentao Zhu, Zhining Zhang, Yizhou Wang

    Understanding and attributing mental states, known as Theory of Mind (ToM), emerges as a fundamental capability for human social reasoning. While Large Language Models (LLMs) appear to possess certain ToM abilities, the mechanisms underlying these capabilities remain elusive. In this study, we discover that it is possible to linearly decode the belief status from the perspectives of various agents through neural activations of language models, indicating the existence of internal representations of self and others' beliefs. By manipulating these representations, we observe dramatic changes in the models' ToM performance, underscoring their pivotal role in the social reasoning process. Additionally, our findings extend to diverse social reasoning tasks that involve different causal inference patterns, suggesting the potential generalizability of these representations.

    Read more

    5/31/2024