Original Article
Individual differences in learning behaviours in humans: Asocial exploration tendency does not predict reliance on social learning

https://doi.org/10.1016/j.evolhumbehav.2016.11.001Get rights and content

Abstract

A number of empirical studies have suggested that individual differences in asocial exploration tendencies in animals may be related to those in social information use. However, because the ‘exploration tendency’ in most previous studies has been measured without considering the information-gathering processes, it is yet hard to conclude that the animal asocial exploration strategies may be tied to social information use. Here, we studied human learning behaviour in both asocial and social two-armed bandit tasks. By fitting reinforcement learning models including asocial and/or social decision processes, we measured each individual's (1) asocial exploration tendency and (2) social information use. We found consistent individual differences in the exploration tendency in the asocial tasks. We also found substantive heterogeneity in the adopted learning strategies in the social task: Nearly one-third of participants used predominantly the copy-when-uncertain strategy, while the remaining two-thirds were most likely to have relied only on asocial learning. However, we found no significant individual association between the exploration frequency in the asocial task and the use of the social information in the social task. Our results suggest that the social learning strategies may be independent from the asocial exploration strategies in humans.

Introduction

To find better behavioural options in foraging, mate choice, nest search, etc., group living animals can benefit from asocial information-gathering strategy (e.g., reinforcement learning rules (Sutton and Barto, 1998, Trimmer et al., 2012)) and from strategic use of social information (Boyd and Richerson, 1985, Laland, 2004). Although there has been much recent interest in the inter-individual variation in both asocial and social learning behaviour (Mesoudi et al., 2016, Reader, 2015), little is known about whether and (if so) how they associate with each other.

Individual differences in asocial exploration tendency might be related to different individual optimums in the exploration-exploitation trade-off. Given the limited time/energy budget, a single animal must strike the right balance between trying unfamiliar behaviours to sample information (i.e., ‘exploration’) versus choosing known best behaviour (i.e., ‘exploitation’) so as to improve the long-term net decision performance (Cohen et al., 2007, Hills et al., 2014). The optimal balance of exploration-exploitation depends on the costs and benefits of information gathering, which may differ between individuals. For example, an individual with poor information processing performance may have lower benefits of exploration, an individual with shorter expected life-span may benefit less from sampling more information, while an individual experiencing a temporary volatile environment may be forced to explore so as to update its knowledge (Reader, 2015).

On the other hand, the individual variation in reliance on social information might come from the balance of cost and benefit of copying others (Mesoudi et al., 2016). For instance, an individual possessing inaccurate private information will potentially incur a large cost if relying solely on the private knowledge and hence may tend to copy others more (e.g., ‘copy-when-uncertain’ (Laland, 2004, Rendell et al., 2011)), an individual living in a large group may benefit more from following the majority (King & Cowlishaw, 2007), while an individual faced with a highly volatile environment may rely more on private information due to the potentially large cost from copying an out-of-date behaviour (Aoki & Feldman, 2014).

Some factors may simultaneously affect the individual differences in both domains: Environmental volatility may increase asocial exploration tendency while decreasing copying tendency. On the other hand, a common cognitive ability underlying both asocial and social learning may generate a positive correlation between them (Mesoudi et al., 2016). Indeed, the increasing body of empirical studies has shown both negative and positive correlations between the asocial exploration tendency and the social information use (but see Webster & Laland, 2015). For instance, the individual exploration propensity negatively correlates with the individual tendency of copying conspecifics in barnacle geese Branta leucopsis (Kurvers et al., 2010a, Kurvers et al., 2010b) and zebra finches Taeniopygia guttata (Rosa, Nguyen, & Dubois, 2012), while the opposite is true in three-spined sticklebacks Gasterosteus aculeatus (Nomakuchi, Park, & Bell, 2009) and great tits Parus major (Marchetti & Drent, 2000).

However, the term ‘exploration’ has not been explicitly defined as information-gathering behaviour in the previous literature, and might have been confounded with other personality traits (reviewed in Réale, Reader, Sol, McDougall, & Dingemanse, 2007). Broadly speaking, more active, neophilic, or bolder individuals tend to be labelled as ‘explorative’ while more inactive, neophobic, or shyer individuals tend to be labelled as ‘unexplorative’ (Réale et al., 2007). However, it was untested whether individuals labelled as ‘explorative’ actually gather information more during the learning process compared to those labelled as ‘non-explorative’ (Carere and Locurto, 2011, Groothuis and Carere, 2005, Koolhaas et al., 1999). Therefore, it remains unclear whether the individual differences in asocial information-gathering strategy might associate with those in social information use.

In this study, we focused on human learning behaviour in a two-armed bandit (2AB) problem, and tested whether the individual differences in asocial exploration tendency predicted the reliance on social learning. Because the possible individual correlation between asocial exploration and social learning could be either positive or negative (Mesoudi et al., 2016), we did not make any specific prediction about the direction of the correlation.

The 2AB is the most basic test-bed problem of reinforcement learning (Sutton & Barto, 1998). Therefore, we were able to fit a computational reinforcement learning model to each participant's decision data so as to estimate individual information-gathering processes (Daw et al., 2006, Keasar et al., 2002, O'Doherty et al., 2003, Racey et al., 2011, Toyokawa et al., 2014). In the task, individuals have two choice options, but at the outset they do not have exact knowledge of which option is more profitable (Fig. 1a). However, they can update their knowledge of the options through the experiences of earning rewards. Fitting the learning model, we can infer the knowledge-updating process for each participant, so as to categorise each decision made by each participant into either exploitation (i.e., choosing the option with higher estimated reward value as of that round; see the Material and methods section) or exploration (i.e., choosing the other option with lower estimated reward value). The ‘exploration’ measured in this study, therefore, directly relates to information-gathering behaviour during the reinforcement learning.

In addition to the asocial situation where participants engaged in the 2AB task alone (hereafter, ‘solitary task’), participants also played the 2AB task in a pairwise situation (‘paired task’) in which they were able to observe the other participant's choice (but not the peer's earned payoff) displayed on the monitor. To examine whether the participants adopted social learning strategy in the paired task, we fitted several asocial- and social-learning models to each participant's decision data, and then selected the most likely learning model individually. Also, we analysed each participant's gaze movement measured by an eye-tracker in order to confirm the participant's information use during the task. Finally, we examined whether the exploration tendency in the solitary task (i.e., asocial exploration) predicted the use of social information in the paired task.

Section snippets

Participants

Fifty-six right-handed undergraduate students were randomly selected from a subject pool at Hokkaido University in Japan to participate in the experiment. Of these 56 participants, 1 participant failed to complete the second solitary task and 4 participants failed to complete the paired task due to computer problems, leaving us with 55 participants (27 females; Mean age ± S.D. = 18.95 ± 0.89) and 52 participants (26 females; Mean age ± S.D. = 18.96 ± 0.91) to be included in the behavioural analysis for the

Individual consistency in the asocial exploration tendency

The frequency of explorative choices in the first solitary task was positively correlated with that in the second solitary task (r = 0.58, p < 0.001; Fig. 2). This result indicates that participants exhibited a stable asocial exploration tendency across the two solitary tasks. Results of the fitting of the asocial learning model are shown in Online Supporting result S2-1.

Detecting social learning strategies

The Bayesian Model Selection (BMS) resulted in a heterogeneous distribution of learning strategies among the participants (Fig. 3a

Discussion

In this study, we investigated human search strategies in the asocial/social two-armed bandit tasks, respectively, and tested whether the individual differences in asocial exploration tendency in isolated settings might predict the use of social learning in group settings.

Across the first and second solitary tasks, our results showed consistent individual differences in asocial exploration tendency (Fig. 2). Since participants were not informed how the environmental change would occur in

Ethics

This study was approved by the Institutional Review Board of the Centre for Experimental Research in Social Science at Hokkaido University (No. H26-01). Written informed consent was obtained from all participants before beginning the task.

Data accessibility

All data are available from the corresponding author upon request.

Competing interest

We have no competing interest.

Funding

This study was funded by JSPS KAKENHI Grant Number 25245063 and 25118004, and JSPS KAKENHI grant-in-aid for JSPS fellows 24004583. The funders had no role in design study, data collection and analysis, decision to publish, or preparation of the manuscript.

Acknowledgments

We are grateful to Shinsuke Suzuki for valuable discussions related to the computational learning models, and Mike Webster for helpful comments on earlier draft of this manuscript. We also thank two anonymous reviewers for valuable feedback on earlier drafts.

References (52)

  • E. Payzan-LeNestour et al.

    The neural representation of unexpected uncertainty during value-based decision making

    Neuron

    (2013)
  • J.W. Peirce

    PsychoPy—Psychophysics software in Python

    Journal of Neuroscience Methods

    (2007)
  • W.D. Penny

    Comparing dynamic causal models using AIC, BIC and free energy

    NeuroImage

    (2012)
  • L. Rendell et al.

    Cognitive culture: Theoretical and empirical insights into social learning strategies

    Trends in Cognitive Sciences

    (2011)
  • L. Rigoux et al.

    Bayesian model selection for group studies—Revisited

    NeuroImage

    (2014)
  • K.E. Stephan et al.

    Bayesian model selection for group studies

    NeuroImage

    (2009)
  • P.C. Trimmer et al.

    Does natural selection favour the Rescorla-Wagner rule?

    Journal of Theoretical Biology

    (2012)
  • W. Yoshida et al.

    Resolution of uncertainty in prefrontal cortex

    Neuron

    (2006)
  • J. Bouchard et al.

    Social learning and innovation are positively correlated in pigeons (Columba livia)

    Animal Cognition

    (2007)
  • R. Boyd et al.

    Culture and the evolutionary process

    (1985)
  • R. Boyd et al.

    An evolutionary model of social learning: The effects of spatial and temporal variation

  • C. Carere et al.

    Interaction between animal personality and animal cognition

    Current Zoology

    (2011)
  • J.D. Cohen et al.

    Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration

    Philosophical Transactions of the Royal Society B

    (2007)
  • I. Coolen et al.

    Species difference in adaptive use of public information in sticklebacks

    Proceedings of the Royal Society B

    (2003)
  • N.D. Daw et al.

    Cortical substrates for exploratory decisions in humans

    Nature

    (2006)
  • M.J. Frank et al.

    Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation

    Nature Neuroscience

    (2009)
  • Cited by (14)

    View all citing articles on Scopus
    1

    Present address: School of Biology, University of St Andrews, St Andrews, Fife, UK, KY16 9TH.

    View full text