The most personal device: Researchers probe how much psychological data smartphones generate
Everyone who uses a smartphone unavoidably generates masses of digital data that are accessible to others, and these data provide clues to the user's personality. Psychologists at LMU are studying how revealing these clues are.
For most people around the world, smartphones have become an integral and indispensable component of their daily lives. The digital data that these devices incessantly collect are a veritable goldmine—not only for the five largest American IT companies, who make use of them for advertising purposes. They are also of considerable interest in other contexts. For instance, computational social scientists utilize smartphone data in order to learn more about personality traits and social behavior. In a study that appears in the journal PNAS, a team of researchers led by LMU psychologist Markus Bühner set out to determine whether conventional data passively collected by smartphones (such as times or frequencies of use) provide insights into users' personalities. The answer was clear cut. "Yes, automated analysis of these data does allow us to draw conclusions about the personalities of users, at least for most of the major dimensions of personality," says Clemens Stachl, who used to work with Markus Bühner (Chair of Psychological Methodologies and Diagnostics at LMU) and is now a researcher at Stanford University in California.
The LMU team recruited 624 volunteers for their PhoneStudy project. The participants agreed to fill out an extensive questionnaire describing their personality traits, and to install an app that had been developed specially for the study on their phones for 30 days. The app was designed to collect coded information relating to the behavior of the user. The researchers were primarily interested in data pertaining to communication patterns, social behavior and mobility, together with users' choice and consumption of music, the selection of apps used, and the temporal distribution of their phone usage over the course of the day. All the data on personality and smartphone use were then analyzed with the aid of machine-learning algorithms, which were trained to recognize and extract patterns from the behavioral data, and relate these patterns to the information obtained from the personality surveys. The ability of the algorithms to predict the personality traits of the users was then cross-validated on the basis of a new dataset. "By far the most difficult part of the project was the pre-processing of the huge amount of data collected and the training of the predictive algorithms," says Stachl. "In fact, in order to perform the necessary calculations, we had to resort to the cluster of high-performance computers at the Leibniz Supercomputing Center in Garching (LRZ)."
The researchers focused on the five most significant personality dimensions (the Big Five) identified by psychologists, which enable them to characterize personality differences between individuals in a comprehensive way. These dimensions relate to the self-assessed contribution of each of the following traits to a given individual's personality: (1) openness (willingness to adopt new ideas, experiences and values), (2) conscientiousness (dependability, punctuality, ambitiousness and discipline), (3) extraversion (sociability, assertiveness, adventurousness, dynamism and friendliness), (4) agreeableness (willingness to trust others, good natured, outgoing, obliging, helpful) and (5) emotional stability (self-confidence, equanimity, positivity, self-control). The automated analysis revealed that the algorithm was indeed able to successfully derive most of these personality traits from combinations of the multifarious elements of their smartphone usage. Moreover, the results provide hints as to which types of digital behavior are most informative for specific self-assessments of personality. For example, data pertaining to communication patterns and social behavior (as reflected by smartphone use) correlated strongly with levels of self-reported extraversion, while information relating to patterns of day and night-time activity was significantly predictive of self-reported degrees of conscientiousness. Notably, links with the category "openness" only became apparent when highly disparate types of data (e.g., app usage) were combined.
The results of the study are of great value to researchers, as studies have so far been almost exclusively based on self-assessments. The conventional method has proven to be sufficiently reliable in predicting levels of professional success, for instance. "Nevertheless, we still know very little about how people actually behave in their everyday lives—apart from what they choose to tell us on our questionnaires," says Markus Bühner. "Thanks to their broad distribution, their intensive use and their very high level of performance, smartphones are an ideal tool with which to probe the relationships between self-reported and real patterns of behavior."
Clemens Stachl is aware that his research might further stimulate the appetites of the dominant IT firms for data. In addition to regulating the use of passively collected data and strengthening rights to privacy, we also need to take a comprehensive look at the field of artificial intelligence, he says. "The user, not the machine, must be the primary focus of research in this area. It would be a serious mistake to adopt machine-based methods of learning without serious consideration of their wider implications." The potential of these applications—in both research and business—is tremendous. "The opportunities opened up by today's data-driven society will undoubtedly improve the lives of large numbers of people," Stachl says. "But we must ensure that all sections of the population share the benefits offered by digital technologies."