Researchers estimate it takes approximately 1.5 megabytes of data to store language information in the brain

Credit: CC0 Public Domain

A pair of researchers, one with the University of Rochester the other the University of California has found that combining all the data necessary to store and use the English language in the brain adds up to approximately 1.5 megabytes. In their paper published in the journal Royal Society Open Science, Francis Mollica and Steven Piantadosi describe applying information theory to add up the amount of data needed to store the various parts of the English language.

As infants, humans begin acquiring and speaking the of those around them—how it happens is still a mystery, but scientists know that it entails much more than storing words alongside definitions like a dictionary. There are associative clues with words, for example, such as the concept of flight with the word "bird," or even "wing," or "robin." There is also information that tells the brain how to pronounce a word and how it can and cannot be used with other words, and the sounds that make up a word when spoken. In the new effort, Mollica and Piantadosi undertook the task of converting all of the ways our brain might store a language into data amounts. To do so, they used , a branch of mathematics that focuses on how information is coded via sequences of symbols.

To make their calculations, the researchers assigned quantifiable size estimates to the various aspects of the English language. They began by assigning phonemes, the sounds that stack into spoken words. They noted that humans use approximately 50 phonemes and suggested each would require approximately 15 bits to store. They next moved on to vocabulary, estimating that the knows approximately 40,000 words—taken together, they estimated it would add up to approximately 400,000 bits. Next on the list was semantics for those 40,000 —that added up to approximately 12 million bits. They also noted that word frequency is important—they added in another 80,000 bits to account for that. They tossed in another 700 bits to store syntax rules. Adding it all up came to approximately 1.56 megabytes—close to the amount needed to a single digital picture.

Explore further

Newborn babies have inbuilt ability to pick out words, finds study

More information: Francis Mollica et al. Humans store about 1.5 megabytes of information during language acquisition, Royal Society Open Science (2019). DOI: 10.1098/rsos.181393

Press release

Journal information: Royal Society Open Science

© 2019 Science X Network

Citation: Researchers estimate it takes approximately 1.5 megabytes of data to store language information in the brain (2019, March 27) retrieved 18 September 2019 from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Feedback to editors

User comments

Mar 27, 2019
They noted that humans use approximately 50 phonemes and suggested each would require approximately 15 bits to store.

That sounds implausibly low, considering that you need much more data to identify a sound in order to code it into a phoneme, and you also need the muscle sequence to reproduce it.

Anyone who's programmed a robot knows that a simple action like opening and closing your hand takes a wee bit more information and memory than a handful of bits. 15 bits would just about store enough information to encode the position of one single joint of a finger - reproducing a phoneme is a whole set of intricate muscle movements and coordinated actions of dozens of muscles from your chest to your tongue and lips.

Mar 27, 2019
Did the include prosody, diction, semantics, cross-reference to other media such as sound, small and imagery associated with a word and the all important subjective evaluation of a word (qualia) without which a word has no meaning beyond its verbal definition...all up I estimate that around 2 meg per noun is required...

Mar 28, 2019
The myth of the fantastic human brain. It has trouble accessing and processing 1.5 mb... and we are trying to make AI that works like it does?

The only reason to do this would be for the purpose of deceiving real human brains. And then it would only have to be a simple subroutine.

Humans get to invent fantasies such as soul and consciousness because they really have no cognizance of their brain functioning. They have no idea why they think most of the things that occur to them, and they mistake this mystery for mystical superiority when its only the dysfunction of an animal organ forced to perform far beyond it's natural limits.

Another unfortunate result of domestication.

Mar 29, 2019
The richness of human thoughts might be due to that 1.5 Mega-Byte of one person is slightly different from the 1.5 Mega-Byte of the next person , is that so ?

Apr 05, 2019
I hope whoever wrote this did not earn a PhD for it.

Apr 05, 2019
The content of this article clearly stated that it takes 1.5 MBytes to store the ENGLISH language in the brain , NOT any language ! Some languages might take more than 1.5 MBytes , some take less ! And some people who speak several languages might need much more memory to store in their brains !

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more