Professor Deb Roy of the Massachusetts Institute of Technology (MIT) was more interested in language for the sake of robots, than for understanding how babies develop it. But when he had his own baby and realised that the research on language acquisition only provided snapshots of baby-babbling, Professor Roy decided to turn his home into the Big Brother house.
The house was fitted with spy cameras that filmed his newborn son for 14 hours a day, every day, for three years. The project, dubbed the Human Speechome Project, gathered 250 gigbytes of data a day. The sheer scale of the enterprise required new software for handling data.
For example, to quickly scan video footage, the team developed an application that showed only movement, showing successive frames as a ribbon of colour similar to car lights caught on time elapse film. Sound was represented by a spectrograph. This enabled analysers to discard periods of inactivity and silence, or periods of activity not including their subject.
However, this application did not provide the ability to delve deeper into what was happening in target time periods. The project wanted to transcribe all speech made by and around Roy’s son, but automated speech recognition software was too inaccurate. Purely human transcription was too time consuming, with one hour of speech taking 10 hours to transcribe. So the team came up with another tool that found and recorded sound bites, which human transcribers used. The combination of automation and human effort reduced the ratio of transcription time from 1:10 to 1:2.
The project is still analysing the masses of data but Roy says some interesting findings have been made already. He describes a process he calls ‘word births’. The adult begins with a complex sentence to the child, then subconsciously reduces the complexity until the child understands the word. After this, the adult gradually builds up the complexity again.
Professor Roy’s research will add valuable insights to the study of language acquisition. Other research teams have identified a gene, which appears to have given humans the ability to develop language. Studies looking at how the DNA of apes and humans differentiated historically have identified the gene that provides humans with the ability to move the face in ways that allow for speech.
Other studies have looked at how language is processed, and found marked regional specialisation within the brain – specific brain area are responsible for processing specific tasks, such as speech or visual stimuli. One study has theorised that psychoses result from a breakdown of this specialisation. For example, the brain cannot differentiate whether speech has originated internally or externally.
While Professor Roy and his small son will no doubt contribute to this burgeoning field of study, his research has had other effects also, such as the potential use of the ‘quick video scan’ software in more efficiently analysing CCTV footage.
Another spin-off is a semi-automated architectural design application to determine the effect of built spaces on human flows. For example, assessing the impact of changes to a retail space on how shoppers will move around it.
And Roy hasn’t forgotten his love of robotics. “What if we can build a machine that can step into the shoes of a child and learn in human-like ways,” he asks. “Imagine transferring that into a video game character or into a domestic robot that can now learn to communicate and interact in social ways.”