Skip to content Skip to menu

Neurocinematics database: From fMRI data mining to knowledge discovery using real-world stimulation


We are building a database of functional magnetic resonance imaging (fMRI) data collected while people watch television and full length films. There are two overarching rationales for this endeavour, the first is scientific and the second pertains to knowledge transfer. Scientifically, we do not know how the brain normally processes information. This is because neuroimaging studies typically use reductionist and unnatural stimuli (e.g., speech perception is studied with isolated phonemes, like “ba”, rather than anything that might be recognized as language). This has arguably produced severe knowledge gaps in our understanding of the brain. To name a few, these gaps include: We have little to no knowledge about how the brain carves up the world (e.g., are phonemes separable from words, syntax, language, communication and other cognitive processes in the brain). If the brain carves the world up along these traditional descriptive divides, how parallel and distributed is the process? If parallel, how do perceptual and cognitive processes interact in these networks? Do they cooperate or compete? What role does context play? If we start answering these questions, we might begin to explain the immense amount of unexplained variance in how the brain works. For example, it has been estimated that we understand ‘only 10-20% of how V1 actually operates under normal conditions’ (Olshausen and Field, 2004) and V1, also known as primary visual cortex, is one of the most intensely studied regions in the human brain. While there is a huge amount of data available from people sitting in fMRI scanners and doing nothing naturally, there is no such comparable database available for natural (active) stimulation. We hope this database will serve as the antidote to this state of affairs, permitting data mining and knowledge discovery in the hope of filling in some of those gaps in our understanding. 

The second rationale for the database is to advance knowledge transfer. In particular, the database will be freely available to scientists in raw form so that, as stated, it can be mined to promote knowledge discovery. It will also be available in analyzed form on a queryable website and mobile app. This too will provide scientific data for knowledge discovery (much like It can, however, also serve as an educational but entertaining view into how the brain works for people who love television and film. We image this would be particularly effective tool in the classroom to educate students about human neuroscience. For example, students could find high frequency terms form reviews about films they like and see what brain networks are associated with those terms. 

Project Status & Timeline

1. Collection and Analyses of fMRI Television and Film Watching Data for a Neurocinematics Database
A team of psychologists and neuroscientists have begun collecting fMRI data while participants watch television and movies in BUCNI at University College London (UCL). The scanner is run continuously with a multiband sequence with a TR of 1 second (the first show collected was ‘Genie: Secret of the Wild Child‘). We will begin making the data available in January 2015 (Project 2 below) and soon thereafter we will solicit data from the neuroimaging community (Project 3 below). Our goal is to collect, analyze and make publically available thousands of datasets. In addition to television and film, datasets will ultimately include other natural forms of stimulation, like reading books and webpages, listening to podcasts, music, audiobooks and playing video games. Analysis will proceed in three phases. In the first phase, currently underway, we are using a blind source separation and neuroimaging meta-analytic based approach to decode resulting brain data with no a priori assumptions (see Project 2 description). In the second phase, as the database grows, we will introduce a text-mining approach to decode brain images that is based on internet reviews and discussions. Finally, in phase three, we will introduce a more targeted analysis that is based on crowd-sourced and gamified annotation of videos (Project 4 below)
Started: September, 2017
Completion date: Indefinite

2. Interface for the Internet Neurocinematics Database


A first team of three Computer Scientists is building a publicly accessible web and app interface that allows people to view brain images associated with people watching movies. Each image has a probabilistic term based description of the function of each image (e.g. one brain image might be highly associated with the term ‘color’ and another ‘spatial processing’). Interactants will be able to view and sort all of the brain images for a specific film by the probability of its associated terms. They will be able to find high probability terms in reviews of a movie and see associated brain images. They will also be able to sort by specific terms and view which movies and accompanying images are most associated with those terms.

Completion date: January, 2017

3. Interface for a Social Data Repository for the Internet Neurocinematics Database


In addition to fMRI data from BUCNI @ UCL, the database will be ‘social’, giving other scientists the opportunity to contribute their television and film watching fMRI data to the database. For this reason, a second team of three Computer Scientists is building a web-based interface that allows researchers to upload their raw data to our server. We will provide an FAQ for questions that might arise with regard to collecting the data. Interactants will need to input specific information into forms about how the data was collected and the participants who watched the movies.

Completion date: January, 2017

4. Interface for crowdsourcing and gamifying annotations of video in the Internet Neurocinematics Database


To do advanced analysis of television and films, it will be necessary to know when certain events in the videos occurred. Thus, a third team of three Computer Scientists is building an interface to obtain crowdsourced annotations of videos of different durations. This interface implements different annotation types that are ‘gamified’. These include 1) ‘Label’: Press specific keys when specific content occurs and receive points; 2) ‘Describe’: Watch a clip and describe it with as many words as possible in a finite period of time. Points are assigned by the number of non-function words; 3) ‘Rate’: Give continuous numeric ratings of ongoing content (for example, press 1-9 to indicate negative – neutral – positive emotional responses). Points are assigned by the number of ratings completed. There will be a leaderboard on which annotators’ scores can be displayed with some positive feedback.

Completion date: January, 2017