Caveat lector: This blog is where I try out new ideas. I will often be wrong, but that's the point.

Home | Personal | Entertainment | Professional | Publications | Blog

Search Archive


SfN 2014

This will be my first year attending SfN as an actual professor (I was hired but hadn't started by SfN 2013).

This means I'm on the lookout for potential PhD students and post-docs. Nothing certain yet, as grants haven't come back, but if you're looking for a place to do you PhD, or thinking about a post-doc in the next year or two, hit me up.

It turns out, San Diego's a pretty nice city, and has pretty good cognitive science and neuroscience programs (FALSE HUMILITY PRECEDING).

You can find me in the following ways:

  • Via my email address on my faculty webpage
  • At the UCSD booth at the 4th Enhancing Neuroscience Diversity through Undergraduate Research Education Experiences (ENDURE) meeting on Saturday, Nov 15 from 9:30-11:00am at the Marriott Marquis, Independence Ballroom EFGH
  • At our book signing at the Princeton University Press booth from 11:00-12:00 on Monday, Nov 17
  • At BANTER on Monday night (probably)
  • At my collaborator's poster session Tuesday, Nov 18, 13:00-17:00, off and on (abstract below)
Presentation Title:Automated “spectral fingerprinting” of electrophysiological oscillations
Location:WCC Hall A-C
Presentation time:Tuesday, Nov 18, 2014, 1:00 PM - 5:00 PM
Presenter at Poster:Tue, Nov. 18, 2014, 3:00 PM - 4:00 PM
Topic:++G.04.e. Electrophysiology: Electrode arrays
Authors:M. HALLER1, P. VARMA1,2, T. NOTO4, R. T. KNIGHT1,3, A. SHESTYUK1,3, *B. VOYTEK4,5,6;
1Helen Wills Neurosci. Inst., 2Electrical Engin. and Computer Sci., 3Psychology, Univ. of California, Berkeley, Berkeley, CA; 4Cognitive Sci., 5Neurosciences Grad. Program, 6Inst. for Neural Computation, UCSD, La Jolla, CA
Abstract:Neuronal oscillations play an important role in neural communication and network coordination. Low frequency oscillations are comodulated with local neuronal firing rates and correlate with a physiological, perceptual, and cognitive processes. Changes in the population firing rate are reflected by a broadband shift in the power spectral density of the local field potential. On top of this broadband, 1/f^α field, there may exist concurrent, low frequency oscillations. The spectral peak and bandwidth of low frequency oscillations differ among people, brain regions, and cognitive states. Despite this widely-acknowledged variability, the vast majority of research uses a priori bands of interest (e.g., 1-4 Hz delta, 4-8 Hz theta, 8-12 Hz alpha, 12-30 Hz beta). Here we present a novel method for identifying the oscillatory components of the physiological power spectrum on an individual basis, which captures 95-99% of the variance in the power spectral density of the signal with a minimal number of parameters. This algorithm isolates the center frequency and bandwidth of each oscillation, providing a blind method for identifying individual spectral differences. We demonstrate how automated identification of individual oscillatory components can improve neurobehavioral correlations and identify population differences in spectral and oscillatory parameters.


The tenure-track: The first months

Apparently people read this blog and have noticed I’ve not updated in a few months. People are all like, “hey man, what happened to your bloviating?”

Here’s the (shocking!) gist: moving to a new city, starting a faculty job, writing a book, having a second child, and creating a new class from scratch has been somewhat time consuming.

My burgeoning lab has just been renovated.

Voytek lab: pre-renovation (L) and post-renovation (R)
I’ve got one post-doc (Erik Peterson) working with me, two PhD students with two more rotating, and one more post-doc joining in early 2015. I’m the diversity chair for Cognitive Science and a diversity committee member for neuroscience (positions I take very seriously and is a topic near and dear to my heart). I’m constantly busy playing catch up, trying to finish up my post-doctoral research while also trying to establish my scientific independence and build my own lab.

Where does that leave me now that I’m a few months in?

Mentorship is difficult, and I’m trying to do well by my trainees. As a mentor there’s a balance between giving guidance and providing freedom, and I’m still learning that.

I (very amicably and somewhat sadly!) resigned my position as data scientist at Uber, mostly because I was too busy and, between my family, my lab, and Uber, something had to give and it wasn’t going to be my family or my lab. So my nearly four-year-long side career in helping build out a multi-billion dollar multi-national company has come to an end (and I could write a whole book about that, probably).

That decision lead me to reassess life and career choices yet again.

I recently gave a talk at TEDxSanDiego about failure, the “passion trap”, and the narratives we tell ourselves. We love a good narrative, and for the past 10 years my personal narrative has been one of a failed student reformed as a neuroscientist. That’s mostly the narrative I’ve shared here on this blog. But I want to make sure I’m not getting caught up in my own narrative and that I don’t pigeonhole myself into a particular way of thinking.

The Uber thing was partially an attempt at pushing myself outside of my comfort zone and away from my neuroscience narrative. Same with writing the zombie book (Do Zombies Dream of Undead Sheep? now available for sale on Amazon or as an audiobook on Audible BUY IT NOW). In fact, when I’m presented with a new opportunity, one of the major factors in my decision making is, “how weird/novel is this opportunity?”

So I’ve made several deviations in my career but I keep coming back to neuroscience research. I’m just a few months into the tenure-track, and while it’s been a hell of a thing, I have little doubt that I’ve made the right choice. While I’m working a lot, I’m still able to hold evenings and weekends as protected family time, with most weekends spent with my kids at musea, the beach, the zoo, and so on.

Anyway, this is all just a bunch of words to put down on here to excuse my absence so I can get back to writing other stuff.


The language of science

The Washington Post has a headline that reads, verbatim, "A toddler squeezed through the White House gate and caused a security alert. Seriously."

Isn't the Washington Post a "real" newspaper with like, journalistic standards and stuff? Does the headline really need the "Seriously." part, just in case we all thought they were just kiddingsies?

Given how little actually annoys or bothers me, I'm surprised at my own internal response to this. I'm all for the evolution of language, but this seems weirdly out of place.

If I tried to write a scientific paper titled, "Oscillations are fucking rad and you wouldn't believe the four behaviors they control!" it might more accurately capture my personal feelings and excitement, but I wouldn't do it because it's such a culturally-narrow, biased way of talking about the topic.

What I mean by that is that, the language we use conveys information not just through the words, but through the combinations of words, their structures, and so on that provide context about when they were written and their emotional content.

Science papers are often (rightly) criticized for being dry, but that "blandness" is a cultural artifact of an attempt at impartiality and a recitation of facts with minimal emotional bias.

The fact that major news outlets are dropping even the pretense of this is what bothers me, I guess?


Biologists, stop trying to make "moon shot" a thing

Kennedy's "moon shot" was a huge success with regards to throwing a ton of government money at a problem to facilitate and expedite a solution. Getting humans on the moon--and then getting them back home, all using decades-old computer technology--was an incredible feat of engineering, cooperation, and technology development.

But it was a well-defined problem with a clear goal. Here's how it works:

"Hey look, there's the moon. It's pretty far away. Let's go to there."
"Cool! We can get stuff into space. Let's see if we can't make that stuff better so that humans can go, too."

But in biology, the problems are never so well-defined, and sometimes the goals aren't, either! Yet the "moon shot" metaphor is pervasive, especially now with regards to Obama's BRAIN Initiative. Which is unfortunate because calling upon that metaphor as a way of drumming up public support also sets public expectations and if the project fails to meet those expectations then the public trust in large-scale, government-supported research endeavors will erode.

Metaphors are very powerful, and carry with them a lot of meaning and emotional weight, and thus should not be called upon lightly.

In biology, there is no clear goal, nor even a well-defined problem. For the moon landing, we could easily envision what a solution could look like. For neuroscience, we don't know the scope of the problem, nor do we even know what form a solution to "understanding the brain" would look like.

Is "understanding the brain" something done at the cellular or sub-cellular level? What about brain/body interactions? What about emergent phenomena that only arise when individual neurons are all wired up and placed in a complex, dynamic electrochemical environment such as the brain?

We just don't know.

Sadly, this isn't the first time this metaphor has been called upon in biology.

In the 1996, the Human Genome Project was the "moonshot":
"If the human sequence is biology's moonshot, then having the yeast sequence is like John Glenn orbiting the earth a few times," said Francis Collins, M.D., Ph.D., director of the National Institutes of Health's National Center for Human Genome Research... (source)
The Human Genome Project was a great success in terms of having a clear goal (sequence the human genome) and reaching it. But of course, as we all know, some of the promises about what that information would provide in terms of treating disease, especially mental illness, have fallen short.
Collins noted that most diseases have both genetic and behavioral components that jointly shape the course of disease and the body's response to particular drug treatments. Many such diseases are of interest to psychologists, including bipolar illness, schizophrenia, autism and attention-deficit and hyperactivity disorder. Medical scientists will soon be able to examine which of the 0.1 percent of the genome that varies across humans correlates with which of these diseases, Collins said. (source)
 Today, for Collins, the moon shot is the Brain Initiative:
“While these estimates are provisional and subject to congressional appropriations, they represent a realistic estimate of what will be required for this moon-shot initiative,” Collins said. “As the Human Genome Project did with precision medicine, the BRAIN Initiative promises to transform the way we prevent and treat devastating brain diseases and disorders while also spurring economic development. (source)
This moon shot metaphor appears to be a major talking point, but as always I'm concerned about what leveraging such metaphors does to erode public support in the long term as yet another major efforts fails to find effective treatments or cures for major neurological and psychiatric disorders.


A Neuroscientist Walks Into a Startup (redux)

Someone just pointed to me that a video of the talk I gave at the 2013 Society for Neuroscience conference on careers beyond the bench is online.

This talk is informal and conversational (which is how I normally talk). Distilling out all my rambling, the main points I touch on are:

  • Why networking?
  • Social networking as an alternative to travel (good for parents like me!)
  • What skills do we have as PhDs?
  • Science communication.
  • Doing something versus knowing something.
  • Breaking into data science.
  • The importance of side projects.

I've said a few of these things before (my time with Uber and my decision to stay in academia) but this is the most comprehensive collection of what little I've learned along the way.


The Language of Peer Review

When I sent off the draft of my first paper to my PhD advisor I really felt like all the hard work was finished. I'd spent years getting my project started: getting IRB approval, identifying subjects, collecting data, and so on. Then I spent many more months analyzing the data, pouring over every detail. Then I spent many more months putting together figures and writing the draft.

But all of that?

That's only the tip of the iceberg. Let's get into my personal statistics (since those are the data from which I have to draw) to show you the language of peer review.

Over the past few years I've been co-author on 15 research manuscripts, of which I have been the main author on 7. I currently have 3 more that have undergone at least one round of peer review. These 10 first-author manuscripts have collectively undergone about 20 rounds of review, each consisting of 2-4 reviewers.

More than 16,000 words have been written by reviewers about these 10 papers. In response to them, I have written over 14,000 more.

Of note, the total word length for all 10 of these papers comes in at around 40,000 or so (not including references). This means that, for every paper I write, I will probably have to write around 35% more than what I consider to be the "final" version in order to justify its publication to reviewers.

Keep that in mind next time you try to estimate how much longer it will take for you to publish your manuscript. I often forget this.

Now, on the flip side, I've performed approximately 25 reviews for some of the most prestigious journals in cognitive neuroscience including: NaturePNASNature NeuroscienceNeuronJournal of NeuroscienceNeurologyNeuroImageJournal of Cognitive NeuroscienceCerebral Cortex, and so on.

For these 25 reviews I have written over 14,000 words. That may seem light at only ~560 words per review but my personal reviewing philosophy is to be concise but thorough. It is much easier to critique than to create!

If you have not experienced it, peer review is a strange affair. It's sort of like a masquerade ball in that peoples' identities are unknown (at least in one direction), but you know that you'll probably have to see these people unmasked at some point so you better give the appearance of propriety.

There's a lot of weird and interesting language play and kowtowing, with phrases such as "we thank the reviewers for their insightful comments" and "this is a very interesting manuscript, but..."

The word of the day is "asteism": "Polite irony; a genteel and ingenious manner of deriding another."

This generally sums up the peer-review process.

Just out of curiosity I decided to run all my reviews, comments received by reviewers, and response to reviewers through a tag cloud generator (thank you tagcrowd) just to see what it looked like. Check it out, I believe this to be a decent, quick insight into the language of peer-review.

My Reviews
You'll notice right away that certain keywords appear that represent the general class of manuscripts I'm asked to review: "EEG", "coupling", "gamma", "patients", and so on. The appearance of words such as "addressed", "important", and "interesting" belie the kinds of language I use (I generally do find most papers I review to be genuinely interesting, by the way).

Comments Received by Reviewers

Here again you see "interesting" appear. Everyone is interesting! We're all special snowflakes!

This is a good exercise for me though because I can see that reviewers have a tendency to use words like "specific" and "literature" against me. When I write I have a tendency to "jump ahead" and just assume that people will follow my logic without "showing my work". This is sloppy on my part and I struggle with the need to explicitly connect my thoughts.

My wife has to remind me of this fact for every. Single. Paper. That I write. For every talk I give. You'd think I'd have learned my lesson by now.

Response to Reviewers

This is great. You see how "reviewer", "correct", "thank", and "suggested" show up a lot in my response to reviewers? This is another interesting aspect of peer review. This shows the deferential language that scientists use in responding to their peers. This represents all the times I've said, "the reviewer is correct" and "we thank the reviewers for their suggestions" and the like.

Anyway, this was my attempt to peel back the curtain on peer review a bit if you don't have a lot of experience with it.

I don't have a clever or insightful ending for this post, so, uh, I'd like to thank the readers for their valuable time and for the intelligent, thought-provoking comments that are sure to follow.


A decade of reverse-engineering the brain

Salesmanship trumps science. Every. Single. Time.

The big news in the tech world today is the superstar team-up of Elon Musk, Mark Zuckerberg, and Ashton Kutcher investing $40M in Vicarious, whose aim is to, "[t]ranslate the neocortex into computer code". Because then “you have a computer that thinks like a person," according to Vicarious co-founder Scott Phoenix. “Except it doesn’t have to eat or sleep.”

I took at look at this mystery team of neuroscientists who've secretly reverse-engineered how the human brain works and, according to the Vicarious team page, the scientific talent (and I assume lead) is Dileep George.

George was formerly the CTO of Numenta, the company that was spun out of Palm founder Jeff Hawkins' book On Intelligence (which is a fine book with a neat theory, by the way).

Hawkins founded the Redwood Neuroscience Institute which eventually was absorbed into UC Berkeley as the Redwood Center for Theoretical Neuroscience. This was all happening right when I began my PhD at Berkeley.

In 2004.

George gave a talk at the Accelerating Change conference in 2005, the abstract of which reads:
We are at a juncture where great progress has been made in the understanding of the workings of the human neocortex. This gives us a unique opportunity to convert this knowledge into a technology that will solve important problems in computer vision, artificial intelligence, robotics and machine learning. In this talk, based on joint work with Jeff Hawkins, I will describe the state of our understanding of neocortical function and the role Numenta is playing in the development of a new technology modeled after the neocortex.
My question is, how is Vicarious different? What's changed in the last 9 or 10 years or so? Because the high-level press release stuff sounds exactly the same as the Numenta stuff from a decade ago.

What happened to Numenta's lofty aims?

They're now called "Grok" and, according to their about page:
Grok, formerly known as Numenta, builds solutions that help companies automatically and intelligently act on their data. Grok’s technology and product platform are based on biologically inspired machine learning technology first described in co-founder Jeff Hawkin's book, On Intelligence. Grok ingests data streams and creates actionable predictions in real time. Grok's automated modeling and continuous learning capabilities makes it uniquely suited to drive intelligent action from fast data.
George did some amazing computational neuroscience research at Numenta. But for all the talk about how slow academia is, you'd think after ten years and tens (hundreds?) of millions of dollars spent in the fast-paced world of private industry, the sales pitch would have changed by now.

The Blue Brain Project is nearing the end of its first decade as well. And, again, there's some great work coming out of these places, but I cannot overstate my frustration at the hype-to-deliverables ratio of these organizations.

Granted, I wasn't in the meetings. Maybe a lot has changed, but none of that change is making its way out to anywhere where the rest of us can see it.

Having watched this stuff for a decade now, the grand promises have not been delivered on. It's clear to me that VCs need some skeptics on their advisory teams. Any neuroscientist and/or machine learning researcher in that meeting would certainly ask:

"What's different?"


The Passion Trap

The first email I ever sent was to Stephen Hawking. I sent the email in the spring of 1998 when I was 16 years old from a computer at my high school (because I didn't have internet at home) using a friend's AOL account. I had just finished reading Hawking's A Brief History of Time and knew that I wanted to be an astrophysicist (or a cosmologist). I emailed Hawking to tell him how much of an inspiration he was to me and how passionate I was about physics.

Passion. Follow your passion. For those of us lucky to have choices in our life trajectory we're bombarded by advice to follow our passions. Chase your dreams. Go to culinary school. Major in whatever you love. Drop out and start a company! Listen to no one, just follow your heart!

A quick look over at Amazon suggests this is a lucrative bit of advice.

But (and I'm definitely not the first to say this) it's not the best advice.

When I was in high school I was a physics and math chauvinist. I saw psychology and the biological sciences as "soft". My love for physics was probably planted in part by this goofy book:

I've always been inclined toward the sciences, but that book lit a fire in me. It got my imagination going about what could be possible if enough smart people got together to work on a Big Idea. This creative aspect of science really drew me in and, I've come to realize, shaped my career.

Whenever anyone asked 10-year-old me what he wanted to be when he grew up, I'd answer "an astrophysicist". Yeah, I wasn't the coolest kid. But that spark stayed with me and I found some fun outlets. I spent a lot of time in high school playing video games, role playing games with friends, etc. All of the nerd-flavored creative outlets.

As for school, it was was easy and I coasted through.

Home life was… non-standard… so when I was given the opportunity to skip my senior year of high school to attend the University of Southern California I seized it. One late August night in 1998, at about 2 am, I called up my buddy Curtis and he drove me to Los Angeles to drop me off at college.

I immediately declared as a physics major and kept going with all the "advanced" versions of the courses. Around the same time I discovered that I enjoyed socializing and I made a lot of new friends. One part of my life was rewarding, the other was not, so I stopped going to classes. I did poorly, but I had a lot of fun doing it. My love for physics started waning due to the monotony of the work and the lack of wonder exhibited by the professionals I saw working in academic physics.

The only reason I didn't drop physics sooner was the fear that my physics friends would make fun of me for "going soft". And because I didn't know what else to do.

Physics was all I'd ever wanted to do. Physics was my passion.

There's that word again.

Becoming a astrophysicist was this grand ideal I'd built up for myself. It had become part of my identity. Once you start defining yourself by one thing—a political belief, religious affiliation, career, family, whatever—you lose identity to that thing. You reduce the number of paths to happiness and success and wrap your entire self around it.

To put it mildly: that can be unhealthy.

Modern psychological thinking generally breaks "passion" into two distinct subtypes. In their highly influential 2003 Journal of Personality and Social Psychology paper, Les Passions de l’Âme: On Obsessive and Harmonious Passion, Vallerand and colleagues differentiate harmonious passion (HP) from obsessive passion (OP):
Harmonious passion (HP) results from an autonomous internalization of the activity into the person’s identity. An autonomous internalization occurs when individuals have freely accepted the activity as important for them without any contingencies attached to it. This type of internalization produces a motivational force to engage in the activity willingly and engenders a sense of volition and personal endorsement about pursuing the activity. Individuals are not compelled to do the activity but rather they freely choose to do so. With this type of passion, the activity occupies a significant but not overpowering space in the person’s identity and is in harmony with other aspects of the person’s life. 
Obsessive passion (OP), by contrast, results from a controlled internalization of the activity into one’s identity. Such an internalization originates from intrapersonal and/or interpersonal pressure either because certain contingencies are attached to the activity such as feelings of social acceptance or self-esteem, or because the sense of excitement derived from activity engagement becomes uncontrollable. Thus, although individuals like the activity, they feel compelled to engage in it because of these internal contingencies that come to control them. They cannot help but to engage in the passionate activity. The passion must run its course as it controls the person. Because activity engagement is out of the person’s control, it eventually takes disproportionate space in the person’s identity and causes conflict with other activities in the person’s life. 
I hate to go all "Medical students' disease" here but this really seems to capture the gist of my personal physics passion struggle. Breaking out of that was very hard for me. It really felt like I was abandoning my identity. Or like I was lying to myself about who I am.

During my sophomore year I lived in a crazy place. One of my friends wanted to take a psych class and, because I had a free slot in my schedule and I had no idea what to do, I took that class with him. The classes I did attend were pretty cool. Dammit if it didn't turn out that people, and not just particles, are fascinating, too!

Fast forward one semester: I go to register for classes my junior year and find out that my grades had been too low for too long and I was basically kicked out of school. Long story short: I plead and begged, got a one-semester reprieve, got my shit together, and became a psychology major. I finished all the required courses in a semester.

I devoured the stuff.

At the time USC only had a cell/molecular biology major. No cognitive neuroscience. So I basically made my own major (though my final degree was in Psychology). I took C++ and Java classes, AI, Philosophy of Mind, Communication, etc.

I volunteered in a research lab as an RA and discovered that my ability to write code was a semi-magical skill because I could automate a lot of laborious manual jobs. I learned that I had a "knack" for approaching problems that way.

Really my interests as a doe-eyed wannabe cosmologist kid aren't that different from my doe-eyed adult neuroscientist self. My weird childhood, party-fueled and tumultuous college years, and crazy friends made me odd but kept me optimistic and protected me from being jaded. Ironically I now use a ton of math and physics in my neuroscience work.

Take that chauvinistic past-me.

Now, instead of asking "how are we all here, these tiny specs in the vast universe, pondering our origins?" I spend my days asking "how are we all here pondering our origins, we tiny specs in this vast universe?"

Ask yourself if you are harmoniously passionate, or obsessively, and if the answer is the latter, remember you are not your job, your belief, your class, your color, or your passion. To paraphrase a dear friend of mine: don't follow your passions, follow your competencies, and you might just find you enjoy doing something you're good at.


Vallerand, R., Blanchard, C., Mageau, G., Koestner, R., Ratelle, C., Léonard, M., Gagné, M., & Marsolais, J. (2003). Les passions de l'âme: On obsessive and harmonious passion. Journal of Personality and Social Psychology, 85 (4), 756-767 DOI: 10.1037/0022-3514.85.4.756


Neuroscience, culture, and the public trust

Every year, the Society for Neuroscience holds a "Social Issues Roundtable" at their annual conference. Social issues in neuroscience are near and dear to me, and so this year I took a stab at submitting a proposal.

What better place to talk about these issues than at a conference of 35,000 neuroscientists?

Sadly the proposal (below) was rejected. The program had five speakers, consisting of:

  • Carl Zimmer (New York Times science writer extraordinaire)
  • Sally Satel (Psychiatrist and author of, most recently, Brainwashed: The Seductive Appeal of Mindless Neuroscience)
  • Vaughan Bell (Clinical psychologist, science writer, blogger at MindHacks)
  • Jeni Kubota (Neuroscientist studying stereotype and prejudice change)
  • David Higgins (Science fiction scholar, head of Science Fiction Literature (SF) for the International Association for the Fantastic in the Arts)

Note I did not include myself on the panel. I would have simply moderated.

Despite being rejected, I still think it's an important topic. So I'm looking into the possibility of hosting something anyway, either as a satellite to SfN, or at my university (UC San Diego).

If this sounds interesting or intriguing, please let me know!


The Decade of the Brain. The $300M+ BRAIN Initiative. The public looks to neuroscience for answers about mental illness, cognitive decline, law, and disease. Society's expectations are shaped by media representations—from The Matrix to Malcolm Gladwell—which give an unrealistic view of both the certainty and capabilities of neuroscience. In this program we examine the effect of the dissonance between societal expectations and the nuances of scientific research on the public trust.

This roundtable aims to foster communication between media experts and creators, primary neuroscience researchers, mental health professionals, and culture studies researchers. This communication will focus on the bi-directional role between neuroscience research and the popular media and press, and how this affects the public trust. Specifically, there are three central themes we will explore:

  1. Why does neuroscientific research so readily capture the public attention?
  2. How do recurring themes of popular press accounts of neuroscientific research affect the public trust in the research? Examples of such themes include appeals to new research having "implications toward a cure for", e.g., autism, depression, anxiety, Alzheimer's, schizophrenia, etc. while never fulfilling those promises.
  3. How is neuroscientific research and funding affected by shifts in the public interest, from Matrix-like brain- computer interfaces to current trends in Big Data?

By addressing these issues and opening the dialog between neuroscientists and the media, we as a society can better understand how neuroscience is perceived in modern society and the role that we play as guardians of the public trust.

The 2013 BRAIN Initiative put neuroscience back into the forefront of the public’s awareness. Similarly, the release of the much-maligned DSM-5 and Thomas Insel's official statement on behalf of the NIMH on supporting RDoC have led to a resurgence in discussions surrounding our understanding of the biological basis for mental illness. Finally, neuroscience-based bestselling books abound: Grandin's ”The Autistic Brian", Ariely's "Predictably Irrational", and Kahneman's "Thinking, Fast and Slow".


Big data: What's it good for?

Recently I was interviewed for a piece in the Independent titled, "The number crunch: Will Big Data transform your life - or make it a misery?"

Part of this interview (my portion of which amusingly got truncated to "STUFF IS COOL") was around what "Big Data" is "for". Because what was included in the interview was shorter than what we talked about, I thought I'd use my own personal platform here to flesh it out a bit.

First, to get this out of the way, "Big Data" is literally just a lot of data. While it's more of a marketing term than anything, the implication is usually that you have so much data that you can't analyze all of the data at once because the amount of memory (RAM) it would take to hold the data in memory to process and analyze it is greater than the amount of available memory.

This means that analyses usually have to be done on random segments of data, which allows models to be built to compare against other parts of the data.

To break that down in simple words, let's say that Facebook wants to know which ads work best for people with college degrees. Let's say there are 200,000,000 Facebook users with college degrees, and they have been each served 100 ads. That's 20,000,000,000 events of interest, and each "event" (an ad being served) contains several data points (features) about the ad: what was the ad for? Did it have a picture in it? Was there a man or woman in the ad? How big was the ad? What was the most prominent color? Let's say for each ad there are 50 "features". This means you have 1,000,000,000,000 (one trillion) pieces of data to sort through. If each "piece" of data was only 100 bytes, you'd have about 93 GB of data to parse. That's pretty big (but still arguably not quite into "big data" territory), but you get the idea.

Your goal is to figure out which features are most effective in getting college grads to click ads. Maybe your first-pass model on a random sample of 1,000,000 users finds that ads with people in them that are 200x200 pixels big and about food get the most clicks. Now you have a "prediction model" for what college grads want, and you can then test that to see how well your prediction (based on the 1,000,000 college grads) holds up when you compare it to the other 199,000,000 college grads.

Now, for what it can do in "daily life", well, pretty much any company with a significant tech group (Google, Twitter, Facebook, any bank or financial institution, any communications and mobile service, energy, etc.) are doing this kind of thing. To serve ads, to improve their services, to predict future growth and demand needs, whatever. Relatively benign, boring, money-making stuff.

But what about other uses?

Google famously showed that they could predict flu outbreaks based upon when and where people were searching for flu-related terms:

There's the famous story about how Target's algorithms discovered a girl was pregnant.

Researchers are using Facebook statuses to look at how gender and age is affecting language use:

Doctors can look at what patients are writing about in online disease forums to try and get an idea of how off-label drug use affects certain diseases.

We can look at the evolution of language:

or the suppression of ideas:

We can look at how people move based on their cell phone use:

How money physically moves:

Or, like my work with Uber, their actual travel, and how various real world events (like the 2013 U.S. Federal Government Shutdown) affect the way people move around:

These are only the tip of the iceberg. 90% of the world's digital data was created in the last two years so we're just starting to figure out the possibilities. Note that in my cognition research I'm using a ton of data on peoples' behavior to try and infer how age, location, education, etc. affect our cognitive abilities. But those data aren't published or peer-reviewed yet, so it's not really appropriate to discuss quite yet. But the results are fascinating.

So yes, while the early focus of Big Data was essentially basic profit-driven advertising, one shouldn't hold onto the belief that that is all that it's good for.

Unfortunately, this is an extremely complex topic that sits at the intersection of personal freedom, privacy, industry, science, medicine, etc. The next ten years will be dominated (rightly so) by conversations surrounding data ownership rights and privacy. There's no reason that these kinds of analyses can't be done on anonymized data--so we shouldn't throw the baby out with the bathwater--but any scientists, researchers, or analysts should be mindful of these issues.