darb.ketyov.com

Caveat lector: This blog is where I try out new ideas. I will often be wrong, but that's the point.

Home | Personal | Entertainment | Professional | Publications | Blog

Search Archive

21.7.11

The role of Facebook and Twitter in scientific citations and impact factors

This past weekend I was messing around on the internets (when I should have been relaxing by a river) when I was struck by a question: is there a way to see what impact social media has had on science?

Turns out, social media such as Facebook and Twitter may have a big influence on journal impact factors and paper citations... but more on that in a second. (For a primer on impact factor and citations, read my old post: "Something ghoti with science citations".)




First: my rambling thoughts.

Many of us scientists have this perception that "social media" is "important" for science, but I don't yet have a grasp on what that means, and I don't think anyone else does, either.

In fact, just yesterday over on Wired, @Sheril_ asked a bunch of science twitterers why they use Twitter. The responses are all over the place! People use it for self promotion, for sharing and learning new ideas, for staying in touch, professional networking, etc.

I've talked about the role of various social networking/Web2.0 sites in the scientific process ad nauseam. I've espoused my interest in Quora in my post "Quora for Scientists":

...with Quora, anyone can connect with potential experts. The question is how can Quora engage these experts? Why would anyone contribute their expert knowledge for free? I mean, I do it because... well I really enjoy talking with the public about science. Sometimes it's the really simple questions (e.g., What is the neurological basis of curiosity?) that really make me stop and think about "easy" questions I normally wouldn't worry about. That's exciting to me. I like to solve hard problems (especially ones that seem easy), and I think most scientists feel the same way.

Again, that's just another opinion. Another rationale added to many others.

But what do the data suggest?

I turned to the ISI Web of Knowledge, which provides excellent journal-specific metrics such as the total number of citations a journal has received, the journal's impact factor, and so on. (Sadly these metrics are closed to the general public; you need a license to view them.)

For each journal I got its 2010 impact factor as well as the total number of citations the journal has received. The former measures the average number of times a paper published in that journal was cited between 2008 and 2010. The latter is more of a "popularity" measure weighted by the number of publications in the journal.

I wanted to see what impact social media has on these two common metrics. To do that I looked to Facebook and Twitter to find which journals have their own Facebook pages and Twitter accounts. In the end I was left with 17 of the "top" peer-review general science and medical/biomedical publications that had both a Facebook page and a Twitter feed:

Nature, PNAS, Science, The New England Journal of Medicine, Cell, The Lancet, Journal of the American Medical Association, British Medical Journal, PLoS Biology, PLoS Medicine, Genes & Development, Nature Medicine, Genome Research, The Journal of Experimental Biology, and Cell: Stem Cell

For each of these journals I then noted the number of "likes" on its Facebook page, which is a measure of how many people will see an announcement made by that journal. I also took note of how many Twitter followers each journal has, as well as how many people the journal follows and how many tweets per day the account makes, on average.

First, some simple metrics:
* Nature has received the most total cites with 511145
* The New England Journal of Medicine had the highest 2008-2010 impact factor at 53.5
* They have--by far--the most Facebook likes with 232385
* They also have the most Twitter followers with 31863
* The British Medical Journal has made the most tweets with 5405
* They follow the most twitterers with 1471
* And they are the most chatty with 5.4 tweets per day

First-pass metrics show that there is a strong correlation between number of tweets and number of followers (r = 0.63, p = 0.012) as well as the number of followers and tweets per day (r = 0.54, p = 0.037).

Tweet more, you get more followers.


(It's important to note that all metrics are on a log scale!)

Interestingly, total citations and impact factor are only weakly correlated (r = 0.49, p = 0.063).

The question is, do social media metrics explain any of the variance in citations or impact factor?

Again, looking at simple correlations, we see that number of Facebook page likes for a journal and the number of twitter followers it has are correlated (r = 0.53, p = 0.044). This probably represents measure of social networking engagement.

But what's crazy is that number of Facebook page likes is strongly correlated with the total number of citations a journal has received (r = 0.78, p = 0.001)!


Both Facebook page likes and number of Twitter followers correlate (equally well!) with impact factor (r = 0.59, p = 0.021; r = 0.59, p = 0.021 respectively).

For those of you who like a little more scientific rigor than a bunch of uncorrected correlations, I also ran a multiple (log-transformed) linear regression for total citations and impact factor separately. What's interesting is that, in both cases, number of Facebook page likes seems to be the dominant significant factor in determining citations and impact factor.

However, for the impact factor analysis, both the number of people the journal follows on twitter, as well as the number of tweets per day, are inversely correlated with impact factor (partial correlations of r = -0.44 and r = -0.66, respectively)!

So what does all of this mean?

Well, it seems like having a social media presence is probably a reflection of the journal's popularity in general, but that, on average, the journals that do the most social media engagement (amongst the top-tier journals!) show the lowest impact factors.

That's a little disheartening.

16 comments:

  1. Anonymous08:30

    don't you need to look at how impact factors are changing year on year, to see if SNS are having an impact?

    ReplyDelete
  2. Of course! This is a first pass analysis. Were i going to publish this, I'd dig deeper. But I've got a day job :)

    ReplyDelete
  3. How about--high impact factor journals have already 'made it', but the others are trying harder?

    ReplyDelete
  4. Dan H09:49

    Scopus has an easy way to compare journal metrics from 1999 on.
    Their SJR measure (similar to impact factor) shows that most journals have been fairly constant before and after the age of social networking. Cell is the only one showing a decline, but the whole curve shows that decline so it's not social networking related.

    Their Source normalized impact/paper (SNIP) metric shows a bit for flux. It's essentially the normalization is some way to correct for publication frequency biases across fields. JAMA, NEJM, & Lancet all increase from 1999-(2004/5/6) and are then flat. Nature & Science are flat until 2007 & then start having large increases. Cell is flat across most of the time window.

    Perhaps the clinical journals reached their peak audience before social networking & their networks just show this while Science & Nature are bringing in more readers. If you don't have Scopus access, I can see what I'm allowed to pass along if you want to dig into this further.

    ReplyDelete
  5. I'd be interested in a longitudinal study of individuals. Group 1 & 2 matched by h-index or what have you at time T1. Group1 tweets, Group 2 does something non-social. A time T2 we check h index/citations etc. Alas, day job...:-)

    ReplyDelete
  6. I've tried coming up with something witty to say and I all I have is...Thank you....I love how you break things down for people who aren't as sciency so that they can still understand what you're trying to say.

    Blerg...not that witty....

    ReplyDelete
  7. Given that impact factors are

    a) negotiated not calculated
    b) not reproducible (i.e. incorrect by up to 20%)
    c) the mean of a left-skewed distribution (i.e., an undergraduate mistake)

    it is not all that surprising that your results a re somewhat counter-intuitive (or disheartening). Using a less flawed metric (i.e., any other) will probably make more sense.

    For sources for the above claim see:
    http://bjoern.brembs.net/comment-n397.html
    http://bjoern.brembs.net/comment-n499.html

    ReplyDelete
  8. Very nice. As you say it's just a start but it's a good start.

    Ideally, I'd want to look at this on a paper-by-paper basis, because ultimately papers get citations, not journals. And nowadays I think almost everyone reads individual papers rather than actually reading a copy of a journal as they perhaps did 30 years ago.

    So what I'd want would be some metric of "social impact" to correlate with citations.

    Something like, the number of blog posts about the paper, or the number of Tweets (including reTweets) linking to it (or linking to a blog post about it).

    Social impact is probably going to be much faster than citations. So you could do this prospectively. See if the social impact within, say, 30 days of publication, correlates with citations 5 years later.

    ReplyDelete
  9. @rpg: almost *certainly*. But smaller journals often don't have facebook and twitter accounts, so I had no data to go off of.

    ReplyDelete
  10. @AmAmDa: Thank you :)

    ReplyDelete
  11. @Bjoern: Oh, absolutely. There's no way I'd dispute the multitudes of problems with IF. That's why I looked at total citations, too... but yes, I shouldn't be propagating a poor system. Hell, I've written about these issues, too. I believe most young, open science bloggers probably have ranted about this at some point :)

    http://blog.ketyov.com/2011/01/something-ghoti-with-science-citations.html

    ReplyDelete
  12. @Dan & @Neuroskeptic: I still have access to SCOPUS, I believe.

    Like Neuroskeptic suggests--looking at the effect of social media propagation on a paper-by-paper basis would also be interesting. PLoS and Frontiers in Neuroscience are tracking a lot of metrics, which is a great step toward this.

    But I think a better statistical method would be to compare two groups of journals that have similar library distribution, total citations, thematic content, etc. where they only differ by their social media presence.

    ReplyDelete
  13. Dan H05:52

    I think the trouble with the paper-by-paper basis is that, once a paper crosses some threshold & starts flying around social media, there's something unique about it & would be hard to identify similar articles that didn't have the social media component. It could be interesting to figure out what makes a specific article cross that threshold. For example, I suspect there are there certain science writers who greatly amplify the social media presence of an article if they decide to write about it. Perhaps one can look at the citation counts of research in the NY Times science section before & after the social media explosion (though that's far from automated datamining).

    For the effect of social media on citation, comparing internal changes within journals could allow for some automation & averaging, but even the first level data looks so messy. Why Science & Nature have citation increases with the growth of social media & NEJM doesn't has to do with more than just social media.

    ReplyDelete
  14. Having just blogged about your original post (http://blog.uta.edu/~bradley/2011/07/25/the-role-of-facebook-and-twitter-in-scientific-citations-and-impact-factors/) before reading the comments which contribute so much to the discussion (sorry--no unsupportiveness intended :-{)} ), I suspect that what commenters might be getting at is the difference between promoting a journal (which is what you're looking at) and promoting an individual('s work). Paper-by-paper measurement would most likely take a different approach than impact factor.

    Working in a university library, I am aware of 2 things: authors facing the challenge of publishing (and individuals striking out on their own), and journal metrics' advantage over social media analysis in terms of precise definition and measurability. Even Johan Bollen's work with usage statistics (http://www.mesur.org/MESUR.html) sticks with journals.

    ReplyDelete
  15. Thanks for the mention, Brad!

    Paper-by-paper is *definitely* not related to impact factor. It's been shown (and Bjoern Brembs shows above in one of his links in his comment) that the IF of a journal can only *very* weakly (if at all!) predict the number of citations any one paper will receive. There are simply too many human factors involved in the publication process.

    And yes, I agree, paper-metrics are very different from what I'm looking into here (but they still interest me).

    Did you all see that PLoS just opened their article-level metrics API?
    http://blogs.plos.org/plos/2011/07/plos-article-level-metrics-api-launched

    ReplyDelete
  16. A better way of measuring impact might be to see how often the journals are mentioned online in general - using a social media monitoring tool would help. Happy to assist if you need it.

    ReplyDelete