본문 바로가기

English

Is it Ethical to Harvest Public Twitter Accounts without Consent?

While participating in the workshop on Revisiting Research Ethics in the Facebook Era: Challenges in Emerging CSCW Research, the question arose as to whether it was ethical for researchers to follow and systematically capture public Twitter streams without first obtaining specific, informed consent by the subjects. Many in the room felt that consent was not necessary since the tweets are public, a conscious choice made by the user to allow the whole world see her activity. In short, by not restricting access to one’s account, there is no expectation of privacy.

I argued, however, that we cannot be so quick to presume the expectations of potential research subjects. Yes, setting one’s Twitter stream to public does mean that anyone can search for you, follow you, and view your activity. However, there is a reasonable expectation that one’s tweet stream will be “practically obscure” within the thousands (if not millions) of tweets similarly publicly viewable. Yes, the subject has consented to making her tweets visible to those who take the time and energy to seek her out, those who have a genuine interest to connect and view her activity through this social network.

But she did not automatically consent, I argue, to having her tweet stream systematically followed, harvested, archived, and mined by researchers (no matter the positive intent of such research). That is not what is expected when making a Twitter account public, and it is my opinion that researchers should seek consent prior to capturing and using this data.

A healthy debate on this issue followed, and continued in a separate thread on Facebook, which included the following varied positions & responses (edited and condensed):

  1. “…if the account holder tweets to the general public, then it’d seem like there’s no expectation of privacy so no consent would be necessary.”
  2. (me) “But isn’t my expectation that even though my tweets are public, they’re often lost in a sea of hundreds of tweets among my followers, and I never anticipated someone would archive, mine, and perform research on them?”
  3. “If you’re comfortable with your anonymity being guaranteed only by virtue of your public tweets being hidden in plain sight among millions of others, then you’d have to realize that some determined person could follow just yours, archive them, and analyze them. I like my privacy, but I don’t worry about walking around a city or campus even though …”
  4. “…depends on how data are being presented – e.g. in aggregate vs specific “quotes” that could easily be traced.”
  5. “Many IRBs would say yes [consent is needed], or at least would require you to get a waiver–publicizing the extremes to which IRBs go…”
  6. “…IRB application is required. You could request that Informed consent be waived with the argument that you are only analyzing tweets broadcast publicly, and that you de-identify your data to eliminate potential risk to the individual”
  7. “I would say if it is for research and you are dealing only with publicly available documents, then no, you need no consent. you can run that by the irb and get a waiver, but in the end, you are dealing with publicly available documents… not people, subjects. If you are dealing with subjects and not documents, then you will need irb clearance.”
  8. “Tweets are publications. I think it’s absurd to even consider IRB review for anything dealing with things people have published”
  9. “The questions are: 1) Are you conducting research that is intended to be published; 2) Does your research involved human participants; 3) For these human participants, will you gather data through intervention or interaction with the individual; and/or will you gather identifiable private information about them. (45 CFR 46.102(f))
    If these 3 conditions are met, your research must be reviewed by IRB. They will work with you and determine whether or not informed consent is required. In your case, if you are NOT interacting with the individual publishing the tweets, and the tweets are broadcast and searchable as public records (that is, you don’t need access to their account to view tweets posted to a limited audience), then it won’t fall under the definition of research with human subjects.”
  10. “If i download all of Michael’s published papers, blog posts, twitter posts and each one he publishes thereafter… are they the same? or different? I’d argue the same, just for different audiences.”
  11. (me) “What if tomorrow, I decide to take my Tweet stream private. And I delete my blog posts. Does my affirmative action to purge my documents from the “live” web mean that you (researcher) need to treat that previously archived material differently?”
  12. “If the individual changes their intent regarding release of data, then by IRB standards what might previously have been considered publicly available information, then becomes private information, and your collection would likely require BOTH IRB review AND informed consent, b/c the user now has an expectation that their information is protected.”
  13. “Once tweeted, a birdsong is gone forever. No deleting or taking back what’s been broadcast to the world. If someone seeks privacy, they should seek another method of communication. If from the beginning, there was some kind of inherent expectation that tweets were private messages, then the situation might be different. But the whole idea of tweeting is to voluntarily publish or broadcast. It’s different from, say, e-mailing or IMing.”

What we see here are numerous, intelligent researchers not in complete agreement about wither consent is necessary, about whether one’s tweets are “publications” not needing IRB review, or whether Twitter-based research is dealing with “human subjects” that does require strict scrutiny. There’s also some question about how to deal with the fact that users might make information private after an initial release, something our current forms of communication allow more than in the past.

What do you think? If readers have had experience with related research ethics issues, and how their IRB dealt with is, please email me or leave a comment.

Aside: Interestingly, someone who I’ve friended on Facebook saw that discussion and wanted to repost the thread on his blog. Respectful of the delicate nature of re-posting other conversations and moving them from the controlled environs of Facebook to a public blog, he contacted me to ask permission. He didn’t, apparently, contact each of the commenters to ask for their permission. I felt it necessary to get consent from everyone in that thread before authorizing its re-posting. When I asked each of them, all agreed (with some edits), and some took the position that the Facebook conversation was de facto public, even though technically only a certain set of users (friends of the participants) could in reality see the thread.

Zimmer CSCW 2010