We browse, therefore we’re data
Chances are high you got here from a link on Facebook or Twitter. Or maybe you recently ordered a holiday gift from Amazon. If you’re like me, you use these social media platforms and web services with a hint of resign and possible distaste for the way your data is tracked, stored, and commodified. For example, I know Facebook “follows” me around the internet, builds up ghost profiles of friends and family such that even those who choose not to be “on” it become, unwittingly, another line in Facebook’s giant ledger of data. The algorithms to capture and use data are so good that some people, I learned from a recent episode of the Reply All podcast, are convinced Facebook is actually listening to them through their smartphone microphone. (Hint – they’re not, but that doesn’t make the tracking much less invasive.)
We benefit, so we put up with it
Why do I and so many others keep using these services, knowing how we’re tracked and commodified? Mainly through self-examination, I’ve come to believe it’s for the convenience. Amazon’s analysis of my past purchases occasionally helps me find a product I find rather helpful and enjoyable. Facebook allows me to easily stay in touch with a large network of family and friends. Twitter links me to interesting and timely content about both personal and professional interests. Sure I’m trading a fair amount of privacy, but in the micro-transactions of my daily browsing and meanderings online, it seems worth it.
DNA as data commodity
The transactional nature of trading our data for convenience, or perhaps trading it for entertainment (hello, Netflix), has a parallel in direct-to-consumer (DTC) genetics. While DTC genetics has been going strong for over a decade, it’s only been in the past year that privacy issues seem to have really grabbed the popular consciousness. AncestryDNA had a public kerfuffle when a consumer protection attorney wrote about how the company’s terms of service grants them “the broadest possible rights to own and exploit [customers’] genetic information.” In late November, Senator Chuck Schumer voiced concerns about the opaqueness of DTC-GT companies uses of customer data and called on the Federal Trade Commission to investigate companies’ privacy policies.
I don’t know how often the typical DTC customer considered privacy implications in the past, but given the recency of these more public debates I find it unlikely they would be completely unaware of privacy implications now. So why do DTC customer numbers keep growing by the millions? (See, for instance, here and here.) I suspect for the same reason we keep using the aforementioned online services. For DTC testing, it’s often entertaining to learn about genetic ancestry and to connect with relatives. Even much of the health and wellness trait information can seem “recreational” if taken lightly enough: “I have an increased chance of restless leg syndrome? That’s why I’m always fidgeting!” or “How silly, they say I likely have hazel eyes when in face mine are brown….”. We have a vague sense our DNA sample is being stored and perhaps sold, but what we get in exchange seems, again, “worth it.”
Implications for research
While I research DTC genetics, I work in genetics research. So I also think about how these data transactions are changing the way genetic research occurs. When we exchange personal data (genetic or otherwise) for individual gain, that is fundamentally different from how research has traditionally operated. Researchers ask people to join studies not for their own personal benefit, but to benefit society at large. To increase knowledge and improve human health, not the individual participants’ knowledge or health. How can researchers continue to attract participants if they’re not offering a data transaction with direct individual gain?
The answer may be that they can’t. I’ve argued that researchers should consider offering individual data back to participants, if they hope to attract participants at a rate even comparable to DTC companies. Some academic groups are adopting a hybrid model, where they collect genetic data for research while returning to the participant a DTC-like report of personal information (e.g., Gencove, DNA.land). A new start up DNAsimple is offering individuals $50 each time a researcher wants to use their data for a new project. There are and surely will be many more examples like this, where academic researchers leverage data transactions where the individual consumer/user/participant directly benefits in exchange for their contribution (sometimes more knowingly than others) to a larger dataset.
In October, my graduate program hosted a public screening of “The Immortal Life of Henrietta Lacks,” followed by an expert panel discussion. For those unfamiliar with Rebecca Skloot’s book and the story of HeLa cells, I highly recommend these two Radiolab episodes as background. Briefly, Lacks was an African-American woman diagnosed with cervical cancer in Baltimore during the 1950’s. A sample of her tumor cells taken during her time in Johns Hopkins hospital ended up being the first time scientists were able to grow living, human cells in the lab — something they’d been desperately trying to do for quite a while. Her cell line was called “HeLa” and basically kickstarted modern biomedicine.
One of the many things I appreciated about the expert panel discussion (a full recording of which is available on YouTube) was how it helped to parse and sort various issues raised by Henrietta Lacks’ story. Watching the movie, you viscerally get a sense of multiple lines of injustice, of anguish, by Lacks herself and by her descendants and in particular her daughter, Deborah (played by Oprah). There’s the abundant racism, the absence of informed consent, the financial injustice of a wildly profitable biotech industry enabled by the unwitting donation of a poor black woman. Taking this story in, all your protests can easily get muddled up into one ball of outrage. But, as the panelists that night helped me see, it’s helpful to pick it apart a little. In doing so, you come to realize that perhaps the most unjust aspects of this story are not the extraordinary details but the rather ordinary backdrop of systemic racism.
One major theme in this story is the lack (no pun intended) of informed consent. Lacks was never asked (or even told) about her tumor cells being used for research. Furthermore, when her family members were later brought in for genetic testing — an opportunity for the researcher-clinicians to come clean — the family was actively misled. They were told it was for their own benefit (to detect cancer risk), rather than the real reason of wanting genetically-related samples to help detect HeLa cells in the lab.
This issue is a little thorny, even by today’s standards of biomedical research ethics, some of which were not even codified until the 1960’s. “Biobanking” is a fairly common practice now, especially in healthcare systems tied to research institutions. Biological samples collected for different purposes, sometimes clinical treatment (e.g., tumor biopsies), get stored and potentially used for various downstream research projects. The samples are typically de-identified and therefore subject to a laxer form of regulatory oversight. As Dr. Wylie Burke mentioned on the panel, ideally such systems make patients aware of the biobank and at least give them an opt-out option — neither of which applied to Henrietta’s tissue “donation.”
Another objection that comes up when watching or reading the HeLa story is that of profit sharing. The HeLa cell line, while initially given away freely by Johns Hopkins researchers, eventually became a massive revenue generator for numerous biotech companies. Yet the Lacks family, to this day, remains socio-economically depressed. While certainly unfair, by current legal standards I think it’s unlikely the Lacks family would be granted a share of the biotech companies’ profits — potential ethical obligations aside. A 1990 legal case sets a precedent here, Moore v. Regents of the University of California. Cells taken from a cancer patient, John Moore, became a cell line patented for commercial use. Moore sued and lost, the court ruling basically that Moore did not have a financial interest in the cells once excised from his body. (Note, this case also involved issues of informed consent, which of course was breached in the case of HeLa cell line creation.) The legal precedent of no property rights in excised cells/tissues was held up in 2003, in Greenberg v. Miami Children’s Hospital Research Institute. An important caveat with comparing HeLa to this case is, again, that in Greenberg the court was considering a scenario where tissue samples were freely donated by patients, i.e. with proper informed consent.
The circumstances I’ve covered so far are indisputably concerning, but what if they’re distracting us from a greater injustice? This was the possibility raised to me by the expert panel after the screening that night, and in particular the renowned bioethicist Dr. Wylie Burke. What if the real whammy here, crystallized in this one extraordinary story, is the background of structural and institutional racism? This is different than the specific question of financial interests in the previous section. This is about a history of underprivilege and lack of access to educational, employment, and health resources.
One specific example was brought up by the panel moderator and former Dean of the School of Public Health, Dr. Gil Omenn. Henrietta went through two pregnancies likely with this tumor on her cervix — why did her health care providers not catch this? It was only once the tumor was advanced enough that Henrietta could feel it herself that she began to receive treatment. We can’t say for sure, but it seems likely that her standard of care was inadequate. You also see the system continue to fail for Henrietta Lacks’ living family members, many of whom suffer chronic physical and mental illnesses. These background injustices would be no less salient even if Henrietta Lacks’ cancer cells had not gone on to fuel a biotechnological revolution. It’s because her cells were so extraordinary, and because Skloot was so stubborn about accessing the family and this story for her book, that we get to and should talk about these underlying injustices.
I don’t know how to say this, but…I’ve been seeing other blogs!!
Just kidding. I mean I’ve started writing on another blog, hosted by the University of Washington Genomics Salon. The Salon brings together a diverse group of students, post-docs, faculty, and community members to discuss issues relevant to science and society. I am proud to be one of several co-organizers for the Salon, and have moderated or co-moderated two sessions in the past.
Here’s my guest post giving a recap of a Salon session I co-led last month on the topic of science and metaphor. The other co-moderator was the esteemed Dr. Leah Ceccarelli, rhetorical scholar and metaphor expert. Leah was a member of my Master’s thesis committee, a project I’ve written about previously here as well.
Enjoy, and I encourage you to follow the Salon blog as well!
I am currently fielding an online survey as part of my dissertation project. If you have taken a direct-to-consumer (DTC) genetics test and are at least 18 year old, please consider taking my survey. And feel free to pass along to friends, family, and colleagues!
I’ll be back with another more substantive post soon.
Terra cotta warriors
During the last week in August, I attended a special exhibit of the Terra Cotta warriors at Seattle’s Pacific Science Center. The warriors are a touring collection of a tiny subset of the vast clay “army” buried over 2,000 years ago, encircling the tomb of China’s first emperor. While the clay soldiers have been excavated and meticulously studied and cataloged for decades, the tomb itself remains untouched and unopened. Why? Researchers are afraid that the mere act of opening it will irrevocably change (and perhaps destroy) its contents.
It’s an odd coincidence that I heard this story at the current moment of my dissertation research, as I’m beginning to realize that by studying my area of interest, I may actually be changing it. I have no delusions of grandeur that my poking around is akin to opening the tomb of China’s first emperor, undisturbed for millennia…but there is enough of a parallel for consideration.
Prospecting a survey
My dissertation research aims to understand what people do with their “raw” or uninterpreted genetic data, typically obtained from direct-to-consumer genetic testing (DTC-GT, we’re talking 23andMe, AncestryDNA, etc.). The section of the project I’m broaching now is to survey individuals who have done DTC-GT to understand whether they’ve accessed their raw data and, if so, what – if anything – they’re doing with it. A few gracious souls, some strangers and some friends, have agreed to pre-test my survey instrument before it goes out into the field. As I might have suspected, several of the pre-testers were not previously aware that their raw data was available to them. In taking my survey, they inevitably find out.
There’s nothing inherently wrong with downloading your data, but I don’t want to be the one to tip the finger on the scale of someone deciding to do it. And it’s not without consequence. For one, genetic data can be sensitive, and downloading it means it’s only as secure as your computer/device. Plus you might decide to use third-party interpretation services, another area of my research, several of which can be a waste of time and money, and again may potentially expose your data to security breaches.
Ultimately, I want to understand the behavior and motivations of both those who do and don’t get their raw data, without influencing or encouraging any particular behavior. I’m considering adding a disclaimer to the end of the survey, clarifying that I am not suggesting people should get their raw data if they haven’t already done so. But in a way, I’m asking them not to think of a pink elephant.
Social scientists will be screaming at me by now, “Duh, this is the Hawthorne effect!!” Observe people’s behavior and they will change it. Maybe this is actually a mini twist, where by asking people about their behavior, I may be changing it. At any rate, I am touching my finger to a pond and making ripples, when really I want to hover over the pond, swooping from shore to shore, and taking note of all there is to be seen.
I do take some solace in knowing that, unlike the sacred stillness of the Chinese emperor’s tomb, I am asking people about something they engage with in the messiness of everyday life, where they are exposed to many people and ideas that could ultimately plant the same seeds in their head. If not my survey today, then maybe 23andMe’s website tomorrow would have reminded them their raw data was available for download. But nevertheless, I am responsible for my actions as a researcher and for the consequences of asking the questions I do – a responsibility I take this seriously.
(I hope you’re not still thinking of that pink elephant.)
On Monday the consumer genomics company Helix launched a “DNA App Store:” a one-stop interpretation shop for your personal genomic information. Commercialization of personal genetic information has been gaining momentum for over a decade, mostly through direct-to-consumer testing companies such as 23andMe and AncestryDNA, but this announcement from Helix seems to represent a phase change. Watching the fallout online (including some excellent coverage by the MIT Technology Review and Wired), this development strikes me as a Rorschach test for one’s feelings about personal genomics in general. You may cheer the further democratization of the genome: give people their data, damn it! (Though note, Helix actually doesn’t let people download their “raw” genetic data.) More likely, if you’re in the genetics research or medical community, you’re nauseated at the further commodification of genetics and interpretive overstepping of companies in this space. My reaction is mixed — I’ll give a few quick takes below.
But first, a summary of what Helix is doing. The company spun off from Illumina, the genetic technology giant that has dominated the DNA sequencing space for many years now. For a flat fee of $80, the Helix consumer service will sequence your exome, or the ~3% of your genetic material that codes for proteins. Helix holds onto your exome sequence for you and then let’s you choose from a menu of interpretation services: this is the “DNA app store.” The services currently come in six categories: ancestry, entertainment, family, fitness, health and nutrition. These services are being developed by other companies and laboratories, then being somehow vetted by Helix to become an offering to their exome sequencing customers. Current ones include “Wine Explorer” by Vinome, which makes wine recommendations based on select genetic variants; an “Inherited Diabetes” analysis by Admera Health; and a determination of which traits may originate from Neanderthal ancestors, from Insitome. Also not to be overlooked: a company that will make a custom scarf based on your DNA sequence.
Before I get too snarky, I do want to point out some potential benefits of services like this.
- People may get excited about and interested in genetics. And they may just have some fun.
- People may take the time to learn more about the science behind some of these products (including the limitations). Increasing “genetic literacy” through this avenue could benefit people when encountering genetics in a more serious venue — for example, in a clinical test.
- Some traits and conditions have a primarily genetic cause that is easy to detect and can lead to improved health and quality of life. A good example is hereditary hemochromatosis (HH): while not currently in the Helix app store, it’s relatively straightforward to test for the known causal variants in the HFE gene, and it can help people access simple yet impactful treatment (regularly donating blood can have a huge impact on people with HH). Note most common diseases are not so genetically simple, so this is currently a slim category of consumer genomics tests with useful health impact.
- People may waste time and money on these services.
- The information may be inaccurate and misleading.
- People may start to think of genetic information as frivolous and unreliable. This could pose a problem for an envisioned future of health care that integrates genetic information. If people’s first exposure to their genteics is in this often scientifically flimsy space of gee-whiz-adry, they may have a hard time taking it seriously down the road.
Ultimately, many of these consumer-facing interpretive services seem to me like Narcissusome sequencers. We thirst for personal data, regardless of its relevance or utility, and can easily get lost in the fascination of it all.
I too am fascinated by genetics—that’s what drew me to the field I work and study in today. But we need to keep a healthy dose of skepticism moving forward, as I suspect in the coming years we will see many more like the Helix app store.
A mentor once passed on to me some exceedingly sage advice about writing: it’s very hard to write when you don’t know what you’re trying to say. So many of my difficulties in academic writing are explained by this obvious but compelling observation. It’s not that I don’t know things, but it’s hard to figure out what I’m trying to say about those things and in what way.
Build it up and strip it down
I blame this partly on my graduate school training. Granted, perhaps I put this on myself, but much of the intellectual work I’ve done in recent years has been about trying to build things up into complication and then to simplify them back down again. You may start with a basic idea or question, but then you must dress it up with theories, models, Foucault, etc., to gain traction with your academic audience. So you spend a year or two “complicating the hell out of it” (to quote Harrison Ford in Six Days Seven Nights), but then comes time to write it up — for your committee or for a journal publication — in the face of short attentions spans and word or page limits. You’ve got to come up with a pithy and insightful introduction and conclusion, distilling all your many months of complication into something streamlined and laser sharp. This. Is. Hard.
But when you see the alchemy of simplicity-from-complexity done well, either in a talk or a paper, it is truly miraculous. I admire this skill more than any other academic prowess. Why don’t we celebrate this more, I wonder? Why does academia often dismiss or denigrate simplicity? It deserves more respect and more examination, which I will do with the reflections below.
Back to the barre
From ages 8 to 18, I took on average about three ballet classes a week. Dance was my extracurricular activity of choice, and where I felt happiest during my tumultuous teenage years. My teacher was great – a brilliant choreographer and compassionate instructor. I went onto continue my dance education at UNC-Greensboro, a school I chose in part because of its excellent dance department. I started taking ballet classes with a dance legend, Gerri Houlihan. I went into my first class with Gerri expecting a whirlwind of complicated choreography. But instead we went to the barre and Gerri started calmly and with her lovely smile demonstrating the combinations. Plies, tendus from first, tendus from fifth…and I started to realize “wait, these combinations are…easy.” Simple, slow…easy.
It was a few classes in that it dawned on me these combinations weren’t easy. They were simple and slow, yes, but this meant you had nowhere to hide. Your technique was front and center, not glossed over by tempo and crazy patterns. My technique improved leaps and bounds (pun intended) during those classes with Gerri, in large part due to the simplicity of the choreography.
Me and the undergrads
I was reminded of ballet class with Gerri this past quarter, when I took an upper level undergraduate class in the Information School at UW. It’s not out of the ordinary for graduate students to take such classes, but it gets a little more unusual when you’re already a few years into a PhD program. But I was interested in the topic, “Information Policy and Ethics,” so I got myself into the class anyway. I attended every lecture and did almost none of the readings (hopefully my TA has already turned in grades if she is seeing this). Some of the moral philosophy content was review from my bioethics courses, but refreshers never hurt. The general pace of the class, and the fact that I wasn’t too taxed by the material, gave me room to come up with some rather interesting ideas for the term paper. The professor said to me at the end, “Wow, you must have been really bored during this class!” On the contrary. It gave me room to think and process, to distill some complicated ideas into what I think was a rather compelling and novel argument in my term paper. Simple — yes. Easy — no.
Three Minute Thesis
The final tale of simplicity I’ll share here was also from the last few months. I participated in a campus-wide speaking competition called “Three Minute Thesis.” Contestants are given three minutes and one PowerPoint slide to present their research project in front of a general audience and panel of judges. In parallel to my preparation for the TMT, I was also putting together a longer conference talk on the same research results. With the TMT, you have to measure out each phrase and idea you want to express — it is the ultimate stripping down. But going through that process, thinking “what is the one thing I want people to know here” helped me similarly focus my conference talk. (It also reminded me I should wear deodorant during high-pressure speaking events.) I think that for future research products, whether it be another talk or a manuscript, I should similarly try to assemble the “three minute” version to help me really hone in on the main points. Again, simple, and hard.
The graveyard of papers
Writing academic papers is a long process. The research is long, writing is long, journal review and revisions are long…it can all be rather tiring. I’ve only done this process a few times and I already grow weary at the thought of doing it once more. The result is what I’ve summed up in this graph:
That is, the longer you stay in academia, the larger your graveyard of abandoned papers becomes. Co-authors lose interest, you lose time, other projects take over. Too many journals reject it so you eventually give up. How sad. Collectively, how many thousands of hours spent conducting and reporting on the research, only to be shelved into the darkness. Sigh. I don’t exactly know how to fix this, but I suspect it’s some mixture of belligerence and….yes, simplicity. Economy of words, streamlining of ideas, stripping out the unnecessary complications you added in to gain traction with your peers and mentors.
If you can’t answer “why does this really matter,” there’s always the danger that it doesn’t.
“If there was an architect going through the neighborhood and they were drawing plans, I want a copy of the plans of my house…I am not going to build a house, I just want it.”
The above quote is from a focus group participant in a research study conducted by some of my colleagues at the University of Washington. The topic of the focus group was people’s willingness to participate in genetic research and whether they would want to receive their individual results if they were to participate. Here this participant is comparing her genetic results to architectural plans to her house, implying that should would want the information not for any specific use but solely for the sake of self-knowledge: it’s about her, it’s hers, and she wants it. (For my previous research on this and other metaphors, see this post.)
I have previously argued that research participants should be offered access to their personal genetic data. I recognize, however, that there are trade offs to enabling such access. Through an information ethics and policy class I am currently taking, I have been reexamining arguments for personal data access. Below I present one product of this reexamination: exploring the intrinsic and instrumental value (or perhaps lack thereof) of having one’s personal genetic data.
Intrinsic value is worth in a being or object that originates merely from its existence. It is valued as an “end-in-itself” or, more colloquially, “just ‘cause.” The architect plan quote illustrates intrinsic value of genetic information. This woman doesn’t intend to use the information to do anything in particular: she’s not going to “build a house.” However, she feels some ownership and interest in the data as an end-in-itself. Information about our genetic make-up can have a special status. It’s familial, it’s personal, it’s perhaps integral to our very being. Those intrinsic aspects alone might draw us to want our genetic data.
Instrumental value flows from the usefulness of a being or object to achieve some outcome, as a means to an end. There are several ways personal genetic data can be instrumentally valuable to an individual. The routes to potential utility most prominent in my mind are online third-party interpretation tools, the subject of my research. These tools are heterogeneous in their creators, scope, and applications. People trying to learn about their genealogy or relatives have a pretty straight shot to utility via personal genetic data access plus third-party tool analysis. People who want concretely useful information about their health and wellness may be a little more hard-pressed to find it — in some cases because of the nature and evidence base of these tools, and in other cases just because we don’t yet know enough about the genome to make robust predictions.
Notably, some third-party tools are crowd-sourcing genomes for research, an aspect that supports the instrumental value of personal genetic data access a bit more strongly than the individually-focused efforts. But for these tools, the instrumental value arguably exists at the aggregate level, not in individual genomes.
Much of the broader discussions about what genetic information should be returned to people across different contexts invoke the general idea of utility. If there were ever a word to invoke instrumental value, this is it. Some thresholds for returning genetic information to people (e.g., patients or research participants) rest on the idea of clinical utility: genetic information that is useful in deciding the course of clinical care. Others have argued for considering non-clinical, personal utility: things like being able to make personal, non-medical choices based on genetic information (e.g., family planning, career choices, insurance coverage, etc.)
One aspect of personal utility can be the “value of knowing,” and here I wonder if we’re actually circling back to a concept of intrinsic value. Some people value having a genetic “answer” to their condition (in the case of a disease-causing genetic variant), even if there is nothing they can really do about it. Is this valuing the information as an end-in-itself, for intrinsic worth? Perhaps it’s also partly instrumental value, because the knowledge brings about peace of mind, some end to a diagnostic odyssey, perhaps.
Examining intrinsic versus instrumental values of objects and acts has implications for adopting different systems of moral reasoning. For example, utilitarian moral philosophies consider the instrumental value of acts – i.e., the rightness or wrongness of an act is based on the consequences. In contrast, deontological moral philosophies claim the intrinsic nature of an act influences its moral value.
How we act with respect to personal genetic information can be based on utilitarian or deontological ways of thinking. Utilitarianism would say that if having genetic data is useful to people, we should let them have it; if it’s not useful, we shouldn’t (or at least are not morally obligated to give it to them). In contrast, a deontological approach would examine the intrinsic rights of people to have information about themselves, and/or possibly the duty of researchers (to return to my Nature commentary) towards their participants. This duty may or may not encompass returning personal data to participants — I think that’s up for debate. It might be a good idea generally, but is it morally required of researchers to offer participants their data?
Implications for policy
Philosophizing about moral arguments is well and good, but at the end of the day policies are the real carrots and sticks. When it comes to personal genetic data access, I don’t think intrinsic value arguments are going to do enough. Instrumental value arguments will help decision-making, but it’s not clear-cut given the state of the science and the heterogeneity of third-party tools. Empirical evidence is needed to evaluate the claims of instrumental value, and here I’m excited to have my dissertation research play a role.
There is a file type used to store large-scale genetic data called a “vcf” file, short for “variant call format.” To a PC, however, a “.vcf” file extension means something completely different: it’s the “vCard” format used to send Microsoft Outlook contact information.
Therefore, if you click on a genetic “.vcf” file with a PC, it will likely suggest opening it with programs such as Microsoft Outlook, Windows Contact, or the like. In addition to being kind of hilarious, and potentially frustrating to a layperson trying to examine their genetic data, this clash is a microcosm of a fascinating larger trend. Non-specialists are getting access to personal genetic data and wanting to do something with it. But who or what assists them in these endeavors? How should the scientific community respond to people banging on the door of their genetic expertise and skillsets? I can’t supply all the answers, but I can break down this amusing “vcf” conundrum in service of exploring these larger questions.
Personal genetic data access
It is now easier than ever to get a hold of your genetic data. Millions1 of people have availed themselves of direct-to-consumer genetic tests, most of which allow the customer to download his or her “raw” genetic data file. In addition to DTC testing, people might gain access to their genetic sequence by getting a clinical genetic test. HIPAA laws allow people to access the full lab reports from clinical testing, so for sequencing tests this would likely include “raw” sequence data (given current data standards, probably in a “vcf” file). A third way people might gain access to their genetic data is by joining a research study that makes such data available to participants. This has historically not been common practice for research studies, but early adopters such as the Personal Genome Project and now the nation-wide Precision Medicine Initiative are allowing this.
You might argue that acquiring and wrangling with your genetic sequence is still a rather niche endeavor, and I think that’s probably true. (Though note this is an empirical question I’m trying to answer in my dissertation research — exactly who is doing this and why?) But even so, I expect that personal genetic data acquisition will become more mainstream in the future. You have only to look at the popularity of fitness trackers and other wearables to see our society’s obsession with amassing and tracing data about ourselves.
Ok, so simply having a .vcf file of your genetic data doesn’t make you a genetics expert or even mean you can do the first thing with the data. But there are lots of middle men out there in the form of third-party interpretation tools that will help you “do something” with your data. (Note not all work with .vcf files, in part because DTC companies don’t typically provide customers their data in .vcf format, but that’s a technicality.) This ecosystem of raw data access plus third-party interpretation leads to the situation where people are trying to gain access to scientific expertise in new ways. You could say it’s a sort of redistricting who gets to look at genetic data and try to put it some use. The playing field is far from even when you compare a genetics researcher with a layperson, but the general trend is there.
This frustration someone might have trying to open a “.vcf” file is not hypothetical. I have heard of cases where people downloaded a “.vcf” file of their genetic data, from one of the third-party interpretation tools mentioned earlier, and were really annoyed and even angered by their inability to open and understand the file. And their PCs were of no help – potentially even actively misleading them as to the appropriate way to open the file (I admit I don’t know what a Mac OS would try to make of this file).
Why were people so mad? One possibility: we expect our technology to be intuitive. Our Google searches autocomplete, our smartphone reminds us to breathe, and we can shout at Alexa across the room to play our favorite song. Understanding how to work with and understand our genetic data is far more nuanced. Even for experts there is a lot of uncertainty about what certain genetic variants mean.
Ironically, this information age that is precipitating access to all this personal data may at the same time be conditioning us to expect instantaneous and even anticipated interpretation and utility from that data. If that’s true, it’s definitely a recipe for frustration when it comes to non-expert personal genetic data analysis.
Meanwhile, think before you double click.
1 – The three major DTC players are AncestryDNA, 23andMe, FamilyTree DNA. AncestryDNA has over 3 million customers genotyped: https://blogs.ancestry.com/ancestry/2017/01/10/ancestrydna-surpasses-3-million-custom. 23andMe has over 1 million customers genotyped: https://mediacenter.23andme.com/fact-sheet/. I have been unable to find a count of genotyped FamilyTree DNA customers — let me know if you have one!
Lately I’ve been reading, discussing, and thinking about the concept of “participation.” It’s an idea that gets thrown around a lot but without too much examination or critique. Specifically, I am researching theories and frameworks of “participation” as it relates to my dissertation. The question I’m asking is: does giving people access to their own genetic data increase their participation or level of empowerment: in their health care, in their research participation, in their lives in general?
While my project is about consumer genomics, my literature searches on “participation” have rippled out into politics, economics, media studies, and social and information sciences. Understanding how and where ideas of “participation” are invoked, and with what consequences, draws on all these fields. The topic is particularly salient in this age of digital information, where the ubiquity of the Internet and social media offers us unprecedented platforms to create, consume, and interact.
Below I’ll throw out some the ideas I’ve encountered in my literature search. This will be a bit of “spaghetti on a wall” type of exercise, so feel free to “participate” as much or as little as you’d like.
Dimensions of participation
Information scientist Kelty and colleagues  tease out seven dimensions of participation and examine how different projects or communities stack up on these different dimensions. The dimensions get at things such as: whether participants have control over resources (tangible or informational) and to what extent they help to define goals and tasks of the project. Perhaps my favorite dimension is the affective experience of participation, delightfully described as “collective effervescence.” Do you feel like you are participating?
The authors give Facebook as one example of a participatory project that succeeds in some dimensions while falling far short in others. Ability for Facebook users to participate in decision-making or goal-setting is basically nonexistent. On the other hand, the collective effervescence is staggering. We skip along, posting, liking, registering one of six emotions (like, love, wow, haha, sad, angry), and all the while Facebook accrues a staggeringly large, profitable, and powerful database of its 2 billion users .
Participation vs. engagement vs. involvement
In another paper, Woolley and friends examine the idea of “participation” as invoked in biomedical research . It’s become very trendy to tap into ideas of “citizen science” and “participatory research” even in more centralized, national research strategies. But this article argues that we should distinguish between three things: participation, engagement, and involvement. People may “participate” in studies simply by signing a consent form and giving a biological sample. But are they really engaged? Probably not. There’s not typically an ongoing relationship between the participant and the researcher, and the participant is not really weighing in on any part of the research in a democratic sense (e.g., developing research questions, interpreting what the results mean, etc.).
I’m reminded of the different levels of participation in our political system. I might participate in our democracy simply by voting in major elections, but am I really engaged? Engagement seems to mean something more, maybe expending extra effort to stay on top of political news outside of major elections cycles and to regularly call my representatives to voice my opinion on various proceedings.
Empowerment with a hint of coercion
Moving into consumer genomics, my dissertation area, direct-to-consumer (DTC) companies have arguably introduced a more participatory form of genetic research. People can opt into research studies on a study-by-study basis (versus an up front, blanket consent to myriad possible uses of their data), and they receive access to information about their genome (not so in traditional research). This does seem more participatory. By the Kelty dimensions of participation, this comes in strong on resource control: people get to access their own data, not just contribute it to a larger effort; and affective capacities: customers can join online discussion boards, some oriented towards specific genotypes, as well as connect with family members if they so choose. You can just hear the collective effervescence fizzling.
But there are tensions underlying this participatory model of DTC genetics, some of which have been articulated by anthropologist Sandra Soo-Jin Lee and media scholar Kate O’Riordan. Lee has described how consumer genomics is concerned with “biological potential” — the idea that personal genetic information has some future ability to help people. But there are questions of power: who is equipped and positioned to realize the biological potential of having their genetic information? My question then becomes to what extent are DTC companies able to realize the collective biological potential of their customer base (probably quite a lot) to the exclusion of individual customers being able to realize and harness that potential? It’s not that the companies are ill-meaning, but it’s just a fact that for the most part studying genomes in the aggregate (i.e., lots of people at once, as is done in the research context) is more valuable than studying individual genomes. This is partly a result of the current state of genomics knowledge, so this might shift in the future. But as it stands, for most people, studying their individual genome is unlikely to lead to great insights about their health and identity.
O’Riordan has written about how DTC genetics and the subsequent access of personal genetic information by lay persons has created a “new digital genomic public” capable of new “readings” of genomes, now circulated as digital texts . (Without delving into the full article, let me just assure you these ideas are as cool as they first sound.) Skipping ahead to one of her conclusions:
“The features of DTC genomics are contradictory but indicate the conditions of a contemporary collectivity that is at once embodied and informatic, empowered and coerced, personal and public.”
Empowered and coerced — exactly. People submit samples to get genetic testing so that they can receive information about themselves, perhaps leading to empowerment, but in so doing they are perhaps coerced to share this information with others. Note that DTC companies generally allow customers to opt in or out of research, so people aren’t exactly coerced into participating in research. But they do become part of the company’s database, which can be used more broadly than just for research.
The Janus-face of empowerment and coercion circles back to the titular question of this post: do we participate in Facebook? Kelty’s dimensions of participation show us how Facebook is participatory in some ways but not in others. My own personal experience of Facebook is definitely one of empowerment with a hint of coercion. Despite my misgivings about how much data Facebook has on all of us, I continue to use it for the convenience of staying in touch with friends and family. To get what I want, I stay hooked into the system.
This is not to demonize Facebook, but rather to articulate these tensions of participation, or being able to do what we want, and coercion into giving up some privacy or control over or personal (perhaps our genetic) data. We don’t need to quit using all these services, but at a minimum we should keep a critical eye on when and under what conditions we are being invited to participate.
 C. Kelty et al., “Seven dimensions of contemporary participation disentangled,” J. Assoc. Inf. Sci. Technol., 2015.
 “Is Facebook A Structural Threat To Free Society?,” TruthHawk (blog), 13-Mar-2017. [Online]. Available: http://www.truthhawk.com/is-facebook-a-structural-threat-to-free-society/. [Accessed: 27-Mar-2017].
 J. P. Woolley et al., “Citizen science or scientific citizenship? Disentangling the uses of public engagement rhetoric in national research initiatives,” BMC Med. Ethics, vol. 17, no. 1, p. 33, Jun. 2016.
 K. O ’Riordan, “Biodigital Publics: Personal Genomes as Digital Media Artefacts,” Sci. Cult. (Lond)., vol. 22, no. 4, pp. 516–539, 2013.
Imagine a stranger approaches you on the street and demands to either (1) take a sample of your spit so they can sequence your DNA or (2) plug a device into your smartphone that will transfer over to them your last month of sleep and activity data. Which are you more likely to hand over? Which feels less personal, less intimate?
Until recently, I would have assumed that for most people they’d be more reluctant to hand over their DNA sequence. But now I’m thinking it might be the opposite.
I’ll share this but not that
Back in December I talked with someone who developed and runs a website where people can upload genetic and other data for public use. The idea is that making such data publicly available enables researchers and other citizen-scientist types to easily access it and pursue scientific questions such as which genetic variants are associated with which traits and diseases.
Importantly, unlike my hypothetical scenario above, on that website all such data submissions are entirely voluntarily, and in fact the creators even try to actively dissuade people from contributing just to avoid people doing it and regretting it later. Genetic data was originally the focus of the tool, but more recently the developers considered adding the capability for people to upload their FitBit and other self-tracking device data to the site.
Their users and other commentators were generally not enthused about the idea of sharing that type of data. Why is that? Here are some of the developer’s thoughts:
“I think because still like the genotype data is pretty muddly, in terms of what you can learn from it, whereas it’s probably much more interesting how much sleep you are getting every night, how active you are over the day, things like this…people were like yes — sharing your genome I can somehow see but then the sharing, like, your weight, how much you sleep and how much you move over the day, this people found less easy about, I would say.”
Wait, your step counter is more precious to you than your “muddly” DNA? This all runs counter to the common phenomenon of “genetic exceptionalism,” where genetic information is held up above other types of personal information as more potent, more powerful, and perhaps in need of more protection. While many have argued this is a misguided position to take, especially when it comes to policy making and personal privacy protections, it is still a pervasive idea. But clearly not so much with the users of the data sharing website discussed above. People who decide to submit their genetic data for all the world to see are reluctant to share so openly data about their sleep, exercise, and nutrition.
What makes data personal?
What’s going on here? What are the criteria by which some information intuitively feels more private to us than others? I think there are at least three contributing factors.
Is the data visible to us, or tangible in some way? Even though our genetic sequence is partly responsible for building and maintaining our very visible and tangible bodies, it is a rather abstract concept to most of us. We can’t see or feel our DNA, unless we’ve done that favorite science fair experiment where we mix spit with some dish soap and other household items and watch our snot-like strands of DNA precipitate out of solution.
Sleep and activity, on the other hand, are very tangible, very immediate. We can envision the physical processes of going to bed and going for a walk. There are also specific places we go each day to carry out these activities.
Luckily, for most people, our DNA sequence doesn’t seem to directly impact how we feel or how we move through the world on a daily basis. (I’m thinking in contrast to people with genetic disorders that may affect their movement, diet, cognition, etc.).
For sleep, on the other hand, we can physically feel the results of excesses and deficits. It also has a cadence, a longitudinal pattern, that I think also makes it feel a little more relevant, in contrast to our (mostly static) genome.
Now this one’s interesting. Because despite what anyone tells you about “de-identified” genetic data, genetics is inherently identifiable. Given two DNA samples from the same person, you can tell with a high degree of certainty it’s the same person (or their identical twin). Granted, I’ve thought more about the identifiability of genetic data than of sleep and activity profiles, but let’s consider those. With sleep patterns, you might not be able to say exactly who someone is. But maybe you could say what type of person they are based on sleep patterns. Things like a morning person vs. night owl would be relatively easy to tease out, as would perhaps parents with young children or someone who works a night shift.
Another potential factor here could be “judginess” of certain data. With all our FitBits and default smart phone activity tracking, there’s certainly some societal pressure to get in your 10,000 steps a day and your 8 hours a night (though some would rather brag about their ability to thrive on only 4 or 5). Would we be similarly judgy about each other’s DNA? Films like GATTACA suggest we would. But if I’ve brought up GATTACA, then it’s clearly time to wrap up this post.
I’m curious to hear your thoughts about what types of personal data feel more private to you? Which would you be more or less likely to share?