Definitional clarity and “data governance”


A precocious playmate

One of my close friends growing up used a lot of big words. My vocabulary was decent, but with her I was often outpaced. I distinctly remember one time in third or fourth grade when I was at her house playing and she dropped one of her vocab words. I wish I could remember the exact word (I want to say it was something ironic, like “precocious”), but this time I decide to swallow my pride: “What does that word mean?” Her, unphased: “Oh, I don’t really know.” And then we both shrugged and moved on.

Though we didn’t dwell on the moment together, my little mind was blown: using a word doesn’t necessarily mean the speaker knows what it means. Or they might think they know what it means but are mistaken. Or two people might be using the same word differently in conversation—their respective uses are within the bounds of the word’s meaning but not quite aligned. Though that full range of implications was not apparent to my 8-year-old self, the encounter seeded in me a curiosity to test the waters of words’ meanings and people’s understandings of those meanings.

Dictionary entry

Definitional clarity

Fast forward 20 years and I’m in graduate school. I latch onto a phrase one of my professors used as a touchstone in her lectures and class discussion: “definitional clarity.” YES, what are these terms we are using and not just what is the dictionary definition but what are we really talking about?! Definitional clarity is important not just for those $10 vocabulary words but also those seemingly simple, straightforward words where the potential for unexamined, misaligned meanings is heightened. A classic example is the word “significant”—it’s easy to assume this word means notable or deserving attention. To a statistician, however, a “significant” finding is one unlikely to have occurred by chance based on defined statistical principles. Or, an even more timely example: the word “airborne” and COVID-19.

The search for definitional clarity is especially important in interdisciplinary spaces, i.e. where you bring together people with different backgrounds and training to work on a common issue or cause. In my field of studying ethical and social issues of genetic information and technology, interdisciplinarity is a core feature and therefore finding definitional clarity is critical.

Defining “governance”

My latest search for definitional clarity is for a word I’ve encountered for years and in fact have helped build an entire research project about: “governance.” Dictionary.com gives a tidy “government; exercise of authority; control; also, a method or system of government or management.” In a recent conversation with several data science and legal scholars, I casually asked around for working definitions of the word and was directed to a United Nations report, which itself has a collection of definitions. An excerpt:

Governance is the system of values, policies and institutions by which a society manages its economic, political and social affairs through interactions within and among the state, civil society and private sector. It is the way a society organizes itself to make and implement decisions—achieving mutual understanding, agreement and action.

The definition goes onto note that governance includes “mechanisms and processes” and “rules, institutions, and practices” for different groups at every level of society to articulate and negotiate their own interests, rights, and obligations. This framing, though helpful, I find incredibly broad—and perhaps rightfully so in a public policy context where we’re talking about nations, states, and municipalities.

Why “data governance”?

The context in which I’m studying governance is in biomedical research and in particular, the governance of data. My working definition of “data governance” is “decision-making about how biomedical data are stored, accessed, and used by researchers, as well as communicated to research participants.” While data governance is increasingly in vogue (see upward Google search trends below), a related predecessor context is the governance of biobanks, i.e. collections of samples (e.g., blood draws or other biospecimens) from human research participants prospectively gathered or “banked” for to-be-determined research project(s).

Does governance follow resources?

Why do we have these subtypes of governance, e.g., specific to biobanks or to biomedical data more broadly? My working hypothesis is that where resources accrue, governance follows. Resources create value which creates (potentially competing) interests that must be navigated and adjudicated in a well-reasoned and systematic way. Governance at the broad level of states and nations helps organize thinking around natural resources and perhaps even “people power,” or the collective intellect and capabilities of the people in the unit of governance.

Perhaps data governance was solidified as a salient concept with the emerging idea of data as a valuable resource, evidenced in the abundant metaphorical framing of “big data” as “the new oil” (see Sara Watson’s article on dominant metaphors for big data, which I wrote more about in an 2016 post on metaphors.) A downside of the “data as natural resource” metaphor is that it can obscure agency and interests of the data contributors: the people from whom the data were generated or “extracted” (speaking of the “data as natural resource” metaphor…).

Data governance can perhaps try to reclaim or resurface the agency and interests of those contributors, but it can be difficult when they are proxied by participant representatives or other “data stewards” who will have varying degrees of removal from the original donor. Other ingredients of data governance may not have the donors’ interest in mind at all but rather the interest of institutions providing or using the data, or perhaps adherence to other federal laws and regulations. Balancing those various interests, and having a well-reasoned system of decision-making, is therefore crucial to a good system of data governance.

An important step in achieving definitional clarity is bouncing your working definitions off other people, seeing what resonates and what needs refining. Fortunately, part of the data governance project I mentioned is to do qualitative interviewing with researchers and technologists building new data storage and analysis platforms, so it will be interesting to see how these different stakeholders define the contours of data governance. Like with my third-grade friend, the first important step is to ask.

 

Leave a Reply

Your email address will not be published. Required fields are marked *