On the Ethics of Open Source Security Research
Security research is an important aspect of cybersecurity, but it is not particularly easy to carry out cybersecurity research or to do so scientifically. It is not entirely surprising then that a research team may consider trying “novel” tactics to carry out research.
However, this April, a security research team at the University of Minnesota got into hot water for methods that were pushing ethical boundaries.
In this article we’ll explain why we need open-source security research so much, why cybersecurity research requires a scientific basis, and how the team at the University of Minnesota got it wrong.
Introduction to Open-Source Security Research
Cybersecurity is essentially a race. In the one camp are malevolent actors racing to find new, unknown flaws (vulnerabilities) in computer software – including in open-source software.
These new, undiscovered flaws – also know as zero-day vulnerabilities – are extremely valuable to attackers. Because users have not yet protected their systems against these flaws, zero-day vulnerabilities are easier and more lucrative to exploit.
In the other camp are cybersecurity researchers, the teams that try to find flaws before the bad guys find these vulnerabilities. When flaws in open-source software are found by security researchers remedial action can be taken. Researchers carefully test mitigation – such as patches – and then go public with the flaw, also publishing the mitigation methods.
Just like any race, the more effort one of the teams make, the more likely the team is to win that race. That’s why the efforts behind open-source security research matter so much: the more research that is done, the less likely malevolent actors will get opportunities to exploit the zero-day flaws found by the bad guys.
The above description is somewhat simplified. For example, in terms of software code, researchers will look at how one flaw can lead to a chain of compromises. And, in the broad, cybersecurity researchers will look not just at lines of software code, but also at things like networking and hardware flaws, procedures and policies, and more strategic aspects of cybersecurity.
There is, arguably, an urgency to open-source security research. It is only a matter of time before a flaw is discovered, and it is better if a flaw is discovered by a security researcher – rather than a hacker.
However, this urgency does not warrant taking shortcuts. In cybersecurity research, following a scientific and ethical research methodology matters just as much as it does for natural or social sciences.
Understanding Scientific Research
Scientific research methods may appear to be unnecessarily rigid and restrictive, but scientific research methods are tried and tested and exist so that researchers can submit results that withstand scrutiny. Rigid, structured research methods help researchers to avoid common research pitfalls and errors that can undermine the validity of the research.
For example, research methods are designed to prevent issues around the manipulation of data – whether intentionally, or by accident – and it does so through robust research design. Risks of research wrongs such as plagiarism are also mitigated when research studies are designed according to set methodologies.
A complete overview of the scientific method is beyond the scope of this article, but to summarize, we can split structured scientific research into method – and governing principles.
In terms of method, scientific studies are essentially built around a hypothesis. First, the researcher formulates a hypothesis based on previous observations, experiments, or measurements. A prediction is then made which is tested through experimentation. When results come in the hypotheses is analyzed and confirmed or indeed modified in light of the results.
Next, there are a set of principles that governs this process. Research must always be objective and free from bias – data must be representative, for example, not a specially selected data set that is trimmed from the source data in order to manipulate the results.
Why the Scientific Method matters
Other key tenets of the scientific method include the ability to reproduce results, and the ability to verify results. Together, these principles work to guide researchers so that the efforts going into scientific research produce results that are robust, reliable, and testable.
We could point to two key areas. First when we think about scientific progress, whether in cybersecurity or another area, we often think of building blocks. Researchers build on the results of their predecessors and so these building blocks must be sufficiently solid.
By following a robust scientific method, researchers create “structures” that future researchers can build on. In contrast, if the science is flawed, the structure can collapse. That’s why the scientific method is so painstakingly slow and restrained. It only works with the systematic and empirical collection of evidence – so that the science can withstand scrutiny.
But the second and perhaps one of the most critical elements of scientific research methods is ethics. It is not hard to see why “shortcuts” that cause harm to the subjects can lead to faster results in a scientific study. For that reason studies are subject to scrutiny around ethics, and it is why major educational institutions have ethics boards – to scrutinize the ethics behind research projects.
It’s easy to imagine how in the absence of any ethical limits medical and social studies can lead to harm to the subjects that are being studied.
Even in cybersecurity research, it can be tempting to make much faster progress by running studies that cause harm to the subjects being studied. This is what happened at the University of Minnesota this year.
So, what happened in Minnesota?
In April, three researchers at the University of Minnesota (UMN) were performing a study on something called software supply chain attacks. In other words, the research team was studying how hackers could launch an attack by interfering with how software is developed. The team at UMN wanted to test how robust this development process is.
As subjects, the UMN team chose the group that governs Linux kernel development. The research depended on a methodology that is part and parcel of open-source software development – the ability of diverse organizations and individuals to contribute code to the Linux kernel, subject to a review process.
Just like many other significant educational institutions, the UMN team could directly contribute to the Linux kernel. UMN contributions would be subject to review, and it is this review process that the three graduate students were interested in.
In the study, the research team attempted to insert a use-after-free (UAF) vulnerability into the Linux kernel by submitting flawed code by virtue of their ability to make submissions to the kernel. The team wanted to assess whether this flawed code gets caught by the review process.
Submissions were made multiple times and, as it turns out, the flaws were caught by the team that monitors submission to the Linux kernel. However, the Linux kernel oversight team was not pleased when they found out that they were subjects of a research study that they didn’t know about. The researchers were found to be submitting flawed code and, and as a result, the UMN was banned from submitting further updates to the Linux kernel.
From a research viewpoint, one can see why the UMN students tried asking the question that they did. If flawed code is submitted to the Linux kernel will it get flagged? And if so – how long does it take for this code to get flagged? However, from an ethical viewpoint, the team’s methods were not acceptable.
Why the University of Minnesota approach is unethical
The UMN team’s approach contained several ethical flaws. First, the team never advised the open-source community that it was planning on conducting this research. As a result, the subjects of the study were experimented on without their consent – which goes against the ethical approach to scientific studies.
Next, there was a risk that the flawed code that was submitted by the UMN could have stayed in the kernel and eventually included in various Linux distributions – which could lead to damage to any number of parties further down the line.
Again, that was unethical simply because a research study should never lead to harm. Interestingly, the clean-up process, once the UMN team had been found out, was quite challenging as the researchers at UMN did not document their efforts clearly. In essence, all of the UMN’s contributions to the kernel had to be pulled to eliminate the chance that “experimental” flawed code makes it into mainstream Linux distributions.
The researchers behind this study did not get the permission of their subjects – a principal violation of research ethics. The researchers also violated one of the key tenets of open-source development – trust – by introducing bugs into the kernel on purpose. It led to an angry reaction from the entire community.
Whatever one’s views on the merits of the study, there is little question that ethical boundaries were breached, and it is, therefore, no wonder that, a few weeks ago, the Linux kernel oversight team blocked further contributions from students and staff at the University of Minnesota.
Keeping cybersecurity research ethical
At the start of this article, we outlined why cybersecurity research is so critical – technology users are essentially in a war against malevolent actors that range from the common criminal to entire countries. Cyber attacks cost the economy billions and can destroy entire organizations.
Performing cybersecurity research in an ethical manner is just as important so that results can be trusted, verified, and learned from – and to prevent harm from occurring. However, there is another issue at stake.
If there is a climate of fear and distrust around cybersecurity research it will likely impact future research efforts: both the motivation to perform this research, as well as trust in the research outcomes.
That is why it is so critical that researchers in the cybersecurity arena follow the scientific method and all of its associated characteristics – including an ethical approach to cybersecurity research.
In the case of the UMN, the ethics board must scrutinize a research study and flag ethical concerns. We can only speculate why the UMN board did not put a stop to this study. Perhaps the ethics board did not fully appreciate the nuances of open-source software development nor the ethical issues raised by the research study.
As for the UMN team, the researchers should at the very least have asked the permission of the kernel maintainers to run the experiment. Not doing so abused the goodwill of the open-source community given that these kernel submissions are just accepted and included.
Looking at scientific research in the broad, incidents such as the one we just described is uncommon, but not unheard of. It’s certainly becoming less common over time as the research and overview process becomes more thorough. For example, we’re seeing few of the crazy medical experiments we saw a century or so ago.
When unethical research behavior is detected a firm reaction and publicity can help deter similar behavior. The UMN team has been reprimanded and the overall reputation of the university’s computer science department has taken somewhat of a beating.
It should serve as a lesson to others: even in cybersecurity research the scientific method and ethics matter. We won’t meaningfully progress against the cybercrime threat unless there is a collaborative approach that respects everyone’s interests.
Finally, the trusting nature of the open-source community must be preserved. It is not acceptable for cybersecurity researchers to abuse the trust that is so intrinsic to progress in open-source software. Those managing open-source software development should remain vigilant – and actively discourage unnecessary experimentation.
Bonus content: Open-Source Ethics, and how the University of Minnesota Failed Linux
Check out the discussion between Jay from the LearnLinuxTV and our TuxCare Evangelist Joao Correia on how the University of Minnesota got the open-source wrong, the strengths and weaknesses of open-source in general.