The Human Factor in Data Management

“Over the last two years alone 90 percent of the data in the world was generated.” Considering the exponential growth of data, in part due to the rise of smart devices, it is important for organisations to consider how their policies and procedures affect their customers and affect society as a whole.

There are significant benefits associated with this increase in data. However, given that data offers companies power over people, it is important to analyse the trade-off between the benefits and harms in order to understand what can be done to mitigate the risks. It is also important to consider how data breaches, leaks and discrimination can stem from Artificial Intelligence (AI) and the increasing use of algorithms over human input.

The benefits and harms of data

We can look to ethical theories to offer insights into how we might develop codes of ethics for data management. Examples of these theories might include Virtue Ethics, Social Contract, Kantian Ethics and Utilitarianism.

Four ethical theories

Using ethical theories such as these assists in building a framework by which we can analyse whether any decisions we’re making when managing data are in fact “ethical” (using a basic definition of ethical as “pertaining to right and wrong in conduct”.) These theories also give us the tools and knowledge to rationalise any decisions that may differ from the original intentions of any code of conduct.

Even when presented with complex scenarios, humans are capable of making informed decisions based on codes of ethics stemming from ethical theories such as the ones above. The risks of causing harm can, in some cases, be attributed to the trend for using AI and the reliance on algorithms. In their paper “What is Data Ethics”, Floridi and Taddeo discuss how the “gradual reduction of human involvement … pose[s] pressing issues of fairness, responsibility and respect of human rights.”

Ultimately, we need humans (who naturally possess moral agency) to be responsible for how data is collected, processed and used, rather than becoming increasingly reliant on AI and algorithms in their current forms. “[We expect] people to act as moral agents, we hold people accountable for the harm they cause to others.” In a world where we have court systems to rule on right and wrong, it is risky to attribute, or to attempt to attribute, moral agency to computers. They certainly don’t have the moral acumen to base decisions on ethical theories such as those outlined above, i.e. they are unable to understand the moral consequences of their actions. In the future, it may be possible for computers to act in a way that replicates human interaction, but the one thing that limits the full reality is the inability for computers to show intent.

When people manage data, there is a level of nuance associated with their thought process. Good policy can enable (and even assist) people to make ethical decisions with regard for other human beings. As Lin Grensing-Pophal writes, “…intuition and experience can sometimes trump technology.” What might be obvious to a person might be missed by a computer or by analytical tools.

Take the following graphic. It shows examples of questions that should be asked when dealing with data at each of three stages — collecting, processing and using data. These questions have been based on the ethical theories above and will assist in minimising the risks associated with managing the data, and to build a framework around its use.

Ethical considerations for managing data

Use of this infographic (based on good professional data ethics) can allow the user to identify when systems and safeguards have been breached.

The development of AI is unceasing, but such systems cannot yet make rational decisions without human input. They could perhaps make informed decisions through the use of more intricate algorithms, but we are not yet at the point where we can attribute moral agency to a computer in the same way we can to a corporation or a person. The main reason we are not able to do so, once again, is that we cannot assign intent to a computer.

A simple Google search will find a variety of examples where computers have “got it wrong”. How do such mistakes happen? Are the algorithms too simplistic? Or is it a case of people blindly trusting computers and their decisions?

Two particular examples of algorithms at fault include:
- Car insurance companies in the UK quoting significantly higher prices for people named Mohammed versus people named John
- The potential for the “best person for the job” to be missed due to predictive analysis using keyword matches in resumes without any human intervention from HR

Both of these examples of AI decision-making illustrate “how individual’s data may create unfair bias.” The cases also demonstrate the benefits of utilising real people with expertise and moral values based on ethical codes and frameworks over the reliance on AI and algorithms. The human factor is significant in the successful management of data and should not be undervalued. It is imperative to have the input of real people in the management of data, specifically to ensure that we are acting in a morally appropriate way, and with intent.

Data Ethics allows us to make value judgements. These are more important than ever considering the sheer volume of data that is being created each day. Data collection has huge potential benefits, however it is imperative that there continues to be human interaction in the management of data.