Logo and page links

Main menu

How to succeed with transparency

How to succeed with transparency

The use of artificial intelligence (AI) is associated with a wide range of issues and aspects that fall under the broad umbrella of transparency. As soon as AI is used in connection with personal data, transparency is required by the data protection regulations. Yet under this umbrella we also find ethical questions and technological issues relating to communication and design. We've made an experience-based report on how to communicate when using artificial intelligence (AI).

Introduction

While the regulations set out clear requirements with respect to transparency, they do not draw razor-sharp boundary lines or prescribe exactly how to be transparent. Specific assessments must be made in each individual case. This is why we devote so much space in this report to examples, because the assessments and associated initiatives in these real-life examples can provide valuable lessons for others facing similar issues. This is not a complete guide to all aspects of transparency when using AI. However, we highlight some key sandbox discussions that we believe can be of value to others.

Trust is a recurring topic in the examples. If people are to be willing to use solutions and share personal information, they must feel confident that the solution works as intended and adequately protects their privacy.

The Norwegian Data Protection Authority's 2019/2020 survey showed clear indications of a “chilling effect” on public engagement. In other words, if people are unsure about how their personal data are going to be used, they change their behaviour. Over half of those questioned have avoided using a service due to uncertainty about how personal data are collected and used. As many as two out of three respondents feel they have little control and are powerless when it comes to the flow of personal data online. It is easy to assume that this chilling effect applies not only to the internet, but that the scepticism spills over into other forms of personal data sharing, for example when confronted by AI-driven tools.

This experience-based report starts with a review of the most important statutory provisions relating to transparency when using AI. We then present three projects from the Norwegian Data Protection Authority's regulatory sandbox, where transparency has been an important topic. Finally, we have drawn up a checklist for transparency in relation to AI.

Whether you are a programmer sitting at the heart of the development process or an entrepreneur with a burgeoning idea, whether you work for a major enterprise or a small startup: we think this report could be useful for many of those who are developing, using or considering the procurement of AI-based solutions. We hope you will find it illuminating and inspiring, and that it can help you succeed with your AI endeavours.

Statutory transparency requirements

Transparency is a fundamental principle of the EU's General Data Protection Regulation (GDPR), which requires that the person the data relate to (the data subject) be notified of which personal data have been recorded and how the data are being processed. Transparency relating to the processing of personal data is a precondition for the ability of individuals to uphold their rights. Transparency can also help to uncover errors and unfair discrimination, and to engender trust.

Irrespective of whether or not personal data are processed using AI, the GDPR requires transparency and disclosure. In brief, the GDPR requires that:

  • Personal data must be processed in an open and transparent manner (see Article 5(1)(a)). This means, for example, that the data controller must ensure the data subject has sufficient information to uphold their own rights.
  • The data subject must be informed about how their personal data is being used, whether they have been obtained from the data subject themselves or from one or more third parties (see Article 13 and Article 14).
  • If the data subject has supplied the data, such information must be provided in writing before or at the same time as the data are collected (see Article 13). If the data have been obtained from other sources, the data controller must notify the data subject within a reasonable period of time that the data in question have been collected (see Article 14).
  • This information must be written in an understandable fashion, using clear and simple language (see Article 12). It must also be easily accessible to the data subject.
  • The data subject is entitled to be told that their personal data is being processed and to inspect the data concerned (see Article 15).

A comprehensive guide may be found in WP29's/EDPB's guidelines on Transparency.

Transparency requirements relating to the development and use of artificial intelligence (AI)

The use of artificial intelligence (AI) is normally divided into three main phases:

  1. Development of the algorithm
  2. Application of the algorithm
  3. Continuous machine learning and improvement of the algorithm

The GDPR's transparency requirements are general and are essentially the same for all the phases. However, there are some requirements that are relevant only in certain phases. For example, the requirement to provide information on the algorithm’s underlying logic is, as a rule, only relevant in the application phase.

In the development phase, data is processed for the purpose of developing one or more algorithms or AI models. The personal data used are generally historic data that have been collected for a purpose other than the development of an AI model. In the application phase, the AI models are used to carry out a specific task in practice. The purpose of the data processing is normally linked to the task to be performed. In the final phase, the continuous machine learning phase, the AI model is further developed and improved. In this phase, the algorithm is continuously refined on the basis of new data collected during the application phase.

In the following, we presume that in all three phases data is processed lawfully in accordance with Article 6 of the GDPR. Read more about the legal basis for data processing. We will examine in more detail the duty to provide information in the various phases.

Transparency requirements in the development phase

Articles 13 and 14 of the GDPR require undertakings to give notice when personal data are used in connection with the development of algorithms. Article 13 applies to data obtained directly from the data subject, for example by means of questionnaires or electronic tracking. Article 14 regulates situations where data are obtained from other sources or have already been collected, e.g. from one or more third parties or publicly available data.

When the data have been obtained directly from the data subject and are to be processed in connection with the development of AI systems, Article 13 requires the data controller to disclose the following:

  • The types of personal data to be processed
  • The purpose for which the algorithm is being developed
  • What will happen to the data once the development phase has finished
  • Where the data have been obtained from
  • The extent to which the AI model processes personal data and whether anonymisation measures have been implemented

In principle, the data subject will have specific rights in connection with all processing of their personal data. The most relevant right is the right to request access to and the deletion and correction of their data, and, in some cases, also to object to their processing. Large quantities of personal data are often used in the development and training of AI. It is therefore important that both the solution's development and training are assessed specifically in relation to the regulatory framework.

On the whole, the same duties set out in Article 14 apply to data that have already been collected and used for a different purpose than the development of AI systems, such as information the undertaking has recorded on its customers or users.

However, Article 14(5) contains an exemption which may be relevant for the development of AI systems. Due to the vast quantities of data that are often required for the development of AI systems, notifying all the data subjects concerned can be a resource-intensive process. For example, in research projects involving the use of register data from hundreds of thousands of people, it may be difficult to notify each person individually. It follows from Article 14(5) that an exemption may be made if the data subject already has the information, the provision of this information proves impossible or would involve a disproportionate effort, the collection or disclosure is expressly permitted under EU law or the member states’ national legislation, or if the personal data must remain confidential under a duty of professional secrecy.

What constitutes a disproportionate effort will always rest on discretionary judgement and an overarching assessment of the specific circumstances. The Norwegian Data Protection Authority recommends that a minimum of information be provided in all cases, so the individual data subject knows in advance whether their personal data are being used for the development of AI. This may be ensured by means of the publication of general information concerning the data processing, e.g. on the undertaking's website. The information must be accessible to data subjects before further data processing commences.

Transparency requirements in the application phase

In the application phase, disclosure requirements will depend on whether the AI model is used for decision-support or to produce automated decisions.

For automated decisions which have a legal effect or significantly affect a person, specific disclosure requirements apply. If processing can be categorised as automated decision-making pursuant to Article 22, there are additional requirements for transparency. (See also Article 13(2)(f) and Article 14(2)(g).) The data subject is entitled to:

  • Information that they are the subject of an automated decision.
  • Information about their right not to be the subject of an automated decision pursuant to Article 22.
  • Meaningful information about the AI system's underlying logic.
  • The significance and expected consequences of being subject to an automated decision.

Although the provision of such supplementary information to the data subject is not expressly required when the AI system is being used as a decision-support tool, the Norwegian Data Protection Authority recommends that it be provided in such cases. This is particularly true where “meaningful information about the AI system’s underlying logic” can help the data subject to better uphold their rights.

A meaningful explanation will depend not only on technical and legal requirements, but also on linguistic and design-related considerations. An assessment must also be made of the target group for the explanation concerned. This could result in different wording for professional users (such as the NAV advisers and teachers referred to in the following examples) and more sporadic users (consumers, children, elderly people).

These EU guidelines provide advice on what a meaningful explanation of the logic could contain.

The data controller must assess how detailed to make the explanation of how the algorithm works, while ensuring that the information is clear and understandable for the data subjects. This may be achieved by including information about:

  • The categories of data that have been or will be used in the profiling or decision-making process.
  • Why these categories are considered relevant.
  • How a profile used in the automated decision-making process is constructed, including any statistics used in the analysis.
  • Why this profile is relevant for the automated decision-making process.
  • How it is used to make a decision that concerns the data subject.

It may also be useful to consider visualisation and interactive techniques to assist with algorithmic transparency.

Public sector undertakings may be subject to other requirements relating to the provision of information concerning the reasons for automated decisions, e.g. the Norwegian Public Information Act or sector-related legislation.

In those cases where the data subject is entitled to object under Article 21, they must be made explicitly aware of their right to object pursuant to Article 21(4). The data controller is responsible for ensuring that this information is provided clearly and separately from other information, and that it is easily accessible – both physically and in the way it is framed. While it is natural to include such information in a privacy policy, this alone would probably not be sufficient to fulfil this requirement. The data subject should, in addition, be notified of their right to object in the interface where the processing of their data is initiated. In the event of an application portal, for example, the information should be clearly visible on the website or in the app into which the personal data is entered.

In connection with the use of personal data collected in the application phase for continuous machine learning, the requirement to provide information will largely coincide with the requirements in the development phase.

Transparency when using AI in schools

The Aktivitetsdata for vurdering og tilpassing (AVT) project is a research and development project focused on the use of digital learning analytics in schools. The project explores the use of learning analytics and artificial intelligence (AI) to analyse pupil activity data from various digital learning tools.

Activity data is the term used for the data that are generated when a pupil completes activities in a digital learning tool. Such data could comprise information about which activity the pupil completed, how long they spent working on it, and whether or not they answered correctly.

The purpose of the project is to develop a solution that can help teachers to adapt their teaching to the individual pupil. For example, when maths teacher Magnus starts preparing his class for the exam, the system will come up with a revision proposal based on the work the pupils have done recently. Maybe the AI will suggest more algebra for Alfred and more trigonometry for Tina because this is where it has identified the largest gaps in their knowledge?

In addition to individually adapted teaching, the project’s purpose is to give pupils greater insight into their own learning and support teachers in their pupil assessments. The goal of adapted teaching is to ensure that the pupils achieve the best possible learning outcome from their education. On a more general level, the AVT project aims to drive the development of national guidelines, norms and infrastructure for the use of AI in the teaching process.

Specifically, the AVT project uses an open learner model as well as analytics and recommendation algorithms to analyse learning progress and make recommendations for pupils. Analysis results are presented in an online portal (dashboard) customised for each user group – such as teachers, pupils and parents. Users log in to the portal via Feide.

The project owner for the AVT2 project is the Norwegian Association of Local and Regional Authorities (KS). The project is led by the University of Bergen (UiB) and its Centre for the Science of Learning & Technology (SLATE). The City of Oslo’s Education Agency has been the project's main partner and driving force since it commenced in 2017. Recently, the Municipality of Bærum and the regional inter-municipal partnership Inn-Trøndelag have also joined the project in smaller roles.

The sandbox discussed three aspects of transparency:

  • User involvement to understand the risk and the types of information the user needs
  • How to provide information tailored to the users
  • Whether it is necessary to explain the algorithm's underlying logic

User involvement to understand risk and information needs

The AVT project invited pupils, parents/guardians, teachers and municipal data protection officers to participate in a project workshop to discuss privacy risks relating to the use of learning analytics. Understanding the risks to users posed by the system is important if relevant and adequate information is to be provided. Transparency about the use of personal data is not simply a regulatory requirement to enable the individual to have as much control as possible over their own data. Transparency about the use of data can also be important to reveal errors and distortions in the system.

The workshop participants were given a presentation on the learning analytics system, which was followed by discussions in smaller groups: one group comprising children and adults, and one group comprising only adults. The groups were tasked with identifying risks to the pupils’ privacy resulting from use of learning analytics. Below, we have summarised the discussions with respect to three types of risks.

Risk of altered behaviour/chilling effect

When pupils work with digital learning tools, potentially detailed data may be captured and stored. For example, how long a pupil spends on a task, the time of day they do their homework, improvement in performance over time, etc. Keeping track of what data have been registered about them and how these data are used can be challenging for pupils.

The pupils who participated in the workshop were especially worried about the system monitoring how long it took them to complete a task. They pointed out that if the time they spent working on a problem was recorded, they could feel pressured into solving the problems as quickly as possible, at the expense of quality and learning outcome. A chilling effect may arise if pupils change their behaviour when they are working with digital learning tools because they feel the learning analytics system is tracking them. In other words, they change their behaviour because they do not know how their data may be used.

Another example of a chilling effect mentioned in the discussions was that pupils may not feel as free to experiment in their problem-solving, because everything they do in the digital learning tools is recorded and may potentially affect the profile built by the learning analytics system.

If the introduction of an AI-based learning analytics system in education leads to a chilling effect, the AI tool may be counterproductive. Instead of the learning analytics system helping to provide each individual pupil with an education adapted to their needs, the individual pupil adapts their scholastic behaviour to the system.

Adequate information about the type of information collected and how it is used (including which information is not collected and used) is important in order to give the user a sense of assurance and control. It can also help to counteract unintended consequences, such as pupils potentially changing their behaviour unnecessarily. 

Risk of incorrect personal data in the system

A fundamental principle in the data protection regulations is that any personal data processed must be correct. Incorrect or inaccurate data in a learning analytics tool could have a direct impact on the individual pupil’s profile. This could, in turn, affect the teacher’s assessment of the pupil’s competence and the learning resources recommended for the pupil.

The learning analytics system collects data on the pupils’ activities from the digital learning tools used by the school. One potential source of incorrect data, which was discussed by the adult participants in the workshop, is when a pupil solves problems on someone else’s behalf. This has probably always been a risk in education, and there is no reason to believe that a transition to digital activities has changed anything in this regard.

However, the impact on the individual pupil may be far greater now, if the data from this problem-solving is included in an AI-based profile of the pupil. For example, the system may be tricked into believing that the pupil is performing at a higher level than they actually are, thus recommending problems the pupil does not yet have the skills to solve. This could have a demotivating effect on the pupil and reinforce their experience of being unable to master a subject or topic.

A similar source of incorrect data is when a pupil deliberately gives the wrong answer in order to manipulate the system into giving them easier or fewer tasks. This, too, is a familiar strategy, used by children since long before the digitalisation of education. What both of these examples have in common is that the problems must be addressed both technologically and by raising awareness in general.

Risk of the technology causing the pupils unwanted stress

Another issue that came up in the workshop was that, for the pupils, use of the learning analytics system risks blurring the line between an ordinary learning situation and a test. Teachers already use information from the pupils’ problem-solving and participation in class as a basis for assessing what pupils have learned. By using a learning analytics system, however, this assessment will be systematised and visualised differently to the present system. Pupils expressed concerns that there would be an expectation to show their “score” in the system to peers and parents, in the same way as pupils currently feel pressure to share test results.

Measures to reduce this risk can be designed into the system in a way that emphasises or visualises a “score” or results in a balanced manner. Adequate information about the type of information used for assessment (and what is not used) could also be a means of reducing the uncertainty and stress pupils experience when being assessed in the learning situation. 

How to provide information tailored to the users

It can sometimes be challenging to provide a clear and concise explanation of how an AI-based system processes personal data. For the AVT project, this situation is further complicated by the age range of its users. This system may potentially be used by children as young as six at one end of the range and by graduating pupils in upper secondary school at the other.

One central discussion in the sandbox was how the AVT project can provide information that is simple enough for the youngest pupils, while also meeting the information needs of older pupils and parents. These sandbox discussions can be summarised as follows:

  • Use language that takes into account the youngest pupils – adults also appreciate information that is simple and easy to understand.
  • Include all of the information required by law, but not necessarily in the same place at the same time. Adults and children alike can lose heart if the document or online article is too long. One guiding principle may be to focus not only on what the pupils/parents need to know, but also when they need this information.
  • It could be beneficial to provide information in layers, where the most basic information is presented first, while at the same time giving the reader an opportunity to read more detailed information on the various topics. Care must be taken to ensure that important information is not “hidden away” if this approach is used.
  • Consider whether it would be appropriate to provide (or repeat) information when the pupils are in a setting where the information in question is relevant, e.g. by means of pop-up windows.
  • Use different approaches – what works for one group may not necessarily work for another. The AVT project included text, video and images in its information materials, and feedback from data subjects indicates that different user groups respond differently to different formats.
  • Be patient and do not underestimate the complexity of the topic or how difficult it can be to understand how the learning analytics system works, as well as the purpose and consequences of implementing this type of system. This applies to both children and adults.

Explaining the system’s underlying logic

The AVT project’s learning analytics system is a decision-support system. This means that the system produces proposals and recommendations, but does not make autonomous decisions on behalf of the teacher or pupil. If the system had taken automated decisions, it would have been covered by Article 22 of the GDPR, which requires relevant information to be provided about the system’s underlying logic. Whether information about the logic must be provided if there is no automatic decision-making or profiling must be considered from case to case, based on whether it is necessary for the purpose of securing fair and transparent processing.

In the sandbox, we came to no conclusions about whether the AVT project is legally obligated to provide information about the underlying logic of the learning analytics system. We did, however, discuss the issue in light of the objective of the sandbox, which is to promote the development of ethical and responsible artificial intelligence. In this context, we discussed how explanations that provide users with increased insight into how the system works could increase trust in the system, promote its proper use and uncover potential defects.

But how detailed should an explanation of the system be? Is it sufficient to simply provide a general explanation for how the system processes personal data in order to produce a result, or should a reason for every single recommendation made by the system also be provided? And how does one provide the youngest pupils with a meaningful explanation? This sandbox project did not offer any final or exhaustive conclusions on these issues, but we did discuss benefits, drawbacks and various alternative solutions.

For the youngest pupils, creativity is a must when it comes to explanations, and these explanations do not necessarily need to be text-based. For example, the AVT project created an information video, which was presented to various stakeholders. The video was well-liked by the children, but garnered mixed reviews from the adults.

The children thought it explained the system in a straightforward way, but the adults found it did not include enough information. This illustrates firstly, how different the needs of different people are, and secondly, how difficult it can be to find the right level and quantity of information. The AVT project has also considered building a “dummy” version of the learning analytics system, which allows users to experiment with different variables. In this way, users can see how information fed into the system affects the recommendations it makes. Visualisation is often quite effective at explaining advanced technology in a straightforward manner. One could have different user interfaces for different target groups, such as one user interface for the youngest pupils and another aimed at older pupils and parents.

A privacy policy is a useful way of providing general information about how the system processes personal data. We also discussed whether individual reasons for the system’s recommendations should be provided. In other words, information about how the system has arrived at the specific recommendation and the data this recommendation is based on. Individual reasons could be made easily accessible by users, but do not necessarily have to be presented alongside the recommendation. There are many benefits to providing reasons for the system’s recommendations. Giving pupils and teachers a broader understanding of how the system works could increase their trust in it. The reasons could also make teachers better able to truly assess the recommendations made by the system, thus mitigating the risk of it being used as an automated decision-making system. (In other words, teachers blindly trusting the system instead of using it to support their own decision.) In addition, examining the reasons given can also help users to uncover errors and distortions in the system, thereby contributing to its improvement.

Veileder navigasjon