Privacy Impact Assessments and Ethical Analysis
This page includes the privacy impact assessment (PIA), following the European Union Agency for Network and Information Security (ENISA) methodology, and provides ethical analysis of the IMPETUS tools and the platform, that are dealing with sensitive and/or personal data, based on a concrete situation.
To carry out a DPIA is a responsibility of the Data controller, in accordance with art. 35 GDPR. The Data controller shall seek the advice of the data protection officer, where designated, when carrying out a DPIA. Data processors assist the Data controller in ensuring compliance with the obligations pursuant to art. 35 GDPR, taking into account the nature of processing and the information available to the Data processor. With this in mind, the security measures applied to the IMPETUS tools were checked and mapped before the IMPETUS Live Exercises (“LEx”). The security measures have been identified and described in accordance with the list indicated in the “Handbook on security of personal data processing”, prepared by ENISA. The privacy assessment of the tools provided hereunder is based on these lists of security measures.
Indeed, for the LExs, the tools were connected to the Security Operation Centres (“SOCs”), networks, technological systems and other infrastructures of the different cities. The safety and the efficiency of each tool and of the data collected depended largely on the infrastructures of the cities and on their settings, and this should be considered for future implementations of the IMPETUS tools or similar technologies, as further explained here.
The different analysed technologies include:
Data Protection Impact assessment (DPIA) During the IMPETUS LEx, the UAD tool did not process any personal data and it . For the project, the UAD tool was installed on a local server, protected by firewall and within the premises of the University of Milano. It was protected by tools available on the University’s premises (e.g., firewall) and was accessible only via VPN. It applied eEncryption of data in transit was applied. The tool collected aggregated data and numbers coming from the cameras and the SOC used by one of the cities. The images and the other data collected by the cameras were hashed within the SOC of the city. Only through the SOC it could be possible to deanonymize these data, e.g. for public security reasons. The UAD tool itself cannot deanonymize data. Moreover, the tool was not connected to the tools within the SOC which are used to deanonymize data.
Regarding the data collected by the city of Oslo, they refer to single buses. By collecting a sufficient amount of these data and comparing them with the databases of the municipality, such data may enable the identification of specific persons. This did not happen during the LEx; therefore, a DPIA was not carried out.
Ethical issues The UAD tool integrates machine learning advanced algorithms. Human agency and oversight on the functioning of the tool and its algorithms have been considered and granted at a sufficient level.
More specifically, UAD performs the following machine learning tasks: anomaly detection (to identify data that differ from what previously observed) and classification (to classify data according to defined labels, according to what the system learned from past data). The first is an unsupervised task, while the second one is supervised. Nevertheless, an appropriate level of human control is granted. The tool provides anomalies detection and classification as “information” to the human operator. The human operator would then interpret the data and consequently take action.
The human operator is also capable of understanding the feedback information received by the algorithm and understanding how the algorithm has produced that feedback, because the tool integrates a feature-ranking approach to provide justification of alerts generated. These mechanisms grant also a sufficient level of transparency and explainability regarding the outcomes of the algorithmic system.
General considerations and recommendations As seen with the example of the municipality of Oslo, public entities may wish to process, through the UAD tool, the location data of public transport means.
This could enable behavioural tracking of bus drivers, residents of remote areas who are usually the sole occupants of such buses in specific locations, and similar.
This is an issue that the smart city which adopts the tool has to face since there would be the need to implement security measures and protocols of use to grant a secure processing of the data of all citizens and data subjects.
Moreover, if the UAD tool will be connected to datasets containing personal data and sensitive information, the access control module should be adapted in order to perform security and privacy-aware transformations, ranging from pruning and reshaping to encrypting/decrypting or anonymizing the full resource or part of it, before giving access to data.
Having regard to an ethical use of the UAD tool, it is necessary that the users will be specifically trained in order to be able to understand the outcomes, to evaluate them and to give the right interpretation which could lead to the best decisions.
Data Protection Impact Assessment (DPIA) For the DPIA of the processing activities of the CTDR tool during the IMPETUS Live Exercises (“LEx”), the following information were considered:
Processed personal data: IP addresses of network scanning devices installed on city premises. Additionally, the tool collects other information that are not personal data, such as network traffic, network scans, information about organization’s devices, and similar
Storage location: IP addresses are saved on the premises of the Data controller (i.e., the public entity which adopts the tool), and can be processed (e.g., anonymized) before the tool and the IMPETUS platform get the new contents.
Retention period: one month after the conclusion of each “project” (for anonymized data)
Data processors: the use of the tool and the data processing activities do not require external data processors.
The risks related to the data processing activities, including the evaluation of the impact and the analysis of the threats, have been evaluated using the online tool provided by ENISA. The Overall impact evaluation resulted medium and the Overall threat occurrence probability resulted medium. Therefore, the risk is “medium”. The risk assessment will highly depend on the infrastructure and security measures adopted by the Data controller.
The technical and organisational security measures adopted for the LEx were considered adequate for the specific context of use. The Data controller shall consider that, according to the specific context and situation and especially to the width of the scanned network and of the collected IP addresses, further security measures may be required.
The DPIA conducted for the LEx should be implemented by information given by the Data controller, as specified hereafter. In particular, the Data controller shall:
identify a valid legal basis for processing;
ascertain whether the data processing activity is proportionate and necessary, or not, considering also the impact on the rights of the public entity’s employees;
evaluate if it is required to conduct a complete DPIA in accordance with art. 35 GDPR and to consult the Data Protection Authority as provided for by art. 36 GDPR;
evaluate if a consultation of data subjects has to be done, in accordance with art. 35.9 GDPR, to seek their views on the intended processing.
Ethical issues The CTDR tool does not use machine learning, deep learning or other kinds of advanced algorithms. Data analysis is done by the logical reasoner which processes a network scan and by the algorithm developed in programming languages such as php, python and javascript.
The CTDR tool and its algorithm grant a sufficient level of human control and oversight. It does not take final decisions and only helps analysing and connecting alerts sent by different security tools. In any case, the outcomes require analysis and post-processing from IT and cybersecurity experts. The alerts received by human operators bring information about the vulnerability being exploited, the criticality (i.e., impact or collateral damages) and the possible mitigation actions. In this way, human operators are able to understand how the algorithm produced the alert. This grants a sufficient level of transparency.
General considerations and recommendations For the future use case scenarios, public entities adopting the CTDR tool should evaluate further aspects.
Firstly, the tool was created already applying the best security practices in the relevant field. This is due also to the fact that it is based on open-source software, subject to a constant “peer review”.
The choice of Kafka to share data with the IMPETUS platform is also satisfying, since Kafka allows to encrypt the data using a public key with the private key on the IMPETUS Platform, with any other system needing to read the data from the Kafka bus.
On the other hand, other features of the tool can be adapted to the needs of public entities and from such choices both improvements and higher risks may derive.
First of all, the file with the results of the scanning of network’s vulnerabilities will be downloaded on the premises of the public entity. Therefore, before scanning, it is fundamental to ascertain that this file will be saved on a secure server (either local or in cloud).
Secondly, other software can be used instead of Nessun to scan the network. This would require some adaptations of the CTDR tool algorithm, but the public entity will be responsible for choosing a software that grants at least the same level of efficiency and security measures of Nessus.
In the third place, it should be considered that the ability of the tool in detecting vulnerabilities depends on the knowledge graph used. This graph needs to be updated from time to time. Therefore, it is foreseen to develop and employ a Natural Language Processing (« NLP ») model, in order to update automatically the knowledge graph. The updating of the tool itself will be possible since i twill be released as an open-source software. The use of natural language processing could lead to the rise of further issues, especially related to ethics. Ethical issues associated with NLP do not subside with the process of data generation but are recurrent at every stage, concerning learning bias and the evaluation, aggregation and deployment stages.
Lastly, anonymization of IP addresses collected when vulnerabilities are detected (either exploited or likely to be exploited) should be considered.
On the one hand, IP addresses detection may facilitate a deeper analysis of vulnerabilities and the prevention of future attacks. Indeed, the organisation adopting the CTDR tool could better evaluate specific countermeasures, including training of the employees.
On the other hand, the lack of anonymization of IP addresses implies the possibility to detect if the action of a specific employee contributed to the exploitation of a certain vulnerability. In this way, the tool could be considered as an instrument to monitor workers and to address disciplinary sanctions.
This use of the tool could lead to the breach of labour law provisions, as analysed here. In general, each public entity must adapt internal procedure on how much data is stored, accessed, shared, and when are they deleted.
Data Protection Impact Assessment (DPIA) The EO tool does not require the processing of any personal data for its functioning. For example, in cities with counter person sensors, the tool would register only the number of people crossing a specific gate at a given time and calculate the density in a specific public space. It could also elaborate historical data on the number of people in a public space in a particular period of time.
On the other hand, output data of the tool are represented by guidelines and numbers representing parameters for managing the crowd of people (time for egress, available gates, etc.), without any reference to identifiable persons.
Ethical issues The EO tool works according to the following principles: reference scenarios involving the egress of a crowd are pre-simulated in advance through a dedicated software. Results are synthesized and turned into a set of written guidelines, a video of simulated egress and a coloured risk class. The EO tool per se does not contain any algorithm. For ethical issues which may arise from its use in real-life scenarios, please refer to the general considerations and recmmandations for use.
Data Protection Impact Assessment (DPIA) For the DPIA of the processing activities of the SMD tool during the Live Exercises (“LEx”), the following information were considered:
Processed personal data: data included in publicly available text messages on Twitter; author’s id, nickname, name and location (provided by the user). Maximum 150 messages per keyword (per execution, in case of projects which are set to run periodically. A lower value may be set.
Storage location: servers of Amazon Web Services located in Ireland during LEx. Servers of the Data controller (i.e., the public entity which adopts the tool), for real-life use cases.
Retention period: one month after the conclusion of each “project” (for anonymized data).
Data processors: during the LEx, Amazon. Normally, the use of the tool and the data processing activities do not necessarily require external data processors.
The risks related to the data processing activities, including the evaluation of the impact and the analysis of the threats, have been evaluated using the online tool provided by ENISA The Overall impact evaluation resulted medium, while the Overall threat occurrence probability resulted low. Therefore, the risk is “medium”. The risk assessment will highly depend on the infrastructure and security measures adopted by the Data controller.
Ethical issues In general, the tool and its algorithm have been developed in compliance with the rules of trustworthy AI, granting:
Human agency and oversight
Technical robustness and safety
Privacy and data governance
Diversity, non-discrimination and fairness
Accountability.
More specifically, the machine learning models only give evaluations and classifications for different features. They do not give final decisions. It is the human operator who will have to review and analyse the results from the analysis and take all the relevant decisions and subsequent steps. The human operator is capable of understanding the feedback information received by the advanced algorithm and understanding how the advanced algorithm has produced that feedback, since there is an explainability method to explain the scores provided by the models.
Data that may lead to an identification of the user can be anonymized or pseudonymized, according to the requirements imposed by the Data controller. It is also possible to establish different levels of access to personal data, by giving only to some users and roles the possibility to unscramble pseudonymized data to return them to the original format. The key for the decryption is stored in a separate section of the system and this key is also encrypted.
Data Protection Impact Assessment (DPIA) For the DPIA of the processing activities of the FD tool during the LEx, the following information were considered:
Processed personal data: images of volunteers;
Storage location: images were processed on the edge device located inside the municipalities’ premises and shared with the IMPETUS platform on a partner’s secure servers in case of detection of an “emergency”. Data are shared via internet but in future real-life use cases, the platform itself should be installed on the premises of the Data controller;
Retention period: until the end of the LEx. To be decided with Data controllers for future use cases;
Data processors: subject which is responsible for the training of the AI.
The risks related to the data processing activities, including the evaluation of the impact and the analysis of the threats, have been evaluated using the online tool provided by ENISA. The Overall impact evaluation for the use of the FD tool during the LEx resulted medium and the Overall threat occurrence probability resulted medium. Therefore, the risk is “medium”. The risk assessment will highly depend on the infrastructure and security measures adopted by the Data controller, according to the description of the functioning of the tool provided in the previous paragraph.
The technical and organisational security measures adopted were considered adequate for the specific context of use of the LEx. The Data controller shall consider that, according to the specific context and situation, further security measures may be required.
Therefore, the DPIA conducted for the LEx should be implemented by information given by the Data controller, as specified here.
In particular, the Data controller shall:
identify a valid legal basis for processing;
ascertain whether the data processing activity is proportionate and necessary, or not;
appropriately inform the data subjects in accordance with art. 13 GDPR;
evaluate whether a consultation of the data protection authority in accordance with art. 36 GDPR is necessary, or not;
evaluate if a consultation of data subjects has to be done, in accordance with art. 35.9 GDPR, to seek their views on the intended processing.
Ethical issues During the LEx, the potentially relevant ethical issues that the use of the FD tool could pose were not considered as an obstacle since only a small amount of images were made visible to SOC operators, and the tool was used for a limited period of time.
Data Protection Impact Assessment (DPIA) For the DPIA, of the processing activities of the WMS tool during the LEx, the following information were considered:
Processed personal data: biosignals (health data) ( signals through an electroencephalogram (EEG) and heart rate and flow signals through photopletysmography (PPG)). Training data: age and information about personality and health status.
Storage location: stored locally on a computer provided by the tool developer. Will be a device belonging to the Data controller (i.e., the public entity which adopts the tool), for real-life use cases.
Retention period: raw training data will be anonymized immediately after the creation of the assessment model (buffered only for 5 seconds). The workload predictions are stored for 5 minutes before being deleted (the storage period can be changed according to the needs of the Data controller).
Data processors: for the LEx, the tool developer. Normally, the use of the tool and the data processing activities do not require any data processor.
The risks related to the data processing activities, including the evaluation of the impact and the analysis of the threats, have been evaluated using the online tool provided by ENISA The Overall impact evaluation for the use of the WMS tool during the LEx resulted high, while the Overall threat occurrence probability resulted low. Therefore, the risk is “high”. The risk assessment will highly depend on the infrastructure and security measures adopted by the Data controller.
The technical and organisational security measures adopted for the LEx were considered adequate for the specific context of use.The Data controller shall consider that, according to the specific context and situation, further security measures may be required.
The DPIA conducted for the LEx should be implemented by information given by the Data controller, as specified here
In particular, the Data controller shall:
identify a valid legal basis for processing;
ascertain whether the data processing activity is proportionate and necessary, or not;
appropriately inform the data subjects in accordance with art. 13 GDPR;
evaluate whether a consultation of the data protection authority in accordance with art. 36 GDPR is necessary, or not;
evaluate if a consultation of data subjects has to be done, in accordance with art. 35.9 GDPR, to seek their views on the intended processing.
As for the legal basis, it should be considered that usually in the European Union consent is not recognised as a valid legal basis in the relationship between workers and employers.
Ethical issues Before the LEx, volunteers were thoroughly informed about the planned activities and the management of their personal (health) data. Moreover, information about their mental and physical health were deleted immediately after the end of the LEx.
In general, the tool and its algorithm have been developed in compliance with the rules of trustworthy AI, granting:
Human agency and oversight
Technical robustness and safety
Privacy and data governance
Accountability.
Moreover, the machine learning models are not able to be extrapolated to tasks that are outside the task learned during training or to provide accurate classifications based on data in a domain other than the training data.
Workload classifications are provided to the supervisor through the WMS dashboard. Based on the classification, the supervisor is able to assess which action is required in order to guarantee the team’s performance, therefore there is a constant involvement of human operators in the decision-making.
Data Protection Impact Assessment (DPIA) The CTI tool collects the following types of information: all data that can be extracted from available sources in the clear, dark and deep web, including leaked data and threatened or breached databases, plus information and data related to the public entity which uses the tool. This refers, in particular, to domain names, IPs, aliases, BINs, CVEs (Common Vulnerabilities and Exposures) of their websites, data of the executives (like the mayor), etc. This information is necessary to set the target of monitoring for the tool and to give the end users relevant alerts.
When a threat is detected, the alert and the context reported by the CTI tool may contain personal data which refer to more or less identifiable persons, according to their nature. In case of threatened or breached databases, the public entity which has adopted the tool will receive only the parts of them which relate to threats to its own organization. The analysed data were stored on Amazon Web Services servers during the IMPETUS LEx and were retained only for the duration of the LEx.
Ethical issues The CTI tool provides alerts and insights based on collected and analysed data, but it is aways possible for human operators to flag in case of false positives, by this meaning not relevant alerts. Human control and traceability are granted also by logs which detect how the algorithm works.
The explainability of the functioning of the advanced algorithm depends on the module used. Generally speaking, the algorithm applies a risk scoring calculation to different entities. In some modules, the human operator can understand how the tool calculates the risk score and which factors were taken into account. Other times, the human operator just see alerts and the threats and sources from which they originated.
Data protection and ethical issues: The BD tool collects only environmental data, in particular the concentration of bacteria in the air. It does not collect any personal data.
The BD tool is an air analyser which does not contain any advanced analysing algorithm. The lists of immediate actions to be undertaken by SOC operators is defined in advance together with the public entity adopting the tool. The aim of the BD tool in this context is just to simplify the recovery of the most suitable lists of countermeasures.
Data Protection Impact Assessment (DPIA) For the DPIA of the processing activities of the platform during the IMPETUS Live Exercises (“LEx”), the following information were considered:
Storage location: stored locally on the servers of the platform provider. Servers of the Data controller for real-life future uses;
Retention period: for the duration of the LEx.
During the IMPETUS LEx, the platform could collect personal data only from the Firearm Detector and the Workload Monitoring System tool. Moreover, the platform may occasionally process personal data if they are contained in the messages sent through the chat. The chat is meant to convey technical feedbacks or requests of support therefore it should not be used to share personal data. The use of the platform and the related data processing activities performed by any public entity do not require any data processor.
Having regard to the FD tool, it shares images from CCTV cameras. The AI in the FD tool scans the footage provided by cameras to detect the presence of weapons in the video images. To protect privacy, the AI systematically anonymizes people’s faces, blacking them out. These images are never recorded. When the FD tool detects an alert, it asks the SOC operator to evaluate and confirm it. In this context, the tool and the platform process the following data:
a) jpeg snapshots with a visual bounding box of the anomaly (i.e., gun or assault rifle);
b) video sequence of the red alert with a visual bounding box of the anomaly;
c) a raw video sequence of the alert (clean of any bounding box). here the person holding the weapon will be visible;
d) GPS coordinates of the red alert (and therefore, of the person holding the weapon).
If the dispatcher of the SOC validates the alert as “Emergency”, the alert is shared using the SOC (Security Operation Control) room protocols.
As regards the WMS tool, the platform allows the end users (usually, SOC supervisors) to receive alerts when the workload of an employee wearing a specific sensor is considered excessive. The alert is associated to the employee. The data controller must evaluate whether to show the exact name of the employee or rather to show anonymous indications such as “sensor 01 – Excessive workload”. The latter allows to reduce the processing of personal data through the platform.
The risks related to the data processing activities during the LEx, including the evaluation of the impact and the analysis of the threats, have been evaluated using the online tool provided by ENISA. The Overall impact evaluation for the use of the platform during the LEx resulted medium and the Overall threat occurrence probability resulted medium. Therefore, the risk was “medium”.
The technical and organisational security measures adopted during the LEx were considered adequate for the specific context of use. In particular, it was considered that the platform implements role-based access. The access to the different tools is possible only for specific types of end users (SOC operators, SOC supervisors, IT specialists, IT supervisors, intelligence analysts and technical administrators).
The DPIA conducted for the LEx should be implemented by information given by the Data controller, as specified here, since the risk assessment highly depends on the security of the infrastructure on which the platform is installed.
In particular, the Data controller shall:
identify a valid legal basis for processing;
ascertain whether the data processing activity is proportionate and necessary, or not;
appropriately inform the data subjects in accordance with art. 13 GDPR;
evaluate whether a consultation of the data protection authority in accordance with art. 36 GDPR is necessary, or not;
evaluate if a consultation of data subjects has to be done, in accordance with art. 35.9 GDPR, to seek their views on the intended processing.
Ethical issues The IMPETUS platform itself does not implement any algorithm. It only shows the results and the alerts produced by the algorithms of the various tools.
The IMPETUS platform allows end users an easier access to the tools of interest, which can be all or only a selection of them. The ethical issues to be considered are represented by the issues raised by the single tools. Sometimes the concerns underlined with respect to one tool may be enhanced by the connection of the use of that tool with other.