Server-side Techniques
Server-side enabling technologies include three main categories, i.e., data perturbation, Secure Multiparty Computation (SMC) and database anonymization.
Data perturbation
Data Perturbation aims at intentionally making information difficult to understand or perceive for security and privacy reasons. In fact, the speed of dissemination of information, the technical progress and the global nature of the Internet make it difficult to delete data that may be too personal, embarrassing or confidential. Thus, perturbation consists mainly in publishing large amounts of information that are false, imprecise, irrelevant and/or organized in such a way that the information that one wishes to protect is hidden, i.e., embedded in a large volume of data. Data perturbation techniques are used for enhancing privacy in various querying services. In order to protect queries, one idea consists of generating dummy queries that will be sent to the central server along with the real query. The main issues of these techniques are the privacy utility trade-offs induced by the suppression technique, removing some records or details.
Secure Multi-party Computation (SMC)
Privacy preserving computation (SMC) techniques aim at protecting users’ privacy and the secrecy of data contents during processing over these data. The goal of secure computation techniques is to enable distributed computing tasks among participating entities in a secure manner. That is, it considers that a group of participants wants to carry out a joint computation of a given function while keeping secret the input data of each party.
SMC has been used to solve several privacy-preserving problems such as private database queries, secret voting, privacy preserving data mining and privacy preserving intrusion detection tools and mechanisms.
Three different approaches are generally deployed to provide secure multiparty computation functionalities, namely oblivious transfer, homomorphic encryption, and secret sharing techniques. The oblivious transfer protocol generates high processing and communication overheads. The secret sharing approach gives better results in terms of computation cost, thanks to the usage of primitive operations. However, it requires the existence of secure channels between different participating entities, hence generating a high bandwidth consumption, due to the involved interactions between users. The homomorphic encryption does not require the existence of secure channels and assures a high level of privacy. However, it necessitates several processing operations to ensure homomorphism properties, thus generating high computation complexity.
Database anonymization
Database anonymization techniques are basically used to protect data within statistical databases. They permit to resolve the trade-off between data usability and users’ privacy preservation, as revealed results, either the databases or a specific result over the database do not permit to reveal information related to a specific user. These techniques also include Differential Privacy mechanisms.
Anonymization techniques are relevant for various use-cases, namely applications that do not require to learn the original user’s identity, but only context information. Anonymization techniques mainly refer to database privacy preservation. Even so, for cooperative applications where the database belongs to several corporations, it comes to the privacy protection of the various collaborating entities.
Main techniques for anonymizing databases w.r.t. respondent, owner and users’ privacy include k-anonymity, t-closeness and l-diversity. Note that these techniques that are originally used over statistical databases have extended usage to dynamic data.
Differential privacy (DP) is acquiring a growing interest, primarily to guarantee security saving information mining. In a nutshell, differential privacy ensures that the removal or addition of a single database item does not (substantially) affect the outcome of any analysis (i.e., the probability distribution of released items does not significantly change). This property is enforced by adding random noise to the exact outcome.