Skip to main content

Tag: Privacy

The Potential of Differential Privacy (decentriq)

The Expert Group took place as a virtual meeting on June 26, 2022.

Tim Geppert from ZHAW opened the meeting and introduced Andrew Knox from decentriq.
Andrew introduced the group to the basics of differential Privacy by giving an intuitive understanding of Differential Privacy.

The following paragraph highlights this information (The Reference to further information below)

To better understand how differential privacy works, we will use the example of the collaboration between the clothing brand and the digital newspaper. The first thing the brand wants to do with the digital newspaper data is understand how many users exist with similar interests as the cloth brand customers. Running these computations without any privacy control could easily allow the brand to single out specific newspaper customers as well as learning more than what they supposed to know about the reading habits of individual brand customers.

What Differential privacy says, is that for a given output, you are limited in how sure you are that a given input could have caused it. This privacy leakage limitation is the result of some noise being added at the process of asking each question. Practically this means that the (noisy) answer of the question brand is asking will be (almost) the same even if any single user was removed from the dataset completely. Consequently the clothing brand can never know if the result they got was coming from a dataset that included a specific user, effectively protecting the privacy of any specific individual. The tuning part comes into play when we talk about the amount of noise you can add to each answer.

The amount of noise is determined by the parameter ε (epsilon). The lower the ε the noisier the data is (and more private). However, a differential private system is not only adding noise, but is able to use the knowledge of ε to optimize the utility of the data by factoring the noise in the aggregate calculations. Determining the right ε in a Differentially private system is a non-trivial task and most of the time because it implies that the data owner is knowledgable about the privacy risks that the specific ε number entails and what level of risk they are comfortable undertaking.

Following the talk the participants discussed the opportunities and challanges of this privacy enhancing technology and possible industry use cases. Here a key takeaway was that Differential Privacy allows organizations to take more informed decisions about their data privacy, but the privacy/utility trade off still exists.

If you like to get more information about differential privacy read also the full introductory article by decentriq https://blog.decentriq.com/differential-privacy-as-a-way-to-protect-first-party-data/ which provides additional insights about limitations and features of differential privacy

Expert Group Meeting – Privacy Technologies for Data Collaboration

Program
In the upcoming meeting Roger Fontana and Hartmut Schulze will give a deep dive into Swarm learning. A new PET for data based collaborative learning. They will explain how the PET works and present example use cases.
See also the following publication for further reference:

Nature, Volume 594 Issue 7862, 10 June 2021
https://www.nature.com/articles/s41586-021-03583-3


Nature, Volume 594 Issue 7862, 10 June 2021 https://www.nature.com/articles/s41586-021-03583-3

We are looking forward to seeing you in the upcoming meeting.

Expert Group members please refer to the calendar invitation.
Other interested parties please contact Tim Geppert (mail) until one week before the event (08.02.2022) at the latest.

Expert Group “Privacy Technologies” meeting, 23.11.2021

The fourth meeting of the Expert Group Privacy Technologies for Data Collaboration took place in Zurich and online on November 23, 2021 in the afternoon. We were joined by 9 participants.

Juan Troncoso-Pastoriza from EPFL and co-founder of the startup Tune Insight explained the concept and of homomorphic encryption as well as its use for federated learning. He showed several use cases were homomorphic had been applied to (e.g., in the health sector).

In the second half of the meeting the participants discussed the modus operandi for nexts years meeting and fixed three dates for 2022.

data innovation alliance at the AI+X Summit

The ETH AI Center celebrated its first birthday on October 15, 2021, at the AI+X Summit and the data innovation alliance was there to congratulate and to join the inspiring crowd. The day started with workshops.

David Sturzenegger and Stefan Deml from Decentriq organized one of the workshops on “Privacy-preserving analytics and ML” in the name of the alliance.

It was our first in-person workshop again, and such a great experience for us. We gave an overview of various privacy-enhancing technologies (PETs) to a very engaged and diverse audience of about 30 people. We had in-depth discussions about the use-cases that PETs could unlock, and also presented about Decentriq’s data clean rooms and our use of confidential computing. Our product certainly generated a lot of follow up interest, especially from those who wanted to reach out to demo the platform. We were also joined by a guest speaker from Hewlett Packard who spoke about “Swarm Learning”.

David Sturzenegger, Stefan Deml

Melanie Geiger from the data innovation alliance office attended the workshop about AI + Industry & Manufacturing led by Olga Fink from ETH. The overall goal of the workshop was to identify the next research topics. Small groups with representatives from manufacturing companies mixed with researchers discussed the challenges and opportunities of predictive maintenance, quality control, optimization, and computer vision. We identified research topics such as more generalizable predictive maintenance methods that work for multiple machines or even multiple manufacturing companies. But we also realized that some challenges are more on the operational side or applied research like in the integration of the method into the whole manufacturing process and closing the feedback loop.

In the evening the exhibition and the program on the main stage attracted 1000 participants. We had many interesting discussions at our booth with a wonderful mix of students, entrepreneurs, researchers, and people from the industry. Of course, we also saw many familiar faces and due to the 3G policy, we got back some “normality”.

The Potential of Synthetic Data, 08.09.2021

The 3rd meeting of the Expert Group Privacy Technologies for Data Collaboration took place online on September 8, 2021 in the afternoon. We were joined by 14 participants.

Nico Ebert from ZHAW opened the meeting with a discussion about the possibilities for a physical meeting at the fourth meeting on November 26. The participants agreed to meet in Zurich. He also introduced the speaker for the upcoming meeting, namely Juan Troncoso-Pastoriza from EPFL and co-founder of the startup Tune Insight. Juan will introduce the group to the basics of homomorphic encryption.

Afterwards Matthias Templ, an expert from ZHAW in the areas of data anonymization and synthetic data, presented the concept of synthetic data. Synthetic data is “any production data applicable to a given situation that are not obtained by direct measurement” according to the McGraw-Hill Dictionary of Scientific and Technical Terms. Synthetic data is generated from datasets that often contain personal data and should not be shared with third parties. However, major properties of the synthetic dataset are equal compared to the original dataset and it therefore can be used for similar purposes such as learning about distributions.

Matthias explained that creating synthetic data first requires a good understanding of the original dataset (e.g. personal data about a population). This includes understanding its generation process and its inherent distributions (including marginal distributions). Afterwards these distributions are rebuilt with one or more models (e.g. neural networks, decision trees). The models are then used to generate the synthetic dataset. Matthias has developed and published an r library to accomplish this task. He also demonstrated some of his real-world examples in which synthetic data had been applied. After Matthias’ presentation the participants discussed about the potentials of synthetic data. Another discussion point was which modelling techniques are required for which complexities of the original datasets (e.g. datasets with only a few features require less complex techniques).

In the second half of the meeting the participants discussed the potential benefits of the “Data Collaboration Canvas”. The Data Collaboration Canvas is a graphical workshop tool and has been developed with the help of the Expert Group. It is aimed at organizations that want to explore the potential of data innovation with other organizations at an early stage to create mutual added value. It offers a simple, visual structuring aid, e.g. in workshops, to identify common potentials and hurdles of collaboration. The canvas can not only be used to identify data collaboration opportunities between organizations such as companies but also within an organization (e.g. opportunities between different divisions or departments). Participants applied the canvas in two different use cases and discussed usability and comprehensibility of the canvas afterwards.

Expert Group Meeting – Privacy Technologies for Data Collaboration

Agenda:

1) In the upcoming meeting Juan Troncoso-Pastoriza will give a deep dive into Homomorphic Encryption. He will explain how the PET works and present examples from his company Tune Insight

2) Final notes to the Data Collaboration Canvas and our experiences from an in-person workshop

3) Planning of the 2022 Expert Group Content
a) We call for your participation: Company Use Case Talks

4) Apéro

Expert Group Meeting – Privacy Technologies

Dear Expert Group Participants

We appreciate your interest to explore the potentials and challenges of Privacy Technologies for data collaboration.

Having established the Expert Group in February 2021, we would like to invite you all to the Kick-Off Workshop on Tuesday the 23.03.2021, from 4.00 to 6.00 pm.

During this first event, we would like to get to know each other, provide initial insight on Confidential Computing, and develop a common idea for the expert group.

During this session, we would like to ask you to briefly present your current experiences and questions regarding the topic area. Based on your input, we can shape the goals of the expert group.

To register, please use the form below.

Best Regards
Lead Expert Group Privacy Technologies for Data Collaboration
David Sturzenegger, Nico Ebert and Tim Geppert