My main focus is applying theoretical math to improve data privacy whether it is in machine learning or in computer protocols. I have proved theoretical guarantees of the privacy and the reliability of sequential privacy identifiers for the LoRaWAN protocol. I have also shown multiple theoretical results linking fairness and privacy in machine learning.
My new research project is at the interface of topological data analysis and differential privacy ! I am currently looking for a postdoctoral position to move on with the project, if you are interested, please contact me.
In this paper, we have shown how attribute inference attack can be mitigated using fairness properties, namely demographic parity. I have shown multiple relations between those notions and I have introduced generalized demographic parity which extends the notion to non-discrete random variables. More details can be found in my phd thesis (in French)
In this contribution we have updated and secured the LoRaWAN protocol. My contribution is a probabilistic analysis of the collisions of data packets. It leads to an extensive theoretical development with the creation of a new probability law for which I computed the moments, the mass function and the cumulative distribution function. Because of page limits I could not include all of it in the paper but I plan on publishing them at some point. Here is the preprint.
In this paper we have developed a federated learning protocol that masks sensitive attributes of the updates. My contribution to this work was to prove that the new protocol does not deteriorate utility of the final machine learning model.
In this short 15 minutes presentation I have shown how demographic parity is equivalent to have a maximum of balanced accuracy equal to where m is the number of sensitive attributes.
In this presentation I explored how regularisation parameters in exponentiated gradient descent for fair classification (https://proceedings.mlr.press/v80/agarwal18a.html) impacts the success of attribute inference attack.
I presented early experimental results showing issues and conflicts between fairness and privacy. I have shown the on many real world dataset we observe that a random forest behaves differently of different subgroups of a population based on sensitive attributes. It led to building an attribute inference attack that leverage soft labels of random forests.
There is a tradeoff between privacy and fairness in machine learning. In this paper I explore how those notions can be aligned by using intermediate generated synthetic data. To do so requires some theoretical work in topology:
In this paper I introduce a new ensemble learning algorithm that maximizes balanced accuracy instead of accuracy. It then takes into account class imbalance during training. As explained in my phd manuscript, this algorithm is useful in auditing sensitive attribute leakage of the users of machine learning models.
This is a theoretical paper that solves the following problem: we throw a coin and look at the number of time we draw heads. How many time do we have to throw the coin before obtaining m times head on a row? In the paper, I compute the probability law of the number of throws before obtaining a sequence of uninterrupted heads. This work is crucial in reliability engineering and we applied it to the study of LoRaWAN.
I was a tutorial instructor in mathematics at INSA-Lyon for two years. I was under the supervision of Romaric Pujol. I was in charge of a group of 30 first year students. The lessons were in English because the students were part of an international program (SCiences & ANglais (SCAN))