Towards Usability in Private Data Analytics
- Private data analytics systems preferably provide required analytic accuracy to analysts and specified privacy to individuals whose data is analyzed. Devising a general system that works for a broad range of datasets and analytic scenarios has proven to be difficult.
Despite the advent of differentially private systems with proven formal privacy guarantees, industry still uses inferior ad-hoc mechanisms that provide better analytic accuracy. Differentially private mechanisms often need to add large amounts of noise to statistical results, which impairs their usability.
In my thesis I follow two approaches to improve the usability of private data analytics systems in general and differentially private systems in particular. First, I revisit ad-hoc mechanisms and explore the possibilities of systems that do not provide Differential Privacy or only a weak version thereof. Based on an attack analysis I devise a set of new protection mechanisms including Query Based Bookkeeping (QBB). In contrast to previous systems QBB only requires the history of analysts’ queries in order to provide privacy protection. In particular, QBB does not require knowledge about the protected individuals’ data.
In my second approach I use the insights gained with QBB to propose UniTraX, the first differentially private analytics system that allows to analyze part of a protected dataset without affecting the other parts and without giving up on accuracy. I show UniTraX’s usability by way of multiple case studies on real-world datasets across different domains. UniTraX allows more queries than previous differentially private data analytics systems at moderate runtime overheads.