Scalable Distributed Data Anonymization


Sabrina De Capitani di Vimercati, Dario Facchinetti, Sara Foresti, Gianluca Oldani, Stefano Paraboschi, Matthew Rossi, Pierangela Samarati

In Proc. of the 19th International Conference on Pervasive Computing and Communications (PerCom 2021)
Kassel, Germany, March 22-26, 2021

Get the paper Cite

We present an approach for enabling a distributed anonymization process over large collections of sensor data. Our approach anonymizes large datasets (which might not fit in main memory) using an arbitrary number of workers within the Spark framework. We describe how to parallelize the anonymization process through a proper partitioning of the dataset. Our experimental evaluation shows that the proposed approach is scalable and do not affect the quality of the anonymized dataset.

@inproceedings{mondrian,
	author = {S. {De Capitani di Vimercati} and D. Facchinetti and S. Foresti
	          and G. Oldani and S. Paraboschi and M. Rossi and P. Samarati},
	booktitle = {Proc. of the 19th International Conference on Pervasive
	             Computing and Communications (PerCom 2021)},
	title = {Scalable Distributed Data Anonymization},
	location = {Kassel, Germany},
	month = {March},
	day = {22-26},
	year = {2021},
}