Abstract: Researchers choosing to share wireless-network traces with colleagues must first anonymize sensitive information, trading off the removal of information in the interest of identity protection and the preservation of useful data within the trace. While several metrics exist to quantify this privacy-utility tradeoff, they are often computationally expensive. Computing these metrics using a sample of the trace could potentially save precious time. In this paper, we examine several sampling methods to discover their effects on measurement of the privacy-utility tradeoff when anonymizing network traces. We tested the relative accuracy of several packet and flow-sampling methods on existing privacy and utility metrics. We concluded that, for our test trace, no single sampling method we examined allowed us to accurately measure the tradeoff, and that some sampling methods can produce grossly inaccurate estimates of those values. We call for further research to develop sampling methods that maintain relevant privacy and utility properties.
Keywords: anonymization, sanitization, privacy, network, wireless
Copyright © 2012 by IEEE.The copy made available here is the authors' version; for a definitive copy see the publisher's version described above.