Abstract: User association logs collected from a large-scale wireless LAN record where and when a user has used the network. Such information plays an important role in wireless network research. One concern of sharing these data with other researchers, however, is that the logs pose potential privacy risks for the network users. Today, the common practice in sanitizing these data before releasing them to the public is to anonymize users' sensitive information, such as their devices' MAC addresses and their exact association locations. In this work, we demonstrate that such sanitization measures are insufficient to protect user privacy because the differences between user association behaviors can be modeled and many are distinguishable. By simulating an adversary's role, we propose a novel type of correlation attack in which the adversary uses the anonymized association log to build signatures against each user, and when combined with auxiliary information, such signatures can help to identify users within the anonymized log. On a user association log that contains more than four thousand users and millions of association records, we demonstrate that this attack technique is able to pinpoint the victim's identity exactly with a probability as high as 70%, and narrow it down to a set of 20 candidates with a probability close to 100%. We further evaluate the effectiveness of standard anonymization techniques, including generalization and perturbation, in mitigating this correlation attack; our experimental results reveal only limited success of these methods, suggesting that more thorough treatment is needed when anonymizing wireless user association logs before public release.
Copyright © 2011 by the authors.
See also earlier version tan:crf.