Abstract: We consider a distributed system that disseminates high-volume event streams to many simultaneous monitoring applications over a low-bandwidth network. For bandwidth efficiency, we propose a \textitgroup-aware stream filtering approach, used together with multicasting, that exploits two overlooked, yet important, properties of monitoring applications: 1) many of them can tolerate some degree of ``slack'' in their data quality requirements, and 2) there may exist multiple subsets of the source data satisfying the quality needs of an application. We can thus choose the ``best alternative'' subset for each application to maximize the data overlap within the group to best benefit from multicasting. Here we provide a general framework for the group-aware stream filtering problem, which we prove is NP-hard. We introduce a suite of heuristics-based algorithms that ensure data quality (specifically, granularity and timeliness) while preserving bandwidth. Our evaluation shows that group-aware stream filtering is effective in trading CPU time for bandwidth savings, compared with self-interested filtering.
Keywords: distributed computing
Copyright © 2008 by ACM.The copy made available here is the authors' version; for a definitive copy see the publisher's version described above.
See also later version li:ijcnds.