My current research interests include include security and privacy for smart homes, and wireless networks. Below is a list of my papers, patents, and software artifacts, along with the theses produced by my students. For a more formal listing, see my vita. For more about me, please visit my home page. See also: papers by keyword, or download complete BibTeX. See also: overview of most of my research projects. My ORCID is 0000-0001-7411-2783. I also have research profiles on Zotero, SCOPUS, and Google Scholar. |
![]() (click to enlarge) |
Papers are listed in reverse-chronological order;
click an entry to pop up the abstract.
For full information and pdf, please click Details link.
Follow updates with RSS.
Methods: Study participants included adults receiving MOUD at a large outpatient treatment program. We predicted NPOU (EMA-based), medication nonadherence (Electronic Health Record [EHR]- and EMA-based), and treatment retention (EHR-based) using context-sensitive EMAs (e.g., stress, pain, social setting). We used recurrent deep learning models with 7-day sliding windows to predict the next-day outcomes, using Area Under the ROC Curve (AUC) for assessment. We employed SHapley additive ExPlanations (SHAP) to understand feature latency and importance.
Results: Participants comprised 62 adults with 14,322 observations. Model performance varied across EMA subtypes and outcomes with AUCs spanning 0.59-0.97. Recent substance use was the best performing predictor for EMA-based NPOU (AUC=0.97) and medication nonadherence (AUC=0.68); life-contextual factors performed best for EHR-based medication nonadherence (AUC=0.89) and retention (AUC=0.80). SHAP revealed varying latencies between predictors and outcomes.
Conclusions: Findings support the effectiveness of EMA and deep learning for forecasting actionable outcomes in persons receiving MOUD. These insights will enable the development of personalized dynamic risk profiles and just-in-time adaptive interventions (JITAIs) to mitigate high-risk OUD outcomes.
In this paper we present MOAT, a system that leverages Wi-Fi sniffers to analyze the physical properties of a device's wireless transmissions to infer whether that device is located inside or outside of a home. MOAT can adaptively self-update to accommodate changes in the home indoor environment to ensure robust long-term performance. Notably, MOAT does not require prior knowledge of the home's layout or cooperation from target devices, and is easy to install and configure.
We evaluated MOAT in four different homes with 21 diverse commercial smart devices and achieved an overall balanced accuracy rate of up to 95.6%. Our novel periodic adaptation technique allowed our approach to maintain high accuracy even after rearranging furniture in the home. MOAT is a practical and efficient first step for monitoring and managing devices in a smart home.
Our workshop curriculum centers on the smart-home device lifecycle: obtaining, installing, using, and removing devices in a home. For each phase of the lifecycle, we present possible vulnerabilities along with preventative measures relevant to a general audience. We integrate a hands-on activity for participants to put best-practices into action throughout the presentation.
We ran our designed workshop at a science museum in June 2023, and used participant surveys to evaluate the effectiveness of our curriculum. Prior to the workshop, 38.8% of survey responses did not meet learning objectives, 22.4% partially met them, and 38.8% fully met them. After the workshop, only 9.2% of responses did not meet learning objectives, while 29.6% partially met them and 61.2% fully met them. Our experience shows that consumer-focused workshops can aid in bridging information gaps and are a promising form of outreach.
Objective: The aim is to examine patient engagement with multiple digital phenotyping methods among patients receiving buprenorphine medication for OUD.
Methods: The study enrolled 65 patients receiving buprenorphine for OUD between June 2020 and January 2021 from 4 addiction medicine programs in an integrated health care delivery system in Northern California. Ecological momentary assessment (EMA), sensor data, and social media data were collected by smartphone, smartwatch, and social media platforms over a 12-week period. Primary engagement outcomes were meeting measures of minimum phone carry (≥8 hours per day) and watch wear (≥18 hours per day) criteria, EMA response rates, social media consent rate, and data sparsity. Descriptive analyses, bivariate, and trend tests were performed.
Results: The participants’ average age was 37 years, 47% of them were female, and 71% of them were White. On average, participants met phone carrying criteria on 94% of study days, met watch wearing criteria on 74% of days, and wore the watch to sleep on 77% of days. The mean EMA response rate was 70%, declining from 83% to 56% from week 1 to week 12. Among participants with social media accounts, 88% of them consented to providing data; of them, 55% of Facebook, 54% of Instagram, and 57% of Twitter participants provided data. The amount of social media data available varied widely across participants. No differences by age, sex, race, or ethnicity were observed for any outcomes.
Conclusions: To our knowledge, this is the first study to capture these 3 digital data sources in this clinical population. Our findings demonstrate that patients receiving buprenorphine treatment for OUD had generally high engagement with multiple digital phenotyping data sources, but this was more limited for the social media data.
International Registered Report Identifier (IRRID): RR2-10.3389/fpsyt.2022.871916
Objective: We sought to develop and validate a smartwatch step-counting app for older adults and evaluate the algorithm in free-living settings over a long period of time.
Methods: We developed and evaluated a step-counting app for older adults on an open-source wrist-worn device (Amulet). The app includes algorithms to infer the level of physical activity and to count steps. We validated the step-counting algorithm in the lab (counting steps from a video recording, n=20) and in free-living conditions—one 2-day field study (n=6) and two 12-week field studies (using the Fitbit as ground truth, n=16). During app system development, we evaluated 4 walking patterns: normal, fast, up and down a staircase, and intermittent speed. For the field studies, we evaluated 5 different cut-off values for the algorithm, using correlation and error rate as the evaluation metrics.
Results: The step-counting algorithm performed well. In the lab study, for normal walking (R2=0.5), there was a stronger correlation between the Amulet steps and the video-validated steps; for all activities, the Amulet’s count was on average 3.2 (2.1%) steps lower (SD 25.9) than the video-validated count. For the 2-day field study, the best parameter settings led to an association between Amulet and Fitbit (R2=0.989) and 3.1% (SD 25.1) steps lower than Fitbit, respectively. For the 12-week field study, the best parameter setting led to an R2 value of 0.669.
Conclusions: Our findings demonstrate the importance of an iterative process in algorithm development before field-based deployment. This work highlights various challenges and insights involved in developing and validating monitoring systems in real-world settings. Nonetheless, our step-counting app for older adults had good performance relative to the ground truth (a commercial Fitbit step counter). Our app could potentially be used to help improve physical activity among older adults.
In this thesis, we present an end-to-end solution for providing information provenance for mHealth data, which begins by securing mHealth data at its source: the mHealth device. To this end, we devise a memory-isolation method that combines compiler-inserted code and Memory Protection Unit (MPU) hardware to protect application code and data on ultra-low-power micro-controllers. Then we address the security of mHealth data outside of the source (e.g., data that has been uploaded to smartphone or remote-server) with our health-data system, Amanuensis, which uses Blockchain and Trusted Execution Environment (TEE) technologies to provide confidential, yet verifiable, data storage and computation for mHealth data. Finally, we look at identity privacy and data freshness issues introduced by the use of blockchain and TEEs. Namely, we present a privacy-preserving solution for blockchain transactions, and a freshness solution for data access-control lists retrieved from the blockchain.
We envision a solution called the SPLICEcube whose goal is to detect smart devices, locate them in three dimensions within the home, securely monitor their network traffic, and keep an inventory of devices and important device information throughout the device’s lifecycle. The SPLICEcube system consists of the following components: 1) a main cube, which is a centralized hub that incorporates and expands on the functionality of the home router, 2) a database that holds network data, and 3) a set of support cubelets that can be used to extend the range of the network and assist in gathering network data.
To deliver this vision of identifying, securing, and managing smart devices, we introduce an architecture that facilitates intelligent research applications (such as network anomaly detection, intrusion detection, device localization, and device firmware updates) to be integrated into the SPLICEcube. In this thesis, we design a general-purpose Wi-Fi architecture that underpins the SPLICEcube. The architecture specifically showcases the functionality of the cubelets (Wi-Fi frame detection, Wi-Fi frame parsing, and transmission to cube), the functionality of the cube (routing, reception from cubelets, information storage, data disposal, and research application integration), and the functionality of the database (network data storage). We build and evaluate a prototype implementation to demonstrate our approach is scalable to accommodate new devices and extensible to support different applications. Specifically, we demonstrate a successful proof-of-concept use of the SPLICEcube architecture by integrating a security research application: an "Inside-Outside detection" system that classifies an observed Wi-Fi device as being inside or outside the home.
Methods: This paper describes the protocol (including the study design and methodological considerations) from a novel study supported by the National Drug Abuse Treatment Clinical Trials Network at the National Institute on Drug Abuse (NIDA). This study (D-TECT) primarily seeks to evaluate the feasibility of collecting ecological momentary assessment (EMA), smartphone and smartwatch sensor data, and social media data among patients in outpatient MOUD treatment. It secondarily seeks to examine the utility of EMA, digital sensing, and social media data (separately and compared to one another) in predicting MOUD treatment retention, opioid use events, and medication adherence [as captured in electronic health records (EHR) and EMA data]. To our knowledge, this is the first project to include all three sources of digitally derived data (EMA, digital sensing, and social media) in understanding the clinical trajectories of patients in MOUD treatment. These multiple data streams will allow us to understand the relative and combined utility of collecting digital data from these diverse data sources. The inclusion of EHR data allows us to focus on the utility of digital health data in predicting objectively measured clinical outcomes.
Discussion: Results may be useful in elucidating novel relations between digital data sources and OUD treatment outcomes. It may also inform approaches to enhancing outcomes measurement in clinical trials by allowing for the assessment of dynamic interactions between individuals' daily lives and their MOUD treatment response.
Clinical Trial Registration: Identifier: NCT04535583.
To address this problem, in this paper, we investigate the use of vibration, generated by a smartRing, as an out-of-band communication channel to unobtrusively share a secret with a smartThing. This exchanged secret can be used to bootstrap a secure wireless channel over which the smartphone (or another trusted device) and the smartThing can communicate. We present the design, implementation, and evaluation of this system, which we call VibeRing. We describe the hardware and software details of the smartThing and smartRing. Through a user study we demonstrate that it is possible to share a secret with various objects quickly, accurately and securely as compared to several existing techniques. Overall, we successfully exchange a secret between a smartRing and various smartThings, at least 85.9% of the time. We show that VibeRing can perform this exchange at 12.5 bits/second at a bit error rate of less than 2.5%. We also show that VibeRing is robust to the smartThing’s constituent material as well as the holding style. Finally, we demonstrate that a nearby adversary cannot decode or modify the message exchanged between the trusted devices.
In the first part of the thesis, we discuss our work on accurate sensing and detection of different states-of-vulnerability. We start by discussing our work on advancing the field of physiological stress sensing. We took the first step towards testing the reproducibility and validity of our methods and machine-learning models for stress detection. To this end, we analyzed data from 90 participants from four independent controlled studies, using two different types of sensors, with different study protocols and research goals. We evaluated new methods to improve the performance of stress-detection models and found that our methods led to a consistent increase in performance across all studies, irrespective of the device type, sensor type, or the type of stressor. Our thorough exploration of reproducibility in a controlled environment provides a critical foundation for deeper study of such methods, and is a prerequisite for tackling reproducibility in free-living conditions.
Next, we present our work on detecting at-risk indicators for patients undergoing Opioid Use Disorder (OUD) treatment. We conducted a 12-week study with 59 patients undergoing an OUD treatment and collected sensor data, like location, physical activity, sleep, and heart rate, from smartphones and wearables. We used the data collected to formulate low- level contextual features and high-level behavioral features and explored the feasibility of detecting self-reported stress, craving, and mood of the participants. Our results show that adaptive, personalized models can detect different at-risk behaviors with the area under the receiver operating characteristic (AUROC) values of up to 0.85.
In the second part of this dissertation, we discuss our contributions in the domain of state-of-receptivity for digital health interventions. We start by conducting a study with 189 participants in Switzerland to explore participant receptivity towards actual physical activity behavior change interventions and report novel and significant results, e.g., being more receptive to interventions leads to higher goal completion likelihood. We further built machine-learning models to predict state-of-receptivity and deployed those models in a real-world study with participants in the United States to evaluate their effectiveness. Our results show that participants were more receptive to interventions delivered at moments detected as ‘receptive’ by our models.
In addition to receptivity in daily living conditions, we explored how participants interact with affective health interventions while driving. We analyzed longitudinal data from 10 participants driving in their day-to-day lives for two months. In this exploratory work, we found that several high-level trip factors (traffic flow, trip length, and vehicle occupancy) and in-the-moment factors (road type, average speed, and braking behavior) showed significant associations with the participant’s decision to start or cancel an intervention. Based on our analysis, we provide solid recommendations on delivering interventions to maximize responsiveness and effectiveness and minimize the burden on the drivers.
Overall, this dissertation makes significant contributions to the respective sub-fields by addressing fundamental challenges, advancing the current state-of-the-art, and contribut- ing new knowledge, thereby laying a solid foundation for designing, implementing, and delivering future digital health interventions.
In this poster we report preliminary work in which we infer social interactions of individuals from Wi-Fi connection traces in the campus network at Dartmouth College. We make the following contributions: (i) we propose several definitions of a pseudocorrelation matrix from Wi-Fi connection traces, which measure similarity between devices or users according to their temporal association profile to the Access Points (APs); (ii) we evaluate the accuracy of these pseudo-correlation variants in a simulation environment; and (iii) we contrast results with those found on a real trace.
We leveraged prior work regarding receptivity to JITAIs and deployed a chatbot-based digital coach - Ally - that provided physical-activity interventions and motivated participants to achieve their step goals. We extended the original Ally app to include two types of machine-learning model that used contextual information about a person to predict when a person is receptive: a static model that was built before the study started and remained constant for all participants and an adaptive model that continuously learned the receptivity of individual participants and updated itself as the study progressed. For comparison, we included a control model that sent intervention messages at random times. The app randomly selected a delivery model for each intervention message. We observed that the machine-learning models led up to a 40% improvement in receptivity as compared to the control model. Further, we evaluated the temporal dynamics of the different models and observed that receptivity to messages from the adaptive model increased over the course of the study.
Although past work on generic phone notifications has found evidence that users are more likely to respond to notifications with content they view as useful, there is no existing research on whether users' intrinsic motivation for the underlying topic of mHealth interventions affects their receptivity. In this work, we explore whether relationships exist between intrinsic motivation and receptivity across topics and within topics for mHealth interventions. To this end, we conducted a study with 20 participants over 3 weeks, where participants received interventions about mental health, COVID-19, physical activity, and diet & nutrition. The interventions were delivered by the chatbot-based iOS app called Elena+, and via the MobileCoach platform.
Our exploratory analysis found that significant differences in mean intrinsic motivation scores across topics were not associated with differences in mean receptivity metrics across topics. We also found that positive relationships exist between intrinsic motivation measures and receptivity for interventions about a topic.
First, we developed Auracle, a wearable earpiece that can automatically detect eating episodes. Using an off-the-shelf contact microphone placed behind the ear, Auracle captures the sound of a person chewing as it passes through the head. This audio data is then processed by a custom circuit board. We collected data with 14 participants for 32 hours in free-living conditions and achieved accuracy exceeding 92.8% and F1 score exceeding77.5% for eating detection with 1-minute resolution.
Second, we adapted Auracle for measuring children’s eating behavior, and improved the accuracy and robustness of the eating-activity detection algorithms. We used this improved prototype in a laboratory study with a sample of 10 children for 60 total sessions and collected 22.3 hours of data in both meal and snack scenarios. Overall, we achieved 95.5% accuracy and 95.7% F1 score for eating detection with 1-minute resolution.
Third, we developed a computer-vision approach for eating detection in free-living scenarios. Using a miniature head-mounted camera, we collected data with 10 participants for about 55 hours. The camera was fixed under the brim of a cap, pointing to the mouth of the wearer and continuously recording video (but not audio) throughout their normal daily activity. We evaluated performance for eating detection using four different Convolutional Neural Network (CNN) models. The best model achieved 90.9% accuracy and 78.7%F1 score for eating detection with 1-minute resolution. Finally, we validated the feasibility of deploying the 3D CNN model in wearable or mobile platforms when considering computation, memory, and power constraints.
Methods: There were 32 young adults participating in three exercise sessions with the exercise band, after which each completed an adapted version of the Usefulness, Satisfaction, and Ease (USE) questionnaire to characterize the exercise system’s strengths and weaknesses in usability.
Results: Questionnaire data reflected a positive and consistent user experience, with all 20 items receiving mean scores greater than 5.0 on a seven-point Likert scale. There were no specific areas of significant weakness in the device’s user experience.
Conclusions: The positive reception among young adults is a promising indication that the device can be successfully incorporated into exercise interventions and that the system can be further developed and tested for the target population of older adults.
To this end, we conducted a two-month longitudinal study with 10 participants, in which each participant was provided with a study car for their daily driving needs. We delivered two in-vehicle interventions - each aimed at improving affective well-being - and simultaneously recorded the participants' driving behavior. In our analysis, we found that several pre-trip characteristics (like trip length, traffic flow, and vehicle occupancy) and the pre-trip affective state of the participants had significant associations with whether the participants started an intervention or canceled a started intervention. Next, we found that several in-the-moment driving characteristics (like current road type, past average speed, and future brake behavior) showed significant associations with drivers' responsiveness to the intervention. Further, we identified several driving behaviors that "negated" the effectiveness of interventions and highlight the potential of using such "negative" driving characteristics to better inform intervention delivery. Finally, we compared trips with and without intervention and found that both interventions employed in our study did not have a negative effect on driving behavior. Based on our analyses, we provide solid recommendations on how to deliver interventions to maximize responsiveness and effectiveness and minimize the burden on the drivers.
Methods: A 6-month, non-randomized, non-blinded, single-arm study was conducted from October 2018 to May 2020 at a community-based aging center of adults aged ≥65 years with a body mass index (BMI) ≥30 kg/m2. Weekly dietitian visits focusing on behavior therapy and caloric restriction and twice-weekly physical therapist-led group strength, flexibility and balance training classes were delivered using video-conferencing to participants in their homes. Participants used a Fitbit Alta HR for remote monitoring with data feedback provided by the interventionists. An aerobic activity prescription was provided and monitored.
Results: Mean age was 72.9±3.9 years (82% female). Baseline anthropometric measures of weight, BMI, and waist circumference were 97.8±16.3 kg, 36.5±5.2 kg/m2, and 115.5±13.0 cm, respectively. A total of 142 participants were screened (n=27 ineligible), and 53 consented. There were nine dropouts (17%). Overall satisfaction with the trial (4.7+0.6, scale: 1 (low) to 5 (high)) and with Fitbit (4.2+0.9) were high. Fitbit was worn an average of 81.7±19.3% of intervention days. In completers, mean weight loss was 4.6±3.5 kg or 4.7±3.5% (p<0.001). Physical function measures of 30-s sit-to-stand repetitions increased from 13.5±5.7 to 16.7±5.9 (p<0.001), 6-min walk improved by 42.0±77.3 m (p=0.005) but no differences were observed in gait speed or grip strength. Subjective measures of late-life function improved (3.4±4.7 points, p<0.001).
Conclusions: A technology-based obesity intervention is feasible and acceptable to older adults with obesity and may lead to weight loss and improved physical function.
Methods: A 12-week pilot in 28 older rural adults with obesity (body mass index [BMI] ≥ 30 kg/m2) was conducted at a community aging center. The intervention consisted of individualized, weekly dietitian visits focusing on behavior therapy and caloric restriction with twice weekly physical therapist-led group strengthening training classes in a community-based aging center. All participants were provided a Fitbit Flex 2. An aerobic activity prescription outside the strength training classes was provided.
Results: Mean age was 72.9 ± 5.3 years (82% female). Baseline BMI was 37.1 kg/m2, and waist circumference was 120.0 ± 33.0 cm. Mean weight loss (pre/post) was 4.6 ± 3.2 kg (4.9 ± 3.4%; p < .001). Of the 40 eligible participants, 33 (75%) enrolled, and the completion rate was high (84.8%). Objective measures of physical function improved at follow-up: 6-minute walk test improved: 35.7 ± 41.2 m (p < .001); gait speed improved: 0.10 ± 0.24 m/s (p = .04); and five-times sit-to-stand improved by 2.1 seconds (p < .001). Subjective measures of late-life function improved (5.2 ± 7.1 points, p = .003), as did Patient-Reported Outcome Measurement Information Systems mental and physical health scores (5.0 ± 5.7 and 4.4 ± 5.0, both p < .001). Participants wore their Fitbit 93.9% of all intervention days, and were overall satisfied with the trial (4.5/5.0, 1–5 low–high) and with Fitbit (4.0/5.0).
Conclusions: A multicomponent obesity intervention incorporating a wearable device is feasible and acceptable to older adults with obesity, and potentially holds promise in enhancing health.
This paper takes the first step towards testing reproducibility and validity of methods and machine-learning models for stress detection. To this end, we analyzed data from 90 participants, from four independent controlled studies, using two different types of sensors, with different study protocols and research goals. We started by evaluating the performance of models built using data from one study and tested on data from other studies. Next, we evaluated new methods to improve the performance of stress-detection models and found that our methods led to a consistent increase in performance across all studies, irrespective of the device type, sensor type, or the type of stressor. Finally, we developed and evaluated a clustering approach to determine the stressed/not-stressed classification when applying models on data from different studies, and found that our approach performed better than selecting a threshold based on training data. This paper's thorough exploration of reproducibility in a controlled environment provides a critical foundation for deeper study of such methods, and is a prerequisite for tackling reproducibility in free-living conditions.
Objective: This study aims to develop a mobile app for a novel device through a user-centered design process with both older adults and clinicians while exploring whether data collected through this process can be used in NLP and sentiment analysis.
Methods: Through a user-centered design process, we conducted semistructured interviews during the development of a geriatric-friendly Bluetooth-connected resistance exercise band app. We interviewed patients and clinicians at weeks 0, 5, and 10 of the app development. Each semistructured interview consisted of heuristic evaluations, cognitive walkthroughs, and observations. We used the Bing sentiment library for a sentiment analysis of interview transcripts and then applied NLP-based latent Dirichlet allocation (LDA) topic modeling to identify differences and similarities in patient and clinician participant interviews. Sentiment was defined as the sum of positive and negative words (each word with a +1 or −1 value). To assess utility, we used quantitative assessment questionnaires—System Usability Scale (SUS) and Usefulness, Satisfaction, and Ease of use (USE). Finally, we used multivariate linear models—adjusting for age, sex, subject group (clinician vs patient), and development—to explore the association between sentiment analysis and SUS and USE outcomes.
Results: The mean age of the 22 participants was 68 (SD 14) years, and 17 (77%) were female. The overall mean SUS and USE scores were 66.4 (SD 13.6) and 41.3 (SD 15.2), respectively. Both patients and clinicians provided valuable insights into the needs of older adults when designing and building an app. The mean positive-negative sentiment per sentence was 0.19 (SD 0.21) and 0.47 (SD 0.21) for patient and clinician interviews, respectively. We found a positive association with positive sentiment in an interview and SUS score (ß=1.38; 95% CI 0.37 to 2.39; P=.01). There was no significant association between sentiment and the USE score. The LDA analysis found no overlap between patients and clinicians in the 8 identified topics.
Conclusions: Involving patients and clinicians allowed us to design and build an app that is user friendly for older adults while supporting compliance. This is the first analysis using NLP and usability questionnaires in the quantification of user-centered design of technology for older adults.
In the Internet of Things (IoT), everyday objects are equipped with the ability to compute and communicate. These smart things have invaded the lives of everyday people, being constantly carried or worn on our bodies, and entering into our homes, our healthcare, and beyond. This has given rise to wireless networks of smart, connected, always-on, personal things that are constantly around us, and have unfettered access to our most personal data as well as all of the other devices that we own and encounter throughout our day. It should, therefore, come as no surprise that our personal devices and data are frequent targets of ever-present threats. Securing these devices and networks, however, is challenging. In this dissertation, we outline three critical problems in the context of Wireless Personal Area Networks (WPANs) and present our solutions to these problems.
First, I present our Trusted I/O solution (BASTION-SGX) for protecting sensitive user data transferred between wirelessly connected (Bluetooth) devices. This work shows how in-transit data can be protected from privileged threats, such as a compromised OS, on commodity systems. I present insights into the Bluetooth architecture, Intel’s Software Guard Extensions (SGX), and how a Trusted I/O solution can be engineered on commodity devices equipped with SGX.
Second, I present our work on AMULET and how we successfully built a wearable health hub that can run multiple health applications, provide strong security properties, and operate on a single charge for weeks or even months at a time. I present the design and evaluation of our highly efficient event-driven programming model, the design of our low-power operating system, and developer tools for profiling ultra-low-power applications at compile time.
Third, I present a new approach (VIA) that helps devices at the center of WPANs (e.g., smartphones) to verify the authenticity of interactions with other devices. This work builds on past work in anomaly detection techniques and shows how these techniques can be applied to Bluetooth network traffic. Specifically, we show how to create normality models based on fine- and course-grained insights from network traffic, which can be used to verify the authenticity of future interactions.
Purpose: To evaluate the effects of incentives, weekly planning, and daily self-monitoring prompts that were used as intervention components as part of the Ally app.
Methods: We conducted an 8 week optimization trial with n = 274 insurees of a health insurance company in Switzerland. At baseline, participants were randomized to different incentive conditions (cash incentives vs. charity incentives vs. no incentives). Over the course of the study, participants were randomized weekly to different planning conditions (action planning vs. coping planning vs. no planning) and daily to receiving or not receiving a self-monitoring prompt. Primary outcome was the achievement of personalized daily step goals.
Results: Study participants were more active and healthier than the general Swiss population. Daily cash incentives increased step-goal achievement by 8.1%, 95% confidence interval (CI): [2.1, 14.1] and, only in the no-incentive control group, action planning increased step-goal achievement by 5.8%, 95% CI: [1.2, 10.4]. Charity incentives, self-monitoring prompts, and coping planning did not affect physical activity. Engagement with planning interventions and self-monitoring prompts was low and 30% of participants stopped using the app over the course of the study.
Conclusions: Daily cash incentives increased physical activity in the short term. Planning interventions and self-monitoring prompts require revision before they can be included in future versions of the app. Selection effects and engagement can be important challenges for physical-activity apps.
Clinical Trial Information: This study was registered on ClinicalTrials.gov, NCT03384550.
Objective: The objective of this study was to evaluate the automatic recognition and segmentation of nocturnal asthmatic coughs and cough epochs in smartphone-based audio recordings that were collected in the field. We also aimed to distinguish partner coughs from patient coughs in contact-free audio recordings by classifying coughs based on sex.
Methods: We used a convolutional neural network model that we had developed in previous work for automated cough recognition. We further used techniques (such as ensemble learning, minibatch balancing, and thresholding) to address the imbalance in the data set. We evaluated the classifier in a classification task and a segmentation task. The cough-recognition classifier served as the basis for the cough-segmentation classifier from continuous audio recordings. We compared automated cough and cough-epoch counts to human-annotated cough and cough-epoch counts. We employed Gaussian mixture models to build a classifier for cough and cough-epoch signals based on sex.
Results: We recorded audio data from 94 adults with asthma (overall: mean 43 years; SD 16 years; female: 54/94, 57%; male 40/94, 43%). Audio data were recorded by each participant in their everyday environment using a smartphone placed next to their bed; recordings were made over a period of 28 nights. Out of 704,697 sounds, we identified 30,304 sounds as coughs. A total of 26,166 coughs occurred without a 2-second pause between coughs, yielding 8238 cough epochs. The ensemble classifier performed well with a Matthews correlation coefficient of 92% in a pure classification task and achieved comparable cough counts to that of human annotators in the segmentation of coughing. The count difference between automated and human-annotated coughs was a mean –0.1 (95% CI –12.11, 11.91) coughs. The count difference between automated and human-annotated cough epochs was a mean 0.24 (95% CI –3.67, 4.15) cough epochs. The Gaussian mixture model cough epoch–based sex classification performed best yielding an accuracy of 83%.
Conclusions: Our study showed longitudinal nocturnal cough and cough-epoch recognition from nightly recorded smartphone-based audio from adults with asthma. The model distinguishes partner cough from patient cough in contact-free recordings by identifying cough and cough-epoch signals that correspond to the sex of the patient. This research represents a step towards enabling passive and scalable cough monitoring for adults with asthma.
Methods: We recruited patients from the Dartmouth-Hitchcock Weight & Wellness Center into a single-arm, non- randomised study of a remotely delivered 16-week evidence-based healthy lifestyle programme. Every 4 weeks, partic- ipants completed surveys that included their willingness to pay for services like those experienced in the intervention. A two-item Willingness-to-Pay survey was administered to participants asking about their willingness to trade their face- to-face visits for videoconference visits based on commute and copay.
Results: Overall, those with a travel duration of 31–45 min had a greater willingness to trade in-person visits for telehealth than any other group. Participants who had a travel duration less than 15 min, 16–30 min and 46–60 min experienced a positive trend in willingness to have telehealth visits until Week 8, where there was a general negative trend in willingness to trade in-person visits for virtual. Participants believed that telemedicine was useful and helpful.
Conclusions: In rural areas where patients travel 30–45 min a telemedicine-delivered, intensive weight-loss interven- tion may be a well-received and cost-effective way for both patients and the clinical care team to connect.
We define several metrics to gauge receptivity towards the interventions, and found that (1) several participant-specific characteristics (age, personality, and device type) show significant associations with the overall participant receptivity over the course of the study, and that (2) several contextual factors (day/time, phone battery, phone interaction, physical activity, and location), show significant associations with the participant receptivity, in-the-moment. Further, we explore the relationship between the effectiveness of the intervention and receptivity towards those interventions; based on our analyses, we speculate that being receptive to interventions helped participants achieve physical activity goals, which in turn motivated participants to be more receptive to future interventions. Finally, we build machine-learning models to detect receptivity, with up to a 77% increase in F1 score over a biased random classifier.
We present theoretical and practical evaluation of a method called SNAP -- SiNgle Antenna Proximity -- that allows a single-antenna Wi-Fi device to quickly determine proximity with another Wi-Fi device. Our proximity detection technique leverages the repeating nature Wi-Fi’s preamble and the behavior of a signal in a transmitting antenna’s near-field region to detect proximity with high probability; SNAP never falsely declares proximity at ranges longer than 14 cm.
Our system, CloseTalker, allows simple, secure, ad hoc communication between devices in close physical proximity, while jamming the signal so it is unintelligible to any receivers more than a few centimeters away. CloseTalker does not require any specialized hardware or sensors in the devices, does not require complex algorithms or cryptography libraries, occurs only when intended by the user, and can transmit a short burst of data or an address and key that can be used to establish long-term or long-range communications at full bandwidth.
In this paper we present a theoretical and practical evaluation of CloseTalker, which exploits Wi-Fi MIMO antennas and the fundamental physics of radio to establish secure communication between devices that have never previously met. We demonstrate that CloseTalker is able to facilitate secure in-band communication between devices in close physical proximity (about 5 cm), even though they have never met nor shared a key.
Methods: We conducted a convergent, parallel mixed-methods study using semi-structured interviews, focus groups, and self-reported questionnaires, using purposive sampling of 29 older adults, 4 community leaders and 7 clinicians in a rural setting. We developed codes informed by thematic analysis and assessed the quantitative data using descriptive statistics.
Results: All groups expressed that mHealth could improve health behaviors. Older adults were optimistic that mHealth could track health. Participants believed they could improve patient insight into health, motivating change and assuring accountability. Barriers to using technology were described, including infrastructure.
Conclusions: Older rural adults with obesity expressed excitement about the use of mHealth technologies to improve their health, yet barriers to implementation exist.
Objective: The primary objective of this study is to quantify main effects, interactions, and moderators of 3 intervention components of a smartphone-based intervention for physical activity. The secondary objective is the exploration of participants’ states of receptivity, that is, situations in which participants are more likely to react to intervention notifications through collection of smartphone sensor data.
Methods: In 2017, we developed the Assistant to Lift your Level of activitY (Ally), a chatbot-based mobile health intervention for increasing physical activity that utilizes incentives, planning, and self-monitoring prompts to help participants meet personalized step goals. We used a microrandomized trial design to meet the study objectives. Insurees of a large Swiss insurance company were invited to use the Ally app over a 12-day baseline and a 6-week intervention period. Upon enrollment, participants were randomly allocated to either a financial incentive, a charity incentive, or a no incentive condition. Over the course of the intervention period, participants were repeatedly randomized on a daily basis to either receive prompts that support self-monitoring or not and on a weekly basis to receive 1 of 2 planning interventions or no planning. Participants completed a Web-based questionnaire at baseline and postintervention follow-up.
Results: Data collection was completed in January 2018. In total, 274 insurees (mean age 41.73 years; 57.7% [158/274] female) enrolled in the study and installed the Ally app on their smartphones. Main reasons for declining participation were having an incompatible smartphone (37/191, 19.4%) and collection of sensor data (35/191, 18.3%). Step data are available for 227 (82.8%, 227/274) participants, and smartphone sensor data are available for 247 (90.1%, 247/274) participants.
Conclusions: This study describes the evidence-based development of a JITAI for increasing physical activity. If components prove to be efficacious, they will be included in a revised version of the app that offers scalable promotion of physical activity at low cost.
Trial Registration: ClinicalTrials.gov NCT03384550; https://clinicaltrials.gov/ct2/show/NCT03384550 (Archived by WebCite at http://www.webcitation.org/74IgCiK3d)
International Registered Report Identifier (IRRID): DERR1-10.2196/11540
We present an authentication method for desktops called Seamless Authentication using Wristbands (SAW), which addresses the lack of intentionality limitation of proximity-based methods. SAW uses a low-effort user input step for explicitly conveying user intentionality, while keeping the overall usability of the method better than password-based methods. In SAW, a user wears a wristband that acts as the user’s identity token, and to authenticate to a desktop, the user provides a low-effort input by tapping a key on the keyboard multiple times or wiggling the mouse with the wristband hand. This input to the desktop conveys that someone wishes to log in to the desktop, and SAW verifies the user who wishes to log in by confirming the user’s proximity and correlating the received keyboard or mouse inputs with the user’s wrist movement, as measured by the wristband. In our feasibility user study (n=17), SAW proved quick to authenticate (within two seconds), with a low false-negative rate of 2.5% and worst-case false-positive rate of 1.8%. In our user perception study (n=16), a majority of the participants rated it as more usable than passwords.
We propose a new approach: using jamming to thwart adversaries located more than a few centimeters away, while still allowing devices in close physical proximity to securely share data. To accomplish this secure data transfer we exploit MIMO antennas and the Inverse-Square Law.
The app implements an activity-level detection model we developed using a Linear Support Vector Machine (SVM). We trained our model using data from a user study, where subjects performed common physical activities (sit, stand, lay down, walk and run). We obtained accuracies up to 99.2% and 98.5% with 10-fold cross validation and leave-one-subject-out (LOSO) cross-validation respectively. We ran a week-long field study to evaluate the utility, usability and battery life of the ActivityAware system where 5 older adults wore the Amulet as it monitored their activity level. The utility evaluation showed that the app was somewhat useful in achieving the daily physical activity goal. The usability feedback showed that the ActivityAware system has the potential to be used by people for monitoring their activity levels. Our energy-efficiency evaluation revealed a battery life of at least 1 week before needing to recharge. The results are promising, indicating that the app may be used for activity-level monitoring by individuals or researchers for epidemiological studies, and eventually for the development of interventions that could improve the health of older adults.
We present and evaluate a prototype implementation to demonstrate this protocol’s feasibility on low-power wearable devices, and present a case for the system’s ability to meet critical security properties under a specific adversary model and trust assumptions.
We introduce the Amulet Platform for constrained wearable devices, which includes an ultra-low-power hardware architecture and a companion software framework, including a highly efficient event-driven programming model, low-power operating system, and developer tools for profiling ultra-low-power applications at compile time. We present the design and evaluation of our prototype Amulet hardware and software, and show how the framework enables developers to write energy-efficient applications. Our prototype has battery lifetime lasting weeks or even months, depending on the application, and our interactive resource-profiling tool predicts battery lifetime within 6-10% of the measured lifetime.
First, we present the findings of a user study we conducted to understand people’s authentication behavior: things they authenticate to, how and when they authenticate, authentication errors they encounter and why, and their opinions about authentication. In our study, participants performed about 39 authentications per day on average; the majority of these authentications were to personal computers (desktop, laptop, smartphone, tablet) and with passwords, but the number of authentications to other things (e.g., car, door) was not insignificant. We saw a high failure rate for desktop and laptop authentication among our participants, affirming the need for a more usable authentication method. Overall, we found that authentication was a noticeable part of all our participants’ lives and burdensome for many participants, but they accepted it as cost of security, devising their own ways to cope with it.
Second, we propose a new approach to authentication, called bilateral authentication, that leverages wrist-wearable technology to enable seamless authentication for things that people use with their hands, while wearing a smart wristband. In bilateral authentication two entities (e.g., user’s wristband and the user’s phone) share their knowledge (e.g., about user’s interaction with the phone) to verify the user’s identity. Using this approach, we developed a seamless authentication method for desktops and smartphones. Our authentication method offers quick and effortless authentication, continuous user verification while the desktop (or smartphone) is in use, and automatic deauthentication after use. We evaluated our authentication method through four in-lab user studies, evaluating the method’s usability and security from the system and the user’s perspective. Based on the evaluation, our authentication method shows promise for reducing users’ authentication burden for desktops and smartphones.
We built a user-friendly, mobile health-data collection system using wireless medical sensors that interface with an Android application. The data-collection system was designed to support minimally trained, non-clinical health workers to gather data about blood pressure and body weight using off-the-shelf medical sensors. This system comprises a blood-pressure cuff, a weighing scale and a portable point-of-sales printer. With this system, we introduced a new method to record contextual information associated with a blood-pressure reading using a tablet’s touchscreen and accelerometer. This contextual information can be used to verify that a patient’s lower arm remained well-supported and stationary during her blood-pressure measurement. In a preliminary user study, we found that a binary support vector machine classifier could be used to distinguish lower-arm movements from stationary arms with 90% accuracy. Predetermined thresholds for the accelerometer readings suffice to determine whether the tablet, and therefore the arm that rested on it, remained supported. Together, these two methods can allow mHealth applications to guide untrained patients (or health workers) in measuring blood pressure correctly.
Usability is a particularly important design and deployment challenge in remote, rural areas, given the limited resources for technology training and support. We conducted a field study to assess our system’s usability in Kolar town, India, where we logged health worker interactions with the app’s interface using an existing usability toolkit. Researchers analyzed logs from this toolkit to evaluate the app’s user experience and quantify specific usability challenges in the app. We have recorded experiential notes from the field study in this document.
Our recognition method uses bioimpedance, a measurement of how tissue responds when exposed to an electrical current. By collecting bioimpedance samples using a small wearable device we designed, our system can determine that (a)the wearer is indeed the expected person and (b) the device is physically on the wearer’s body. Our recognition method works with 98% balanced-accuracy under a cross-validation of a day’s worth of bioimpedance samples from a cohort of 8 volunteer subjects. We also demonstrate that our system continues to recognize a subset of these subjects even several months later. Finally, we measure the energy requirements of our system as implemented on a Nexus S smart phone and custom-designed module for the Shimmer sensing platform.
We address this problem of balancing disclosure and utility of personal information collected by mobile technologies. We believe subjects can decide how best to share their information if they are aware of the benefits and risks of sharing. We developed ShareBuddy, a privacy-aware architecture that allows recipients to request information and specify the benefits the subjects will receive for sharing each piece of requested information; the architecture displays these benefits and warns subjects about the risks of sharing. We describe the ShareBuddy architecture in this poster.
To address this problem we propose ZEBRA. In ZEBRA, a user wears a bracelet (with a built-in accelerometer, gyroscope, and radio) on her dominant wrist. When the user interacts with a computer terminal, the bracelet records the wrist movement, processes it, and sends it to the terminal. The terminal compares the wrist movement with the inputs it receives from the user (via keyboard and mouse), and confirms the continued presence of the user only if they correlate. Because the bracelet is on the same hand that provides inputs to the terminal, the accelerometer and gyroscope data and input events received by the terminal should correlate because their source is the same -- the user’s hand movement. In our experiments ZEBRA performed continuous authentication with 85% accuracy in verifying the correct user and identified all adversaries within 11 s. For a different threshold that trades security for usability, ZEBRA correctly verified 90% of users and identified all adversaries within 50 s.
In this thesis we describe solutions to two of these problems. First, we evaluate the use of bioimpedance for recognizing who is wearing these wireless sensors and show that bioimpedance is a feasible biometric. Second, we investigate the use of accelerometers for verifying whether two of these wireless sensors are on the same person and show that our method is successful as distinguishing between sensors on the same body and on different bodies. We stress that any solution to these problems must be usable, meaning the user should not have to do anything but attach the sensor to their body and have them just work.
These methods solve interesting problems in their own right, but it is the combination of these methods that shows their true power. Combined together they allow a network of wireless sensors to cooperate and determine whom they are sensing even though only one of the wireless sensors might be able to determine this fact. If all the wireless sensors know they are on the same body as each other and one of them knows which person it is on, then they can each exploit the transitive relationship to know that they must all be on that person’s body. We show how these methods can work together in a prototype system. This ability to operate unobtrusively, collecting in situ data and labeling it properly without interrupting the wearer’s activities of daily life, will be vital to the success of these wireless sensors.
We present a wearable sensor to passively recognize people. Our sensor uses the unique electrical properties of a person’s body to recognize their identity. More specifically, the sensor uses bioimpedance -- a measure of how the body’s tissues oppose a tiny applied alternating current -- and learns how a person’s body uniquely responds to alternating current of different frequencies. In this paper we demonstrate the feasibility of our system by showing its effectiveness at accurately recognizing people in a household 90% of the time.
In this paper, we describe Plug-n-Trust (PnT), a novel approach to protecting both the confidentiality and integrity of safety-critical medical sensing and data processing on vulnerable mobile phones. With PnT, a plug-in smart card provides a trusted computing environment, keeping data safe even on a compromised mobile phone. By design, PnT is simple to use and deploy, while providing a flexible programming interface amenable to a wide range of applications. We describe our implementation, designed for Java-based smart cards and Android phones, in which we use a split-computation model with a novel path hashing technique to verify proper behavior without exposing confidential data. Our experimental evaluation demonstrates that PnT achieves its security goals while incurring acceptable overhead.
In order for such a vision to be successful, these devices will need to seamlessly interoperate with no interaction required of the user. As difficult as it is for users to manage their wireless area networks, it will be even more difficult for a user to manage their wireless body-area network in a truly pervasive world. As such, we believe these wearable devices should form a wireless body-area network that is passive in nature. This means that these pervasive wearable devices will require no configuration, yet they will be able form a wireless body-area network by (1) discovering their peers, (2) recognizing they are attached to the same body, (3) securing their communications, and (4) identifying to whom they are attached. While we are interested in all aspects of these passive wireless body-area networks, we focus on the last requirement: identifying who is wearing a device.
We conducted focus groups to understand the privacy concerns that patients have when they use mHealth devices. We conducted a user study to understand how willing patients are to share their personal health information that was collected using an mHealth device. To the best of our knowledge, ours is the first study that explores users’ privacy concerns by giving them the opportunity to actually share the information collected about them using mHealth devices. We found that patients tend to share more information with third parties than the public and prefer to keep certain information from their family and friends. Finally, based on these discoveries, we propose some guidelines to developing defaults for sharing settings in mHealth systems.
We make three contributions. First, we propose Adapt-lite, a set of two techniques that can be applied to existing wireless protocols to make them energy efficient without compromising their security or privacy properties. The techniques are: adaptive security, which dynamically modifies packet overhead; and MAC striping, which makes forgery difficult even for small-sized MACs. Second, we apply these techniques to an existing wireless protocol, and demonstrate a prototype on a Chronos wrist device. Third, we provide security, privacy, and energy analysis of our techniques.
We make three contributions. First, we propose an mHealth sensing protocol that provides strong security and privacy properties with low energy overhead, suitable for low-power sensors. The protocol uses three novel techniques: adaptive security, to dynamically modify transmission overhead; MAC striping, to make forgery difficult even for small-sized MACs; and an asymmetric resource requirement. Second, we demonstrate a prototype on a Chronos wrist device, and evaluate it experimentally. Third, we provide a security, privacy, and energy analysis of our system.
We provide a method to probabilistically detect this situation. Because accelerometers are relatively cheap and require little power, we imagine that the cellphone and each sensor will have a companion accelerometer embedded with the sensor itself. We extract standard features from these companion accelerometers, and use a pair-wise statistic -- coherence, a measurement of how well two signals are related in the frequency domain -- to determine how well features correlate for different locations on the body. We then use these feature coherences to train a classifier to recognize whether a pair of sensors -- or a sensor and a cellphone -- are on the same body. We evaluate our method over a dataset of several individuals walking around with sensors in various positions on their body and experimentally show that our method is capable of achieving an accuracies over 80%.
This poster describes a simple, flexible, and novel approach to protecting both the confidentiality and integrity medical sensing and data processing on vulnerable mobile phones, using plug-in smart cards---even a phone compromised by malware. We describe our design, implementation, and initial experimental results using real smart cards and Android smartphones.
In this chapter, we survey the growing body of research that addresses the risks, methods, and evaluation of network trace sanitization. Research on the risks of network trace sanitization attempts to extract information from published network traces, while research on sanitization methods investigates approaches that may protect against such attacks. Although researchers have recently proposed both quantitative and qualitative methods to evaluate the effectiveness of sanitization methods, such work has several shortcomings, some of which we highlight in a discussion of open problems. Sanitizing a network trace, however challenging, remains an important method for advancing network--based research.
We demonstrate deficiencies of previously studied methods that measure clock skews in 802.11 networks by means of an attack that spoofs clock skews. We then provide means to overcome those deficiencies, thereby improving the reliability of fingerprinting. Finally, we show how to perform the clock-skew arithmetic that enables network providers to publish clock skews of their access points for use by clients.
In this position paper, we propose Mobile-phone based Patient Compliance System (MPCS) that can reduce the time-consuming and error-prone processes of existing self-regulation practice to facilitate self-reporting, non-compliance detection, and compliance reminders. The novelty of this work is to apply social-behavior theories to engineer the MPCS to positively influence patients’ compliance behaviors, including mobile-delivered contextual reminders based on association theory; mobile-triggered questionnaires based on self-perception theory; and mobile-enabled social interactions based on social-construction theory. We discuss the architecture and the research challenges to realize the proposed MPCS.
We propose DEAMON (Distributed Energy-Aware MONitoring), an energy-efficient distributed algorithm for long-term sensor monitoring. Our approach assumes only that mobile nodes are tasked to report sensor data under conditions specified by a Boolean expression, and that a network of nearby sensor nodes contribute to monitoring subsets of the task’s sensors. Our algorithm to select sensor nodes and to monitor the sensing condition conserves energy of all nodes by limiting sensing and communication operations. We evaluate DEAMON with a stochastic analysis and with simulation results, and show that it should significantly reduce energy consumption.
We found that the applications used on the WLAN changed dramatically, with significant increases in peer-to-peer and streaming multimedia traffic. Despite the introduction of a Voice over IP (VoIP) system that includes wireless handsets, our study indicates that VoIP has been used little on the wireless network thus far, and most VoIP calls are made on the wired network.
We saw greater heterogeneity in the types of clients used, with more embedded wireless devices such as PDAs and mobile VoIP clients. We define a new metric for mobility, the “session diameter”. We use this metric to show that embedded devices have different mobility characteristics than laptops, and travel further and roam to more access points. Overall, users were surprisingly non-mobile, with half remaining close to home about 98% of the time.
We describe AnonySense, a privacy-aware architecture for realizing pervasive applications based on collaborative, opportunistic sensing by personal mobile devices. AnonySense allows applications to submit sensing tasks that will be distributed across anonymous participating mobile devices, later receiving verified, yet anonymized, sensor data reports back from the field, thus providing the first secure implementation of this participatory sensing model. We describe our trust model, and the security properties that drove the design of the AnonySense system. We evaluate our prototype implementation through experiments that indicate the feasibility of this approach, and through two applications: a Wi-Fi rogue access point detector and a lost-object finder.
We propose SenseRight, the first architecture for high-integrity people-centric sensing. The SenseRight approach, which extends and enhances AnonySense, assures integrity of both the sensor data (through use of tamper-resistant sensor devices) and the sensor context (through a time-constrained protocol), maintaining anonymity if desired.
We propose AnonySense, a general-purpose architecture for leveraging users’ mobile devices for measuring context, while maintaining the privacy of the users. AnonySense features multiple layers of privacy protection---a framework for nodes to receive tasks anonymously, a novel blurring mechanism based on tessellation and clustering to protect users’ privacy against the system while reporting context, and k-anonymous report aggregation to improve the users’ privacy against applications receiving the context. We outline the architecture and security properties of AnonySense, and focus on evaluating our tessellation and clustering algorithm against real mobility traces.
In this dissertation, we present Dart-Mesh: a Linux-based layer-3 dual-radio two-tiered mesh network that provides complete 802.11b coverage in the Sudikoff Lab for Computer Science at Dartmouth College. We faced several challenges in building, testing, monitoring and managing this network. These challenges motivated us to design and implement Mesh-Mon, a network monitoring system to aid system administrators in the management of a mobile mesh network. Mesh-Mon is a scalable, distributed and decentralized management system in which mesh nodes cooperate in a proactive manner to help detect, diagnose and resolve network problems automatically. Mesh-Mon is independent of the routing protocol used by the mesh routing layer and can function even if the routing protocol fails. We demonstrate this feature by running Mesh-Mon on two versions of Dart-Mesh, one running on AODV (a reactive mesh routing protocol) and the second running on OLSR (a proactive mesh routing protocol) in separate experiments.
Mobility can cause links to break, leading to disconnected partitions. We identify critical nodes in the network, whose failure may cause a partition. We introduce two new metrics based on social-network analysis: the Localized Bridging Centrality (LBC) metric and the Localized Load-aware Bridging Centrality (LLBC) metric, that can identify critical nodes efficiently and in a fully distributed manner.
We run a monitoring component on client nodes, called Mesh-Mon-Ami, which also assists Mesh-Mon nodes in the dissemination of management information between physically disconnected partitions, by acting as carriers for management data.
We conclude, from our experimental evaluation on our 16-node Dart-Mesh testbed, that our system solves several management challenges in a scalable manner, and is a useful and effective tool for monitoring and managing real-world mesh networks.
This sampling approach may be sufficient, for example, for a system administrator or anomaly detection module to observe some unusual behavior in the network. Once an anomaly is detected, however, the administrator may require a more extensive traffic sample, or need to identify the location of an offending device.
We propose a method to allow measurement applications to dynamically modify the sampling strategy, refocusing the monitoring system to pay more attention to certain types of traffic than others. In this paper we show that refocusing is a necessary and promising new technique for wireless measurement.
By analyzing the RSS pattern of typical 802.11 transmitters in a 3-floor building covered by 20 air monitors, we observed that the RSS readings followed a mixture of multiple Gaussian distributions. We discovered that this phenomenon was mainly due to antenna diversity, a widely-adopted technique to improve the stability and robustness of wireless connectivity. This observation renders existing approaches ineffective because they assume a single RSS source. We propose an approach based on Gaussian mixture models, building RSS profiles for spoofing detection. Experiments on the same testbed show that our method is robust against antenna diversity and significantly outperforms existing approaches. At a 3% false positive rate, we detect 73.4%, 89.6% and 97.8% of attacks using the three proposed algorithms, based on local statistics of a single AM, combining local results from AMs, and global multi-AM detection, respectively.
Effective monitoring of wireless network traffic, using commodity hardware, is a challenging task due to the limitations of the hardware. IEEE 802.11 networks support multiple channels, and a wireless interface can monitor only a single channel at one time. Thus, capturing all frames passing an interface on all channels is an impossible task, and we need strategies to capture the most representative sample.
When a large geographic area is to be monitored, several monitoring stations must be deployed, and these will typically overlap in their area of coverage. The competing goals of effective wireless monitoring are to capture as many frames as possible, while minimizing the number of those frames that are captured redundantly by more than one monitoring station. Both goals may be addressed with a sampling strategy that directs neighboring monitoring stations to different channels during any period. To be effective, such a strategy requires timely access to the nature of all recent traffic.
We propose a coordinated sampling strategy that meets these goals. Our implemented solution involves a central controller considering traffic characteristics from many monitoring stations to periodically develop specific sampling policies for each station. We demonstrate the effectiveness of our coordinated sampling strategy by comparing it with existing independent strategies. Our coordinated strategy enabled more distinct frames to be captured, providing a solid foundation for focused sampling and intrusion detection.
In this chapter we discuss the measurement and analysis of the popular 802.11 family of wireless LANs. We describe the tools, metrics and techniques that are used to measure wireless LANs. The results of existing measurement studies are surveyed. We illustrate some of the problems that are specific to measuring wireless LANs, and outline some challenges for collecting future wireless traces.
We implemented and compared the prediction accuracy of several location predictors drawn from four major families of domain-independent predictors, namely Markov-based, compression-based, PPM, and SPM predictors. We found that low-order Markov predictors performed as well or better than the more complex and more space-consuming compression-based predictors.
Although other researchers have explored mobility prediction in hypothetical scenarios, evaluating their predictors analytically or with synthetic data, few studies have been able to evaluate their predictors with real user mobility data. As a first step towards filling this fundamental gap, we work with a large data set collected from the Dartmouth College campus-wide wireless network that hosts more than 500 access points and 6,000 users. Extending our earlier work that focuses on predicting the next-visited access point (i.e., location), in this work we explore the predictability of the time of user mobility. Indeed, our contributions are two-fold. First, we evaluate a series of predictors that reflect possible dependencies across time and space while benefiting from either individual or group mobility behaviors. Second, as a case study we examine voice applications and the use of handoff prediction for advance bandwidth reservation. Using application-specific performance metrics such as call drop and call block rates, we provide a picture of the potential gains of prediction.
Our results indicate that it is difficult to predict handoff time accurately, when applied to real campus WLAN data. However, the findings of our case study also suggest that application performance can be improved significantly even with predictors that are only moderately accurate. The gains depend on the applications’ ability to use predictions and tolerate inaccurate predictions. In the case study, we combine the real mobility data with synthesized traffic data. The results show that intelligent prediction can lead to significant reductions in the rate at which active calls are dropped due to handoffs with marginal increments in the rate at which new calls are blocked.
We consider a class of applications that wish to consider a user’s context when deciding whether to authorize a user’s access to important physical or information resources. Such a context-sensitive authorization scheme is necessary when a mobile user moves across multiple administrative domains where they are not registered in advance. Also, users interacting with their environment need a non-intrusive way to access resources, and clues about their context may be useful input into authorization policies for these resources. Existing systems for context-sensitive authorization take a logic-based approach, because a logical language makes it possible to define a context model where a contextual fact is expressed with a boolean predicate and to derive higher-level context information and authorization decisions from contextual facts.
However, those existing context-sensitive authorization systems have a central server that collects context information, and evaluates policies to make authorization decisions on behalf of a resource owner. A centralized solution assumes that all resource owners trust the server to make correct decisions, and all users trust the server not to disclose private context information. In many realistic applications of pervasive computing, however, the resources, users, and sources of context information are inherently distributed among many organizations that do not necessarily trust each other. Resource owners may not trust the integrity of context information produced by another domain, and context sensors may not trust others with the confidentiality of data they provide about users.
In this thesis, we present a secure distributed proof system for context-sensitive authorization. Our system enables multiple hosts to evaluate an authorization query in a peer-to-peer way, while preserving the confidentiality and integrity policies of mutually untrusted principals running those hosts. We also develop a novel caching and revocation mechanism to support context-sensitive policies that refer to information in dozens of different administrative domains. Contributions of this thesis include the definition of fine-grained security policies that specify trust relations among principals in terms of information confidentiality and integrity, the design and implementation of a secure distributed proof system, a proof for the correctness of our algorithm, and a performance evaluation showing that the amortized performance of our system scales to dozens of servers in different domains.
In this paper, we present a general methodology for extracting mobility information from wireless network traces, and for classifying mobile users and APs. We used the Fourier transform to convert time-dependent location information to the frequency domain, then chose the two strongest periods and used them as parameters to a classification system based on Bayesian theory. To classify mobile users, we computed diameter (the maximum distance between any two APs visited by a user during a fixed time period) and observed how this quantity changes or repeats over time. We found that user mobility had a strong period of one day, but there was also a large group of users that had either a much smaller or much bigger primary period. Both primary and secondary periods had important roles in determining classes of mobile users. Users with one day as their primary period and a smaller secondary period were most prevalent; we expect that they were mostly students taking regular classes. To classify APs, we counted the number of users visited each AP. The primary period did not play a critical role because it was equal to one day for most of the APs; the secondary period was the determining parameter. APs with one day as their primary period and one week as their secondary period were most prevalent. By plotting the classes of APs on our campus map, we discovered that this periodic behavior of APs seemed to be independent of their geographical locations, but may depend on the relative locations of nearby APs. Ultimately, we hope that our study can help the design of location-aware services by providing a base for user mobility models that reflect the movements of real users.
In this paper, we present a general methodology for extracting mobility information from wireless network traces, and for classifying mobile users and APs. We used the Fourier transform to convert time-dependent location information to the frequency domain, then chose the two strongest periods and used them as parameters to a classification system based on Bayesian theory. To classify mobile users, we computed diameter (the maximum distance between any two APs visited by a user during a fixed time period) and observed how this quantity changes or repeats over time. We found that user mobility had a strong period of one day, but there was also a large group of users that had either a much smaller or much bigger primary period. Both primary and secondary periods had important roles in determining classes of mobile users. Users with one day as their primary period and a smaller secondary period were most prevalent; we expect that they were mostly students taking regular classes. To classify APs, we counted the number of users visited each AP. The primary period did not play a critical role because it was equal to one day for most of the APs; the secondary period was the determining parameter. APs with one day as their primary period and one week as their secondary period were most prevalent. By plotting the classes of APs on our campus map, we discovered that this periodic behavior of APs seemed to be independent of their geographical locations, but may depend on the relative locations of nearby APs. Ultimately, we hope that our study can help the design of location-aware services by providing a base for user mobility models that reflect the movements of real users.
We found that residential traffic dominated all other traffic, particularly in residences populated by newer students; students are increasingly choosing a wireless laptop as their primary computer. Although web protocols were the single largest component of traffic volume, network backup and file sharing contributed an unexpectedly large amount to the traffic. Although there was some roaming within a network session, we were surprised by the number of situations in which cards roamed excessively, unable to settle on one access point. Cross-subnet roams were an especial problem, because they broke IP connections, indicating the need for solutions that avoid or accommodate such roams.
This paper analyzes an extensive network trace from a mature 802.11 WLAN, including more than 550 access points and 7000 users over seventeen weeks. We employ several measurement techniques, including syslogs, telephone records, SNMP polling and tcpdump packet sniffing. This is the largest WLAN study to date, and the first to look at a large, mature WLAN and consider geographic mobility. We compare this trace to a trace taken after the network’s initial deployment two years ago.
We found that the applications used on the WLAN changed dramatically. Initial WLAN usage was dominated by Web traffic; our new trace shows significant increases in peer-to-peer, streaming multimedia, and voice over IP (VoIP) traffic. On-campus traffic now exceeds off-campus traffic, a reversal of the situation at the WLAN’s initial deployment. Our study indicates that VoIP has been used little on the wireless network thus far, and most VoIP calls are made on the wired network. Most calls last less than a minute.
We saw greater heterogeneity in the types of clients used, with more embedded wireless devices such as PDAs and mobile VoIP clients. We define a new metric for mobility, the “session diameter.” We use this metric to show that embedded devices have different mobility characteristics than laptops, and travel further and roam to more access points. Overall, users were surprisingly non-mobile, with half remaining close to home about 98% of the time.
In this study, we begin with a large outdoor routing experiment testing the performance of four popular ad hoc algorithms (AODV, APRL, ODMRP, and STARA). We present a detailed comparative analysis of these four implementations. Then, using the outdoor results as a baseline of reality, we disprove a set of common assumptions used in simulation design, and quantify the impact of these assumptions on simulated results. We also more specifically validate a group of popular radio models with our real-world data, and explore the sensitivity of various simulation parameters in predicting accurate results. We close with a series of specific recommendations for simulation and ad hoc routing protocol designers.
Using zero(), it is possible to efficiently implement applications including a variety of databases and I/O-efficient computation systems on top of the Unix file system. zero() can also be used to implement an efficient file-system-based paging mechanism. In some I/O-efficient computations, the availability of zero() effectively doubles disk capacity by allowing blocks of temporary files to be reallocated to new files as they are read.
Experiments on a Linux ext2 file system augmented by zero() demonstrate that where their functionality overlaps, zero() is more efficient than ftruncate(). Additional experiments reveal that in exchange for added effective disk capacity, I/O-efficient code pays only a small performance penalty.
We found that the applications used on the WLAN changed dramatically. Initial WLAN usage was dominated by Web traffic; our new trace shows significant increases in peer-to-peer, streaming multimedia, and voice over IP (VoIP) traffic. On-campus traffic now exceeds off-campus traffic, a reversal of the situation at the WLAN’s initial deployment. Our study indicates that VoIP has been used little on the wireless network thus far, and most VoIP calls are made on the wired network. Most calls last less than a minute.
We saw more heterogeneity in the types of clients used, with more embedded wireless devices such as PDAs and mobile VoIP clients. We define a new metric for mobility, the “session diameter.” We use this metric to show that embedded devices have different mobility characteristics than laptops, and travel further and roam to more access points. Overall, users were surprisingly non-mobile, with half remaining close to home about 98% of the time.
We implemented and compared the prediction accuracy of several location predictors drawn from two major families of domain-independent predictors, namely Markov-based and compression-based predictors. We found that low-order Markov predictors performed as well or better than the more complex and more space-consuming compression-based predictors. Predictors of both families fail to make a prediction when the recent context has not been previously seen. To overcome this drawback, we added a simple fallback feature to each predictor and found that it significantly enhanced its accuracy in exchange for modest effort. Thus the Order-2 Markov predictor with fallback was the best predictor we studied, obtaining a median accuracy of about 72% for users with long trace lengths. We also investigated a simplification of the Markov predictors, where the prediction is based not on the most frequently seen context in the past, but the most recent, resulting in significant space and computational savings. We found that Markov predictors with this recency semantics can rival the accuracy of standard Markov predictors in some cases. Finally, we considered several seemingly obvious enhancements, such as smarter tie-breaking and aging of context information, and discovered that they had little effect on accuracy. The paper ends with a discussion and suggestions for further work.
In this paper we present a data-dissemination service, PACK, which allows applications to specify customized data-reduction policies. These policies define how to discard or summarize data flows wherever buffers overflow on the dissemination path, notably at the mobile hosts where applications often reside. The PACK service provides an overlay infrastructure to support mobile data sources and sinks, using application-specific data-reduction policies where necessary along the data path. We uniformly apply the data-stream “packing” abstraction to buffer overflow caused by network congestion, slow receivers, and the temporary disconnections caused by end-host mobility. We demonstrate the effectiveness of our approach with an application example and experimental measurements.
We represent applications as collections of mobile agents and introduce a distributed mechanism for allocating general computational priority to mobile agents. We derive a bidding strategy for an agent that plans expenditures given a budget and a series of tasks to complete. We also show that a unique Nash equilibrium exists between the agents under our allocation policy. We present simulation results to show that the use of our resource-allocation mechanism and expenditure-planning algorithm results in shorter mean job completion times compared to traditional mobile-agent resource allocation. We also observe that our resource-allocation policy adapts favorably to allocate overloaded resources to higher priority agents, and that agents are able to effectively plan expenditures even when faced with network delay and job-size estimation error.
We found that most of the users of Dartmouth's network have short association times and a high rate of mobility. This observation fits with the predominantly student population of Dartmouth College, because students do not have a fixed workplace and are moving to and from classes all day.
We found that residential traffic dominated all other traffic, particularly in residences populated by newer students; students are increasingly choosing a wireless laptop as their primary computer. Although web protocols were the single largest component of traffic volume, network backup and file sharing contributed an unexpectedly large amount to the traffic. Although there was some roaming within a network session, we were surprised by the number of situations in which cards roamed excessively, unable to settle on one access point. Cross-subnet roams were an especial problem, because they broke IP connections, indicating the need for solutions that avoid or accommodate such roams.
We found that residential traffic dominated all other traffic, particularly in residences populated by newer students; students are increasingly choosing a wireless laptop as their primary computer. Although web protocols were the single largest component of traffic volume, network backup and file sharing contributed an unexpectedly large amount to the traffic. Although there was some roaming within a network session, we were surprised by the number of situations in which cards roamed excessively, unable to settle on one access point. Cross-subnet roams were an especial problem, because they broke IP connections, indicating the need for solutions that avoid or accommodate such roams.
In this paper, we motivate and describe our graph abstraction, and discuss a variety of critical design issues. We also sketch our Solar system, an implementation that represents one point in the design space for our graph abstraction.
We found that residential traffic dominated all other traffic, particularly in residences populated by newer students; students are increasingly choosing a wireless laptop as their primary computer. Although web protocols were the single largest component of traffic volume, network backup and file sharing contributed an unexpectedly large amount to the traffic. Although there was some roaming within a network session, we were surprised by the number of situations in which cards roamed excessively, unable to settle on one access point. Cross-subnet roams were an especial problem, because they broke IP connections, indicating the need for solutions that avoid or accommodate such roams.
In this paper, we motivate and describe our graph abstraction, and discuss a variety of critical design issues. We also sketch our Solar system, an implementation that represents one point in the design space for our graph abstraction.
We describe our approach in terms of a specific context-dissemination framework, the Solar system, although the same principles would apply to systems with similar properties.
We discuss our market structure and mechanisms we have developed to foster secure exchange between agents and hosts. Additionally, we believe that certain agent applications encourage repeated interactions that benefit both agents and hosts, giving further reason for hosts to fairly accommodate agents. We apply our ideas to create a resource-allocation policy for mobile-agent systems, from which we derive an algorithm for a mobile agent to plan its expenditure and travel. With perfect information, the algorithm guarantees the agent’s optimal completion time.
We relax the assumptions underlying our algorithm design and simulate our planning algorithm and allocation policy to show that the policy prioritizes agents by endowment, handles bursty workloads, adapts to situations where network resources are overextended, and that delaying agents’ actions does not catastrophically affect agents’ performance.
Mobile agents represent informational and computational flow. We develop mechanisms that distributively allocate computation among mobile agents in two settings. The first models a situation where users collectively own networked computing resources and require priority enforcement. We extend the allocation mechanism to allow resource reservation to mitigate utility volatility. The second, more general model relaxes the ownership assumption. We apply our computational market to an open setting where a principal’s chief concern is revenue maximization.
Our simulations compare the performance of market-based allocation policies to traditional policies and relate the cost of ownership and consumption separation. We observe that our markets effectively prioritize applications’ performance, can operate under uncertainty and network delay, provide metrics to balance network load, and allow measurement of market-participation risk versus reservation-based computation.
In addition to allocation problems, we investigate resource selection to optimize execution time. The problem is NP-complete if the costs and latencies are constant. Both metrics’ dependence on the chosen set complicates matters. We study how a greedy approach, a novel heuristic, and a shortest-constrained-path strategy perform in mobile-agent applications.
Market-based computational-resource allocation fertilizes applications where previously there was a dearth of motive for or means of cooperation. The rationale behind mobile-agent performance optimization is also useful for resource allocation in general distributed systems where an application has a sequence of dependent tasks or when data collection is expensive.
We present the design of Mobile Voice over IP (MVOIP), an application-level protocol that enables such mobility in a VOIP application based on the ITU H.323 protocol stack. An MVOIP application uses hints from the surrounding network to determine that it has switched subnets. It then initiates a hand-off procedure that comprises pausing its current calls, obtaining a valid IP address for the current subnet, and reconnecting to the remote party with whom it was in a call. Testing the system shows that on a Windows 2000 platform there is a perceivable delay in the hand-off process, most of which is spent in the Windows API for obtaining DHCP addresses. Despite this bottleneck, MVOIP works well on a wireless network.
The incentives of agents in the two markets drastically differ. The open-interest model motivates agents to be less trusting and to not share information. This aspect stems from the model’s greater applicability to resource allocation, but has a deep impact on system efficiency. In this paper, we summarize some economic theory and allegorical evidence from our models and system implementations that support the claim, and conclude with guidelines for system development.
We describe boundaries that can interfere with end-to-end authorization, and outline our unified approach. We describe the system we built and the applications we adapted to use our unified authorization system, and measure its costs. We conclude that our system is a practical approach to the desirable goal of end-to-end authorization.
In our earlier work, we propose a policy for allocating general computational priority among agents posed as a competitive game for which we derive a unique computable Nash equilibrium. Here we improve on our earlier approach by implementing resource guarantees where mobile-agent hosts issue call options on computational resources. Call options allow an agent to reserve and guarantee the cost and time necessary to complete its itinerary before the agent begins execution.
We present an algorithm based upon the binomial options-pricing model that estimates future congestion to allow hosts to evaluate call options; methods for agents to measure the risk associated with their performance and compare their expected utility of competing in the computational spot market with utilizing resource options; and test our theory with simulations to show that option trade reduces variance in agent completion times.
The dissertation is organized into four main parts. First, I discuss the challenges and tradeoffs involved in naming resources and consider a variety of existing approaches to naming.
Second, I consider the architectural requirements for user-centric sharing. I evaluate existing systems with respect to these requirements.
Third, to support the sharing architecture, I develop a formal logic of sharing that captures the notion of restricted delegation. Restricted delegation ensures that users can use the same mechanisms to share resources consistently, regardless of the origin of the resource, or with whom the user wishes to share the resource next. A formal semantics gives unambiguous meaning to the logic. I apply the formalism to the Simple Public Key Infrastructure and discuss how the formalism either supports or discourages potential extensions to such a system.
Finally, I use the formalism to drive a user-centric sharing implementation for distributed systems. I show how this implementation enables end-to-end authorization, a feature that makes heterogeneous distributed systems more secure and easier to audit. Conventionally, gateway services that bridge administrative domains, add abstraction, or translate protocols typically impede the flow of authorization information from client to server. In contrast, end-to-end authorization enables us to build gateway services that preserve authorization information, hence we reduce the size of the trusted computing base and enable more effective auditing. I demonstrate my implementation and show how it enables end-to-end authorization across various boundaries. I measure my implementation and argue that its performance tracks that of similar authorization mechanisms without end-to-end structure.
I conclude that my user-centric philosophy of naming and sharing benefits both users and administrators.
Agent Tcl is a mobile-agent system whose agents can be written in Tcl, Java, and Scheme. Agent Tcl has extensive navigation and communication services, security mechanisms, and debugging and tracking tools. In this article we focus on Agent Tcl’s architecture and security mechanisms, its RPC system, and its docking system, which lets an agent move transparently among mobile computers, regardless of when they are connected to the network.
We create a formal utility model to derive user-demand functions, allowing agents to efficiently plan expenditure and deal with price fluctuations. By quantifying demand and utility, resource owners can precisely set a value for a good. We simulate our model in a mobile agent scheduling environment and show how mobile agents may use server prices to distribute themselves evenly throughout a network.
Agent Tcl is a mobile-agent system whose agents can be written in Tcl, Java, and Scheme. Agent Tcl has extensive navigation and communication services, security mechanisms, and debugging and tracking tools. In this article we focus on Agent Tcl’s architecture and security mechanisms, its RPC system, and its docking system, which lets an agent move transparently among mobile computers, regardless of when they are connected to the network.
In this work we examine current multiprocessor file systems, as well as how those file systems are used by scientific applications. Contrary to the expectations of the designers of current parallel file systems, the workloads on those systems are dominated by requests to read and write small pieces of data. Furthermore, rather than being accessed sequentially and contiguously, as in uniprocessor and supercomputer workloads, files in multiprocessor file systems are accessed in regular, structured, but non-contiguous patterns.
Based on our observations of multiprocessor workloads, we have designed Galley, a new parallel file system that is intended to efficiently support realistic scientific multiprocessor workloads. In this work, we introduce Galley and discuss its design and implementation. We describe Galley’s new three-dimensional file structure and discuss how that structure can be used by parallel applications to achieve higher performance. We introduce several new data-access interfaces, which allow applications to explicitly describe the regular access patterns we found to be common in parallel file system workloads. We show how these new interfaces allow parallel applications to achieve tremendous increases in I/O performance. Finally, we discuss how Galley’s new file structure and data-access interfaces can be useful in practice.
The design of a high-performance multiprocessor file system requires a comprehensive understanding of the expected workload. Unfortunately, until recently, no general workload studies of multiprocessor file systems have been conducted. The goal of the CHARISMA project was to remedy this problem by characterizing the behavior of several production workloads, on different machines, at the level of individual reads and writes. The first set of results from the CHARISMA project describe the workloads observed on an Intel iPSC/860 and a Thinking Machines CM-5. This paper is intended to compare and contrast these two workloads for an understanding of their essential similarities and differences, isolating common trends and platform-dependent variances. Using this comparison, we are able to gain more insight into the general principles that should guide multiprocessor file-system design.
We propose that the traditional functionality of parallel file systems be separated into two components: a fixed core that is standard on all platforms, encapsulating only primitive abstractions and interfaces, and a set of high-level libraries to provide a variety of abstractions and application-programmer interfaces (APIs).
We present our current and next-generation file systems as examples of this structure. Their features, such as a three-dimensional file structure, strided read and write interfaces, and I/O-node programs, re specifically designed with the flexibility and performance necessary to support a wide range of applications.
This method for software isolation has two particular advantages over processes. First, for frequently communicating modules, we significantly reduce context switch time. Thus, we demonstrate near-optimal inter-module communication using software fault isolation. Second, our software-based techniques provide an efficient and expedient solution in situations where only one address space is available (e.g., kernel, or a single-address-space operating system).
We have found that the Galley File System provides a good environment on which to build high-performance libraries, and that the mesh of Panda and Galley was a successful combination.
Recent parallel file-system usage studies show that writes to write-only files are a dominant part of the workload. Therefore, optimizing writes could have a significant impact on overall performance. In this paper, we propose ENWRICH, a compute-processor write-caching scheme for write-only files in parallel file systems. ENWRICH combines low-overhead write caching at the compute processors with high performance disk-directed I/O at the I/O processors to achieve both low latency and high bandwidth. This combination facilitates the use of the powerful disk-directed I/O technique independent of any particular choice of interface. By collecting writes over many files and applications, ENWRICH lets the I/O processors optimize disk I/O over a large pool of requests. We evaluate our design via simulated implementation and show that ENWRICH achieves high performance for various configurations and workloads.
We propose that the traditional functionality of parallel file systems be separated into two components: a fixed core that is standard on all platforms, encapsulating only primitive abstractions and interfaces, and a set of high-level libraries to provide a variety of abstractions and application-programmer interfaces (APIs). We think of this approach as the “RISC” of parallel file-system design.
We present our current and next-generation file systems as examples of this structure. Their features, such as a three-dimensional file structure, strided read and write interfaces, and I/O-node programs, are specifically designed with the flexibility and performance necessary to support a wide range of applications.
Revised on 1/8/96 to emphasize our use of a particular MPI implementation, MPICH.
Recent parallel file-system usage studies show that writes to write-only files are a dominant part of the workload. Therefore, optimizing writes could have a significant impact on overall performance. In this paper, we propose ENWRICH, a compute-processor write-caching scheme for write-only files in parallel file systems. ENWRICH combines low-overhead write caching at the compute processors with high performance disk-directed I/O at the I/O processors to achieve both low latency and high bandwidth. This combination facilitates the use of the powerful disk-directed I/O technique independent of any particular choice of interface. By collecting writes over many files and applications, ENWRICH lets the I/O processors optimize disk I/O over a large pool of requests. We evaluate our design via simulated implementation and show that ENWRICH achieves high performance for various configurations and workloads.
The design of a high-performance parallel file system requires a comprehensive understanding of the expected workload. Unfortunately, until recently, no general workload studies of parallel file systems have been conducted. The goal of the CHARISMA project was to remedy this problem by characterizing the behavior of several production workloads, on different machines, at the level of individual reads and writes. The first set of results from the CHARISMA project describe the workloads observed on an Intel iPSC/860 and a Thinking Machines CM-5. This paper is intended to compare and contrast these two workloads for an understanding of their essential similarities and differences, isolating common trends and platform-dependent variances. Using this comparison, we are able to gain more insight into the general principles that should guide parallel file-system design.
Of course, computational processes sharing a node with a file-system service may receive less CPU time, network bandwidth, and memory bandwidth than they would on a computation-only node. In this paper we begin to examine this issue experimentally. We found that high-performance I/O does not necessarily require substantial CPU time, leaving plenty of time for application computation. There were some complex file-system requests, however, which left little CPU time available to the application. (The impact on network and memory bandwidth still needs to be determined.) For applications (or users) that cannot tolerate an occasional interruption, we recommend that they continue to use only compute nodes. For tolerant applications needing more cycles than those provided by the compute nodes, we recommend that they take full advantage of both compute and I/O nodes for computation, and that operating systems should make this possible.
Most successful systems are based on a solid understanding of the characteristics of the expected workload, but until now there have been no comprehensive workload characterizations of multiprocessor file systems. We began the CHARISMA project in an attempt to fill that gap. We instrumented the common node library on the iPSC/860 at NASA Ames to record all file-related activity over a two-week period. Our instrumentation is different from previous efforts in that it collects information about every read and write request and about the mix of jobs running in the machine (rather than from selected applications).
The trace analysis in this paper leads to many recommendations for designers of multiprocessor file systems. First, the file system should support simultaneous access to many different files by many jobs. Second, it should expect to see many small requests, predominantly sequential and regular access patterns (although of a different form than in uniprocessors), little or no concurrent file-sharing between jobs, significant byte- and block-sharing between processes within jobs, and strong interprocess locality. Third, our trace-driven simulations showed that these characteristics led to great success in caching, both at the compute nodes and at the I/O nodes. Finally, we recommend supporting strided I/O requests in the file-system interface, to reduce overhead and allow more performance optimization by the file system.
Design of such high-performance parallel file systems depends on a thorough grasp of the expected workload. So far there have been no comprehensive usage studies of multiprocessor file systems. Our CHARISMA project intends to fill this void. The first results from our study involve an iPSC/860 at NASA Ames. This paper presents results from a different platform, the CM-5 at the National Center for Supercomputing Applications. The CHARISMA studies are unique because we collect information about every individual read and write request and about the entire mix of applications running on the machines.
The results of our trace analysis lead to recommendations for parallel file system design. First, the file system should support efficient concurrent access to many files, and I/O requests from many jobs under varying load condit ions. Second, it must efficiently manage large files kept open for long periods. Third, it should expect to see small requests, predominantly sequential access patterns, application-wide synchronous access, no concurrent file-sharing between jobs, appreciable byte and block sharing between processes within jobs, and strong interprocess locality. Finally, the trace data suggest that node-level write caches and collective I/O request interfaces may be useful in certain environments.
Our approach is novel in three distinct and essential ways. First, we will teach parallel computing to freshmen in a course designed from beginning to end to do so. Second, we will motivate the course with examples from scientific computation. Third, we use multimedia and visualization as instructional aids. We have two primary objectives: to begin a reform of our undergraduate curriculum with an laboratory-based freshman course on parallel computation, and to produce tools and methodologies that improve student understanding of the basic principles of parallel computing.
Most successful systems are based on a solid understanding of the characteristics of the expected workload, but until now there have been no comprehensive workload characterizations of multiprocessor file systems. We began the CHARISMA project in an attempt to fill that gap. We instrumented the common node library on the iPSC/860 at NASA Ames to record all file-related activity over a two-week period. Our instrumentation is different from previous efforts in that it collects information about every read and write request and about the mix of jobs running in the machine (rather than from selected applications).
The trace analysis in this paper leads to many recommendations for designers of multiprocessor file systems. First, the file system should support simultaneous access to many different files by many jobs. Second, it should expect to see many small requests, predominantly sequential and regular access patterns (although of a different form than in uniprocessors), little or no concurrent file-sharing between jobs, significant byte- and block-sharing between processes within jobs, and strong interprocess locality. Third, our trace-driven simulations showed that these characteristics led to great success in caching, both at the compute nodes and at the I/O nodes. Finally, we recommend supporting strided I/O requests in the file-system interface, to reduce overhead and allow more performance optimization by the file system.
The contests we describe have distinct advantages over contests such as the ACM scholastic programming contest. The primary advantage is that there is no travel required — the whole contest is held in cyberspace. All interaction between participants and judges is via electronic mail.
Of course all contests build on and learn from others, and ours is no exception. This paper is intended to provide a description and philosophy of programming contests that will foster discussion, that will provide a model, and that will increase interest in programming as an essential aspect of computer science.
We found significant interest in parallel supercomputing on campus. An on-campus parallel supercomputing facility would not only support numerous courses and research projects, but would provide a locus for intellectual activity in parallel computing, encouraging interdisciplinary collaboration. We believe that this report is a first step in that direction.
This dissertation studies some of the file system issues needed to get high performance from parallel disk systems, since parallel hardware alone cannot guarantee good performance. The target systems are large MIMD multiprocessors used for scientific applications, with large files spread over multiple disks attached in parallel. The focus is on automatic caching and prefetching techniques. We show that caching and prefetching can transparently provide the power of parallel disk hardware to both sequential and parallel applications using a conventional file system interface. We also propose a new file system interface (compatible with the conventional interface) that could make it easier to use parallel disks effectively.
Our methodology is a mixture of implementation and simulation, using a software testbed that we built to run on a BBN GP1000 multiprocessor. The testbed simulates the disks and fully implements the caching and prefetching policies. Using a synthetic workload as input, we use the testbed in an extensive set of experiments. The results show that prefetching and caching improved the performance of parallel file systems, often dramatically.
Experiments have been conducted with an interleaved file system testbed on the Butterfly Plus multiprocessor. Results of these experiments suggest that 1) the hit ratio, the accepted measure in traditional caching studies, may not be an adequate measure of performance when the workload consists of parallel computations and parallel file access patterns, 2) caching with prefetching can significantly improve the hit ratio and the average time to perform an I/O operation, and 3) an improvement in overall execution time has been observed in most cases. In spite of these gains, prefetching sometimes results in increased execution times (a negative result, given the optimistic nature of the study).
We explore why is it not trivial to translate savings on individual I/O requests into consistently better overall performance and identify the key problems that need to be addressed in order to improve the potential of prefetching techniques in this environment.
Experiments have been conducted with an interleaved file system testbed on the Butterfly Plus multiprocessor. Results of these experiments suggest that 1) the hit ratio, the accepted measure in traditional caching studies, may not be an adequate measure of performance when the workload consists of parallel computations and parallel file access patterns, 2) caching with prefetching can significantly improve the hit ratio and the average time to perform an I/O operation, and 3) an improvement in overall execution time has been observed in most cases. In spite of these gains, prefetching sometimes results in increased execution times (a negative result, given the optimistic nature of the study).
We explore why is it not trivial to translate savings on individual I/O requests into consistently better overall performance and identify the key problems that need to be addressed in order to improve the potential of prefetching techniques in this environment.
This design paper describes a symbolic design system called Prism. The motivation for designing Prism arose from the desire to improve symbolic-to-mask compaction -- specifically in the VIVID system. Current compactors run as totally batch processes. Running batch, a compactor must either smash the chip hierarchy and compact the entire chip as one cell or compact individual cells, making assumptions about the environment and connections for each cell. In either case, the area of the mask suffers. Also, compactors can take an extraordinary amount of time, and one small change -- even if it would make no change in the area of the compacted mask -- requires a total recompaction.
Experiences with using and creating VIVID indicated more reasons to build Prism. VIVID is of the best existing symbolic systems, but strides in state-of-the-art communications, user interfaces, and design automation software engineering have left it behind. Prism is a descendant of VIVID, but Prism is a new model for symbolic design.