From nawyn at media.mit.edu Mon Aug 3 10:49:46 2009 From: nawyn at media.mit.edu (Jason Nawyn) Date: Mon, 3 Aug 2009 10:49:46 -0400 Subject: [Datasets] CHI Workshop/Datasets mailing list Message-ID: <358F98C3-F240-4BE1-B284-E6E54AF72DA6@media.mit.edu> Dear Workshop Participants, I am writing to announce the existence of a mailing list for participants in the CHI 2009 Workshop on "Developing Shared Home Behavior Datasets to Advance HCI and Ubiquitous Computing Research." You can post a message to list list by addressing an email to datasets at mit.edu. We hope you and your associates will find this a useful venue for exchanging information about the collection, use, and sharing of information related to sensor data on home activity. Please feel free to pass the information about this list along to your colleagues who might be interested in contributing to the discussion. Information about subscribing/unsubscribing or changing list preferences is available at: http://mailman.mit.edu/mailman/listinfo/datasets Thanks, and best wishes. Jason From nawyn at media.mit.edu Mon Aug 3 11:09:36 2009 From: nawyn at media.mit.edu (Jason Nawyn) Date: Mon, 3 Aug 2009 11:09:36 -0400 Subject: [Datasets] Annotating home activity Message-ID: <046D7769-51DA-4C7A-A71A-ECB89AF24D4B@media.mit.edu> As part of our home data collection project, we will be making our multimodal sensor datasets publicly available for use in the research community. In order to enhance the value of these resources, we plan to provide observational annotations for a broad selection of high- level activities that typically occur in the home. Sample activities to be annotated include: brushing teeth, exercising, mopping, ironing, preparing a snack, etc. A full list of the proposed items is available at: http://boxlab.wikispaces.com/Annotation In order to reflect the ambiguity that naturally occurs as an individual transitions among (often simultaneous) activities, we are proposing a discrete rating scale that will be applied to activity labels according to the annotator?s confidence that the indicated behavior is currently happening. We are presently seeking feedback on this strategy. The proposed ?certainty values? are as follows: - Not happening - Possibly - Likely - Definitely These values will be time stamped and stored as integers (0-3) in the annotation data file along with the activity labels. While an activity is ongoing, annotators will be encouraged to modify their certainty ratings according to whether the observed behavior increases or decreases the likelihood that the activity is occurring. Certainty ratings should reflect ambiguity from a number of sources. Here are some suggestions for how this system might be applied: - Behaviors that indicate preparation for an activity but are not intrinsic to the common sense definition of the activity might be included in the annotation, but at lower certainty values - If an individual is engaged in an ongoing activity but briefly focuses on another task, the pause may be reflected by marking the first activity with lower certainty - Long pauses that clearly interrupt the current activity may result in a rating of ?Not happening? for that activity We are currently testing a user interface that makes it relatively easy to apply these labels, and would welcome any comments you might have about this or other strategies to introduce confidence levels into annotation sets (preferably without requiring multiple independent coders). If you don't see the value of this exercise, we'd appreciate that perspective too. Thanks, Jason From jfogarty at cs.washington.edu Mon Aug 3 22:47:17 2009 From: jfogarty at cs.washington.edu (James Fogarty) Date: Mon, 3 Aug 2009 19:47:17 -0700 Subject: [Datasets] Annotating home activity In-Reply-To: <046D7769-51DA-4C7A-A71A-ECB89AF24D4B@media.mit.edu> References: <046D7769-51DA-4C7A-A71A-ECB89AF24D4B@media.mit.edu> Message-ID: One issue that came up in our coding of office activities was time granularity. Instead of coding things at extremely high granularity, I seem to recall we decided to code whether each activity occurred in each 5 second window. This made it easier to code multiple activities in a single pass, which was important to us because we had a lot of overlapping activities (sitting, typing, looking at the computer monitor). Hopefully that makes sense. Our CHI_2003/TOCHI_2005 paper might have more detail, or I'd be willing to talk sometime about how we did our coding. Not sure whether it was a good or bad approach, but I can at least tell you what it was. :) James -- James A. Fogarty, Assistant Professor Computer Science & Engineering, University of Washington http://www.cs.washington.edu/homes/jfogarty/? -----Original Message----- From: datasets-bounces at mit.edu [mailto:datasets-bounces at mit.edu] On Behalf Of Jason Nawyn Sent: Monday, August 03, 2009 8:10 AM To: datasets at mit.edu Subject: [Datasets] Annotating home activity As part of our home data collection project, we will be making our multimodal sensor datasets publicly available for use in the research community. In order to enhance the value of these resources, we plan to provide observational annotations for a broad selection of high- level activities that typically occur in the home. Sample activities to be annotated include: brushing teeth, exercising, mopping, ironing, preparing a snack, etc. A full list of the proposed items is available at: http://boxlab.wikispaces.com/Annotation In order to reflect the ambiguity that naturally occurs as an individual transitions among (often simultaneous) activities, we are proposing a discrete rating scale that will be applied to activity labels according to the annotator's confidence that the indicated behavior is currently happening. We are presently seeking feedback on this strategy. The proposed "certainty values" are as follows: - Not happening - Possibly - Likely - Definitely These values will be time stamped and stored as integers (0-3) in the annotation data file along with the activity labels. While an activity is ongoing, annotators will be encouraged to modify their certainty ratings according to whether the observed behavior increases or decreases the likelihood that the activity is occurring. Certainty ratings should reflect ambiguity from a number of sources. Here are some suggestions for how this system might be applied: - Behaviors that indicate preparation for an activity but are not intrinsic to the common sense definition of the activity might be included in the annotation, but at lower certainty values - If an individual is engaged in an ongoing activity but briefly focuses on another task, the pause may be reflected by marking the first activity with lower certainty - Long pauses that clearly interrupt the current activity may result in a rating of "Not happening" for that activity We are currently testing a user interface that makes it relatively easy to apply these labels, and would welcome any comments you might have about this or other strategies to introduce confidence levels into annotation sets (preferably without requiring multiple independent coders). If you don't see the value of this exercise, we'd appreciate that perspective too. Thanks, Jason _______________________________________________ Datasets mailing list Datasets at mit.edu http://mailman.mit.edu/mailman/listinfo/datasets