From nawyn at media.mit.edu  Mon Aug  3 10:49:46 2009
From: nawyn at media.mit.edu (Jason Nawyn)
Date: Mon, 3 Aug 2009 10:49:46 -0400
Subject: [Datasets] CHI Workshop/Datasets mailing list
Message-ID: <358F98C3-F240-4BE1-B284-E6E54AF72DA6@media.mit.edu>


Dear Workshop Participants,

I am writing to announce the existence of a mailing list for  
participants in the CHI 2009 Workshop on "Developing Shared Home  
Behavior Datasets to Advance HCI
and Ubiquitous Computing Research."  You can post a message to list  
list by addressing an email to datasets at mit.edu.

We hope you and your associates will find this a useful venue for  
exchanging information about the collection, use, and sharing of  
information related to sensor data on home activity.

Please feel free to pass the information about this list along to your  
colleagues who might be interested in contributing to the discussion.   
Information about subscribing/unsubscribing or changing list  
preferences is available at:

http://mailman.mit.edu/mailman/listinfo/datasets

Thanks, and best wishes.

Jason


From nawyn at media.mit.edu  Mon Aug  3 11:09:36 2009
From: nawyn at media.mit.edu (Jason Nawyn)
Date: Mon, 3 Aug 2009 11:09:36 -0400
Subject: [Datasets] Annotating home activity
Message-ID: <046D7769-51DA-4C7A-A71A-ECB89AF24D4B@media.mit.edu>


As part of our home data collection project, we will be making our  
multimodal sensor datasets publicly available for use in the research  
community.  In order to enhance the value of these resources, we plan  
to provide observational annotations for a broad selection of high- 
level activities that typically occur in the home.  Sample activities  
to be annotated include: brushing teeth, exercising, mopping, ironing,  
preparing a snack, etc.  A full list of the proposed items is  
available at: http://boxlab.wikispaces.com/Annotation

In order to reflect the ambiguity that naturally occurs as an  
individual transitions among (often simultaneous) activities, we are  
proposing a discrete rating scale that will be applied to activity  
labels according to the annotator?s confidence that the indicated  
behavior is currently happening.  We are presently seeking feedback on  
this strategy. The proposed ?certainty values? are as follows:
	- Not happening
	- Possibly
	- Likely
	- Definitely

These values will be time stamped and stored as integers (0-3) in the  
annotation data file along with the activity labels.  While an  
activity is ongoing, annotators will be encouraged to modify their  
certainty ratings according to whether the observed behavior increases  
or decreases the likelihood that the activity is occurring. Certainty  
ratings should reflect ambiguity from a number of sources.  Here are  
some suggestions for how this system might be applied:
	- Behaviors that indicate preparation for an activity but are not  
intrinsic to the common sense definition of the activity might be  
included in the annotation, but at lower certainty values
	- If an individual is engaged in an ongoing activity but briefly  
focuses on another task, the pause may be reflected by marking the  
first activity with lower certainty
	- Long pauses that clearly interrupt the current activity may result  
in a rating of ?Not happening? for that activity

We are currently testing a user interface that makes it relatively  
easy to apply these labels, and would welcome any comments you might  
have about this or other strategies to introduce confidence levels  
into annotation sets (preferably without requiring multiple  
independent coders).  If you don't see the value of this exercise,  
we'd appreciate that perspective too.

Thanks,

Jason


From jfogarty at cs.washington.edu  Mon Aug  3 22:47:17 2009
From: jfogarty at cs.washington.edu (James Fogarty)
Date: Mon, 3 Aug 2009 19:47:17 -0700
Subject: [Datasets] Annotating home activity
In-Reply-To: <046D7769-51DA-4C7A-A71A-ECB89AF24D4B@media.mit.edu>
References: <046D7769-51DA-4C7A-A71A-ECB89AF24D4B@media.mit.edu>
Message-ID: <E2763DC1E6046B43B39C6EB30B552E6A0B6EE24426@exchsrv1>

One issue that came up in our coding of office activities was time granularity.  Instead of coding things at extremely high granularity, I seem to recall we decided to code whether each activity occurred in each 5 second window.  This made it easier to code multiple activities in a single pass, which was important to us because we had a lot of overlapping activities (sitting, typing, looking at the computer monitor).

Hopefully that makes sense.  Our CHI_2003/TOCHI_2005 paper might have more detail, or I'd be willing to talk sometime about how we did our coding.  Not sure whether it was a good or bad approach, but I can at least tell you what it was.  :)

James

--
James A. Fogarty, Assistant Professor
Computer Science & Engineering, University of Washington

http://www.cs.washington.edu/homes/jfogarty/? 


-----Original Message-----
From: datasets-bounces at mit.edu [mailto:datasets-bounces at mit.edu] On Behalf Of Jason Nawyn
Sent: Monday, August 03, 2009 8:10 AM
To: datasets at mit.edu
Subject: [Datasets] Annotating home activity


As part of our home data collection project, we will be making our  
multimodal sensor datasets publicly available for use in the research  
community.  In order to enhance the value of these resources, we plan  
to provide observational annotations for a broad selection of high- 
level activities that typically occur in the home.  Sample activities  
to be annotated include: brushing teeth, exercising, mopping, ironing,  
preparing a snack, etc.  A full list of the proposed items is  
available at: http://boxlab.wikispaces.com/Annotation

In order to reflect the ambiguity that naturally occurs as an  
individual transitions among (often simultaneous) activities, we are  
proposing a discrete rating scale that will be applied to activity  
labels according to the annotator's confidence that the indicated  
behavior is currently happening.  We are presently seeking feedback on  
this strategy. The proposed "certainty values" are as follows:
	- Not happening
	- Possibly
	- Likely
	- Definitely

These values will be time stamped and stored as integers (0-3) in the  
annotation data file along with the activity labels.  While an  
activity is ongoing, annotators will be encouraged to modify their  
certainty ratings according to whether the observed behavior increases  
or decreases the likelihood that the activity is occurring. Certainty  
ratings should reflect ambiguity from a number of sources.  Here are  
some suggestions for how this system might be applied:
	- Behaviors that indicate preparation for an activity but are not  
intrinsic to the common sense definition of the activity might be  
included in the annotation, but at lower certainty values
	- If an individual is engaged in an ongoing activity but briefly  
focuses on another task, the pause may be reflected by marking the  
first activity with lower certainty
	- Long pauses that clearly interrupt the current activity may result  
in a rating of "Not happening" for that activity

We are currently testing a user interface that makes it relatively  
easy to apply these labels, and would welcome any comments you might  
have about this or other strategies to introduce confidence levels  
into annotation sets (preferably without requiring multiple  
independent coders).  If you don't see the value of this exercise,  
we'd appreciate that perspective too.

Thanks,

Jason


_______________________________________________
Datasets mailing list
Datasets at mit.edu
http://mailman.mit.edu/mailman/listinfo/datasets