[Datasets] Annotating home activity

Mon Aug 3 11:09:36 EDT 2009

As part of our home data collection project, we will be making our  
multimodal sensor datasets publicly available for use in the research  
community.  In order to enhance the value of these resources, we plan  
to provide observational annotations for a broad selection of high- 
level activities that typically occur in the home.  Sample activities  
to be annotated include: brushing teeth, exercising, mopping, ironing,  
preparing a snack, etc.  A full list of the proposed items is  
available at: http://boxlab.wikispaces.com/Annotation

In order to reflect the ambiguity that naturally occurs as an  
individual transitions among (often simultaneous) activities, we are  
proposing a discrete rating scale that will be applied to activity  
labels according to the annotator’s confidence that the indicated  
behavior is currently happening.  We are presently seeking feedback on  
this strategy. The proposed “certainty values” are as follows:
	- Not happening
	- Possibly
	- Likely
	- Definitely

These values will be time stamped and stored as integers (0-3) in the  
annotation data file along with the activity labels.  While an  
activity is ongoing, annotators will be encouraged to modify their  
certainty ratings according to whether the observed behavior increases  
or decreases the likelihood that the activity is occurring. Certainty  
ratings should reflect ambiguity from a number of sources.  Here are  
some suggestions for how this system might be applied:
	- Behaviors that indicate preparation for an activity but are not  
intrinsic to the common sense definition of the activity might be  
included in the annotation, but at lower certainty values
	- If an individual is engaged in an ongoing activity but briefly  
focuses on another task, the pause may be reflected by marking the  
first activity with lower certainty
	- Long pauses that clearly interrupt the current activity may result  
in a rating of “Not happening” for that activity

We are currently testing a user interface that makes it relatively  
easy to apply these labels, and would welcome any comments you might  
have about this or other strategies to introduce confidence levels  
into annotation sets (preferably without requiring multiple  
independent coders).  If you don't see the value of this exercise,  
we'd appreciate that perspective too.

Thanks,

Jason