[Datasets] Annotating home activity
James Fogarty
jfogarty at cs.washington.edu
Mon Aug 3 22:47:17 EDT 2009
One issue that came up in our coding of office activities was time granularity. Instead of coding things at extremely high granularity, I seem to recall we decided to code whether each activity occurred in each 5 second window. This made it easier to code multiple activities in a single pass, which was important to us because we had a lot of overlapping activities (sitting, typing, looking at the computer monitor).
Hopefully that makes sense. Our CHI_2003/TOCHI_2005 paper might have more detail, or I'd be willing to talk sometime about how we did our coding. Not sure whether it was a good or bad approach, but I can at least tell you what it was. :)
James
--
James A. Fogarty, Assistant Professor
Computer Science & Engineering, University of Washington
http://www.cs.washington.edu/homes/jfogarty/
-----Original Message-----
From: datasets-bounces at mit.edu [mailto:datasets-bounces at mit.edu] On Behalf Of Jason Nawyn
Sent: Monday, August 03, 2009 8:10 AM
To: datasets at mit.edu
Subject: [Datasets] Annotating home activity
As part of our home data collection project, we will be making our
multimodal sensor datasets publicly available for use in the research
community. In order to enhance the value of these resources, we plan
to provide observational annotations for a broad selection of high-
level activities that typically occur in the home. Sample activities
to be annotated include: brushing teeth, exercising, mopping, ironing,
preparing a snack, etc. A full list of the proposed items is
available at: http://boxlab.wikispaces.com/Annotation
In order to reflect the ambiguity that naturally occurs as an
individual transitions among (often simultaneous) activities, we are
proposing a discrete rating scale that will be applied to activity
labels according to the annotator's confidence that the indicated
behavior is currently happening. We are presently seeking feedback on
this strategy. The proposed "certainty values" are as follows:
- Not happening
- Possibly
- Likely
- Definitely
These values will be time stamped and stored as integers (0-3) in the
annotation data file along with the activity labels. While an
activity is ongoing, annotators will be encouraged to modify their
certainty ratings according to whether the observed behavior increases
or decreases the likelihood that the activity is occurring. Certainty
ratings should reflect ambiguity from a number of sources. Here are
some suggestions for how this system might be applied:
- Behaviors that indicate preparation for an activity but are not
intrinsic to the common sense definition of the activity might be
included in the annotation, but at lower certainty values
- If an individual is engaged in an ongoing activity but briefly
focuses on another task, the pause may be reflected by marking the
first activity with lower certainty
- Long pauses that clearly interrupt the current activity may result
in a rating of "Not happening" for that activity
We are currently testing a user interface that makes it relatively
easy to apply these labels, and would welcome any comments you might
have about this or other strategies to introduce confidence levels
into annotation sets (preferably without requiring multiple
independent coders). If you don't see the value of this exercise,
we'd appreciate that perspective too.
Thanks,
Jason
_______________________________________________
Datasets mailing list
Datasets at mit.edu
http://mailman.mit.edu/mailman/listinfo/datasets
More information about the Datasets
mailing list