[Datasets] Annotating home activity
Jason Nawyn
nawyn at media.mit.edu
Mon Aug 3 11:09:36 EDT 2009
As part of our home data collection project, we will be making our
multimodal sensor datasets publicly available for use in the research
community. In order to enhance the value of these resources, we plan
to provide observational annotations for a broad selection of high-
level activities that typically occur in the home. Sample activities
to be annotated include: brushing teeth, exercising, mopping, ironing,
preparing a snack, etc. A full list of the proposed items is
available at: http://boxlab.wikispaces.com/Annotation
In order to reflect the ambiguity that naturally occurs as an
individual transitions among (often simultaneous) activities, we are
proposing a discrete rating scale that will be applied to activity
labels according to the annotator’s confidence that the indicated
behavior is currently happening. We are presently seeking feedback on
this strategy. The proposed “certainty values” are as follows:
- Not happening
- Possibly
- Likely
- Definitely
These values will be time stamped and stored as integers (0-3) in the
annotation data file along with the activity labels. While an
activity is ongoing, annotators will be encouraged to modify their
certainty ratings according to whether the observed behavior increases
or decreases the likelihood that the activity is occurring. Certainty
ratings should reflect ambiguity from a number of sources. Here are
some suggestions for how this system might be applied:
- Behaviors that indicate preparation for an activity but are not
intrinsic to the common sense definition of the activity might be
included in the annotation, but at lower certainty values
- If an individual is engaged in an ongoing activity but briefly
focuses on another task, the pause may be reflected by marking the
first activity with lower certainty
- Long pauses that clearly interrupt the current activity may result
in a rating of “Not happening” for that activity
We are currently testing a user interface that makes it relatively
easy to apply these labels, and would welcome any comments you might
have about this or other strategies to introduce confidence levels
into annotation sets (preferably without requiring multiple
independent coders). If you don't see the value of this exercise,
we'd appreciate that perspective too.
Thanks,
Jason
More information about the Datasets
mailing list