Sight & Touch

Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes

Jacob O. Wobbrock

The Information School University of Washington Mary Gates Hall, Box 352840 Seattle, WA 98195-2840 wobbrock@u.washington.edu

Andrew D. Wilson

Microsoft Research

One Microsoft Way Redmond, WA 98052 awilson@microsoft.com

Yang Li

Computer Science & Engineering University of Washington The Allen Center, Box 352350 Seattle, WA 98195-2350 yangli@cs.washington.edu

Comments:

Frank’s Blog

Manoj’s Blog

Summary:

Introduction:

The article presents $1 a cheap, compact gesture recognizer which involves only basic geometry and trigonometry in its implementation. With increasing opportunities for using pens, fingers, and wands, there is a need for a gesture recognition algorithm that does not require expertise in AI or pattern matching. In this article $1 will be compared to two other well known gesture recognition algorithms.

Criteria for the $1 recognizer:

1. Resilient to variations in sampling due to movement speed or sensing;

2. support optional and configurable rotation, scale, and position invariance;

3. require no specialized mathematical techniques;

4. short implementation;

5. run fast enough for interactive use;

6. have the ability to learn new gestures;

7. return an N-best list with a [0,1] score, independent of the number of input points;

8. provide competitive performance with previous algorithms in HCI over a set of gestures.

$1 has four steps in its execution:

1. The path is resampled by determining an average increment between points, and adding a new point when the actual increment exceeds this value. The candidate gesture and the template will thus have the same number of points. This enable point by point comparison between the template and candidate paths.

2. $1 searches for the angle that results in the best alignment between the two point paths. A rotation trick is used to increase performance. This involves a rotate to zero of the angle from the first point to the centroid of the gesture.

3. The gesture is then scaled to a reference square. This ensures that pairwise point differences between the candidate and template paths are due to the rotation and not aspect ratio. A translate to origin is then performed.

4. A global minimum search is performed for the best angle and template example according to the cumulative difference between the point positions of the candidate and the template path. This may involve further rotation of the candidate path.

Rotation Invariance:

Iterations in matching candidate and template paths are minimized by the use of a seed and search approach. Hill climbing was found to always find a mach where similar gesture pairs were concerned. With dissimilar pairs this was not the case as the results were plagued with local minima and a sharp increase in the number of iterations. A middle ground solution was found by using the Golden Section Search, which guarantees to terminate after 10 iterations.

Limitations of the $1 Recognizer:

$1 is a geometric template matching algorithm. It can’t tell the difference between gestures which depend on orientation, or time.

Evaluation:

Ten subjects were chosen.

An HP iPAQ h4355 Pocket PC was the tablet on which they were tested.

16 Gestures were used.

After one practice gesture the subjects were asked to perform three sets of four gestures at slow medium and fast speeds. All gestures were to be performed as accurately as possible.

The performance of 1$ was compared to Rubine and Dynamic Time Wrapping.

Results:

Recognition errors for 1$ vs Rubine were 0.98% (SD=3.63) and 0.85% (SD=3.27).

Recognition rates for 1$ vs Rubine were 92.02% vs 99.15%

Discussion:

The motivation behind 1$ is particularly good. There are many developers who would be much more willing to combine gesture recognition into applications, knowing that they could make a start with a relatively simple algorithm. The comparison with other algorithms, a discussion of the algorithms limitations, and the pseudocode of the algorithm at the end of the article is particularly helpful.

Sight & Touch

Wednesday, March 3, 2010

No comments:

Post a Comment

Followers

Blog Archive