Sensing foot gestures from the pocket
Authors - Jeremy Scott, David Dearman, Koji Yatani, Khai N. Truong
Authors Bios - Jeremy Scott is a graduate student at MIT and received his Bachelor's from the University of Toronto.
David Dearman is a PhD student at the University of Toronto.
Koji Yatani is a PhD student at the University of Toronto interested in interactive systems.
Khai Truong is an associate professor at the University of Toronto specifically interested in human-computer interaction.
Venue - This paper was presented at the UIST '10 Proceedings of the 23rd annual ACM symposium on User interface software and technology.
Summary
Hypothesis - In this paper, researchers discuss a way to use foot gestures to perform tasks in a mobile environment and develop a system that supports this. In doing so, they hope to prove their hypothesis that such a system can learn over time from users to recognize more accurately in addition to the primary goal of allowing users to perform tasks without having to focus on visual input and visual feedback.
Methods - The researchers decided the first thing to do was study possible foot gestures that could be used in the product. The four gestures explored were dorsiflexion, Plantar flexion, heel rotation, and toe rotation. Participants were asked to hold down a button on a mouse and perform one of these gestures rotating to a specified angle from the start position and releasing to indicate completion. The setup consisted of 6 Motion Capture cameras and a laptop, informing the participant of what task was to be performed and recording information received from the cameras. Participants began the study with the training phase that consisted of 156 gestures and visual feedback informing the user of their progress. Next, the testing phase began with no visual feedback and 468 gestures. Lastly, participants were interviewed and asked to rank the gestures in order of preference. A second study was conducted using the same procedure and equipment later in testing the researchers' machine learning algorithms by using different number of users' data as training data and the rest as test data. Also heel rotation and Plantar flexion were the only 2 gestures tested due to the results of the first study (see below). 2 different classification procedures were used for the machine learning portion of this study. Leave-one-participant-out (LOPO) consisted of using all but one of the particpants' data as training data and then testing on the remaining participant. Within-participant (WP) consisted of a single user performing a gesture many times with all but one of those trials used as training data and the remaining trial used as test data.
Results - The initial study found raising the heel, or Plantar flexion, to be the most accurate and preferred gesture for vertical angles. Plantar flexion also showed a consistent error rate across all angles whereas the other gestures increased in error as the angle increased. Among the rotation gestures, both heel and toe rotation were comparable in regards to error and range but heel rotation was greatly preferred by the participants. The second study tested gesture recognition using a phone located in a front pocket, back pocket, and hip mount which resulted in successful recognition 50.8%, 34.9%, and 86.4% of the time respectively. Higher percentages resulted when the algorithm only had to determine which gesture type was being performed (heel rotation or Plantar flexion).
Contents - The researchers developed a program to recognize foot gestures using data collected with a phone's accelerometer. The workflow for the program consisted of a user wearing a phone in a pocket, initiating the system by placing a foot at the origin and performing a double tab, and performing the desired gesture which would then be recognized by the system and executes the desired command. To optimize this method, the recognition algorithm would have to be very robust and able to adapt to an individual so the researchers integrated machine learning into the workflow and did several quick tests to help develop the machine learning algorithm. They found 34 features that could be used in gesture recognition and implemented them into an initial application that was used in the second study.
Conclusion - The researchers conclude that foot gestures are a viable option for user input. Because the WP procedure worked significantly better than LOPO, the researchers also conclude that this type of program would be best implemented by learning from an individual user before use by that person.
Discussion
I think the researchers achieved their goal of proving the viability of using foot gestures as input and they mostly convinced me of this in their testing. I would have also liked to have seen a framework built that other developers could use to begin using this technology in real-world applications because I think the usefulness of this ability is still questionable to most people. Nevertheless this technology could prove crucial in the future of device interfacing and help make device interfacing a very transparent and non-feedback dependent task.
No comments:
Post a Comment