Saturday, November 12, 2011

Paper Reading #32: Taking advice from intelligent systems: the double-edged sword of explanations

Taking Advice from Intelligent Systems: The DoubleEdged Sword of Explanations

Authors - Kate Ehrlich, Susanna Kirk, John Patterson, Jamie Rasmussen, Steven Ross, and Daniel Gruen

Authors Bios - Kate Ehrlich is a senior technical staff member at IBM Research and has a PhD from the University of Sussex.
Susanna Kirk is a principal lead at IBM Research.
John Patterson is a Distinguished Engineer at the Collaborative User Experience Reseach Group within IBM.
Jamie Rasmussen is an advisory software engineer at IBM Research and has an undergraduate degree from MIT.
Steven Ross is a senior technical staff member at IBM Research and attended MIT.
Daniel Gruen is a Research Scientist at the Cambridge Research Center and works with IBM Research as well.

Venue - This paper was presented at the CHI '11 Proceedings of the 2011 annual conference on Human factors in computing systems.

Summary

Hypothesis - In this paper, researchers explain that intelligent systems help professionals make decisions in many different fields and most provide an explanation as to why they made said choice but very little research has been done to show that these choices and explanations can mislead users. The hypothesis is that intelligent systems have a significant impact on user choice and can result in errors being made.

Methods - A study supporting the researchers hypothesis was conducted among network security professionals and used a prototype security environment (NIMBLE) that shows threats to the users and makes suggestions based on the output of a model that was trained on previous actions. 19 participants were selected for the study. Each participant was given time to practice using the console before beginning the approximately 2 hour session. Each session consisted of 24 timed trials and each trial consisted of ranking a set of 11 threat types by importance (priority). The system would make suggestions for the ranking of a particular threat and may be right or wrong. The following variables were being tested:

Recommendation - There were 3 recommendation conditions: 1) No recommendation, 2) Recommendation only, and 3) Recommendation and justification.
Recommendation Correctness - The recommendations could have had either 1) 1 correct choice out of the 3 recommendations or 2) No correct choices out of the 3 recommendations.
Suggestion helpfulness was also being recorded on a Likert scale.

Results - The researchers found that when a correct choice was present, the Suggestion and Justification cases significantly improved accuracy compared to the Baseline condition. When there was not a correct choice present, no significant decline in accuracy was found. The researchers also found that the participants were significantly using the suggestions.

Content - The researchers found support for the case that professionals make their own decisions first and then look at suggestions as a way to help them check but the value of suggestions never outweighs the value of their own judgement. The researchers also note that suggestions and justifications can lead users to select wrong answers so intelligent systems should check that confidence levels are being met before recommending some choices.

Conclusion - The researchers conclude by stating that providing justifications for recommendations can both enhance and hurt user decisions so intelligent systems that make recommendations should strive to provide excellent reasoning for its decisions and point out potential flaws.

Discussion

I think the researchers support their hypothesis well in that they showed how explanations can both help and hurt a user that has to make a decision. I also think the subject matter is interesting because it investigated a negative aspect whereas most studies we have read about have focused on the positive aspects.

Paper Reading #31: Identifying emotional states using keystroke dynamics

Identifying Emotional States using Keystroke Dynamics

Authors - Clayton Epp, Michael Lippold, and Regan L. Mandryk

Authors Bios - Clayton Epp is a graduate student at the University of Saskatchewan and focuses on affective computing using unobstrusive technologies.
Michael Lippold is a master's student at the University of Saskatchewan and has an undergraduate degree from the University of Calgary.
Regan L. Mandryk is an Assistant Professor at the University of Saskatchewan and has a PhD from Simon Fraser University.

Venue - This paper was presented at the CHI '11 Proceedings of the 2011 annual conference on Human factors in computing systems.

Summary

Hypothesis - In this paper, researchers note that systems don't react to contextual input in a reliable way in the current world and the ones that do require expensive sensors and are often invasive. The researchers hypothesize that they can build a reliable program that accurately detects emotion through keystroke dynamics and can use this information to respond appropriately.

Methods - The researchers wanted to gather daily use cases so rather than perform laboratory studies, they had users install software that recorded keystroke dynamics and periodically asked for users to fill out a questionnaire regarding their current emotional state. The software also featured a set of control (fixed) text as users had to type the same excerpts from a book from time to time. The study consisted of 26 participants, although only 12 were used in the results due to participation requirements not being met, who installed the above software and used their computer regularly for a period of about 4 weeks. The researchers gathered:

Keystroke Features - This includes keystroke duration, latency, and more.
Content Features - This includes the characters being typed such as number of capital letters or numbers.
Emotional State Classes - The users expressed how they felt by answering a questionnaire that consisted of 15 questions regarding the users' emotional state on a Likert scale.
Additional Data Points - An example of this is process name that was running while data was being typed.

In building their machine that will detect emotional states, the researchers used a machine learning algorithm that was trained on the data collected in the study. Models that detect a certain state with high accuracy will be shown.

Results - The top performing models were those that detected confidence, hesitance, nervousness, relaxation, sadness, and tired emotional states. Anger and excitement are models that have potential but were very skewed because people are usually not angered or excited for long periods of time making it difficult to get a solid read on necessary features. Each model only needed about 7 features to detect an emotional state accurately although the models varied on which features they needed meaning a system that used these models would still need a lot of features to be recorded.

Content - It should be noted that only models using the fixed text input were able to accurately detect emotion meaning that when the users were typing normally, the models were not able to pick up on what the user was feeling. The researchers propose that this is because the free text entries varied too much in length and did not provide enough data to be reliable. Another interesting note about this study is that the data was aggregated before analysis meaning individual differences were not catered to, so the researchers propose that forming a model on each user's data could provide better results for that user.

Conclusion - The researchers conclude by stating that this research is crucial in creating emotionally-aware computers in the future and can help in the work place by taking more precautions in one state while moving more efficiently in another.

Discussion

I think the researchers supported their hypothesis and convinced me that they had built a system capable of determining emotional states based on keystroke dynamics. I think the concept of using something as common as keystrokes to detect information is a great idea as it adds nothing to a user's workload but can form contextual information that can be used to help. I would like to see a system use this data to improve performance and reduce errors.

Thursday, November 10, 2011

Paper Reading #30: Life "modes" in social media

Life “Modes” in Social Media

Authors - Fatih Kursat Ozenc and Shelly D. Farnha

Authors Bios - Fatih Kursat Ozenc was a PhD student at Carnegie Mellon and is currently looking for a job.
Shelly D. Farnha is a researcher at Microsoft Research and has a PhD from the University of Washington

Venue - This paper was presented at the CHI '11 Proceedings of the 2011 annual conference on Human factors in computing systems.

Summary

Hypothesis - In this paper, researchers suggest that people interact differently in different social contexts (life facets) of their live such as work, school, and home environments. The researchers hypothesize that the same can be said of users on the internet and that by organizing and sharing information in different ways across various contexts, online interaction will be easier and more effective to use.

Methods - The researchers decide to use a design-research approach that will allow them to use their current understanding and validate and/or deepen their knowledge of the subject. The researchers performed two-hour interviews consisting of:

A Life Mapping Activity - Participants are to draw out the important areas of their lives and then explain how they currently handle transitions between them online
A Need Validation Session - A series of concept scenarios are presented to participants and they are asked to discuss what they think of the problem and the proposed solution
A Visual Metaphors Ranking Activity - Users rank what they think is the most effective way to visualize the different contexts of their lives

Results - 16 participants chosen to contribute to the study. Results were as follows:

Life Mapping Activity - Most participants chose to draw a social meme map or a timeline map to model their lives. Found to be very similar to mind maps. Most start with "me" and move outward with Family, Work, and Social categories. These categories were further divided into more specific areas like soccer team or immediate family. The timeline maps were either daily or life organized and looked for collage like than the social meme maps.
Communication Practices - Participants are asked to go through their life maps and color code areas by the method of communication they use in each. The closer someone was to the person, the more likely a mix was used to communicate with them whereas people further away tended to only have one or two communication methods. People tended to manage the boundaries of their communication channels by being very specific with where the channel was directed.
Segmenting Areas of Life - The researchers found that job type, personality type, and other factors help determine how segmented a user will keep their life. For instance, managers tend to keep family and work completely segmented. Also found that most people had 1 personal, 1 work, and 1 junk email account.
Transitions - People tend to transition between facets, or modes, by physically moving through time or space. Online transitions proved to be subtle and effected mainly by the methods above (ex. changing to a personal laptop stops use of work email).
Need Validation Session - The participants showed a much greater liking of focused sharing and consuming scenarios over organizing and transition scenarios.

Content - 3 main themes found were:

Modes of the Self - The term mode was found to be preferable compared to facets and people indicated that the long term goal was to be consistent in all life modes although small changes are required.
Focused Sharing - People greatly liked the idea of focused sharing and did not want to share information to too many people. A better way to do this would be asking what groups to share with rather than share with all friends or no friends.
Mobile Me - An overarching theme of the entire study was that when the researchers discussed interaction as desktop-based, many users began explaining that their communication is now almost completely mobile-based.

Some design recommendations learned from this study are:

Think holistically due to the richness of people's lives
Prioritize focused sharing scenarios
Organize sharing by people and life modes
Prioritize mobile
Provide tools for transitioning

Conclusion - The researchers conclude by stating that they have found a need for focused sharing and consuming in the digital communication world and designers should think holistically when designing any kind of communication tool.

Discussion

I think the researchers did an excellent job of supporting their hypothesis and showed that people interact in ways that the internet does not completely support. I think this paper effectively shows why we need "mode" options in all social networking and that we don't necessarily need any new ways of communicating but rather ways of aggregating all of the current ways we communicate and the mobile world is definitely helping with that.

Tuesday, November 8, 2011

Paper Reading #29: Usable gestures for blind people: understanding preference and performance

Usable Gestures for Blind People: Understanding Preference and Performance

Authors - Shaun K. Kane, Jacob O. Wobbrock, and Richard E. Ladner

Authors Bios - Shaun K. Kane is an Assistant Professor at the University of Maryland and received his PhD from the University of Washington.
Jacob O. Wobbrock is an Associate Professor at the University of Washington specializing in human-computer interaction.
Richard E. Ladner is a Professor at the University of Washington and has a PhD in mathematics from the University of California, Berkeley.

Venue - This paper was presented at the CHI '11 Proceedings of the 2011 annual conference on Human factors in computing systems.

Summary

Hypothesis - In this paper, researchers question the design of touch screen interfaces for smartphones as they relate to their use by both blind and sighted people. The hypothesis is that blind and sighted people have different perceptions as to how touch screen gestures should work and also have different ways of performing similar gestures. Knowing these differences could help designers in the future be more aware of how their applications are going to be used.

Methods - 1) 10 blind and 10 sighted participants were asked to take part in the first study that asked them to perform 2 possible gestures to perform a specific action that was read by a moderator. The gesture was performed and recorded on a Lenovo tablet PC and the participants were encouraged to explain their gesture as they performed it. After creating the gestures, participants were asked to rate the effectiveness of their gestures on a Likert scale.
2) The second study consisted of the same setup as the first and used the same participants only this time they were all asked to perform the same gestures for certain actions and rate them on a Likert scale.

Results - 1) The only significant difference on all the Likert data was that blind people found their gestures to be better matches then sighted people overall. Gestures created by the blind people had significantly more strokes and used the edges and corners of the screen more than the sighted people. Gestures created by the blind people were also significantly more likely to use multi-touch and most of those used a mode key by holding a finger in an area while performing a gesture with the other hand. Text-entry produced mixed results with QWERTY keyboard being the preferred method by both blind and sighted participants.
2) There was no overall difference in easiness for blind and sighted participants but their was a significant difference between categories meaning the gestures in one category were easier to perform than those found in another category. There was also significant interaction between blindness and gesture category meaning categories carried different easiness ratings between blind and sighted people. Some other noticeable differences were bigger gestures done by blind people, faster gesturing by sighted people, and greater likeliness of gesture recognition of gestures created by sighted people but gestures created by the blind participants were recognized more accurately than those done by sighted users.

Content - The researchers narrowed down the results of their studies to the following design guidelines:

Avoid symbols in print writing - Many blind people never learn to print and using writing as a form of input would alienate these people
Favor edges, corners, and other landmarks - These locations are easily found without the use of sight
Limit time-based gestures - Blind people tend to perform gestures at a different pace than sighted people so limiting gestures is not practical
Reproduce traditional layouts when possible - Using layouts such as the QWERTY keyboard help to immediately familiarize users with an interface

Conclusion - The researchers conclude by stating that the research presented in this paper will help lead to better touch screen designs that can be used by both blind and sighted users.

Discussion

I think the researchers achieved their goal of exploring differences in how blind users operate touch screen interfaces. I think this research is important as it attempts to close gaps that technology have for the most part ignored such as those with disabilities. It was interesting that whenever gestures performed by blind people were recognized they were more accurately recognized than gestures made by sighted people.

Monday, November 7, 2011

Paper Reading #28: Experimental analysis of touch-screen gesture designs in mobile environments

Experimental Analysis of Touch-Screen Gesture Designs in Mobile Environments

Authors - Andrew Bragdon, Eugene Nelson, Yang Li, and Ken Hinckle

Authors Bios - Andrew Bragdon is a second year PhD student at Brown University focusing on human-computer interaction.
Eugene Nelson is a PhD student at Brown University.
Yang Li is a researcher at Google and earned his PhD from the Chinese Academy of Sciences.
Ken Hinckley is a Principal Researcher at Microsoft Research and has a PhD from the University of Virginia.

Venue - This paper was presented at the CHI '11 Proceedings of the 2011 annual conference on Human factors in computing systems.

Summary

Hypothesis - In this paper, researchers question the effectiveness of soft buttons on touch screens and propose the use of bezel-initiated gestures as a viable alternative for developers and designers to use. The hypothesis is that soft buttons will result in more errors than gestures in non-ideal settings (distractions present) and reduce users' awareness of their surroundings.

Content - Previous studies have shown that users prefer to use one hand to interact with a phone so all gestures tested support the one-handed approach easily. The 4 factors examined were:

Moding Technique - This includes Hard Buttons (Easily found due to material used and provided tactile feedback), Bezel Gestures (Initiated by beginning gesture very close to edge of screen and then recognized as input), and Soft Buttons (Standard sized buttons that have a single random black character in the middle)
Gesture Type - Divided into Mark-Based (Straight strokes aligned with various axes) and Free-Form (Can be circular or letters)
User's Motor Activity - Possibly sitting/standing (No difference found between the 2) or walking (Using a treadmill)
Distraction Level of Environment - The 3 levels of distraction are no distraction (User down not need to look anywhere else), moderate situational awareness task (User can only glance at the phone in between tasks), and attention-saturating task (Users could not look at phone at all during continuous task).

Methods - 15 participants were recruited to take part in the study. 2 Android phones were used in the study, one modified with the hard button and one standard. For each environment (Sitting/No Distraction, Sitting/Moderate Situational Awareness Task, Walking/Moderate Situational Awareness Task, and Sitting/Attention-Saturating Task), users performed 2 training and 6 regular blocks of 12 commands and took a questionnaire at the end of the study.

Results - The following was found:

A significant difference in time for the different environments existed
Bezel Gestures were found to be quicker than the other 2 methods of input
Bezel Gestures performance gains were more noticeable in the environments where distractions were present
Bezel Gestures were unaffected by change in environment but soft buttons saw significant degradation in performance when the user could not see the screen constantly
Accuracy was similar across all modes but free-form paths showed significantly more errors than marks
Gesture marks were preferred when not sitting with no distractions in which case soft buttons was the favorite

Conclusion - The researchers conclude by stating they think bezel-initiated, mark-based general shortcuts should be provided where soft buttons are traditionally used citing the lack of change between performance of the 2 in a still environment but a tremendous advantage for bezel gestures when full attention cannot be given to a phone.

Discussion

I think the researchers achieved their goal and showed bezel gestures as a viable alternative to soft buttons. I think anything that can remove the need for soft buttons is a good thing as screen space can be saved and user attention is not necessary. It was interesting to see just how similar the 2 modes were in the still environment while the gestures were staggeringly better when full attention was not give to the phone.

Thursday, November 3, 2011

Paper Reading #27: Sensing cognitive multitasking for a brain-based adaptive user interface

Sensing Cognitive Multitasking for a Brain-Based Adaptive User Interface

Authors - Erin Treacy Solovey, Francine Lalooses, Krysta Chauncey, Douglas Weaver,
Margarita Parasi, Matthias Scheutz, Angelo Sassaroli, Sergio Fantini, Paul Schermerhorn,
Audrey Girouard, and Robert J.K. Jacob

Authors Bios - Erin Treacy Solovey is a PhD candidate at Tufts University in the Human-Computer Interaction Research Group.
Francine Lalooses is a PhD candidate at Tufts University and has a Bachelor's and Master's degree from Boston University.
Krysta Chauncey is a post doctorate researcher at Tufts University.
Douglas Weaver earned a doctorate degree from Tufts University.
Margarita Parasi was earning a Master's degree at Tufts University at the time of this paper's publication.
Matthias Scheutz is a PhD student at Tufts Universtiy.
Angelo Sassaroli is a research assistant professor at Tufts University and earned a PhD from the University of Electro-Communication.
Sergio Fantini is a professor in the Biomedical Engineering Department at Tufts University.
Paul Schermerhorn is a post doctorate researcher at Tufts University from Indiana University.
Audrey Girouard is an assistant professor at The Queen's University and earned a PhD from Tufts University.
Robert J.K. Jacob is a professor at Tufts University.

Venue - This paper was presented at the CHI '11 Proceedings of the 2011 annual conference on Human factors in computing systems.

Summary

Hypothesis - In this paper, researchers state that multitasking has become commonplace in the work environment but software designers have struggled with developing systems to capitalize on this fact. The hypothesis is that through studies and experiments, the researchers will be able to develop a system capable of determining a certain cognitive state a user is in and adjust for maximum efficiency.

Content - The 3 scenarios of multitasking are:

Branching - Task switching while keeping secondary task in memory
Dual-Task - Frequent changes in task that do not require memory
Delay - Secondary task can wait for primary task to finish

The 2 conditions of branching are:

Random Branching - User does not expect task
Predictive Branching - User expects task

Methods - 1) The preliminary study consisted of determining the cognitive states of 3 participants using fNIRS. The experiment was modeled after one Koechlin did earlier.

The following experiments use human-robot interaction (HRI) to study the usefulness of the system proposed by the researchers. The tasks being performed require both the human and robot and cannot be done a single entity. The basics of the task is sorting rocks and communicating with the robot regarding its location.
2) The conditions and actions for the study are as follows:

Delay - If 2 classification messages are consecutive then put in the same bin otherwise put in a new bin. Begin a new transmission for all location messages.
Dual Task -
Branching - For classification messages do the same as the Delay condition. For location messages do the same as the Dual Task condition.

12 participants were selected for the study. Participants practiced the procedure beforehand to get better at the tasks to make sure that they were thinking properly before applying the fNIRS sensor. The user had to perform 10 40-second trials of each condition.
3) 12 participants participated in the second study as well. They performed the same experiment as the second one but the stimuli was ordered as follows for the 2 types of branching:

Random Branching - Classification and location messages are received pseudorandomly.
Predictive Branching - One classification message were sent after every 3 messages.

Results - 1) The preliminary study showed to be accurate enough for the researchers to continue.
2) A significant difference in response time was found between delay and the other two conditions but no difference was found between the other two. They also found that the 3 conditions had very distinct hemodynamic responses and could be measured that way.
3) No significantly different results were found for response time or accuracy.

Conclusion - The researchers presented a conceptual design of a system that would take data from the fNIRS sensors and adjust the user interface to appropriately handle the cases being dealt with. The researchers conclude by saying that this thought process should be further explored and tested and they believe that they also built the case for HRI to be relevant to the HCI field.

Discussion

I think the researchers accomplished their goal of proving the hypothesis but did it in a very cumbersome manner that was thoroughly unclear. The paper felt disjoint at times and seemed to show where some of the researchers broke off and did their research separately and then applied it later. The results sections were also extremely unclear as it was never explicitly explained well as to what they were looking for and then they never said if the differences meant anything other than that they were different.

Tuesday, November 1, 2011

Paper Reading #26: Embodiment in brain-computer interaction

Embodiment in Brain-Computer Interaction

Authors - Kenton O’Hara, Abigail Sellen, and Richard Harper

Authors Bios - Kenton O’Hara is a Senior Researcher at Microsoft Research and works in the Socio Digital Systems Group.
Abigail Sellen is a Principal Researcher at Microsoft Research and has a PhD from The University of California, San Diego where she studied under Don Norman.
Richard Harper is a Principal Researcher at Microsoft Research and has a PhD from Manchester.

Venue - This paper was presented at the CHI '11 Proceedings of the 2011 annual conference on Human factors in computing systems.

Summary

Hypothesis - In this paper, researchers will present a deeper understanding of Brain-Computer Interaction (BCI) as it relates to practical use and what users think of such technology. The hypothesis is that the researchers will gain useful insight as to when and where BCI should be used for maximum effectiveness and report on current user perspectives on such devices.

Methods - To determine user perception of BCI, the researchers conduct a study that uses the MindFlex game. MindFlex is a commercially available product that measures a user's "focus" and uses that measurement to control a fan that can raise or lift a ball in the air. The more focus detected, the faster the fan blows, the higher the ball floats. 16 participants were found for the study and were in 4 distinct social groups meaning they knew each other prior to the study. Each group was given a MindFlex game for one week, video recording every play session which was to be analyzed by the researchers once returned.

Results - Upon examination of the video submissions, the researchers found many properties they believe to be common to BCI use especially in social settings. They found the following:

Bodily Orientation and Focus - Participants often moved their body or performed physical gestures in an attempt to control focus. Clenching fists and steady gazes on the ball were usually signs of focus whereas wandering eyes and hands were associated with reducing focus. An important aspect of the game was the ability to hear the fan when looking elsewhere allowing for monitoring even when not paying attention
Mental Imagery and Narrative - Participants created narratives to help control the ball at times even though they had nothing to do with the game directly. For example thinking of things that fly was a common strategy for raising the ball but it wasn't the thought that was raising the ball, it was the increased concentration that went along with thinking about it. The converse did not apply to this game as thinking of things lowering only increased concentration also increasing the ball's altitude. These concepts should be considered when designing games and other applications using BCI as it seems people will try to apply their own intuitions to the control method.
Intentionality and Invisibility - Communication while using BCI can be complicated as the action being performed can be perceived as a form of communication so, for example if a spectator gives a user a suggestion and the user does not respond, the spectator will think they are being ignored and continue to suggest or ask if the user understood. If no discernible action is being performed while playing the game, some users felt the need to communicate what was going on by saying it out loud or by gesturing for what they intend to do.
Play as Performance - Some users felt the need to make operating with BCI more entertaining by exaggerating gestures or verbally commenting.
Spectatorship Verabalisations and Play - The genericity of the MindFlex game allowed for unique interactions between friends and family members as they are able to make inferences about certain thoughts based on the movement of the ball or maybe even help the user by distracting them with something that they know will work.

Content - The first point made by the researchers is that physical and social interactions need to be considered when designing BCI applications as these will be important for communicating what is only within a person's head. The researchers also point out that one way of expanding BCI application is to build in physical interaction that would help with communicating intentions and help people think appropriately. A downside to a more narrow interaction scheme is that open interpretation is limited which limits unique social interactions which could be a major selling point for BCI applications.

Conclusion - The researchers conclude by saying that future BCI design needs to incorporate physical and social aspects along with the mental. This paper has shown that these aspects are what set BCI games apart from traditional thinking.

Discussion

I think the researchers achieve their goal of broadening the concept of BCI application design to include physical and social interactions which will hopefully lead to more research of this kind. I think the fact that MindFlex is already available to the public is a huge step forward for this technology and will bring much needed public interest to the topic.