Skip to Navigation | Skip to Content

The Truth About Webcam Eye Tracking

By now everyone has probably heard of webcam eye tracking. If you haven't, it is exactly what it sounds like - detecting a person's gaze location using a webcam instead of a "real" eye tracker with all the bells and whistles, including infrared illuminators and high sensitivity cameras.

Because webcam eye tracking doesn't require any specialized equipment, participants don't have to come to a lab. They are tested remotely, sitting in front of their computer at home, wearing Happy Bunny pajamas and trying to keep their fat cat from rolling onto the keyboard. The only requirements are: a webcam, Internet connection, and eyes to track.

Companies that provide webcam eye tracking services include GazeHawk and EyeTrackShop (YouEye has had a website for as long as I can remember but their webcam eye tracking doesn't seem to be available yet). These companies recruit participants, administer the study, create visualizations, and report data by area-of-interest.

After reading about a few of their studies, including EyeTrackShop's recent Facebook vs. Google+ hit, I decided to experience webcam eye tracking first-hand. I signed up to be a participant.

One day I was sitting in a comfy leather chair at a Starbucks with my laptop in my lap, working on chapter 10 of my book on eye tracking, when an "Earn up to $4" email came through to let me know I had a GazeHawk study waiting for me. Because any distraction is a welcome distraction when I'm writing (in case you're wondering what's taking me so long), I decided to take an educational break and follow the link.

The first page informed me about how to prepare for the test.

Step 1. Preparing for your GazeHawk test

All screenshots courtesy of GazeHawk (thanks!).

The lighting at Starbucks seemed to match the "DO" pictures, so I proceeded. I was then asked for access to my webcam, which I promptly granted, only to discover that even a webcam adds 10 pounds.

Step 2. Setting up your camera

Center your face in the picture

I centered my face in the window and was ready for calibration.

Step 3. Calibrate for testing

I followed the red dot as instructed.

Checking your calibration

While my "results" were being uploaded (which seemed like a lifetime), I managed to check Twitter (follow me!), finish my pumpkin bread, send a few text messages, and help a kid plug in his laptop into the outlet behind my chair. I then realized the instructions on the screen said not to move my head during the upload. Oops. Even though there was no mention about moving the webcam (or the whole laptop), I figured I shouldn't have done that either. Good thing I didn't get up to get a refill!

When the screen was finally ready for me to start testing, I was several minutes older and my calibration was probably already invalid, but since the interface didn't request a do-over, I continued with the study.

Step 4. Run a test

Step 4 of the process provided a scenario (auto insurance shopping) and instructed me to press Escape when I was done. I wasn't quite sure with what I was supposed to be finished in order to press Escape but I clicked on "Start Testing" anyway. An insurance company homepage appeared.

A second or two later my Outlook displayed an email notification, and the compulsive email checker that I am, I opened the email. I also had to open it on my BlackBerry or the red light would have kept blinking for a while and I can't stand that.

I came back to the insurance homepage and looked around for a while but nothing was clickable - the page appeared to be static. I then remembered the instructions I saw previously, and the Escape key saved the day.

Based on this experience, a few other studies I participated in, and my conversations with people involved with GazeHawk and EyeTrackShop, I made a list of what I believe are the main limitations of this new technology:

  1. Webcam eye tracking has much lower accuracy than real eye trackers. While a typical remote eye tracker (e.g., Tobii T60) has accuracy of 0.5 degrees of visual angle, a webcam will produce accuracy of 2 - 5 degrees, provided that the participant is NOT MOVING. To give you an idea of what that means, five degrees correspond to 2.5 inches (6 cm) on a computer monitor (assuming viewing distance of 27 inches), so the actual gaze location could be anywhere within a radius of 2.5 inches from the gaze location recorded with a webcam. I don't know about you but I wouldn't be comfortable with that level of inaccuracy in my studies.
  2. What decreases the accuracy of webcam eye tracking even further is when participants move their heads, and the longer the session, the more likely this will happen. Therefore, webcam eye tracking sessions have to be very short - typically less than 5 minutes, but ideally less than a minute. Studies conducted with real eye trackers, on the other hand, can last a lot longer with little impact on accuracy.
  3. Currently, webcam eye tracking can handle only single static pages. All four studies I have participated in and a few I read about were one-page studies. Without allowing participants to click on anything and go to another page, the applicability of webcam eye tracking is limited. This constraint also lowers the external validity of the studies.
  4. The rate at which the gaze location is sampled is much lower for webcams than real eye trackers. The typical frame rate of a remote (i.e., non-wearable) eye tracker is between 60 and 500 Hz (i.e., images per second). The webcam frame rate is somewhere between 5 and 30 Hz. The low frame rate makes analyzing fixations and saccades impossible. The analysis is limited to looking at rough gaze points.
  5. Due to imperfect lighting conditions, poor webcams, on-screen distractions, participants' head movement, and overall lower tracking robustness, out of every 10 people who participate in a study, only 3 - 7 will provide sufficiently useful data. While this may not be a problem in and of itself because of very low oversampling costs, what makes me uncomfortable is not knowing how the determination to exclude data from the analysis is made. Data cleansing is important in any study but it is absolutely critical in webcam eye tracking. Exclusion criteria should be made explicit for webcam eye tracking to gain trust among researchers.

Regardless of its limitations, the contribution of webcam eye tracking to research is undeniable. Using webcams made it possible to conduct remote eye tracking studies and enjoy the benefits of remote testing, such as low cost, fast data collection, and global reach.

    While webcam eye tracking is not a substitute for in-person research that uses real eye trackers, it is a cheap option if you're looking for a quick and dirty indication of the distribution of attention on a single page (e.g., your homepage or an ad). As the technology and data collection processes employed by these services continue to improve, the applicability of webcam eye tracking will expand. Will it ever replace eye tracking as we know it? Doubtful, but I will keep an eye on it anyway.

    Comments

    Great article Aga! Thanks. I really was wondering how 'good' webcam eye trackers were. Your experience doesn't surprise me.

    Great article as always, Aga.

    Having done a few of these myself now, it's very much a You-Get-What-You-Pay-For scenario. The feeling is that "crowdsourced eyetracking" solutions are good for high-volume but high-level reactions to brands and page layout.

    The idea that moving your head would skew results and hence sessions need to be short and single screen is interesting. Naturally "real" eye trackers compensate for this! On top of lack of fixation/saccade analysis you mentioned, this also effectively rules out more in-depth insights in to the user journey, i.e., the learning/cognitive process, navigation, mental models, etc.

    There are some technical limits of webcam eyetracking, not only because of the inconsistency and quality of webcams, but it is missing a feedback loop between the machine and the eyes (i.e., the IR illumination and the black-box Voodoo that the likes of Tobii spend a fortune on R&D).

    Really enjoying your material Aga and looking forward to your book!

    Full disclosure: I'm a cofounder of GazeHawk

    Thanks for the great article Aga, it's great to see more exposure to webcam eye tracking.

    I don't want to cause anyone to miss the forest for the trees, so let me start by saying I think Aga's overall analysis is very good. Webcam eye tracking has a ton of potential, only a fraction of which has been realized. It is currently great for some use cases, but there are plenty of cases where it will never replace hardware eye trackers (ex: military applications and very low-level cognitive research).

    That being said, some points on GazeHawk's eye tracking capabilities:

    - Our accuracy, as reported on our blog based on real world numbers, is 70px (gazehawk.com/blog/on-accuracy), or less than an inch. This converts to ~2 degrees. I agree that 2.5 inches is too little accuracy for useful data, but the data we deliver is significantly better than that. By comparison to a T60 we find our accuracy to be about half of theirs (in the real world subjects calibrate to about 1 degree of accuracy on their system, based on conversations with eye tracking experts who have done detailed accuracy evaluations).

    - The studies you ran were on screenshots provided by our customers. We've run studies on interactive pages, though we generally do no do free-flow browsing since it leads to very non-quantitative data. Still, our reporting on multiple pages and dynamic content still has some work to be done, so I do agree with you that this is one of the places where webcam eye tracking falls short (though not due to any technical limitation, instead due to the newness of the companies offering this technology).

    - The average fixation is ~200ms when reading text, and ~350ms when watching a scene (Source: http://neuralcorrelate.com/martinez-conde_et_al_nrn_2004.pdf). The lowest framerate we've worked with without rejecting the data is 15fps. Depending on the fixation algorithm used, that means our minimum fixation length is between 267ms and 133ms. There are definitely some limitations as to use case, but I wouldn't go so far as to claim fixation analysis is impossible.

    - You are absolutely right on #5, transparency is important here. The main factor in exclusion is how confident the eye tracking is in its modeling (i.e. whether or not it's model of the eye accurately represent the real world). The lack of detail is because this is one of the hardest parts of webcam eye tracking, and has a lot of proprietary and "trade secret" components to it.

    Again, thanks for the great article Aga. Looking forward to the book!

    Brian, thanks so much for your comments!

    Is there any way we can test your accuracy? We obviously can test the accuracy of the eye tracking systems we have, so it would be good for someone unbiased to run a comparative study.

    Regarding fixation analysis, the average fixation duration is not really of concern here. This more about the minimum fixation duration that we want to be able to detect, which is usually set to 80 - 100 ms (depending on the stimulus). Also, with so few samples per second (e.g., 15), there is just too much uncertainty in terms of when a fixation starts and when it ends.

    I look forward to learning about any improvements you make to this technology. Please keep us posted!

    Very useful analysis Aga, thanks so much! Question for you and Brian: What is the future of the technology? What should accuracy look like in 2-5 years?

    Great question, Darrel! There is definitely an upper bound on the accuracy of webcam eye tracking, as there are only so many pixels of eye that you can get from your average person. With a 640x480 webcam pushing for a reasonable success rate I believe this is ~1-2 degrees. Once HD webcams become more commonplace I believe this will drop down to 1 degree, but that likely won't happen for another 3 years. I think the big improvements over the next few years will be in success rate and consistency. We'll be able to track people in worse environments and with more head movement, as well as with less calibration time.

    At the same time I think custom hardware eye tracking will become cheaper and more integrated leading to an interesting division in roles. Custom hardware will be better for UI: you need a really high consistency and accuracy that you won't get with webcams. On the other hand, for UX studies and market research, everything seems to be moving toward online in-home. So I think in the end there will be 3 products:
    - High-end eye trackers for specialty research use (ex: PhD research in micro-saccades)
    - Inexpensive custom hardware for UI applications and assistive devices
    - Webcams for in-home UX studies (I don't believe the inexpensive custom hardware will be wide-spread enough to be useful for general-population studies)

    I'll let Aga comment on the in-home vs in-lab panel research methodologies, as this is a big debate in certain UX circles. However, I personally think that UX/market research will move significantly online. If there's only a 20 second calibration requirement to include eye tracking in that dataset, there's little reason not to do it.

    Hi Aga,

    My name is Magnus Linde and I work as Market Research Manager at EyeTrackShop. Didn’t see this great article until now but enjoyed very much reading both it and the comments below. Thanks!

    Overall I think your analysis of the technology gives a good and nuanced picture of what to expect from webcam eye tracking and the excellent comments from Brian at GazeHawk further clarify the huge potential of this new technology. Nevertheless I can’t pass on this opportunity to give a few comments on EyeTrackShop’s solution for web cam-based eye tracking and our view of the subject:

    First of all, I think you’re absolutely right; webcam eye tracking will not replace IR-based eye tracking just as web surveys didn’t replace telephone interviews or make face-to-face interviews obsolete. There is no such thing as a “one-size-fits-all” research approach. Instead I think webcam eye tracking should be considered a complement to other forms of eye tracking, offering great benefits such as low cost, fast turnaround and the ability to easily distribute tests geographically i.e. it’s easy and cost effective to do large sample studies anywhere in the world. For example, EyeTrackShop has done parallel studies with hundreds of respondents across diverse countries, including all continents. To secure access to high-value audiences that are highly-engaged, thoroughly screened and meticulously segmented, we do not carry our own panels. Instead we have partnerships with some the world’s biggest online panel providers such as Cint, uSamp and Toluna.

    So, as with all research, the methodology used for eye tracking research should be chosen with respect to the research objective – for many cases web cam eye tracking will be great, for some cases it will not.

    The EyeTrackShop webcam eye tracking solution has been developed in close cooperation with Tobii Technology and has been available to customers since December 2010. Since the first version, there have been several new releases with additional developments. Major developments include flexibility in stimuli content and build up, interfaces to other survey tools, flexibility in customer report layout and metrics, increased respondent trackability, and improved gaze data quality and quality surveillance. One feature to increase both trackability and accuracy is the EyeTrackShop interactive filter. Before calibration starts, the respondent automatically gets detailed feedback about their head position, eyes position and lightning in order to optimize the eye tracking conditions. The system also rejects respondents with poor conditions where eyes are not trackable.

    To validate the EyeTrackshop solution, we’ve conducted several tests and are continuously conducting new ones as the ongoing development improves the output and results. We’ve done both precision tests and comparative/parallel tests benchmarking the EyeTrackShop results with the already proven high accuracy of Tobii eye trackers. The result from all these studies shows that under normal conditions, the standard deviation is ~2 degrees and that for an area of interest with a size equal to 6 % of a normal laptop screen, the metric “Seen Area of Interest” differs only about 10 percentage points from the result in a Tobii eye tracker (n= 30). (That said, if the “Area of Interest” is smaller than an area representing 6 % of the screen, our recommendation is to use Tobii Eye Trackers.)

    In short, the EyeTrackShop platform works great for Online Display Ads, Print Ads, Pretesting Website Design, and Packaging Design Studies. EyeTrackShop has also developed methodology to test video and made a beta test with Google in October this year. For studies such as Free-browsing Experience or in-depth usability studies we would recommend using a Tobii Eye Tracker.

    Webcam eye tracking offers huge benefits and is a cost and time effective alternative for large scale eye tracking studies. The full potential of the technology is far from captured and with on-going development the applicability will increase even further.

    Thanks Aga, for a great article and for addressing the topic. Looking forward to reading your book, need any input to the chapter on webcam eyetracking we’ll be happy to assist ;)

    Webcam eye tracking is fundamentally flawed; uStamp, Toluna, Lightspeed and all the other panel companies reuse participants, this creates a Subject Expectancy Effect that invalidates the results regardless of how accurate the eye tracking aspect is: http://thinkeyetracking.com/2011/12/the-emperors-new-sneakers/

    Post a comment

    We’ve enabled comment moderation on Rosenfeld Media. Upon posting your comment, it will not immediately appear on this page. Hang tight, we’ll be sure to screen it before too long. (Starred fields are required)

    We don’t like these either (but comment spam makes them a must)