Interview with Mike Ambinder of Valve Software

Posted on | Leave a comment

  • Valve Software has designed top-selling games including Left 4 Dead, Half-Life, and Team Fortress.  I recently spoke with Mike Ambinder, PhD, the company’s
    full-time experimental psychologist, to discuss the professional practices that
    ensure high-quality game experiences.

    Q: What’s your role at Valve?
    A: My job is to apply knowledge and methodologies from psychology to game
    design.  That means performing statistical analyses, developing
    playtesting methodologies, conducting  design experiments, a little bit of interface
    design, and investigating alternative hardware among other things.

    Q: How can psychology guide game design?

    A: Well for example, in the Left 4 Dead series there are several predetermined
    locations in the game called “drop points” where health items or
    weapons will spontaneously appear.  To decide what’s dropped, where, and
    when we considered reward and reinforcement schedules, which are elements of
    behavioral psychology.  You can put things on a fixed schedule so that
    they’ll appear at regular intervals.  This makes the gameplay experience
    more predictable, and there can be real value in that.  Or you can use a
    variable schedule so that you don’t know what’s going to show up or when it’ll
    pop in.  Variable schedules can create a higher rate of engagement in the
    game and make the experience more enjoyable as uncertainty of occurrence can
    increase arousal.  A large component of the gameplay in the Left 4 Dead series
    is the use of these variable reinforcement schedules.

    Q: How is testing integrated into the design process?
    A: We’re constantly playtesting.  Our philosophy is to playtest as much as
    possible, and to start it as soon as we have a playable prototype.  Of
    course our designers are experienced and generally make good decisions about
    gameplay, but we don’t want to just assume we’ve got it right.  Game
    designs are hypotheses, and every instance of play is an experiment.

    Q: What’s your standard testing method?

    A: We use a variety of methods, but the most favored is direct observation of
    real players working their way through the game.  I’m not a fan of the
    think-aloud protocol, in part because the constant prompting detracts from the
    gameplay experience and can introduce inadvertent bias, and in part because
    people can be really bad at explaining why they do what they do.  Better
    to just sit back, watch, say nothing, and try to understand the player’s
    actions.  So quiet, direct observation is our preferred method, but we
    combine that with player Q&As, surveys, quantitative metrics, eyetracking, and
    design experiments, and we’re investigating methods of measuring the player’s
    emotional experience during gameplay.

    Q: How can eyetracking help to inform game design?
    A: Generally, you want to eliminate frequent long eye movements because they
    lead to fatigue.  For example, if the area map is in the bottom right
    corner of the display and your progress through that map is shown in the upper
    left, you’ll see the player’s eyes transiting the screen a lot.  The
    proximity compatibility principle tells us that things which are mentally
    proximal should also be physically proximal, and eyetracking can tell us which
    things are mentally proximal.  By arranging related information together,
    you can reduce fatigue and make the interface more efficient to use.

    Q: And how can you measure the emotional experience of gameplay?

    A: This is still early on, but we’re looking at biometric methods like EEGs
    which measure brainwaves, and EMGs which measure the electrical activity of
    muscles.  But there are questions of their cost and efficacy. 
    They’re also both very intrusive methods, requiring either a cap that’s wired
    into a machine or electrodes attached to the face.  In testing you want to
    mimic the home experience as much as possible, and EEGs and EMGs both make it
    feel more like a lab environment.  But new technologies are emerging that
    could change that.  Remote detection of facial expression seems promising;
    these systems produce data along the lines of an EMG but only use a camera to
    measure muscle activity in the face.

    Emotion can be viewed as a vector and measured along two scales: magnitude and
    valence.  Magnitude describes the intensity of the emotion, while valence
    describes its quality (either positive or negative).  You can measure the
    magnitude pretty reliably using something like heart rate, but understanding
    the valence is the tricky part.  How do you know if that intense emotional
    response is good or bad?  Of course you could just ask, but again that’s not
    a preferred method because people don’t describe their own experiences reliably
    and you’re introducing bias into the response.  Context is a better
    basis.  If someone is getting killed repeatedly, you can assume that
    they’re experiencing a negative emotion. 
    However, to validate we’d love to have a system which quantifies valence
    in real time. 

    Once we can measure these qualities reliably, we can start asking what the
    ideal emotional experience should look like over the course of the player’s
    interaction with the game.  Maybe that would be something like a pattern
    of peaks and valleys that steadily rises over time, as opposed to a prolonged
    burst of emotion that’s experienced all at once.  That seems like a
    plausible theory, but we won’t know until we’ve measured it.

    Q: What are some of the design elements that you’ve found make better player
    A: I can suggest a few things.  First, the player needs to be able to
    understand the experience.  If you die, you need to understand why you
    died.  If you reach a decision point, you need to understand what the
    implications are of taking path A or path B.  The designer needs to
    provide a sensible environment.

    Variety is also really important.  Don’t give people the same monsters
    again and again, or force them to traverse the same levels over and over. There
    are obvious counterpoints to this, and the constructs of the game may dictate a
    lack of variety, so it’s not a hard and fast rule (none of these are), but it
    is something we try and emphasize.  The Left 4 Dead series is a great
    example, because you’re always interacting with a new set of players with
    different skill levels and different tactics, and that will completely change
    the dynamic of the game.  It doesn’t play the same way twice.

    Third, you want to provide people with a feeling of continuous
    advancement.  People prize rewards if they increase in perceived
    value.  They want to feel that the required level of skill builds
    gradually as the game progresses.

    Finally, have the player make interesting choices.  Which weapon should I
    choose?  Which armor should I take?  If these decisions don’t involve
    meaningful tradeoffs, then you’re probably not creating an enjoyable

    Q: How do you foster collaboration in multiplayer games?
    A: Left 4 Dead is really designed to force players to cooperate.  If you
    go out on your own, for example, you’ll get incapacitated very quickly. 
    The game doesn’t prevent you from doing that — it’s a choice you can exercise,
    but it’s inevitably a losing strategy.  If you have other players near you
    then you can collectively put up a stronger fight, and when you fall then they
    can easily revive you.

    Testing helped us improve collaboration in Left 4 Dead as well.  In the
    original design, the thinking was that players would build awareness of each
    others’ locations just through verbal cues, speaking to one another through a
    headset.  But it turned out that in the midst of gameplay that doesn’t
    work well.  When a teammate fell and needed to be revived, the other
    players had a difficult time finding him or her.  They needed another cue,
    so we introduced glowing outlines that appear around your teammates’ bodies, and
    which are visible through walls.  We found that really increased the
    players’ situational awareness, facilitated cooperation, and created a better
    gameplay experience.

    Q: What kinds of quantitative metrics do you use to inform design?
    A: We work with tons of data.  We can track any variable available in
    the game.  We’ll take information about where people die in each level,
    then overlay it on an image of the level to show whether people are dying in
    the right places, and in the right numbers.  We can examine the growth in
    players’ skill levels over time by any of various measures, depending upon the
    needs of the game’s design.  That may be a fairly coarse metric such as
    the ratio of kills to deaths, who gets the most kills, who stays alive the
    longest, and so on.  Or you can apply several measured in combination to
    satisfy a very precise definition of the ideal skill level, such as players who
    have a moderately high rate of kills but who win a lot and stay alive for a
    very long time.

    I really appreciate your time.  I’d wish you luck, but with these kinds
    of practices it really doesn’t sound like you need it.