UX Metrics Quiz

In a usability study, task time, success rate, and subjective ratings of ease of use would all be examples of what?

Dependent variables

Independent variables

Control variables

Ordinal variables

Assume you had 6 participants perform a task and the times they took were 34, 28, 36, 28, 35, and 30 seconds. Which of the following would be the most appropriate way to report the average, or mean, time?

31.8 secs

31.83 secs

31.833 secs

31.8333 secs

If you increase the sample size for the calculation of a mean, what will probably happen to the confidence interval for that mean?

It will get smaller

It will get larger

No change

Impossible to know

Assume you calculate both a 90% and 95% confidence interval for a mean. What can you say about their relative sizes?

The 90% confidence interval will be smaller

The 95% confidence interval will be smaller

They will be the same

Impossible to say which will be smaller

If the 90% confidence intervals for two means from independent samples do NOT overlap with each other, what can you say about the relationship between those means?

They are significantly different from each other at the 90% level.

They might be significantly different from each other but you will have to do a t-test to be sure.

They are not significantly different from each other

Impossible to say anything about their relationship.

True or False: A t-test can be used to determine if two means are significantly different from each other.

True

False

What is the median of a set of numbers?

The middle number when they’re put in order

The arithmetic average of the numbers

The most frequent number

The highest number

What’s wrong with this graph?

It should be a bar or column graph instead of a line graph.

It should not have error bars

There's nothing wrong with it

It should be a pie chart instead of a line graph

What would the correlation coefficient (r) be for these two sets of data?

+1.0

-1.0

Impossible to tell

You calculate the correlation coefficient between two variables and find that r=0.74. Is that statistically significant?

Can’t tell from the data given

Yes

Yes, if both variables are continuous

Which of the following statements is correct?

Formative usability testing is iterative; summative usability testing is generally not.

Summative usability testing is iterative; formative usability testing is generally not.

Summative usability testing is good; formative usability testing is not so good.

Formative usability testing is slow; summative usability testing is fast.

Performance metrics are primarily about which of the following?

What people do

What people say

What people think

What people feel

True or False: You can calculate task success across tasks for each participant or across participants for each task.

True

False

True or False: In a within-subjects design each participant is being compared to himself across different conditions.

True

False

When task success is scored for each task as either a success or failure, this is an example of what kind of data?

Binary data

Continuous data

Ordinal data

Likert data

Task time data, especially from an online study, commonly has outliers (e.g., someone fell asleep during a task!). Which of these is NOT an appropriate way of identifying outliers?

Dropping any values above the geometric mean

Identifying natural breaks in the data

Calculating +/- 2 or 3 standard deviations from the mean

Using pre-defined thresholds

With Design A, participants successfully completed 86% of the tasks in an average of 30 seconds each. With Design B, they successfully completed 90% of the tasks in an average of 27 seconds each. With which design were participants more efficient?

Probably Design B but you’ll need to do a significance test

Probably Design A but you’ll need to do a significance test

Definitely no difference in efficiency

Can't tell from the data given

A t-test comparing two sets of data in Excel returns a value of 3.91259E-11. How do you interpret that?

The means of the two sets of data are significantly different from each other.

The means of the two sets of data are NOT significantly different from each other.

There was an error in the calculation

We can’t determine if there is a significant difference between the means.

Self-reported metrics are primarily about which of the following?

What people say

What people do

The efficiency with which they accomplish tasks

None of these

Which of these would most likely represent an item on a Likert scale?

This quiz is easy: Strongly Agree .… Strongly Disagree

This quiz is: Easy …. Difficult

This quiz is: Easy …. Not Easy

Yes or No: This quiz is easy

True or False: In most cases it is acceptable to calculate an average of the ratings by all respondents to a question on a 7-point rating scale.

True

False

Which of the following is NOT a statement in the System Usability Scale (SUS)?

It took too many clicks to get to what I wanted

I think that I would like to use this system frequently.

I found the system unnecessarily complex.

I thought the system was easy to use.

How would a System Usability Scale (SUS) score of 45 be characterized?

Very bad

Very good

Average

Good

The Net Promoter Score (NPS) is based on the responses to just one question. What is it?

How likely is it that you would recommend [whatever] to a friend or colleague?

How useful did you find [whatever]?

How likely is that you will continue to use [whatever]?

How easy did you find [whatever] to use?

True or False: In Rolf Molich’s Comparative Usability Evaluation (CUE) studies, he has consistently found a high degree of correlation between the usability issues found by different teams when testing the same thing.

True

False

How many participants do you need for an effective usability test?

It depends on your objectives

True or False: It’s not possible to calculate confidence intervals for responses to purely survey-type questions (e.g., opinions).

True

False

MaxDiff is a method for determining which of the following?

Preference/importance scores for multiple items, such as brand preferences or product features

The difference between the lowest rating on a scale and the highest rating

The percentage of people who are considered “promoters”

The difference between the 10th percentile and the 90th percentile

Which of the following would NOT be an example of an efficiency measure?

Expectation vs. Experience ratings

# correct per minute

% correct per unit of time

Actual # of clicks compared to the optimum number

Which of these is a reasonable method for combining different metrics that have different scales (e.g., task success and task time)?

Converting each metric to a percentage and then combining them

Converting each metric to the mean and then combining them

Taking the geometric mean of the metrics and combining them

Calculating the correlation coefficient (r) between the metrics

True or False: When combining different metrics (e.g., task success, task time, and SUS scores) into an overall score you must give equal weight to each one.

True

False

True or False: Tree testing is one way of testing a menu system/information architecture.

True

False

Which of the following is least likely to be measured in a typical live-site A/B test?

System Usability Scale (SUS) ratings

Click-through rates

Conversion rates

Abandonment rates

Assume you want to know if the click rates for two different button treatments on a web page are significantly different from each other. What statistical test would be most appropriate for that?

Chi-square test

T-test

R-test

ANOVA

Which type of exercise is most useful for learning about the categories users would create in organizing a set of items?

Open card-sort

Closed card-sort

Tree test

Multi-dimensional Scaling (MDS) exercise

Which of the following are appropriate analyses for the data from a card-sorting exercise?

Hierarchical cluster analysis and MDS analysis

T-test and R-test

ANOVA and multiple regression

None of these

True or False: It’s not possible to calculate an ROI (Return on Investment) for UX or usability work.

True

False

If you are doing a study comparing two conditions, which study design will generally require more participants in total?

Between-subjects design

Within-subjects design

Repeated-measures design

Impossible to say

UX Metrics Quiz

More Quizzes