UX Metrics Quiz
In a usability study, task time, success rate, and subjective ratings of ease of use would all be examples of what?
Dependent variables
Independent variables
Control variables
Ordinal variables
Assume you had 6 participants perform a task and the times they took were 34, 28, 36, 28, 35, and 30 seconds. Which of the following would be the most appropriate way to report the average, or mean, time?
31.8 secs
31.83 secs
31.833 secs
31.8333 secs
If you increase the sample size for the calculation of a mean, what will probably happen to the confidence interval for that mean?
It will get smaller
It will get larger
No change
Impossible to know
Assume you calculate both a 90% and 95% confidence interval for a mean. What can you say about their relative sizes?
The 90% confidence interval will be smaller
The 95% confidence interval will be smaller
They will be the same
Impossible to say which will be smaller
If the 90% confidence intervals for two means from independent samples do NOT overlap with each other, what can you say about the relationship between those means?
They are significantly different from each other at the 90% level.
They might be significantly different from each other but you will have to do a t-test to be sure.
They are not significantly different from each other
Impossible to say anything about their relationship.
True or False: A t-test can be used to determine if two means are significantly different from each other.
True
False
What is the median of a set of numbers?
The middle number when they’re put in order
The arithmetic average of the numbers
The most frequent number
The highest number
What’s wrong with this graph?
It should be a bar or column graph instead of a line graph.
It should not have error bars
There's nothing wrong with it
It should be a pie chart instead of a line graph
What would the correlation coefficient (r) be for these two sets of data?
+1.0
-1.0
0
Impossible to tell
You calculate the correlation coefficient between two variables and find that r=0.74. Is that statistically significant?
Can’t tell from the data given
Yes
Yes, if both variables are continuous
No
Which of the following statements is correct?
Formative usability testing is iterative; summative usability testing is generally not.
Summative usability testing is iterative; formative usability testing is generally not.
Summative usability testing is good; formative usability testing is not so good.
Formative usability testing is slow; summative usability testing is fast.
Performance metrics are primarily about which of the following?
What people do
What people say
What people think
What people feel
True or False: You can calculate task success across tasks for each participant or across participants for each task.
True
False
True or False: In a within-subjects design each participant is being compared to himself across different conditions.
True
False
When task success is scored for each task as either a success or failure, this is an example of what kind of data?
Binary data
Continuous data
Ordinal data
Likert data
Task time data, especially from an online study, commonly has outliers (e.g., someone fell asleep during a task!). Which of these is NOT an appropriate way of identifying outliers?
Dropping any values above the geometric mean
Identifying natural breaks in the data
Calculating +/- 2 or 3 standard deviations from the mean
Using pre-defined thresholds
With Design A, participants successfully completed 86% of the tasks in an average of 30 seconds each. With Design B, they successfully completed 90% of the tasks in an average of 27 seconds each. With which design were participants more efficient?
Probably Design B but you’ll need to do a significance test
Probably Design A but you’ll need to do a significance test
Definitely no difference in efficiency
Can't tell from the data given
A t-test comparing two sets of data in Excel returns a value of 3.91259E-11. How do you interpret that?
The means of the two sets of data are significantly different from each other.
The means of the two sets of data are NOT significantly different from each other.
There was an error in the calculation
We can’t determine if there is a significant difference between the means.
Self-reported metrics are primarily about which of the following?
What people say
What people do
The efficiency with which they accomplish tasks
None of these
Which of these would most likely represent an item on a Likert scale?
This quiz is easy: Strongly Agree .… Strongly Disagree
This quiz is: Easy …. Difficult
This quiz is: Easy …. Not Easy
Yes or No: This quiz is easy
True or False: In most cases it is acceptable to calculate an average of the ratings by all respondents to a question on a 7-point rating scale.
True
False
Which of the following is NOT a statement in the System Usability Scale (SUS)?
It took too many clicks to get to what I wanted
I think that I would like to use this system frequently.
I found the system unnecessarily complex.
I thought the system was easy to use.
The Net Promoter Score (NPS) is based on the responses to just one question. What is it?
How likely is it that you would recommend [whatever] to a friend or colleague?
How useful did you find [whatever]?
How likely is that you will continue to use [whatever]?
How easy did you find [whatever] to use?
True or False: In Rolf Molich’s Comparative Usability Evaluation (CUE) studies, he has consistently found a high degree of correlation between the usability issues found by different teams when testing the same thing.
True
False
How many participants do you need for an effective usability test?
It depends on your objectives
6
10
20
True or False: It’s not possible to calculate confidence intervals for responses to purely survey-type questions (e.g., opinions).
True
False
MaxDiff is a method for determining which of the following?
Preference/importance scores for multiple items, such as brand preferences or product features
The difference between the lowest rating on a scale and the highest rating
The percentage of people who are considered “promoters”
The difference between the 10th percentile and the 90th percentile
Which of the following would NOT be an example of an efficiency measure?
Expectation vs. Experience ratings
# correct per minute
% correct per unit of time
Actual # of clicks compared to the optimum number
Which of these is a reasonable method for combining different metrics that have different scales (e.g., task success and task time)?
Converting each metric to a percentage and then combining them
Converting each metric to the mean and then combining them
Taking the geometric mean of the metrics and combining them
Calculating the correlation coefficient (r) between the metrics
True or False: When combining different metrics (e.g., task success, task time, and SUS scores) into an overall score you must give equal weight to each one.
True
False
Which of the following is least likely to be measured in a typical live-site A/B test?
System Usability Scale (SUS) ratings
Click-through rates
Conversion rates
Abandonment rates
Assume you want to know if the click rates for two different button treatments on a web page are significantly different from each other. What statistical test would be most appropriate for that?
Chi-square test
T-test
R-test
ANOVA
Which type of exercise is most useful for learning about the categories users would create in organizing a set of items?
Open card-sort
Closed card-sort
Tree test
Multi-dimensional Scaling (MDS) exercise
Which of the following are appropriate analyses for the data from a card-sorting exercise?
Hierarchical cluster analysis and MDS analysis
T-test and R-test
ANOVA and multiple regression
None of these
{"name":"UX Metrics Quiz", "url":"https://www.quiz-maker.com/QPREVIEW","txt":"In a usability study, task time, success rate, and subjective ratings of ease of use would all be examples of what?, True or False: You can calculate means with ordinal data., True or False: You can calculate means with interval data.","img":"https://www.quiz-maker.com/3012/images/ogquiz.png"}