Data Mining Quiz Night 1

Create an infographic-style image depicting themes related to data mining, including graphs, algorithms, and data analysis tools.

Data Mining Quiz Night 1

Test your knowledge in the exciting field of data mining! This quiz covers various essential concepts including instance-based learning, DBSCAN, association rules, and more.

Challenge yourself with multiple-choice questions that will help you gauge your understanding:

  • Instance-Based Learning
  • Clustering Techniques
  • Association Rules
  • N-grams and Probability
13 Questions3 MinutesCreated by MiningMaster321
Which of the following is FALSE regarding Instance based learning
Hypothesis complexity can grow with the data
Classification costs are low
Constructs hypotheses directly from the training instances
RBF networks are an example of instance based learning
Time complexity of this algorithm depends upon the size of training data
What is the shape of the isodensity contours for the following covariance matrix:
Parallel to the variable axes and elongated along the the x_3 axis
elongated along vectors defined as linear combinations of x_1, x_2
Elongated along vectors defined as linear combinations of x_2, x_3
Elongated along vectors defined as linear combinations of x_1, x_2, x_3
Which of the following statement is FALSE? Lift
First sort the instances in descending order of probabilities
The smaller the lift factor, the better
The lift factor is the increase in positives instances in a sample vs the overall positive rate in the population
In general to plot the lift curve, we use the sample size as x-axis and number of positives as Y-axis
Allow to evaluate a classifier by considering subserts of the instances
The probability of a given N-gram within a sequence of words is computed using the:
Markov chain rule
The accuracy
Factorization
Retrieval by content
TF-IDF Score
Which of the following statement is TRUE?
Clustering is a supervised learning task
We aim to maximize the within cluster distance metric
Fuzzy clustering is interesting when we want to classify instances belonging to at most 1 cluster
In DBScan clustering the disgarded points are called noise points
The preferred distance metric for nominal attribute values is manhattan distance
For DBSCAN Parameter Selection, why is the value of Eps given MinPts=4 and what would be more likely to happen if this value increases?
10; the number of clusters would decrease
10; the number of clusters would increase
30; the number of clusters would decrease
30; the number of clusters would increase
Which of the following affirmations about instance based learning is FALSE:
It is time efficent in making predictions
It is easy to add new instances to the "model"
Does not make assumptions about the data
Can be memory intensive
Regarding association rules, which of the statements is FALSE:
Confidence of a rule is the number of instances satisfying the right hand side of a rule percentage of all the instances
Coverage is the number of instances with all the items of the rule
If an item has insufficient coverage, the apriori algo won't compute k-items set containing it
Support is the proportion of instances containg all items of the rule
Association rules are similar to classification rules but they aren't intended to be used together as a whole
Which of the following is a non-adaptive transformation for time series:
Piecewise Linear Approximation
Discrete Fourier Transformation
Singular Value Decomposition
Principal Component Analysis
Support calculates:
Calculate the confidence of all possible rules given the frequent itemsets
The percentage of transactions that contain all of the items in an item set
The probability that a transaction that contains the items on the left hand side of the rule also contains the item on the right hand side
The probability of all of the items in a rule occurring together divided by the product of the probabilities of the items on the left and right hand side occurring as if there was no association between them
What Algorithm does Shazam use to identify songs?
Looks at the anchor peak pairs of the song.
Looks at the nearest neighbors using the frequency of the song with songs in the database
Matches anchor points between the song and songs in the database
Matches the constellation plot of the new song with the songs in the database.(edited)
Which among these is FALSE about histograms?
It reveals data quality problems
They can handle data in multiple dimensions
They can provide valuable information such as outliers.
They are a non parametric model
Which of the following is FALSE regarding the convolution operation in image processing. For each pixel (x,y):
Multiply the corresponding mask and pixel values
Multiply the corresponding mask and pixel values
Average the products to perform max pooling
Sum these products to compute the new pixel values
{"name":"Data Mining Quiz Night 1", "url":"https://www.quiz-maker.com/QPREVIEW","txt":"Test your knowledge in the exciting field of data mining! This quiz covers various essential concepts including instance-based learning, DBSCAN, association rules, and more. Challenge yourself with multiple-choice questions that will help you gauge your understanding:Instance-Based LearningClustering TechniquesAssociation RulesN-grams and Probability","img":"https:/images/course8.png"}
Powered by: Quiz Maker