Which of the Following Is an Example of a Question That Can Be Answered Using Breakeven Analysis?
Introduction
The idea of creating machines which learn by themselves has been driving humans for decades now. For fulfilling that dream, unsupervised learning and clustering is the key. Unsupervised learning provides more flexibility, but is more challenging also.
Clustering plays an important part to describe insights from unlabeled data. It classifies the data in similar groups which improves various concern decisions past providing a meta agreement.
In this skill examination, nosotros tested our community on clustering techniques. A total of 1566 people registered in this skill test. If y'all missed taking the test, here is your opportunity for you lot to find out how many questions y'all could have answered correctly.
If yous are simply getting started with Unsupervised Learning, here are some comprehensive resources to assist y'all in your journey:
- Auto Learning Certification Course for Beginners
-
The Most Comprehensive Guide to Chiliad-Means Clustering You lot'll Always Need
- Certified AI & ML Blackbelt+ Program
Overall Results
Below is the distribution of scores, this will aid you evaluate your operation:
You can access your performance here. More than 390 people participated in the skill test and the highest score was 33. Hither are a few statistics about the distribution.
Overall distribution
Mean Score: 15.xi
Median Score: 15
Fashion Score: sixteen
Helpful Resources
An Introduction to Clustering and dissimilar methods of clustering
Getting your clustering right (Part I)
Getting your clustering correct (Part II)
Questions & Answers
Q1. Movie Recommendation systems are an example of:
- Nomenclature
- Clustering
- Reinforcement Learning
- Regression
Options:
B. A. 2 Only
C. one and two
D. 1 and 3
E. 2 and three
F. ane, ii and 3
H. 1, 2, 3 and 4
Solution: (E)
More often than not, motion picture recommendation systems cluster the users in a finite number of like groups based on their previous activities and profile. And then, at a fundamental level, people in the same cluster are fabricated similar recommendations.
In some scenarios, this can also be approached as a nomenclature problem for assigning the most appropriate movie class to the user of a specific group of users. Also, a pic recommendation system can be viewed as a reinforcement learning problem where it learns by its previous recommendations and improves the future recommendations.
Q2. Sentiment Analysis is an example of:
- Regression
- Classification
- Clustering
- Reinforcement Learning
Options:
A. 1 Only
B. 1 and 2
C. 1 and three
D. ane, two and three
E. ane, ii and 4
F. i, 2, 3 and 4
Solution: (E)
Sentiment analysis at the central level is the task of classifying the sentiments represented in an image, text or spoken language into a ready of defined sentiment classes like happy, deplorable, excited, positive, negative, etc. It can besides exist viewed as a regression problem for assigning a sentiment score of say 1 to ten for a corresponding image, text or speech.
Another fashion of looking at sentiment analysis is to consider it using a reinforcement learning perspective where the algorithm constantly learns from the accuracy of past sentiment analysis performed to improve the time to come operation.
Q3. Can determination copse be used for performing clustering?
A. Truthful
B. False
Solution: (A)
Decision copse can also exist used to for clusters in the data just clustering frequently generates natural clusters and is not dependent on any objective office.
Q4. Which of the post-obit is the most appropriate strategy for data cleaning before performing clustering analysis, given less than desirable number of information points:
- Capping and flouring of variables
- Removal of outliers
Options:
A. 1 just
B. two only
C. one and two
D. None of the above
Solution: (A)
Removal of outliers is not recommended if the information points are few in number. In this scenario, capping and flouring of variables is the nearly advisable strategy.
Q5. What is the minimum no. of variables/ features required to perform clustering?
A. 0
B. 1
C. two
D. 3
Solution: (B)
At least a unmarried variable is required to perform clustering analysis. Clustering analysis with a unmarried variable tin exist visualized with the help of a histogram.
Q6. For two runs of M-Mean clustering is information technology expected to get aforementioned clustering results?
A. Yes
B. No
Solution: (B)
Thou-Ways clustering algorithm instead converses on local minima which might also correspond to the global minima in some cases just non always. Therefore, it'south advised to run the K-Means algorithm multiple times earlier drawing inferences about the clusters.
However, note that it's possible to receive same clustering results from M-means by setting the same seed value for each run. Simply that is washed by simply making the algorithm choose the set of same random no. for each run.
Q7. Is it possible that Consignment of observations to clusters does not alter betwixt successive iterations in G-Means
A. Yes
B. No
C. Can't say
D. None of these
Solution: (A)
When the K-Means algorithm has reached the local or global minima, it will not alter the assignment of information points to clusters for ii successive iterations.
Q8. Which of the following can act equally possible termination conditions in K-Ways?
- For a fixed number of iterations.
- Consignment of observations to clusters does non modify betwixt iterations. Except for cases with a bad local minimum.
- Centroids practise not change betwixt successive iterations.
- End when RSS falls below a threshold.
Options:
A. ane, 3 and 4
B. 1, 2 and 3
C. 1, 2 and 4
D. All of the above
Solution: (D)
All four weather condition tin be used every bit possible termination status in Chiliad-Means clustering:
- This condition limits the runtime of the clustering algorithm, just in some cases the quality of the clustering volition be poor because of an bereft number of iterations.
- Except for cases with a bad local minimum, this produces a good clustering, but runtimes may exist unacceptably long.
- This as well ensures that the algorithm has converged at the minima.
- Finish when RSS falls below a threshold. This criterion ensures that the clustering is of a desired quality after termination. Practically, information technology's a good do to combine information technology with a jump on the number of iterations to guarantee termination.
Q9. Which of the following clustering algorithms suffers from the problem of convergence at local optima?
- Grand- Ways clustering algorithm
- Agglomerative clustering algorithm
- Expectation-Maximization clustering algorithm
- Diverse clustering algorithm
Options:
A. 1 only
B. 2 and 3
C. ii and 4
D. 1 and three
E. 1,2 and four
F. All of the in a higher place
Solution: (D)
Out of the options given, only K-Ways clustering algorithm and EM clustering algorithm has the drawback of converging at local minima.
Q10. Which of the post-obit algorithm is almost sensitive to outliers?
A. G-means clustering algorithm
B. K-medians clustering algorithm
C. K-modes clustering algorithm
D. K-medoids clustering algorithm
Solution: (A)
Out of all the options, K-Means clustering algorithm is most sensitive to outliers equally it uses the mean of cluster data points to notice the cluster center.
Q11. After performing Yard-Means Clustering analysis on a dataset, y'all observed the following dendrogram. Which of the post-obit conclusion can exist drawn from the dendrogram?
A. There were 28 information points in clustering analysis
B. The best no. of clusters for the analyzed data points is 4
C. The proximity office used is Average-link clustering
D. The in a higher place dendrogram estimation is non possible for K-Means clustering analysis
Solution: (D)
A dendrogram is not possible for Grand-Ways clustering analysis. Withal, one can create a cluster gram based on One thousand-Means clustering assay.
Q12. How can Clustering (Unsupervised Learning) be used to improve the accuracy of Linear Regression model (Supervised Learning):
- Creating different models for different cluster groups.
- Creating an input feature for cluster ids as an ordinal variable.
- Creating an input feature for cluster centroids as a continuous variable.
- Creating an input feature for cluster size as a continuous variable.
Options:
A. one merely
B. one and 2
C. 1 and 4
D. 3 only
E. 2 and four
F. All of the above
Solution: (F)
Creating an input feature for cluster ids as ordinal variable or creating an input feature for cluster centroids equally a continuous variable might non convey any relevant information to the regression model for multidimensional data. Simply for clustering in a single dimension, all of the given methods are expected to convey meaningful information to the regression model. For example, to cluster people in two groups based on their hair length, storing clustering ID equally ordinal variable and cluster centroids as continuous variables will convey meaningful information.
Q13. What could exist the possible reason(s) for producing two unlike dendrograms using agglomerative clustering algorithm for the same dataset?
A. Proximity office used
B. of information points used
C. of variables used
D. B and c but
E. All of the higher up
Solution: (E)
Change in either of Proximity function, no. of data points or no. of variables volition lead to dissimilar clustering results and hence unlike dendrograms.
Q14. In the figure beneath, if yous draw a horizontal line on y-axis for y=ii. What will be the number of clusters formed?
A. one
B. two
C. 3
D. 4
Solution: (B)
Since the number of vertical lines intersecting the red horizontal line at y=two in the dendrogram are two, therefore, two clusters will be formed.
Q15. What is the most appropriate no. of clusters for the data points represented by the following dendrogram:
A. 2
B. 4
C. 6
D. 8
Solution: (B)
The conclusion of the no. of clusters that tin can best depict different groups can exist called by observing the dendrogram. The all-time selection of the no. of clusters is the no. of vertical lines in the dendrogram cut past a horizontal line that can transverse the maximum distance vertically without intersecting a cluster.
In the higher up example, the best choice of no. of clusters will be 4 equally the scarlet horizontal line in the dendrogram beneath covers maximum vertical altitude AB.
Q16. In which of the following cases will K-Ways clustering neglect to give good results?
- Data points with outliers
- Data points with different densities
- Data points with round shapes
- Data points with not-convex shapes
Options:
A. ane and 2
B. 2 and 3
C. ii and four
D. ane, 2 and iv
E. i, 2, 3 and 4
Solution: (D)
Thousand-Means clustering algorithm fails to give good results when the information contains outliers, the density spread of data points across the data space is unlike and the data points follow non-convex shapes.
Q17. Which of the post-obit metrics, do we have for finding contrast between two clusters in hierarchical clustering?
- Unmarried-link
- Complete-link
- Boilerplate-link
Options:
A. i and 2
B. ane and 3
C. 2 and 3
D. one, ii and three
Solution: (D)
All of the three methods i.east. unmarried link, consummate link and boilerplate link tin can be used for finding contrast betwixt two clusters in hierarchical clustering.
Q18. Which of the following are true?
- Clustering assay is negatively affected by multicollinearity of features
- Clustering assay is negatively affected past heteroscedasticity
Options:
A. 1 only
B. 2 simply
C. 1 and 2
D. None of them
Solution: (A)
Clustering analysis is not negatively affected by heteroscedasticity but the results are negatively impacted past multicollinearity of features/ variables used in clustering as the correlated characteristic/ variable will comport extra weight on the distance calculation than desired.
Q19. Given, vi points with the following attributes:
Which of the post-obit clustering representations and dendrogram depicts the use of MIN or Single link proximity function in hierarchical clustering:
A.
B.
C.
D.
Solution: (A)
For the single link or MIN version of hierarchical clustering, the proximity of ii clusters is defined to be the minimum of the altitude between any two points in the different clusters. For instance, from the table, we see that the distance between points 3 and half-dozen is 0.xi, and that is the height at which they are joined into one cluster in the dendrogram. Equally some other example, the distance between clusters {3, half dozen} and {2, 5} is given past dist({3, six}, {two, 5}) = min(dist(3, 2), dist(vi, two), dist(3, 5), dist(6, 5)) = min(0.1483, 0.2540, 0.2843, 0.3921) = 0.1483.
Q20 Given, six points with the following attributes:
Which of the following clustering representations and dendrogram depicts the use of MAX or Complete link proximity role in hierarchical clustering:
A.
B.
C.
D.
Solution: (B)
For the single link or MAX version of hierarchical clustering, the proximity of two clusters is divers to exist the maximum of the distance betwixt whatsoever two points in the different clusters. Similarly, here points 3 and 6 are merged first. Notwithstanding, {iii, 6} is merged with {4}, instead of {2, five}. This is because the dist({3, vi}, {4}) = max(dist(iii, 4), dist(six, iv)) = max(0.1513, 0.2216) = 0.2216, which is smaller than dist({3, six}, {two, 5}) = max(dist(three, 2), dist(half dozen, 2), dist(3, v), dist(six, 5)) = max(0.1483, 0.2540, 0.2843, 0.3921) = 0.3921 and dist({3, half dozen}, {1}) = max(dist(iii, ane), dist(6, 1)) = max(0.2218, 0.2347) = 0.2347.
Q21 Given, half dozen points with the following attributes:
Which of the following clustering representations and dendrogram depicts the use of Group average proximity function in hierarchical clustering:
A.
B.
C.
D.
Solution: (C)
For the group average version of hierarchical clustering, the proximity of two clusters is defined to be the average of the pairwise proximities between all pairs of points in the unlike clusters. This is an intermediate approach between MIN and MAX. This is expressed past the following equation:
Here, the distance between some clusters. dist({3, 6, 4}, {1}) = (0.2218 + 0.3688 + 0.2347)/(3 ∗ ane) = 0.2751. dist({ii, five}, {i}) = (0.2357 + 0.3421)/(ii ∗ 1) = 0.2889. dist({3, 6, 4}, {two, 5}) = (0.1483 + 0.2843 + 0.2540 + 0.3921 + 0.2042 + 0.2932)/(6∗one) = 0.2637. Because dist({3, half dozen, 4}, {ii, 5}) is smaller than dist({iii, six, 4}, {1}) and dist({two, 5}, {1}), these ii clusters are merged at the 4th stage
Q22. Given, six points with the following attributes:
Which of the post-obit clustering representations and dendrogram depicts the use of Ward'due south method proximity function in hierarchical clustering:
A.
B.
C.
D.
Solution: (D)
Ward method is a centroid method. Centroid method calculates the proximity between two clusters by calculating the altitude between the centroids of clusters. For Ward's method, the proximity between ii clusters is defined equally the increment in the squared fault that results when two clusters are merged. The results of applying Ward's method to the sample data set of six points. The resulting clustering is somewhat different from those produced by MIN, MAX, and group average.
Q23. What should be the best selection of no. of clusters based on the post-obit results:
A. i
B. 2
C. 3
D. 4
Solution: (C)
The silhouette coefficient is a measure out of how similar an object is to its own cluster compared to other clusters. Number of clusters for which silhouette coefficient is highest represents the all-time option of the number of clusters.
Q24. Which of the following is/are valid iterative strategy for treating missing values before clustering analysis?
A. Imputation with mean
B. Nearest Neighbor assignment
C. Imputation with Expectation Maximization algorithm
D. All of the above
Solution: (C)
All of the mentioned techniques are valid for treating missing values before clustering analysis but simply imputation with EM algorithm is iterative in its functioning.
Q25. K-Mean algorithm has some limitations. One of the limitation information technology has is, it makes hard assignments(A signal either completely belongs to a cluster or non belongs at all) of points to clusters.
Note: Soft assignment can be consider as the probability of being assigned to each cluster: say Thousand = 3 and for some signal xn, p1 = 0.vii, p2 = 0.2, p3 = 0.1)
Which of the following algorithm(s) allows soft assignments?
- Gaussian mixture models
- Fuzzy M-means
Options:
A. 1 only
B. ii simply
C. 1 and 2
D. None of these
Solution: (C)
Both, Gaussian mixture models and Fuzzy K-means allows soft assignments.
Q26. Assume, you want to cluster 7 observations into three clusters using Grand-Means clustering algorithm. Later first iteration clusters, C1, C2, C3 has following observations:
C1: {(2,2), (4,iv), (vi,six)}
C2: {(0,4), (4,0)}
C3: {(5,5), (nine,9)}
What volition be the cluster centroids if you desire to continue for second iteration?
A. C1: (4,four), C2: (2,2), C3: (7,7)
B. C1: (half-dozen,6), C2: (4,four), C3: (9,9)
C. C1: (2,2), C2: (0,0), C3: (5,v)
D. None of these
Solution: (A)
Finding centroid for data points in cluster C1 = ((two+four+6)/iii, (2+4+6)/3) = (four, 4)
Finding centroid for information points in cluster C2 = ((0+4)/2, (4+0)/2) = (2, 2)
Finding centroid for information points in cluster C3 = ((5+9)/2, (5+9)/2) = (7, vii)
Hence, C1: (four,4), C2: (two,2), C3: (vii,7)
Q27. Assume, you desire to cluster 7 observations into iii clusters using Grand-Means clustering algorithm. After first iteration clusters, C1, C2, C3 has following observations:
C1: {(two,2), (4,4), (6,6)}
C2: {(0,4), (4,0)}
C3: {(5,5), (9,nine)}
What will be the Manhattan distance for observation (9, 9) from cluster centroid C1. In 2d iteration.
A. 10
B. 5*sqrt(2)
C. thirteen*sqrt(2)
D. None of these
Solution: (A)
Manhattan distance betwixt centroid C1 i.east. (4, 4) and (9, 9) = (nine-four) + (9-4) = 10
Q28. If two variables V1 and V2, are used for clustering. Which of the following are true for K means clustering with thou =iii?
- If V1 and V2 has a correlation of one, the cluster centroids will be in a direct line
- If V1 and V2 has a correlation of 0, the cluster centroids will be in directly line
Options:
A. 1 only
B. 2 only
C. 1 and two
D. None of the to a higher place
Solution: (A)
If the correlation between the variables V1 and V2 is ane, then all the information points will exist in a straight line. Hence, all the iii cluster centroids will class a straight line besides.
Q29. Feature scaling is an of import step before applying Grand-Hateful algorithm. What is reason behind this?
A. In altitude calculation it volition give the same weights for all features
B. Y'all ever get the aforementioned clusters. If yous use or don't use characteristic scaling
C. In Manhattan altitude it is an important step but in Euclidian it is not
D. None of these
Solution; (A)
Characteristic scaling ensures that all the features get same weight in the clustering analysis. Consider a scenario of clustering people based on their weights (in KG) with range 55-110 and height (in inches) with range 5.six to six.4. In this case, the clusters produced without scaling can be very misleading as the range of weight is much higher than that of height. Therefore, its necessary to bring them to same scale and then that they have equal weightage on the clustering result.
Q30. Which of the post-obit method is used for finding optimal of cluster in Yard-Mean algorithm?
A. Elbow method
B. Manhattan method
C. Ecludian mehthod
D. All of the to a higher place
E. None of these
Solution: (A)
Out of the given options, but elbow method is used for finding the optimal number of clusters. The elbow method looks at the per centum of variance explained every bit a part of the number of clusters: One should choose a number of clusters and so that adding another cluster doesn't give much ameliorate modeling of the data.
Q31. What is true well-nigh G-Mean Clustering?
- One thousand-ways is extremely sensitive to cluster center initializations
- Bad initialization can lead to Poor convergence speed
- Bad initialization tin can lead to bad overall clustering
Options:
A. i and 3
B. 1 and 2
C. 2 and 3
D. 1, 2 and iii
Solution: (D)
All iii of the given statements are true. Thousand-means is extremely sensitive to cluster center initialization. Too, bad initialization can pb to Poor convergence speed too as bad overall clustering.
Q32. Which of the following tin be applied to get good results for K-means algorithm corresponding to global minima?
- Try to run algorithm for dissimilar centroid initialization
- Adjust number of iterations
- Notice out the optimal number of clusters
Options:
A. 2 and 3
B. 1 and iii
C. 1 and 2
D. All of in a higher place
Solution: (D)
All of these are standard practices that are used in social club to obtain good clustering results.
Q33. What should be the best choice for number of clusters based on the following results:
A. five
B. 6
C. fourteen
D. Greater than 14
Solution: (B)
Based on the above results, the best option of number of clusters using elbow method is 6.
Q34. What should exist the all-time selection for number of clusters based on the following results:
A. 2
B. 4
C. vi
D. 8
Solution: (C)
Generally, a higher average silhouette coefficient indicates better clustering quality. In this plot, the optimal clustering number of grid cells in the study area should be 2, at which the value of the average silhouette coefficient is highest. However, the SSE of this clustering solution (k = 2) is besides large. At k = half dozen, the SSE is much lower. In addition, the value of the average silhouette coefficient at k = half-dozen is also very loftier, which is just lower than k = 2. Thus, the all-time option is thousand = 6.
Q35. Which of the following sequences is correct for a K-Means algorithm using Forgy method of initialization?
- Specify the number of clusters
- Assign cluster centroids randomly
- Assign each data indicate to the nearest cluster centroid
- Re-assign each bespeak to nearest cluster centroids
- Re-compute cluster centroids
Options:
A. 1, 2, iii, 5, four
B. 1, 3, 2, 4, 5
C. ii, one, three, 4, 5
D. None of these
Solution: (A)
The methods used for initialization in Yard means are Forgy and Random Sectionalization. The Forgy method randomly chooses k observations from the data set and uses these as the initial means. The Random Sectionalization method first randomly assigns a cluster to each ascertainment then proceeds to the update step, thus computing the initial hateful to be the centroid of the cluster's randomly assigned points.
Q36. If you are using Multinomial mixture models with the expectation-maximization algorithm for clustering a gear up of data points into two clusters, which of the assumptions are of import:
A. All the information points follow two Gaussian distribution
B. All the data points follow n Gaussian distribution (n >two)
C. All the data points follow two multinomial distribution
D. All the data points follow north multinomial distribution (n >ii)
Solution: (C)
In EM algorithm for clustering its essential to choose the aforementioned no. of clusters to classify the data points into as the no. of different distributions they are expected to be generated from and too the distributions must be of the same type.
Q37. Which of the following is/are not truthful about Centroid based K-Means clustering algorithm and Distribution based expectation-maximization clustering algorithm:
- Both starts with random initializations
- Both are iterative algorithms
- Both take potent assumptions that the information points must fulfill
- Both are sensitive to outliers
- Expectation maximization algorithm is a special case of Grand-Ways
- Both requires prior knowledge of the no. of desired clusters
- The results produced by both are non-reproducible.
Options:
A. one only
B. 5 only
C. 1 and 3
D. 6 and vii
East. four, 6 and 7
F. None of the in a higher place
Solution: (B)
All of the above statements are true except the fiveth as instead One thousand-Means is a special case of EM algorithm in which just the centroids of the cluster distributions are calculated at each iteration.
Q38. Which of the following is/are not true about DBSCAN clustering algorithm:
- For data points to be in a cluster, they must be in a distance threshold to a core point
- It has strong assumptions for the distribution of data points in dataspace
- Information technology has substantially high time complexity of order O(n3)
- It does not require prior noesis of the no. of desired clusters
- It is robust to outliers
Options:
A. i just
B. 2 merely
C. 4 but
D. 2 and 3
E. 1 and 5
F. 1, 3 and 5
Solution: (D)
- DBSCAN tin course a cluster of any arbitrary shape and does not have strong assumptions for the distribution of data points in the dataspace.
- DBSCAN has a low time complexity of order O(n log n) only.
Q39. Which of the following are the high and low premises for the existence of F-Score?
A. [0,1]
B. (0,1)
C. [-1,one]
D. None of the above
Solution: (A)
The lowest and highest possible values of F score are 0 and 1 with 1 representing that every information point is assigned to the correct cluster and 0 representing that the precession and/ or recall of the clustering assay are both 0. In clustering analysis, loftier value of F score is desired.
Q40. Following are the results observed for clustering 6000 data points into three clusters: A, B and C:
What is the Fane-Score with respect to cluster B?
A. 3
B. 4
C. v
D. 6
Solution: (D)
Hither,
True Positive, TP = 1200
True Negative, TN = 600 + 1600 = 2200
False Positive, FP = k + 200 = 1200
Simulated Negative, FN = 400 + 400 = 800
Therefore,
Precision = TP / (TP + FP) = 0.five
Remember = TP / (TP + FN) = 0.6
Hence,
F1 = 2 * (Precision * Recall)/ (Precision + recall) = 0.54 ~ 0.5
End Notes
I hope you lot enjoyed taking the test and institute the solutions helpful. The test focused on conceptual as well as practical knowledge of clustering fundamentals and its various techniques.
I tried to clear all your doubts through this article, just if we have missed out on something then let united states know in comments beneath. Also, If you accept any suggestions or improvements you remember we should brand in the side by side skilltest, you can let us know past dropping your feedback in the comments section.
Larn, compete, hack and get hired!
Source: https://www.analyticsvidhya.com/blog/2017/02/test-data-scientist-clustering/
0 Response to "Which of the Following Is an Example of a Question That Can Be Answered Using Breakeven Analysis?"
Post a Comment