Exam: DP-100: Designing and Implementing a Data Science Solution on Azure

Total Questions: 304
Page of

DRAG DROP -

You are planning to host practical training to acquaint staff with Docker for Windows.

Staff devices must support the installation of Docker.

Which of the following are requirements for this installation? Answer by dragging the correct options from the list to the answer area.

Select and Place:
Question image
Answer:
Answer image

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements.
You have been tasked with evaluating your model on a partial data sample via k-fold cross-validation.
You have already configured a k parameter as the number of splits. You now have to configure the k parameter for the cross-validation with the usual value choice.
Recommendation: You configure the use of the value k=3.
Will the requirements be satisfied?
A.
Yes
B.
No
Answer: A ✅ Explanation -Scenario Recap: -You need to evaluate your model on a partial data sample using k-fold cross-validation. -You must set the k parameter (number of folds). -The question asks whether k = 3 is a usual/typical choice. Analysis: -k-fold cross-validation is used to partition the data into k subsets (“folds”). -The most common/default value of k is usually k = 5 or k = 10, because: k=5 or k=10 offers a good balance between bias and variance. k=3 is valid and will work, but is less commonly recommended, because it results in fewer folds and therefore higher variance of the estimate. -Using k=3 is sometimes acceptable if data volume is very limited or training time is a major constraint. -Since the requirement mentions the usual value choice, k=3 is not typically the usual choice.

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements.
You have been tasked with evaluating your model on a partial data sample via k-fold cross-validation.
You have already configured a k parameter as the number of splits. You now have to configure the k parameter for the cross-validation with the usual value choice.
Recommendation: You configure the use of the value k=10.
Will the requirements be satisfied?

A.
Yes
B.
No
Answer: A Yes—the requirements will be satisfied. ✅ Explanation: The usual/typical values of k in k-fold cross-validation are k=5 or k=10. Using k=10 is a very common choice in practice, as it offers a good balance between bias and variance of the performance estimate. ✅ Answer: Yes, this recommendation satisfies the requirements because k=10 is a usual value choice for k-fold cross-validation.

You construct a machine learning experiment via Azure Machine Learning Studio.
You would like to split data into two separate datasets.
Which of the following actions should you take?
A.
You should make use of the Split Data module.
B.
You should make use of the Group Categorical Values module.
C.
You should make use of the Clip Values module.
D.
You should make use of the Group Data into Bins module.
Answer:A ✅ Explanation Split Data module is specifically designed to divide your dataset into two subsets (e.g., training and testing data, or other partitions). The other modules do not split datasets: Group Categorical Values: groups categories together (e.g., combining categories). Clip Values: limits numeric values to a specified range. Group Data into Bins: converts continuous values into discrete bins (binning).

You have been tasked with creating a new Azure pipeline via the Machine Learning designer.
You have to makes sure that the pipeline trains a model using data in a comma-separated values (CSV) file that is published on a website. A dataset for the file for this file does not exist.
Data from the CSV file must be ingested into the designer pipeline with the least amount of administrative effort as possible.
Which of the following actions should you take?
A.
You should make use of the Convert to TXT module.
B.
You should add the Copy Data object to the pipeline.
C.
You should add the Import Data object to the pipeline.
D.
You should add the Dataset object to the pipeline.
Answer: C ✅ Explanation Scenario Recap: You need to ingest data from a CSV file on a website. There is no dataset created yet. You want to ingest this data into your Azure ML Designer pipeline with the least administrative effort. Analysis of options: ✅ C. Import Data object The Import Data module lets you directly pull data from a URL (HTTP/HTTPS) without having to first create and register a Dataset manually. This is exactly designed for this scenario and is the simplest way to ingest web-hosted CSV data. Minimal administrative overhead.

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements.
You are in the process of creating a machine learning model. Your dataset includes rows with null and missing values.
You plan to make use of the Clean Missing Data module in Azure Machine Learning Studio to detect and fix the null and missing values in the dataset.
Recommendation: You make use of the Replace with median option.
Will the requirements be satisfied?

A.
Yes
B.
No
Answer: A ✅ Explanation The Clean Missing Data module in Azure Machine Learning Studio is explicitly designed to detect and handle missing/null values. One of the options it offers is Replace with median, which replaces missing values in each column with the median of the non-missing values. This is a standard, valid technique for imputing missing data and will meet your goal of detecting and fixing null and missing values.

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements.
You are in the process of creating a machine learning model. Your dataset includes rows with null and missing values.
You plan to make use of the Clean Missing Data module in Azure Machine Learning Studio to detect and fix the null and missing values in the dataset.
Recommendation: You make use of the Custom substitution value option.
Will the requirements be satisfied?

A.
Yes
B.
No
Answer: A ✅ Explanation The Clean Missing Data module is specifically designed to detect and handle null or missing values. The Custom substitution value option allows you to replace missing values with a value you specify (for example, 0, “Unknown,” or another default). This fulfills the requirement to detect and fix null and missing values in your dataset.

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements.
You are in the process of creating a machine learning model. Your dataset includes rows with null and missing values.
You plan to make use of the Clean Missing Data module in Azure Machine Learning Studio to detect and fix the null and missing values in the dataset.
Recommendation: You make use of the Remove entire row option.

Will the requirements be satisfied?

A.
Yes
B.
No
Answer: A ✅ Explanation -The Clean Missing Data module can detect rows with null or missing values. -The Remove entire row option instructs the module to delete any row that contains at least one missing value, effectively cleaning the data. -This meets the requirement to detect and fix (in this case, by removing) the missing/null values.

You need to consider the underlined segment to establish whether it is accurate.
To transform a categorical feature into a binary indicator, you should make use of the Clean Missing Data module.
Select `No adjustment required` if the underlined segment is accurate. If the underlined segment is inaccurate, select the accurate option.
A.
No adjustment required.
B.
Convert to Indicator Values
C.
Apply SQL Transformation
D.
Group Categorical Values
Answer: B ✅ Explanation ✅ Analysis: Clean Missing Data is used for detecting and handling missing or null values, not for encoding categories. To transform (encode) categorical values into binary indicators (dummy variables), you must use: ✅ Convert to Indicator Values Therefore, the underlined segment is inaccurate.

You need to consider the underlined segment to establish whether it is accurate.
To improve the amount of low incidence cases in a dataset, you should make use of the SMOTE module.
Select `No adjustment required` if the underlined segment is accurate. If the underlined segment is inaccurate, select the accurate option.

A.
No adjustment required.
B.
Remove Duplicate Rows
C.
Join Data
D.
Edit Metadata
Answer: A ✅ Explanation ✅ Analysis: SMOTE stands for Synthetic Minority Over-sampling Technique. It is used specifically to increase the number of low-incidence (minority class) cases by creating synthetic samples. This technique helps address class imbalance in datasets.