Introduction to Microsoft Certified - Azure Data Engineer Associate Exam
- Katy Morgan 
- Jun 3, 2021
- 6 min read
The Microsoft DP-200 Exam is challenging and thorough preparation is essential for success. This exam study guide is designed to help you prepare for the Implementing an Azure Data Solution certification exam. It contains a detailed list of the topics covered on the Professional exam, as well as a detailed list of preparation resources. These study guides for Microsoft Implementing an Azure Data Solution will help guide you through the study process for your certification.
DP-200 Microsoft Implementing an Azure Data Solution Exam Summary
● Exam Name: Microsoft Implementing an Azure Data Solution
● Exam Code: DP-200
● Exam Price: $165 (USD)
● Duration: 120 mins
● Number of Questions: 40-60
● Passing Score: 700 / 1000
● Books / Training: Course DP-200T01-A: Implementing an Azure Data Solution
● Schedule Exam: Pearson VUE
● Sample Questions: Microsoft Implementing an Azure Data Solution Sample Questions
● Recommended Practice: Microsoft DP-200 Certification Practice Exam
Exam Syllabus: DP-200 Microsoft Certified - Azure Data Engineer Associate
1. Implement Data Storage Solutions (40-45%)
● Implement non-relational data stores
- implement a solution that uses Cosmos DB, Data Lake Storage Gen2, or Blob storage
- implement data distribution and partitions
- implement a consistency model in Cosmos DB
- provision a non-relational data store
- provide access to data to meet security requirements
- implement for high availability, disaster recovery, and global distribution
● Implement relational data stores
- provide access to data to meet security requirements
- implement for high availability and disaster recovery
- implement data distribution and partitions for Azure Synapse Analytics
- implement PolyBase
● Manage data security
- implement data masking
- encrypt data at rest and in motion
2. Manage and Develop Data Processing (25-30%)
● Develop batch processing solutions
- develop batch processing solutions by using Data Factory and Azure Databricks
- ingest data by using PolyBase
- implement the integration runtime for Data Factory
- create linked services and datasets
- create pipelines and activities
- create and schedule triggers
- implement Azure Databricks clusters, notebooks, jobs, and autoscaling
- ingest data into Azure Databricks
● Develop streaming solutions
- configure input and output
- select the appropriate built-in functions
- implement event processing by using Stream Analytics
3. Monitor and Optimize Data Solutions (30-35%)
● Monitor data storage
- monitor relational and non-relational data stores
- implement Blob storage monitoring
- implement Data Lake Storage Gen2 monitoring
- implement Azure Synapse Analytics monitoring
- implement Cosmos DB monitoring
- configure Azure Monitor alerts
- implement auditing by using Azure Log Analytics
● Monitor data processing
- monitor Data Factory pipelines
- monitor Azure Databricks
- monitor Stream Analytics
- configure Azure Monitor alerts
- implement auditing by using Azure Log Analytics
● Optimize of Azure data solutions
- troubleshoot data partitioning bottlenecks
- optimize Data Lake Storage Gen2
- optimize Stream Analytics
- optimize Azure Synapse Analytics
- manage the data lifecycle
Microsoft DP-200 Certification Sample Questions and Answers
To make you familiar with Microsoft Implementing an Azure Data Solution (DP-200) certification exam structure, we have prepared this sample question set. We suggest you to try our Sample Questions for Implementing an Azure Data Solution DP-200 Certification to test your understanding of Microsoft DP-200process with the real Microsoft certification exam environment.
DP-200 Microsoft Implementing an Azure Data Solution Sample Questions:-
01. You are migrating a corporate research analytical solution from an internal data center to Azure. 45 TB of research data is currently stored in an on-premises Hadoop cluster. You plan to copy it to Azure Storage.
Your internal data center is connected to your Azure Virtual Network (VNet) with Express Route private peering. The Azure Storage service endpoint is accessible from the same VNet.Corporate policy dictates that the research data cannot be transferred over public internet.
You need to securely migrate the research data online. What should you do?
a) Transfer the data using Azure Data Factory in distributed copy (DistCopy) mode, with an Azure Data Factory self-hosted Integration Runtime (IR) machine installed in the on-premises datacenter.
b) Transfer the data using Azure Data Factory in native Integration Runtime (IR) mode, with an Azure Data Factory self-hosted IR machine installed on the Azure VNet.
c) Transfer the data using Azure Data Box Heavy devices.
d) Transfer the data using Azure Data Box Disk devices.
02. Which offering provides scale-out parallel processing and dramatically accelerates performance of analytics clusters when integrated with the IBM Flash System?
a) IBM Cloud Object Storage
b) IBM Spectrum Accelerate
c) IBM Spectrum Scale
d) IBM Spectrum Connect
03. A company has an Azure SQL data warehouse. They want to use PolyBase to retrieve data from an Azure Blob storage account and ingest into the Azure SQL data warehouse. The files are stored in parquet format. The data needs to be loaded into a table called lead2pass_sales.
Which of the following actions need to be performed to implement this requirement?
(Choose 4)
a) Create an external file format that would map to the parquet-based files
b) Load the data into a staging table
c) Create an external table called lead2pass_sales_details
d) Create an external data source for the Azure Blob storage account
e) Create a master key on the database
f) Configure Polybase to use the Azure Blob storage account
04. A company is planning on creating an Azure SQL database to support a mission critical application. The application needs to be highly available and not have any performance degradation during maintenance windows.
Which of the following technologies can be used to implement this solution?
(Choose 3)
a) Premium Service Tier
b) Virtual Machine Scale Sets
c) Basic Service Tier
d) SQL Data Sync
e) Always On Availability Groups
f) Zone-redundant configuration
05. Reference Scenario: click here
Your company uses Azure Stream Analytics to monitor devices. The company plans to double the number of devices that are monitored.
You need to monitor a Stream Analytics job to ensure that there are enough processing resources to handle the additional load.
Which metric should you monitor?
a) Input Deserialization Errors
b) Early Input Events
c) Late Input Events
d) Watermark delay
06. The data engineering team manages Azure HDInsight clusters. The team spends a large amount of time creating and destroying clusters daily because most of the data pipeline process runs in minutes.
You need to implement a solution that deploys multiple HDInsight clusters with minimal effort. What should you implement?
a) Azure Databricks
b) Azure Traffic Manager
c) Azure Resource Manager templates
d) Ambari web user interface
07. You are a data engineer for your company. Your company has an on-premises SQL Server instance that contains 16 databases. Four of the databases require Common Language Runtime (CLR) features. You must be able to manage each database separately because each database has its own resource needs.
You plan to migrate these databases to Azure. You want to migrate the databases by using a backup and restore process by using SQL commands. You need to choose the most appropriate deployment option to migrate the databases.
What should you use?
a) Azure SQL Database with an elastic poolAzure Cosmos DB with the SQL (DocumentDB) API
b) Azure Cosmos DB with the SQL (DocumentDB) API
c) Azure SQL Database managed instance
d) Azure Cosmos DB with the Table API
08. A company is designing a hybrid solution to synchronize data and on-premises Microsoft SQL Server database to Azure SQL Database. You must perform an assessment of databases to determine whether data will move without compatibility issues.
You need to perform the assessment. Which tool should you use?
a) SQL Server Migration Assistant (SSMA)
b) Microsoft Assessment and Planning Toolkit
c) SQL Vulnerability Assessment (VA)
d) Azure SQL Data Sync
e) Data Migration Assistant (DMA)
09. Each day, company plans to store hundreds of files in Azure Blob Storage and Azure Data Lake Storage. The company uses the parquet format.
You must develop a pipeline that meets the following requirements:
- Process data every six hours
- Offer interactive data analysis capabilities
- Offer the ability to process data using solid-state drive (SSD) caching Use Directed Acyclic Graph(DAG) processing mechanisms
- Provide support for REST API calls to monitor processes Provide native support for Python
- Integrate with Microsoft Power BI
You need to select the appropriate data technology to implement the pipeline. Which data technology should you implement?
a) Azure SQL Data Warehouse
b) HDInsight Apache Storm cluster
c) Azure Stream Analytics
d) HDInsight Apache Hadoop cluster using MapReduce
e) HDInsight Spark cluster
10. You are a data engineer for an Azure SQL Database. You write the following SQL statements:
CREATE TABLE Customer (
CustomerID int IDENTITY PRIMARY KEY,
GivenName varchar(100) MASKED WITH (FUNCTION = 'partial(2,"XX",0)') NULL,
SurName varchar(100) NOT NULL,
Phone varchar(12) MASKED WITH (FUNCTION = 'default()')
INSERT Customer (GivenName, SurName, Phone) VALUES ('Sammy', 'Jack', '555.111.2222');
SELECT * FROM Customer;
You need to determine what is returned by the SELECT query. What data is returned?
a) 1 SaXX Jack XXX.XXX.2222
b) 1 XXXX Jack XXX.XXX.XXXX
c) 1 xx Jack XXX.XXX.2222
d) 1 SaXX Jack xxxx
Answers:-
Answer 1:- b
Answer 2:- c
Answer 3:- b, c, d, e
Answer 4:- a, e, f
Answer 5:- d
Answer 6:- c
Answer 7:- c
Answer 8:- e
Answer 9:- e
Answer 10:- d




Comments