The AWS DAS-C01 exam preparation guide is designed to provide candidates with necessary information about the Data Analytics Specialty exam. It includes exam summary, sample questions, practice test, objectives and ways to interpret the exam objectives to enable candidates to assess the types of questions-answers that may be asked during the AWS Certified Data Analytics - Specialty exam.
It is recommended for all the candidates to refer the DAS-C01 objectives and sample questions provided in this preparation guide. The AWS Data Analytics Specialty certification is mainly targeted to the candidates who want to build their career in Specialty domain and demonstrate their expertise. We suggest you to use practice exam listed in this cert guide to get used to with exam environment and identify the knowledge areas where you need more work prior to taking the actual AWS Certified Data Analytics - Specialty exam.
AWS DAS-C01 Exam Summary:
Exam Name | AWS Certified Data Analytics - Specialty (Data Analytics Specialty) |
Exam Code | DAS-C01 |
Exam Price | $300 USD |
Duration | 180 minutes |
Number of Questions | 65 |
Passing Score | 750 / 1000 |
Recommended Training / Books |
Data Analytics Fundamentals Building Data Lakes on AWS Building Batch Data Analytics Solutions on AWS Building Data Analytics Solutions using Amazon Redshift |
Schedule Exam | AWS Certification |
Sample Questions | AWS DAS-C01 Sample Questions |
Recommended Practice | AWS Certified Data Analytics - Specialty Practice Test |
AWS Data Analytics Specialty Syllabus:
Section | Objectives |
---|---|
Collection - 18% |
|
Determine the operational characteristics of the collection system. |
- Confirm that data loss is within tolerance limits in the event of failures. - Evaluate costs associated with data acquisition, transfer, and provisioning from various sources into the collection system (for example, networking, bandwidth, ETL, data migration). - Assess the failure scenarios that the collection system may experience, and take remediation actions based on impact. - Determine data persistence at various points of data capture. - Identify the latency characteristics of the collection system. |
Select a collection system that handles the frequency, volume, and source of data. |
- Describe and characterize the volume and flow characteristics of incoming data (for example, streaming, transactional, batch). - Match the flow characteristics of data to potential solutions - Assess the tradeoffs between various ingestion services, and take into account scalability, cost, fault tolerance, and latency. - Explain the throughput capability of a variety of types of data collection solutions, and identify bottlenecks. - Choose a collection solution that satisfies connectivity constraints of the source data system. |
Select a collection system that addresses the key properties of data, such as order, format, and compression. |
- Describe how to capture data changes at the source. - Discuss data structure and format, compression applied, and encryption requirements. - Distinguish the impact of out-of-order delivery of data, duplicate delivery of data, and the tradeoffs between at-most-once, exactly-once, and at-least-once processing. - Describe how to transform and filter data during the collection process. |
Storage and Data Management - 22% |
|
Determine the operational characteristics of the storage solution for analytics. |
- Determine the appropriate storage service or services on the basis of cost compared to performance. - Understand the durability, reliability, and latency characteristics of the storage solution based on requirements. - Determine the requirements of a system for strong or eventual consistency of the storage system. - Determine the appropriate storage solution to address data freshness requirements. |
Determine data access and retrieval patterns. |
- Determine the appropriate storage solution based on update patterns (for example, bulk, transactional, micro batching). - Determine the appropriate storage solution based on access patterns (for example, sequential or random access, continuous usage or one-time usage). - Determine the appropriate storage solution to address change characteristics of data (append-only changes or updates). - Determine the appropriate storage solution for long-term storage and transient storage. - Determine the appropriate storage solution for structured data and semi-structured data. - Determine the appropriate storage solution to address query latency requirements. |
Select appropriate data layout, schema, structure, and format. |
- Determine appropriate mechanisms to address schema evolution requirements. - Select the appropriate storage format for a specific task. - Select the appropriate compression and encoding strategies for a chosen storage format. - Select the appropriate data sorting and distribution strategies and the storage layout for efficient data access. - Explain the cost and performance implications of different data distributions, layouts, and formats (for example, size and number of files). - Implement data formatting and partitioning schemes for data-optimized analysis. |
Define data lifecycles based on usage patterns and business requirements. |
- Determine the appropriate strategy to address data lifecycle requirements. - Apply appropriate lifecycle and data retention policies to different storage solutions. |
Determine the appropriate system to catalog data and to manage metadata. |
- Evaluate mechanisms to discover new and updated data sources. - Evaluate mechanisms to create and update data catalogs and metadata. - Explain mechanisms to search and retrieve data catalogs and metadata. - Explain mechanisms to tag and classify data. |
Processing - 24% |
|
Determine appropriate data processing solution requirements. |
- Understand data preparation and usage requirements. - Understand different types of data sources and targets. - Evaluate performance and orchestration needs. - Evaluate appropriate services for cost, scalability, and availability. |
Design a solution to transform and prepare data for analysis. |
- Apply appropriate ETL and ELT techniques for batch workloads and real-time workloads. - Implement failover, scaling, and replication mechanisms. - Implement techniques to address concurrency needs. - Implement techniques to improve cost-optimization efficiencies. - Orchestrate workflows. - Aggregate and enrich data for downstream consumption. |
Automate and operationalize data processing solutions. |
- Implement automated techniques for repeatable workflows. - Apply methods to identify and recover from processing failures. - Deploy logging and monitoring solutions to enable auditing and traceability. |
Analysis and Visualization - 18% |
|
Determine the operational characteristics of an analysis and visualization solution. |
- Determine costs associated with analysis and visualization. - Determine scalability associated with analysis. - Determine failover recovery and fault tolerance within the RPO and RTO. - Determine the availability characteristics of an analysis tool. - Evaluate dynamic, interactive, and static presentations of data. - Translate performance requirements to an appropriate visualization approach (for example pre-compute and consume static data, consume dynamic data). |
Select the appropriate data analysis solution for a given scenario. |
- Evaluate and compare analysis solutions. - Select the right type of analysis based on the customer use case (for example, streaming, interactive, collaborative, operational). |
Select the appropriate data visualization solution for a given scenario. |
- Evaluate output capabilities for a given analysis solution (for example, metrics, KPIs, tabular, API). - Choose the appropriate method for data delivery (for example, web, mobile, email, collaborative notebooks). - Choose and define the appropriate data refresh schedule. - Choose appropriate tools for different data freshness requirements (for example, Amazon OpenSearch Service, Amazon QuickSight, Amazon EMR notebooks). - Understand the capabilities of visualization tools for interactive use cases (for example, drill down, drill through, pivot). - Implement the appropriate data access mechanism (for example, in memory, direct access). - Implement an integrated solution from multiple heterogeneous data sources. |
Security - 18% |
|
Select appropriate authentication and authorization mechanisms. |
- Implement appropriate authentication methods (for example, federated access, SSO, AWS Identity and Access Management [IAM]). - Implement appropriate authorization methods (for example, policies, ACLs, table and column level permissions). - Implement appropriate access control mechanisms (for example, security groups, role-based controls). |
Apply data protection and encryption techniques. |
- Determine data encryption and masking needs. - Apply different encryption approaches (for example, server-side encryption, client-side encryption, AWS Key Management Service [AWS KMS], AWS CloudHSM). - Implement at-rest and in-transit encryption mechanisms. - Implement data obfuscation and masking techniques. - Apply basic principles of key rotation and secrets management. |
Apply data governance and compliance controls. |
- Determine data governance and compliance requirements. - Understand and configure access, and audit logging across data analytics services. - Implement appropriate controls to meet compliance requirements. |