AWS Certified Data Engineer - Associate Exam (DEA-C01) - AWS Actual Exam Questions
Last updated on April 11, 2026
A data engineer needs to build an enterprise data catalog based on the company's Amazon S3 buckets and Amazon RDS databases. The data catalog must include storage format metadata for the data in the catalog. Which solution will meet these requirements with the LEAST effort?
Use an AWS Glue crawler to scan the S3 buckets and RDS databases and build a data catalog. Use data stewards to inspect the data and update the data catalog with the data format.
Use an AWS Glue crawler to build a data catalog. Use AWS Glue crawler classifiers to recognize the format of data and store the format in the catalog.
Use Amazon Macie to build a data catalog and to identify sensitive data elements. Collect the data format information from Macie.
Use scripts to scan data elements and to assign data classifications based on the format of the data.
to join the discussion
No discussions yet. Be the first to ask!
Delete Comment
Are you sure? This action cannot be undone.
A company saves customer data to an Amazon S3 bucket. The company uses server-side encryption with AWS KMS keys (SSE-KMS) to encrypt the bucket. The dataset includes personally identifiable information (PII) such as social security numbers and account details. Data that is tagged as PII must be masked before the company uses customer data for analysis. Some users must have secure access to the PII data during the preprocessing phase. The company needs a low-maintenance solution to mask and secure the PII data throughout the entire engineering pipeline. Which combination of solutions will meet these requirements? (Select TWO.)
Use AWS Glue DataBrew to perform extract, transform, and load (ETL) tasks that mask the PII data before analysis.
Use Amazon GuardDuty to monitor access patterns for the PII data that is used in the engineering pipeline.
Configure an Amazon Made discovery job for the S3 bucket.
Use AWS Identity and Access Management (IAM) to manage permissions and to control access to the PII data.
Write custom scripts in an application to mask the PII data and to control access.
to join the discussion
No discussions yet. Be the first to ask!
Delete Comment
Are you sure? This action cannot be undone.
A data engineer maintains a materialized view that is based on an Amazon Redshift database. The view has a column named load_date that stores the date when each row was loaded. The data engineer needs to reclaim database storage space by deleting all the rows from the materialized view. Which command will reclaim the MOST database storage space?
Option A
Option B
Option C
Option D
to join the discussion
No discussions yet. Be the first to ask!
Delete Comment
Are you sure? This action cannot be undone.
A retail company has a customer data hub in an Amazon S3 bucket. Employees from many countries use the data hub to support company-wide analytics. A governance team must ensure that the company's data analysts can access data only for customers who are within the same country as the analysts. Which solution will meet these requirements with the LEAST operational effort?
Create a separate table for each country's customer data. Provide access to each analyst based on the country that the analyst serves.
Register the S3 bucket as a data lake location in AWS Lake Formation. Use the Lake Formation row- level security features to enforce the company's access policies.
Move the data to AWS Regions that are close to the countries where the customers are. Provide access to each analyst based on the country that the analyst serves.
Load the data into Amazon Redshift. Create a view for each country. Create separate 1AM roles for each country to provide access to data from each country. Assign the appropriate roles to the analysts.
to join the discussion
No discussions yet. Be the first to ask!
Delete Comment
Are you sure? This action cannot be undone.
A company maintains a data warehouse in an on-premises Oracle database. The company wants to build a data lake on AWS. The company wants to load data warehouse tables into Amazon S3 and synchronize the tables with incremental data that arrives from the data warehouse every day. Each table has a column that contains monotonically increasing values. The size of each table is less than 50 GB. The data warehouse tables are refreshed every night between 1 AM and 2 AM. A business intelligence team queries the tables between 10 AM and 8 PM every day. Which solution will meet these requirements in the MOST operationally efficient way?
Use an AWS Database Migration Service (AWS DMS) full load plus CDC job to load tables that contain monotonically increasing data columns from the on-premises data warehouse to Amazon S3. Use custom logic in AWS Glue to append the daily incremental data to a full-load copy that is in Amazon S3.
Use an AWS Glue Java Database Connectivity (JDBC) connection. Configure a job bookmark for a column that contains monotonically increasing values. Write custom logic to append the daily incremental data to a full-load copy that is in Amazon S3.
Use an AWS Database Migration Service (AWS DMS) full load migration to load the data warehouse tables into Amazon S3 every day Overwrite the previous day's full-load copy every day.
Use AWS Glue to load a full copy of the data warehouse tables into Amazon S3 every day. Overwrite the previous day's full-load copy every day.
to join the discussion
No discussions yet. Be the first to ask!
Delete Comment
Are you sure? This action cannot be undone.
Finish Practice?
Are you sure you want to finish? This will end your practice session.