Professional Data Engineer Exam (PROFESSIONAL-DATA-ENGINEER) - Google Cloud Actual Exam Questions
Last updated on May 14, 2026
You’re using Bigtable for a real-time application, and you have a heavy load that is a mix of read and writes. You’ve recently identified an additional use case and need to perform hourly an analytical job to calculate certain statistics across the whole database. You need to ensure both the reliability of your production application as well as the analytical workload. What should you do?
Export Bigtable dump to GCS and run your analytical job on top of the exported files.
Add a second cluster to an existing instance with a multi-cluster routing, use live-traffic app profile for your regular workload and batch-analytics profile for the analytics workload.
Add a second cluster to an existing instance with a single-cluster routing, use live-traffic app profile for your regular workload and batch-analytics profile for the analytics workload.
Increase the size of your existing cluster twice and execute your analytics workload on your new resized cluster.
to join the discussion
No discussions yet. Be the first to ask!
Delete Comment
Are you sure? This action cannot be undone.
You have data pipelines running on BigQuery, Cloud Dataflow, and Cloud Dataproc. You need to perform health checks and monitor their behavior, and then notify the team managing the pipelines if they fail. You also need to be able to work across multiple projects. Your preference is to use managed products of features of the platform. What should you do? 4XHVWLRQV�DQG�$QVZHUV�3') �������
Export the information to Cloud Stackdriver, and set up an Alerting policy
Run a Virtual Machine in Compute Engine with Airflow, and export the information to Stackdriver
Export the logs to BigQuery, and set up App Engine to read that information and send emails if you find a failure in the logs
Develop an App Engine application to consume logs using GCP API calls, and send emails if you find a failure in the logs
to join the discussion
No discussions yet. Be the first to ask!
Delete Comment
Are you sure? This action cannot be undone.
You have a BigQuery dataset named "customers". All tables will be tagged by using a Data Catalog tag template named "gdpr". The template contains one mandatory field, "has sensitive data~. with a 4XHVWLRQV�DQG�$QVZHUV�3') ������� boolean value. All employees must be able to do a simple search and find tables in the dataset that have either true or false in the "has sensitive data" field. However, only the Human Resources (HR) group should be able to see the data inside the tables for which "hass-ensitive-data" is true. You give the all employees group the bigquery.metadataViewer and bigquery.connectionUser roles on the dataset. You want to minimize configuration overhead. What should you do next?
Create the "gdpr" tag template with private visibility. Assign the bigquery -dataViewer role to the HR group on the tables that contain sensitive data.
Create the ~gdpr" tag template with private visibility. Assign the datacatalog. tagTemplateViewer role on this tag to the all employees group, and assign the bigquery.dataViewer role to the HR group on the tables that contain sensitive data.
Create the "gdpr" tag template with public visibility. Assign the bigquery. dataViewer role to the HR group on the tables that contain sensitive data.
Create the "gdpr" tag template with public visibility. Assign the datacatalog. tagTemplateViewer role on this tag to the all employees. group, and assign the bijquery.dataViewer role to the HR group on the tables that contain sensitive data.
to join the discussion
No discussions yet. Be the first to ask!
Delete Comment
Are you sure? This action cannot be undone.
You are administering a BigQuery on-demand environment. Your business intelligence tool is submitting hundreds of queries each day that aggregate a large (50 TB) sales history fact table at the day and month levels. These queries have a slow response time and are exceeding cost expectations. You need to decrease response time, lower query costs, and minimize maintenance. What should you do?
Build materialized views on top of the sales table to aggregate data at the day and month level.
Build authorized views on top of the sales table to aggregate data at the day and month level.
Enable Bl Engine and add your sales table as a preferred table.
Create a scheduled query to build sales day and sales month aggregate tables on an hourly basis.
to join the discussion
No discussions yet. Be the first to ask!
Delete Comment
Are you sure? This action cannot be undone.
Flowlogistic wants to use Google BigQuery as their primary analysis system, but they still have Apache Hadoop and Spark workloads that they cannot move to BigQuery. Flowlogistic does not know how to store the data that is common to both workloads. What should they do?
Store the common data in BigQuery as partitioned tables.
Store the common data in BigQuery and expose authorized views.
Store the common data encoded as Avro in Google Cloud Storage.
Store he common data in the HDFS storage for a Google Cloud Dataproc cluster. 4XHVWLRQV�DQG�$QVZHUV�3') ������
to join the discussion
No discussions yet. Be the first to ask!
Delete Comment
Are you sure? This action cannot be undone.
Finish Practice?
Are you sure you want to finish? This will end your practice session.