Official Bank 0/180

Databricks Certified Data Engineer Professional (DATABRICKS) - DataBricks Actual Exam Questions

Last updated on April 15, 2026

97% Exam Compliance
180 Total Questions
1
Question

Which of the following code blocks returns about 150 randomly selected rows from the 1000-row DataFrame transactionsDf, assuming that any row can appear more than once in the returned DataFrame?

Select 2
Options
A

transactionsDf.resample(0.15, False, 3142)

B

transactionsDf.sample(0.15, False, 3142)

C

transactionsDf.sample(0.15)

D

transactionsDf.sample(0.85, 8429)

E

transactionsDf.sample(True, 0.15, 8261)

Discussion (0 comments)

to join the discussion

Community Discussion

No discussions yet. Be the first to ask!

2
Question

Which of the following statements about Spark's DataFrames is incorrect?

Options
A

Spark's DataFrames are immutable.

B

Spark's DataFrames are equal to Python's DataFrames.

C

Data in DataFrames is organized into named columns.

D

RDDs are at the core of DataFrames.

E

The data in DataFrames may be split into multiple chunks.

Discussion (0 comments)

to join the discussion

Community Discussion

No discussions yet. Be the first to ask!

3
Question

Which of the following code blocks adds a column predErrorSqrt to DataFrame transactionsDf that is the square root of column predError?

Options
A

transactionsDf.withColumn("predErrorSqrt", sqrt(predError))

B

transactionsDf.select(sqrt(predError))

C

transactionsDf.withColumn("predErrorSqrt", col("predError").sqrt())

D

transactionsDf.withColumn("predErrorSqrt", sqrt(col("predError")))

E

transactionsDf.select(sqrt("predError"))

Discussion (0 comments)

to join the discussion

Community Discussion

No discussions yet. Be the first to ask!

4
Question

The code block displayed below contains an error. The code block is intended to return all columns of DataFrame transactionsDf except for columns predError, productId, and value. Find the error. Excerpt of DataFrame transactionsDf: transactionsDf.select(~col("predError"), ~col("productId"), ~col("value"))

Options
A

The select operator should be replaced by the drop operator and the arguments to the drop operator should be column names predError, productId and value wrapped in the col operator so they should be expressed like drop(col(predError), col(productId), col(value)).

B

The select operator should be replaced with the deselect operator.

C

The column names in the select operator should not be strings and wrapped in the col operator, so they should be expressed like select(~col(predError), ~col(productId), ~col(value)).

D

The select operator should be replaced by the drop operator.

E

The select operator should be replaced by the drop operator and the arguments to the drop operator should be column names predError, productId and value as strings. (Correct)

Discussion (0 comments)

to join the discussion

Community Discussion

No discussions yet. Be the first to ask!

5
Question

The code block shown below should return a new 2-column DataFrame that shows one attribute from column attributes per row next to the associated itemName, for all suppliers in column supplier whose name includes Sports. Choose the answer that correctly fills the blanks in the code block to accomplish this. Sample of DataFrame itemsDf: 1. +------+----------------------------------+-----------------------------+-------------------+ 2. |itemId|itemName |attributes |supplier | 3. +------+----------------------------------+-----------------------------+-------------------+ 4. |1 |Thick Coat for Walking in the Snow|[blue, winter, cozy] |Sports Company Inc.| 5. |2 |Elegant Outdoors Summer Dress |[red, summer, fresh, cooling]|YetiX | 6. |3 |Outdoors Backpack |[green, summer, travel] |Sports Company Inc.| 7. +------+----------------------------------+-----------------------------+-------------------+ Code block: itemsDf.__1__(__2__).select(__3__, __4__)

Select 2
Options
A

1. filter 2. col("supplier").isin("Sports") 3. "itemName" 4. explode(col("attributes"))

B

1. where 2. col("supplier").contains("Sports") 3. "itemName" 4. "attributes"

C

1. where 2. col(supplier).contains("Sports") 3. explode(attributes) 4. itemName

D

1. where 2. "Sports".isin(col("Supplier")) 3. "itemName" 4. array_explode("attributes")

E

1. filter 2. col("supplier").contains("Sports") 3. "itemName" 4. explode("attributes")

Discussion (0 comments)

to join the discussion

Community Discussion

No discussions yet. Be the first to ask!

Finish Practice?

Are you sure you want to finish? This will end your practice session.