Databricks Certified Machine Learning Associate (DATABRICKS-MACHINE-LEARNING-ASSOCIATE) - DataBricks Actual Exam Questions
Last updated on May 13, 2026
A machine learning engineer has been notified that a new Staging version of a model registered to the MLflow Model Registry has passed all tests. As a result, the machine learning engineer wants to put this model into production by transitioning it to the Production stage in the Model Registry. From which of the following pages in Databricks Machine Learning can the machine learning engineer accomplish this task?
The home page of the MLflow Model Registry
The experiment page in the Experiments observatory
The model version page in the MLflow Model Registry
The model page in the MLflow Model Registry
to join the discussion
No discussions yet. Be the first to ask!
Delete Comment
Are you sure? This action cannot be undone.
What is the name of the method that transforms categorical features into a series of binary indicator feature variables?
Leave-one-out encoding
Target encoding
One-hot encoding
Categorical
String indexing
to join the discussion
No discussions yet. Be the first to ask!
Delete Comment
Are you sure? This action cannot be undone.
A machine learning engineer is trying to scale a machine learning pipeline by distributing its single- node model tuning process. After broadcasting the entire training data onto each core, each core in the cluster can train one model at a time. Because the tuning process is still running slowly, the engineer wants to increase the level of parallelism from 4 cores to 8 cores to speed up the tuning process. Unfortunately, the total memory in the cluster cannot be increased. In which of the following scenarios will increasing the level of parallelism from 4 to 8 speed up the tuning process?
When the tuning process in randomized
When the entire data can fit on each core
When the model is unable to be parallelized
When the data is particularly long in shape
When the data is particularly wide in shape
to join the discussion
No discussions yet. Be the first to ask!
Delete Comment
Are you sure? This action cannot be undone.
A data scientist has developed a machine learning pipeline with a static input data set using Spark ML, but the pipeline is taking too long to process. They increase the number of workers in the cluster to get the pipeline to run more efficiently. They notice that the number of rows in the training set after reconfiguring the cluster is different from the number of rows in the training set prior to reconfiguring the cluster. Which of the following approaches will guarantee a reproducible training and test set for each model?
Manually configure the cluster
Write out the split data sets to persistent storage
Set a speed in the data splitting operation
Manually partition the input data
to join the discussion
No discussions yet. Be the first to ask!
Delete Comment
Are you sure? This action cannot be undone.
A data scientist wants to efficiently tune the hyperparameters of a scikit-learn model. They elect to use the Hyperopt library's fmin operation to facilitate this process. Unfortunately, the final model is not very accurate. The data scientist suspects that there is an issue with the objective_function being passed as an argument to fmin. They use the following code block to create the objective_function: Which of the following changes does the data scientist need to make to their objective_function in order to produce a more accurate model?
Add test set validation process
Add a random_state argument to the RandomForestRegressor operation
Remove the mean operation that is wrapping the cross_val_score operation
Replace the r2 return value with -r2
Replace the fmin operation with the fmax operation
to join the discussion
No discussions yet. Be the first to ask!
Delete Comment
Are you sure? This action cannot be undone.
Finish Practice?
Are you sure you want to finish? This will end your practice session.