Pre-Summer Sale - 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: dm70dm

MLA-C01 AWS Certified Machine Learning Engineer - Associate Questions and Answers

Questions 4

An ML engineer needs to implement a solution to host a trained ML model. The rate of requests to the model will be inconsistent throughout the day.

The ML engineer needs a scalable solution that minimizes costs when the model is not in use. The solution also must maintain the model ' s capacity to respond to requests during times of peak usage.

Which solution will meet these requirements?

Options:

A.

Create AWS Lambda functions that have fixed concurrency to host the model. Configure the Lambda functions to automatically scale based on the number of requests to the model.

B.

Deploy the model on an Amazon Elastic Container Service (Amazon ECS) cluster that uses AWS Fargate. Set a static number of tasks to handle requests during times of peak usage.

C.

Deploy the model to an Amazon SageMaker endpoint. Deploy multiple copies of the model to the endpoint. Create an Application Load Balancer to route traffic between the different copies of the model at the endpoint.

D.

Deploy the model to an Amazon SageMaker endpoint. Create SageMaker endpoint auto scaling policies that are based on Amazon CloudWatch metrics to adjust the number of instances dynamically.

Buy Now
Questions 5

A company stores training data as a .csv file in an Amazon S3 bucket. The company must encrypt the data and must control which applications have access to the encryption key.

Which solution will meet these requirements?

Options:

A.

Create a new SSH access key and use the AWS Encryption CLI to encrypt the file.

B.

Create a new API key by using Amazon API Gateway and use it to encrypt the file.

C.

Create a new IAM role with permissions for kms:GenerateDataKey and use the role to encrypt the file.

D.

Create a new AWS Key Management Service (AWS KMS) key and use the AWS Encryption CLI with the KMS key to encrypt the file.

Buy Now
Questions 6

A company has deployed an XGBoost prediction model in production to predict if a customer is likely to cancel a subscription. The company uses Amazon SageMaker Model Monitor to detect deviations in the F1 score.

During a baseline analysis of model quality, the company recorded a threshold for the F1 score. After several months of no change, the model ' s F1 score decreases significantly.

What could be the reason for the reduced F1 score?

Options:

A.

Concept drift occurred in the underlying customer data that was used for predictions.

B.

The model was not sufficiently complex to capture all the patterns in the original baseline data.

C.

The original baseline data had a data quality issue of missing values.

D.

Incorrect ground truth labels were provided to Model Monitor during the calculation of the baseline.

Buy Now
Questions 7

A company wants to host an ML model on Amazon SageMaker. An ML engineer is configuring a continuous integration and continuous delivery (Cl/CD) pipeline in AWS CodePipeline to deploy the model. The pipeline must run automatically when new training data for the model is uploaded to an Amazon S3 bucket.

Select and order the pipeline ' s correct steps from the following list. Each step should be selected one time or not at all. (Select and order three.)

• An S3 event notification invokes the pipeline when new data is uploaded.

• S3 Lifecycle rule invokes the pipeline when new data is uploaded.

• SageMaker retrains the model by using the data in the S3 bucket.

• The pipeline deploys the model to a SageMaker endpoint.

• The pipeline deploys the model to SageMaker Model Registry.

MLA-C01 Question 7

Options:

Buy Now
Questions 8

An ML engineer is developing a classification model. The ML engineer needs to use custom libraries in processing jobs, training jobs, and pipelines in Amazon SageMaker AI.

Which solution will provide this functionality with the LEAST implementation effort?

Options:

A.

Manually install the libraries in the SageMaker AI containers.

B.

Build a custom Docker container that includes the required libraries. Host the container in Amazon Elastic Container Registry (Amazon ECR). Use the ECR image in the SageMaker AI jobs and pipelines.

C.

Use a SageMaker AI notebook instance and install libraries at startup.

D.

Run code externally on Amazon EC2 and import results into SageMaker AI.

Buy Now
Questions 9

A financial company receives a high volume of real-time market data streams from an external provider. The streams consist of thousands of JSON records every second.

The company needs to implement a scalable solution on AWS to identify anomalous data points.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

A.

Ingest real-time data into Amazon Kinesis Data Streams. Use the built-in RANDOM_CUT_FOREST function in Amazon Managed Service for Apache Flink to process the data streams and to detect data anomalies.

B.

Ingest real-time data into Amazon Kinesis Data Streams. Deploy an Amazon SageMaker AI endpoint for real-time outlier detection. Create an AWS Lambda function to detect anomalies. Use the data streams to invoke the Lambda function.

C.

Ingest real-time data into Apache Kafka on Amazon EC2 instances. Deploy an Amazon SageMaker AI endpoint for real-time outlier detection. Create an AWS Lambda function to detect anomalies. Use the data streams to invoke the Lambda function.

D.

Send real-time data to an Amazon Simple Queue Service (Amazon SQS) FIFO queue. Create an AWS Lambda function to consume the queue messages. Program the Lambda function to start an AWS Glue extract, transform, and load (ETL) job for batch processing and anomaly detection.

Buy Now
Questions 10

A company runs an ML model on Amazon SageMaker AI. The company uses an automatic process that makes API calls to create training jobs for the model. The company has new compliance rules that prohibit the collection of aggregated metadata from training jobs.

Which solution will prevent SageMaker AI from collecting metadata from the training jobs?

Options:

A.

Opt out of metadata tracking for any training job that is submitted.

B.

Ensure that training jobs are running in a private subnet in a custom VPC.

C.

Encrypt the training data with an AWS Key Management Service (AWS KMS) customer managed key.

D.

Reconfigure the training jobs to use only AWS Nitro instances.

Buy Now
Questions 11

A company is building an Amazon SageMaker AI pipeline for an ML model. The pipeline uses distributed processing and distributed training.

An ML engineer needs to encrypt network communication between instances that run distributed jobs. The ML engineer configures the distributed jobs to run in a private VPC.

What should the ML engineer do to meet the encryption requirement?

Options:

A.

Enable network isolation.

B.

Configure traffic encryption by using security groups.

C.

Enable inter-container traffic encryption.

D.

Enable VPC flow logs.

Buy Now
Questions 12

A company needs to combine data from multiple sources. The company must use Amazon Redshift Serverless to query an AWS Glue Data Catalog database and underlying data that is stored in an Amazon S3 bucket.

Select and order the correct steps from the following list to meet these requirements. Select each step one time or not at all. (Select and order three.)

• Attach the IAM role to the Redshift cluster.

• Attach the IAM role to the Redshift namespace.

• Create an external database in Amazon Redshift to point to the Data Catalog schema.

• Create an external schema in Amazon Redshift to point to the Data Catalog database.

• Create an IAM role for Amazon Redshift to use to access only the S3 bucket that contains underlying data.

• Create an IAM role for Amazon Redshift to use to access the Data Catalog and the S3 bucket that contains underlying data.

MLA-C01 Question 12

Options:

Buy Now
Questions 13

A company has trained and deployed an ML model by using Amazon SageMaker. The company needs to implement a solution to record and monitor all the API call events for the SageMaker endpoint. The solution also must provide a notification when the number of API call events breaches a threshold.

Use SageMaker Debugger to track the inferences and to report metrics. Create a custom rule to provide a notification when the threshold is breached.

Which solution will meet these requirements?

Options:

A.

Use SageMaker Debugger to track the inferences and to report metrics. Create a custom rule to provide a notification when the threshold is breached.

B.

Use SageMaker Debugger to track the inferences and to report metrics. Use the tensor_variance built-in rule to provide a notification when the threshold is breached.

C.

Log all the endpoint invocation API events by using AWS CloudTrail. Use an Amazon CloudWatch dashboard for monitoring. Set up a CloudWatch alarm to provide notification when the threshold is breached.

D.

Add the Invocations metric to an Amazon CloudWatch dashboard for monitoring. Set up a CloudWatch alarm to provide notification when the threshold is breached.

Buy Now
Questions 14

An ML engineer needs to use an Amazon EMR cluster to process large volumes of data in batches. Any data loss is unacceptable.

Which instance purchasing option will meet these requirements MOST cost-effectively?

Options:

A.

Run the primary node, core nodes, and task nodes on On-Demand Instances.

B.

Run the primary node, core nodes, and task nodes on Spot Instances.

C.

Run the primary node on an On-Demand Instance. Run the core nodes and task nodes on Spot Instances.

D.

Run the primary node and core nodes on On-Demand Instances. Run the task nodes on Spot Instances.

Buy Now
Questions 15

Case study

An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3.

The dataset has a class imbalance that affects the learning of the model ' s algorithm. Additionally, many of the features have interdependencies. The algorithm is not capturing all the desired underlying patterns in the data.

The ML engineer needs to use an Amazon SageMaker built-in algorithm to train the model.

Which algorithm should the ML engineer use to meet this requirement?

Options:

A.

LightGBM

B.

Linear learner

C.

К-means clustering

D.

Neural Topic Model (NTM)

Buy Now
Questions 16

A hospital wants to predict patient outcomes for the coming year An ML engineer must improve several existing ML models that currently perform poorly.

Select the correct regularization method from the following list to improve each model Select each regularization method one time, more than one time, or not at all. (Select THREE.)

• L1 regularization

• L2 regularization

• Early stopping

MLA-C01 Question 16

Options:

Buy Now
Questions 17

Case Study

A company is building a web-based AI application by using Amazon SageMaker. The application will provide the following capabilities and features: ML experimentation, training, a

central model registry, model deployment, and model monitoring.

The application must ensure secure and isolated use of training data during the ML lifecycle. The training data is stored in Amazon S3.

The company is experimenting with consecutive training jobs.

How can the company MINIMIZE infrastructure startup times for these jobs?

Options:

A.

Use Managed Spot Training.

B.

Use SageMaker managed warm pools.

C.

Use SageMaker Training Compiler.

D.

Use the SageMaker distributed data parallelism (SMDDP) library.

Buy Now
Questions 18

Case study

An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3.

The dataset has a class imbalance that affects the learning of the model ' s algorithm. Additionally, many of the features have interdependencies. The algorithm is not capturing all the desired underlying patterns in the data.

Which AWS service or feature can aggregate the data from the various data sources?

Options:

A.

Amazon EMR Spark jobs

B.

Amazon Kinesis Data Streams

C.

Amazon DynamoDB

D.

AWS Lake Formation

Buy Now
Questions 19

An ML engineer is building an ML model in Amazon SageMaker AI. The ML engineer needs to load historical data directly from Amazon S3, Amazon Athena, and Snowflake into SageMaker AI.

Which solution will meet this requirement?

Options:

A.

Use AWS Glue DataBrew to import the data into SageMaker AI.

B.

Build a pipeline in SageMaker Pipelines to process the data. Use AWS DataSync to load the processed data into SageMaker AI.

C.

Create a feature store in SageMaker Feature Store. Use an Apache Spark connector to Feature Store to access the data.

D.

Use SageMaker Data Wrangler to query and import the data.

Buy Now
Questions 20

An ML engineer uses an Amazon SageMaker AI notebook instance to run a training job that trains a neural network model with an estimator. The training job loads data iteratively from an Amazon S3 path that is configured as an environment variable. The ML engineer viewed a profiling report of the training job. The ML engineer discovered that a substantial amount of the training time is spent during data loading.

How can the ML engineer improve the training speed?

Options:

A.

Provision Amazon Elastic Block Store (Amazon EBS) Provisioned IOPS SSD io1 storage during the estimator initialization. Download the training data from the S3 path to Amazon EBS. Point the data loader to the EBS location.

B.

Provision Amazon Elastic File System (Amazon EFS) storage during the estimator initialization. Download the training data to Amazon EFS by using the S3 path. Point the data loader to the EFS location.

C.

Download the training data to the estimator by using fast file mode. Point the data loader to the location specified by the S3 path.

D.

Configure the path to the S3 bucket that contains the training data as a hyperparameter instead of an environment variable.

Buy Now
Questions 21

A company is building a near real-time data analytics application to detect anomalies and failures for industrial equipment. The company has thousands of IoT sensors that send data every 60 seconds. When new versions of the application are released, the company wants to ensure that application code bugs do not prevent the application from running.

Which solution will meet these requirements?

Options:

A.

Use Amazon Managed Service for Apache Flink with the system rollback capability enabled to build the data analytics application.

B.

Use Amazon Managed Service for Apache Flink with manual rollback when an error occurs to build the data analytics application.

C.

Use Amazon Data Firehose to deliver real-time streaming data programmatically for the data analytics application. Pause the stream when a new version of the application is released and resume the stream after the application is deployed.

D.

Use Amazon Data Firehose to deliver data to Amazon EC2 instances across two Availability Zones for the data analytics application.

Buy Now
Questions 22

An ML engineer needs to use Amazon SageMaker to fine-tune a large language model (LLM) for text summarization. The ML engineer must follow a low-code no-code (LCNC) approach.

Which solution will meet these requirements?

Options:

A.

Use SageMaker Studio to fine-tune an LLM that is deployed on Amazon EC2 instances.

B.

Use SageMaker Autopilot to fine-tune an LLM that is deployed by a custom API endpoint.

C.

Use SageMaker Autopilot to fine-tune an LLM that is deployed on Amazon EC2 instances.

D.

Use SageMaker Autopilot to fine-tune an LLM that is deployed by SageMaker JumpStart.

Buy Now
Questions 23

An ML engineering team has a data processing pipeline that ingests sensor data from IoT devices into an Amazon S3 bucket. The pipeline then processes the data by using AWS Glue extract, transform, and load (ETL) jobs for ML modeling. The team noticed throttling errors in the ETL jobs. The data ingestion process has also been slower than normal.

What is the cause of the problem?

Options:

A.

The AWS Glue service quotas have been reached.

B.

The network bandwidth between the IoT devices and the AWS Region is insufficient.

C.

The AWS Glue ETL jobs are not optimized for parallel processing.

D.

The AWS Glue execution role is missing Amazon S3 permissions.

Buy Now
Questions 24

A company is running ML models on premises by using custom Python scripts and proprietary datasets. The company is using PyTorch. The model building requires unique domain knowledge. The company needs to move the models to AWS.

Which solution will meet these requirements with the LEAST development effort?

Options:

A.

Use SageMaker AI built-in algorithms to train the proprietary datasets.

B.

Use SageMaker AI script mode and premade images for ML frameworks.

C.

Build a container on AWS that includes custom packages and a choice of ML frameworks.

D.

Purchase similar production models through AWS Marketplace.

Buy Now
Questions 25

A company collects customer data daily and stores it as compressed files in an Amazon S3 bucket partitioned by date. Each month, analysts process the data, check data quality, and upload results to Amazon QuickSight dashboards.

An ML engineer needs to automatically check data quality before the data is sent to QuickSight, with the LEAST operational overhead.

Which solution will meet these requirements?

Options:

A.

Run an AWS Glue crawler monthly and use AWS Glue Data Quality rules to check data quality.

B.

Run an AWS Glue crawler and create a custom AWS Glue job with PySpark to evaluate data quality.

C.

Use AWS Lambda with Python scripts triggered by S3 uploads to evaluate data quality.

D.

Send S3 events to Amazon SQS and use Amazon CloudWatch Insights to evaluate data quality.

Buy Now
Questions 26

An ML engineer is using Amazon SageMaker AI to train an ML model. The ML engineer needs to use SageMaker AI automatic model tuning (AMT) features to tune the model hyperparameters over a large parameter space.

The model has 20 categorical hyperparameters and 7 continuous hyperparameters that can be tuned. The ML engineer needs to run the tuning job a maximum of 1,000 times. The ML engineer must ensure that each parameter trial is built based on the performance of the previous trial.

Which solution will meet these requirements?

Options:

A.

Define the search space as categorical parameters of 1,000 possible combinations. Use grid search.

B.

Define the search space as continuous parameters. Use random search. Set the maximum number of tuning jobs to 1,000.

C.

Define the search space as categorical parameters and continuous parameters. Use Bayesian optimization. Set the maximum number of training jobs to 1,000.

D.

Define the search space as categorical parameters and continuous parameters. Use grid search. Set the maximum number of tuning jobs to 1,000.

Buy Now
Questions 27

An ML engineer needs to use data with Amazon SageMaker Canvas to train an ML model. The data is stored in Amazon S3 and is complex in structure. The ML engineer must use a file format that minimizes processing time for the data.

Which file format will meet these requirements?

Options:

A.

CSV files compressed with Snappy

B.

JSON objects in JSONL format

C.

JSON files compressed with gzip

D.

Apache Parquet files

Buy Now
Questions 28

A logistics company has installed in-vehicle cameras for basic monitoring of its drivers. The company wants to improve driver safety by identifying distractions that could lead to accidents.

Which solution will meet this requirement with the LEAST operational effort?

Options:

A.

Use Amazon Rekognition eye gaze direction detection to monitor driver behavior and identify distractions.

B.

Use Amazon SageMaker AI to customize an AI model to monitor driver behavior and identify distractions.

C.

Integrate a third-party driver monitoring system with Amazon Rekognition to monitor driver behavior and identify distractions.

D.

Use Amazon Comprehend to analyze text-based driver feedback and identify distractions.

Buy Now
Questions 29

An ML engineer has developed a binary classification model outside of Amazon SageMaker. The ML engineer needs to make the model accessible to a SageMaker Canvas user for additional tuning.

The model artifacts are stored in an Amazon S3 bucket. The ML engineer and the Canvas user are part of the same SageMaker domain.

Which combination of requirements must be met so that the ML engineer can share the model with the Canvas user? (Choose two.)

Options:

A.

The ML engineer and the Canvas user must be in separate SageMaker domains.

B.

The Canvas user must have permissions to access the S3 bucket where the model artifacts are stored.

C.

The model must be registered in the SageMaker Model Registry.

D.

The ML engineer must host the model on AWS Marketplace.

E.

The ML engineer must deploy the model to a SageMaker endpoint.

Buy Now
Questions 30

A company ingests sales transaction data using Amazon Data Firehose into Amazon OpenSearch Service. The Firehose buffer interval is set to 60 seconds.

The company needs sub-second latency for a real-time OpenSearch dashboard.

Which architectural change will meet this requirement?

Options:

A.

Use zero buffering in the Firehose stream and tune the PutRecordBatch batch size.

B.

Replace Firehose with AWS DataSync and enhanced fan-out consumers.

C.

Increase the Firehose buffer interval to 120 seconds.

D.

Replace Firehose with Amazon SQS.

Buy Now
Questions 31

An ML engineer is configuring auto scaling for an inference component of a model that runs behind an Amazon SageMaker AI endpoint. The ML engineer configures SageMaker AI auto scaling with a target tracking scaling policy set to 100 invocations per model per minute. The SageMaker AI endpoint scales appropriately during normal business hours. However, the ML engineer notices that at the start of each business day, there are zero instances available to handle requests, which causes delays in processing.

The ML engineer must ensure that the SageMaker AI endpoint can handle incoming requests at the start of each business day.

Which solution will meet this requirement?

Options:

A.

Reduce the SageMaker AI auto scaling cooldown period to the minimum supported value. Add an auto scaling lifecycle hook to scale the SageMaker AI instances.

B.

Change the target metric to CPU utilization.

C.

Modify the scaling policy target value to one.

D.

Apply a step scaling policy that scales based on an Amazon CloudWatch alarm. Apply a second CloudWatch alarm and scaling policy to scale the minimum number of instances from zero to one at the start of each business day.

Buy Now
Questions 32

An ML engineer is using Amazon SageMaker to train a deep learning model that requires distributed training. After some training attempts, the ML engineer observes that the instances are not performing as expected. The ML engineer identifies communication overhead between the training instances.

What should the ML engineer do to MINIMIZE the communication overhead between the instances?

Options:

A.

Place the instances in the same VPC subnet. Store the data in a different AWS Region from where the instances are deployed.

B.

Place the instances in the same VPC subnet but in different Availability Zones. Store the data in a different AWS Region from where the instances are deployed.

C.

Place the instances in the same VPC subnet. Store the data in the same AWS Region and Availability Zone where the instances are deployed.

D.

Place the instances in the same VPC subnet. Store the data in the same AWS Region but in a different Availability Zone from where the instances are deployed.

Buy Now
Questions 33

An ML engineer is using Amazon Quick Suite (previously known as Amazon QuickSight) anomaly detection to detect very high or very low machine operating temperatures compared to normal. The ML engineer sets the Severity parameter to Low and above. The ML engineer sets the Direction parameter to All.

What effect will the ML engineer observe in the anomaly detection results if the ML engineer changes the Direction parameter to Lower than expected?

Options:

A.

Increased anomaly identification frequency and increased recall

B.

Decreased anomaly identification frequency and decreased recall

C.

Increased anomaly identification frequency and decreased recall

D.

Decreased anomaly identification frequency and increased recall

Buy Now
Questions 34

A company is using an Amazon Redshift database as its single data source. Some of the data is sensitive.

A data scientist needs to use some of the sensitive data from the database. An ML engineer must give the data scientist access to the data without transforming the source data and without storing anonymized data in the database.

Which solution will meet these requirements with the LEAST implementation effort?

Options:

A.

Configure dynamic data masking policies to control how sensitive data is shared with the data scientist at query time.

B.

Create a materialized view with masking logic on top of the database. Grant the necessary read permissions to the data scientist.

C.

Unload the Amazon Redshift data to Amazon S3. Use Amazon Athena to create schema-on-read with masking logic. Share the view with the data scientist.

D.

Unload the Amazon Redshift data to Amazon S3. Create an AWS Glue job to anonymize the data. Share the dataset with the data scientist.

Buy Now
Questions 35

An ML engineer has trained an ML model by using Amazon SageMaker AI. The ML engineer determines that the model is overfitting and that the training data contains unnecessary features. The ML engineer must reduce the overfitting and the impact of the unnecessary features.

Which solution will meet these requirements?

Options:

A.

Apply L1 regularization to the training data. Retrain the model.

B.

Use SageMaker Debugger to apply L1 regularization to the running model.

C.

Increase the number of training iterations. Retrain the model.

D.

Decrease the number of training iterations. Retrain the model.

Buy Now
Questions 36

A company is planning to create several ML prediction models. The training data is stored in Amazon S3. The entire dataset is more than 5 ТВ in size and consists of CSV, JSON, Apache Parquet, and simple text files.

The data must be processed in several consecutive steps. The steps include complex manipulations that can take hours to finish running. Some of the processing involves natural language processing (NLP) transformations. The entire process must be automated.

Which solution will meet these requirements?

Options:

A.

Process data at each step by using Amazon SageMaker Data Wrangler. Automate the process by using Data Wrangler jobs.

B.

Use Amazon SageMaker notebooks for each data processing step. Automate the process by using Amazon EventBridge.

C.

Process data at each step by using AWS Lambda functions. Automate the process by using AWS Step Functions and Amazon EventBridge.

D.

Use Amazon SageMaker Pipelines to create a pipeline of data processing steps. Automate the pipeline by using Amazon EventBridge.

Buy Now
Questions 37

A company has an ML model that is deployed to an Amazon SageMaker AI endpoint for real-time inference. The company needs to deploy a new model. The company must compare the new model’s performance to the currently deployed model ' s performance before shifting all traffic to the new model.

Which solution will meet these requirements with the LEAST operational effort?

Options:

A.

Deploy the new model to a separate endpoint. Manually split traffic between the two endpoints.

B.

Deploy the new model to a separate endpoint. Use Amazon CloudFront to distribute traffic between the two endpoints.

C.

Deploy the new model as a shadow variant on the same endpoint as the current model. Route a portion of live traffic to the shadow model for evaluation.

D.

Use AWS Lambda functions with custom logic to route traffic between the current model and the new model.

Buy Now
Questions 38

A company uses a batching solution to process daily analytics. The company wants to provide near real-time updates, use open-source technology, and avoid managing or scaling infrastructure.

Which solution will meet these requirements?

Options:

A.

Create Amazon Managed Streaming for Apache Kafka (Amazon MSK) Serverless clusters.

B.

Create Amazon MSK Provisioned clusters.

C.

Create Amazon Kinesis Data Streams with Application Auto Scaling.

D.

Create self-hosted Apache Flink applications on Amazon EC2.

Buy Now
Questions 39

An ML engineer wants to re-train an XGBoost model at the end of each month. A data team prepares the training data. The training dataset is a few hundred megabytes in size. When the data is ready, the data team stores the data as a new file in an Amazon S3 bucket.

The ML engineer needs a solution to automate this pipeline. The solution must register the new model version in Amazon SageMaker Model Registry within 24 hours.

Which solution will meet these requirements?

Options:

A.

Create an AWS Lambda function that runs one time each week to poll the S3 bucket for new files. Invoke the Lambda function asynchronously. Configure the Lambda function to start the pipeline if the function detects new data.

B.

Create an Amazon CloudWatch rule that runs on a schedule to start the pipeline every 30 days.

C.

Create an S3 Lifecycle rule to start the pipeline every time a new object is uploaded to the S3 bucket.

D.

Create an Amazon EventBridge rule to start an AWS Step Functions TrainingStep every time a new object is uploaded to the S3 bucket.

Buy Now
Questions 40

A company ' s dataset for prediction analytics contains duplicate records, missing data, and unusually extreme high or low values. The company needs a solution to resolve the data quality issues quickly. The solution must maintain data integrity and have the LEAST operational overhead.

Which solution will meet these requirements?

Options:

A.

Use AWS Glue DataBrew to delete duplicate records, fill missing values with medians, and replace extreme values with values in a normal range.

B.

Configure an AWS Glue job to identify records with missing values and extreme measurements and delete them.

C.

Create an Amazon EMR Spark job to replace missing values with zeros and merge duplicate records.

D.

Use Amazon SageMaker Data Wrangler to delete duplicates, apply statistical modeling for missing values, and apply outlier detection algorithms.

Buy Now
Questions 41

An ML engineer is setting up an Amazon SageMaker AI pipeline for an ML model. The pipeline must automatically initiate a re-training job if any data drift is detected.

How should the ML engineer set up the pipeline to meet this requirement?

Options:

A.

Use an AWS Glue crawler and an AWS Glue extract, transform, and load (ETL) job to detect data drift. Use AWS Glue triggers to automate the retraining job.

B.

Use Amazon Managed Service for Apache Flink to detect data drift. Use an AWS Lambda function to automate the re-training job.

C.

Use SageMaker Model Monitor to detect data drift. Use an AWS Lambda function to automate the re-training job.

D.

Use Amazon Quick Suite (previously known as Amazon QuickSight) anomaly detection to detect data drift. Use an AWS Step Functions workflow to automate the re-training job.

Buy Now
Questions 42

A financial company receives a high volume of real-time market data streams from an external provider. The streams consist of thousands of JSON records per second.

The company needs a scalable AWS solution to identify anomalous data points with the LEAST operational overhead.

Which solution will meet these requirements?

Options:

A.

Ingest data into Amazon Kinesis Data Streams. Use the built-in RANDOM_CUT_FOREST function in Amazon Managed Service for Apache Flink to detect anomalies.

B.

Ingest data into Kinesis Data Streams. Deploy a SageMaker AI endpoint and use AWS Lambda to detect anomalies.

C.

Ingest data into Apache Kafka on Amazon EC2 and use SageMaker AI for detection.

D.

Send data to Amazon SQS and use AWS Glue ETL jobs for batch anomaly detection.

Buy Now
Questions 43

A company needs to create a central catalog for all the company ' s ML models. The models are in AWS accounts where the company developed the models initially. The models are hosted in Amazon Elastic Container Registry (Amazon ECR) repositories.

Which solution will meet these requirements?

Options:

A.

Configure ECR cross-account replication for each existing ECR repository. Ensure that each model is visible in each AWS account.

B.

Create a new AWS account with a new ECR repository as the central catalog. Configure ECR cross-account replication between the initial ECR repositories and the central catalog.

C.

Use the Amazon SageMaker Model Registry to create a model group for models hosted in Amazon ECR. Create a new AWS account. In the new account, use the SageMaker Model Registry as the central catalog. Attach a cross-account resource policy to each model group in the initial AWS accounts.

D.

Use an AWS Glue Data Catalog to store the models. Run an AWS Glue crawler to migrate the models from the ECR repositories to the Data Catalog. Configure cross-account access to the Data Catalog.

Buy Now
Questions 44

A company has developed a new ML model. The company requires online model validation on 10% of the traffic before the company fully releases the model in production. The company uses an Amazon SageMaker endpoint behind an Application Load Balancer (ALB) to serve the model.

Which solution will set up the required online validation with the LEAST operational overhead?

Options:

A.

Use production variants to add the new model to the existing SageMaker endpoint. Set the variant weight to 0.1 for the new model. Monitor the number of invocations by using Amazon CloudWatch.

B.

Use production variants to add the new model to the existing SageMaker endpoint. Set the variant weight to 1 for the new model. Monitor the number of invocations by using Amazon CloudWatch.

C.

Create a new SageMaker endpoint. Use production variants to add the new model to the new endpoint. Monitor the number of invocations by using Amazon CloudWatch.

D.

Configure the ALB to route 10% of the traffic to the new model at the existing SageMaker endpoint. Monitor the number of invocations by using AWS CloudTrail.

Buy Now
Questions 45

An ML engineer is using Amazon SageMaker Canvas to build a custom ML model from an imported dataset. The model must make continuous numeric predictions based on 10 years of data.

Which metric should the ML engineer use to evaluate the model’s performance?

Options:

A.

Accuracy

B.

InferenceLatency

C.

Area Under the ROC Curve (AUC)

D.

Root Mean Square Error (RMSE)

Buy Now
Questions 46

A company has a conversational AI assistant that sends requests through Amazon Bedrock to an Anthropic Claude large language model (LLM). Users report that when they ask similar questions multiple times, they sometimes receive different answers. An ML engineer needs to improve the responses to be more consistent and less random.

Which solution will meet these requirements?

Options:

A.

Increase the temperature parameter and the top_k parameter.

B.

Increase the temperature parameter. Decrease the top_k parameter.

C.

Decrease the temperature parameter. Increase the top_k parameter.

D.

Decrease the temperature parameter and the top_k parameter.

Buy Now
Questions 47

A music streaming company constantly streams song ratings from an application to an Amazon S3 bucket. The company wants to use the ratings as an input for training and inference of an Amazon SageMaker AI model.

The company has an AWS Glue Data Catalog that is configured with the S3 bucket as the source. An ML engineer needs to implement a solution to create a repository for this data. The solution must ensure that the data stays synchronized during batch training and real-time inference.

Which solution will meet these requirements?

Options:

A.

Ingest data into SageMaker Feature Store from the S3 bucket. Apply tags and indexes.

B.

Use Amazon Athena. Create tables by using CREATE TABLE AS SELECT (CTAS) queries to group data.

C.

Use AWS Lake Formation. Apply tag-based control on the data.

D.

Use the Generate Data Insights function in SageMaker Data Wrangler.

Buy Now
Questions 48

A company is running ML models on premises by using custom Python scripts and proprietary datasets. The company is using PyTorch. The model building requires unique domain knowledge. The company needs to move the models to AWS.

Which solution will meet these requirements with the LEAST effort?

Options:

A.

Use SageMaker built-in algorithms to train the proprietary datasets.

B.

Use SageMaker script mode and premade images for ML frameworks.

C.

Build a container on AWS that includes custom packages and a choice of ML frameworks.

D.

Purchase similar production models through AWS Marketplace.

Buy Now
Questions 49

A company wants to improve the sustainability of its ML operations.

Which actions will reduce the energy usage and computational resources that are associated with the company ' s training jobs? (Choose two.)

Options:

A.

Use Amazon SageMaker Debugger to stop training jobs when non-converging conditions are detected.

B.

Use Amazon SageMaker Ground Truth for data labeling.

C.

Deploy models by using AWS Lambda functions.

D.

Use AWS Trainium instances for training.

E.

Use PyTorch or TensorFlow with the distributed training option.

Buy Now
Questions 50

A company is developing a generative AI conversational interface to assist customers with payments. The company wants to use an ML solution to detect customer intent. The company does not have training data to train a model.

Which solution will meet these requirements?

Options:

A.

Fine-tune a sequence-to-sequence (seq2seq) algorithm in Amazon SageMaker JumpStart.

B.

Use an LLM from Amazon Bedrock with zero-shot learning.

C.

Use the Amazon Comprehend DetectEntities API.

D.

Run an LLM from Amazon Bedrock on Amazon EC2 instances.

Buy Now
Questions 51

An ML engineer needs to use AWS CloudFormation to create an ML model that an Amazon SageMaker endpoint will host.

Which resource should the ML engineer declare in the CloudFormation template to meet this requirement?

Options:

A.

AWS::SageMaker::Model

B.

AWS::SageMaker::Endpoint

C.

AWS::SageMaker::NotebookInstance

D.

AWS::SageMaker::Pipeline

Buy Now
Questions 52

A digital media entertainment company needs real-time video content moderation to ensure compliance during live streaming events.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

A.

Use Amazon Rekognition and AWS Lambda to extract and analyze the metadata from the videos ' image frames.

B.

Use Amazon Rekognition and a large language model (LLM) hosted on Amazon Bedrock to extract and analyze the metadata from the videos’ image frames.

C.

Use Amazon SageMaker AI to extract and analyze the metadata from the videos ' image frames.

D.

Use Amazon Transcribe and Amazon Comprehend to extract and analyze the metadata from the videos ' image frames.

Buy Now
Questions 53

A company is using ML to predict the presence of a specific weed in a farmer ' s field. The company is using the Amazon SageMaker linear learner built-in algorithm with a value of multiclass_dassifier for the predictorjype hyperparameter.

What should the company do to MINIMIZE false positives?

Options:

A.

Set the value of the weight decay hyperparameter to zero.

B.

Increase the number of training epochs.

C.

Increase the value of the target_precision hyperparameter.

D.

Change the value of the predictorjype hyperparameter to regressor.

Buy Now
Questions 54

A company wants to use large language models (LLMs) that are supported by Amazon Bedrock to develop a chat interface for the company ' s internal technical documentation. The company stores the documentation as dozens of text files that are several megabytes in total size. The company updates the text files often.

Which solution will meet these requirements MOST cost-effectively?

Options:

A.

Create a new LLM on Amazon Bedrock. Train the new LLM on the original dataset and the company documentation. Make the new model available in Bedrock for calls from the chat interface.

B.

Integrate the company documentation with Amazon Bedrock guardrails. Invoke the guardrails for all Amazon Bedrock calls from the chat interface.

C.

Use all the text files to fine tune a model in Amazon Bedrock. Use the fine-tuned model to process user prompts.

D.

Upload all the text files to an Amazon Bedrock knowledge base. Use the knowledge base to provide context when the chat interface makes calls to Amazon Bedrock.

Buy Now
Questions 55

An ML engineer is developing a fraud detection model by using the Amazon SageMaker XGBoost algorithm. The model classifies transactions as either fraudulent or legitimate.

During testing, the model excels at identifying fraud in the training dataset. However, the model is inefficient at identifying fraud in new and unseen transactions.

What should the ML engineer do to improve the fraud detection for new transactions?

Options:

A.

Increase the learning rate.

B.

Remove some irrelevant features from the training dataset.

C.

Increase the value of the max_depth hyperparameter.

D.

Decrease the value of the max_depth hyperparameter.

Buy Now
Questions 56

A company is using Amazon SageMaker AI to develop a credit risk assessment model. During model validation, the company finds that the model achieves 82% accuracy on the validation data. However, the model achieved 99% accuracy on the training data. The company needs to address the model accuracy issue before deployment.

Which solution will meet this requirement?

Options:

A.

Add more dense layers to increase model complexity. Implement batch normalization. Use early stopping during training.

B.

Implement dropout layers. Use L1 or L2 regularization. Perform k-fold cross-validation.

C.

Use principal component analysis (PCA) to reduce the feature dimensionality. Decrease model layers. Implement cross-entropy loss functions.

D.

Augment the training dataset. Remove duplicate records from the training dataset. Implement stratified sampling.

Buy Now
Questions 57

A company uses a batching solution to process data analytics each day. The company wants to build an analytics platform to provide near real-time updates. The company wants to use open source technology and does not want to manage or scale the infrastructure.

Which solution will meet these requirements?

Options:

A.

Create Amazon Managed Streaming for Apache Kafka (Amazon MSK) Serverless clusters to process the data.

B.

Create Amazon Managed Streaming for Apache Kafka (Amazon MSK) Provisioned clusters. Configure the clusters based on data volume.

C.

Create data streams in Amazon Kinesis Data Streams. Use AWS Application Auto Scaling to scale the infrastructure.

D.

Create self-hosted Apache Flink applications on Amazon EC2. Run the applications as containers.

Buy Now
Questions 58

A company needs to ingest data from data sources into Amazon SageMaker Data Wrangler. The data sources are Amazon S3, Amazon Redshift, and Snowflake. The ingested data must always be up to date with the latest changes in the source systems.

Which solution will meet these requirements?

Options:

A.

Use direct connections to import data from the data sources into Data Wrangler.

B.

Use cataloged connections to import data from the data sources into Data Wrangler.

C.

Use AWS Glue to extract data from the data sources. Use AWS Glue also to import the data directly into Data Wrangler.

D.

Use AWS Lambda to extract data from the data sources. Use Lambda also to import the data directly into Data Wrangler.

Buy Now
Questions 59

An ML engineer wants to use Amazon SageMaker Data Wrangler to perform preprocessing on a dataset. The ML engineer wants to use the processed dataset to train a classification model. During preprocessing, the ML engineer notices that a text feature has a range of thousands of values that differ only by spelling errors. The ML engineer needs to apply an encoding method so that after preprocessing is complete, the text feature can be used to train the model.

Which solution will meet these requirements?

Options:

A.

Perform ordinal encoding to represent categories of the feature.

B.

Perform similarity encoding to represent categories of the feature.

C.

Perform one-hot encoding to represent categories of the feature.

D.

Perform target encoding to represent categories of the feature.

Buy Now
Questions 60

An ML engineer is deploying a generative AI model-based customer support agent that uses Amazon SageMaker AI for inference. The customer support agent must respond to customer questions about topics such as shipping policies, refund processes, and account management. The generative AI model generates one token at a time.

Customers report dissatisfaction with how long the customer support agent takes to generate lengthy responses to questions. The ML engineer must apply an inference optimization technique to improve the performance of the customer support agent.

Which solution will meet this requirement?

Options:

A.

Compilation

B.

Speculative decoding

C.

Quantization

D.

Fast model loading

Buy Now
Questions 61

A company uses Amazon Athena to query a dataset in Amazon S3. The dataset has a target variable that the company wants to predict.

The company needs to use the dataset in a solution to determine if a model can predict the target variable.

Which solution will provide this information with the LEAST development effort?

Options:

A.

Create a new model by using Amazon SageMaker Autopilot. Report the model ' s achieved performance.

B.

Implement custom scripts to perform data pre-processing, multiple linear regression, and performance evaluation. Run the scripts on Amazon EC2 instances.

C.

Configure Amazon Macie to analyze the dataset and to create a model. Report the model ' s achieved performance.

D.

Select a model from Amazon Bedrock. Tune the model with the data. Report the model ' s achieved performance.

Buy Now
Questions 62

An ML engineer needs to deploy ML models to get inferences from large datasets in an asynchronous manner. The ML engineer also needs to implement scheduled monitoring of the data quality of the models. The ML engineer must receive alerts when changes in data quality occur.

Which solution will meet these requirements?

Options:

A.

Deploy the models by using scheduled AWS Glue jobs. Use Amazon CloudWatch alarms to monitor the data quality and to send alerts.

B.

Deploy the models by using scheduled AWS Batch jobs. Use AWS CloudTrail to monitor the data quality and to send alerts.

C.

Deploy the models by using Amazon Elastic Container Service (Amazon ECS) on AWS Fargate. Use Amazon EventBridge to monitor the data quality and to send alerts.

D.

Deploy the models by using Amazon SageMaker AI batch transform. Use SageMaker Model Monitor to monitor the data quality and to send alerts.

Buy Now
Questions 63

An ML engineer needs to organize a large set of text documents into topics. The ML engineer will not know what the topics are in advance. The ML engineer wants to use built-in algorithms or pre-trained models available through Amazon SageMaker AI to process the documents.

Which solution will meet these requirements?

Options:

A.

Use the BlazingText algorithm to identify the relevant text and to create a set of topics based on the documents.

B.

Use the Sequence-to-Sequence algorithm to summarize the text and to create a set of topics based on the documents.

C.

Use the Object2Vec algorithm to create embeddings and to create a set of topics based on the embeddings.

D.

Use the Latent Dirichlet Allocation (LDA) algorithm to process the documents and to create a set of topics based on the documents.

Buy Now
Questions 64

A company wants to share data with a vendor in real time to improve the performance of the vendor ' s ML models. The vendor needs to ingest the data in a stream. The vendor will use only some of the columns from the streamed data.

Which solution will meet these requirements?

Options:

A.

Use AWS Data Exchange to stream the data to an Amazon S3 bucket. Use an Amazon Athena CREATE TABLE AS SELECT (CTAS) query to define relevant columns.

B.

Use Amazon Kinesis Data Streams to ingest the data. Use Amazon Managed Service for Apache Flink as a consumer to extract relevant columns.

C.

Create an Amazon S3 bucket. Configure the S3 bucket policy to allow the vendor to upload data to the S3 bucket. Configure the S3 bucket policy to control which columns are shared.

D.

Use AWS Lake Formation to ingest the data. Use the column-level filtering feature in Lake Formation to extract relevant columns.

Buy Now
Questions 65

A company is developing an ML model by using Amazon SageMaker AI. The company must monitor bias in the model and display the results on a dashboard. An ML engineer creates a bias monitoring job.

How should the ML engineer capture bias metrics to display on the dashboard?

Options:

A.

Capture AWS CloudTrail metrics from SageMaker Clarify.

B.

Capture Amazon CloudWatch metrics from SageMaker Clarify.

C.

Capture SageMaker Model Monitor metrics from Amazon EventBridge.

D.

Capture SageMaker Model Monitor metrics from Amazon SNS.

Buy Now
Questions 66

A company runs an Amazon SageMaker AI domain in a public subnet of a newly created VPC. The network is configured properly, and ML engineers can access the SageMaker AI domain.

Recently, the company discovered suspicious traffic to the domain from a specific IP address. The company needs to block traffic from the specific IP address.

Which update to the network configuration will meet this requirement?

Options:

A.

Create a security group inbound rule to deny traffic from the specific IP address. Assign the security group to the domain.

B.

Create a network ACL inbound rule to deny traffic from the specific IP address. Assign the rule to the default network ACL for the subnet where the domain is located.

C.

Create a shadow variant for the domain. Configure SageMaker Inference Recommender to send traffic from the specific IP address to the shadow endpoint.

D.

Create a VPC route table to deny inbound traffic from the specific IP address. Assign the route table to the domain.

Buy Now
Questions 67

An ML engineer wants to run a training job on Amazon SageMaker AI by using multiple GPUs. The training dataset is stored in Apache Parquet format.

The Parquet files are too large to fit into the memory of the SageMaker AI training instances.

Which solution will fix the memory problem?

Options:

A.

Attach an Amazon EBS Provisioned IOPS SSD volume and store the files on the EBS volume.

B.

Repartition the Parquet files by using Apache Spark on Amazon EMR and use the repartitioned files for training.

C.

Change to memory-optimized instance types with sufficient memory.

D.

Use SageMaker distributed data parallelism (SMDDP) to split memory usage.

Buy Now
Questions 68

An ML engineer has a custom container that performs k-fold cross-validation and logs an average F1 score during training. The ML engineer wants Amazon SageMaker AI Automatic Model Tuning (AMT) to select hyperparameters that maximize the average F1 score.

How should the ML engineer integrate the custom metric into SageMaker AI AMT?

Options:

A.

Define the average F1 score in the TrainingInputMode parameter.

B.

Define a metric definition in the tuning job that uses a regular expression to capture the average F1 score from the training logs.

C.

Publish the average F1 score as a custom Amazon CloudWatch metric.

D.

Write the F1 score to a JSON file in Amazon S3 and reference it in ObjectiveMetricName.

Buy Now
Questions 69

An ML engineer must choose the appropriate Amazon SageMaker algorithm to solve specific AI problems.

Select the correct SageMaker built-in algorithm from the following list for each use case. Each algorithm should be selected one time.

• Random Cut Forest (RCF) algorithm

• Semantic segmentation algorithm

• Sequence-to-Sequence (seq2seq) algorithm

MLA-C01 Question 69

Options:

Buy Now
Questions 70

A company has significantly increased the amount of data stored as .csv files in an Amazon S3 bucket. Data transformation scripts and queries are now taking much longer than before.

An ML engineer must implement a solution to optimize the data for query performance with the LEAST operational overhead.

Which solution will meet this requirement?

Options:

A.

Configure an AWS Lambda function to split the .csv files into smaller objects.

B.

Configure an AWS Glue job to drop string-type columns and save the results to S3.

C.

Configure an AWS Glue ETL job to convert the .csv files to Apache Parquet format.

D.

Configure an Amazon EMR cluster to process the data in S3.

Buy Now
Questions 71

An ML engineer is analyzing a classification dataset before training a model in Amazon SageMaker AI. The ML engineer suspects that the dataset has a significant imbalance between class labels that could lead to biased model predictions. To confirm class imbalance, the ML engineer needs to select an appropriate pre-training bias metric.

Which metric will meet this requirement?

Options:

A.

Mean squared error (MSE)

B.

Difference in proportions of labels (DPL)

C.

Silhouette score

D.

Structural similarity index measure (SSIM)

Buy Now
Questions 72

An ML engineer needs to use an ML model to predict the price of apartments in a specific location.

Which metric should the ML engineer use to evaluate the model’s performance?

Options:

A.

Accuracy

B.

Area Under the ROC Curve (AUC)

C.

F1 score

D.

Mean absolute error (MAE)

Buy Now
Exam Code: MLA-C01
Exam Name: AWS Certified Machine Learning Engineer - Associate
Last Update: May 22, 2026
Questions: 241

PDF + Testing Engine

$49.5  $164.99

Testing Engine

$37.5  $124.99
buy now MLA-C01 testing engine

PDF (Q&A)

$31.5  $104.99
buy now MLA-C01 pdf
dumpsmate guaranteed to pass

24/7 Customer Support

DumpsMate's team of experts is always available to respond your queries on exam preparation. Get professional answers on any topic of the certification syllabus. Our experts will thoroughly satisfy you.

Site Secure

mcafee secure

TESTED 22 May 2026