The Google GCP-PMLE exam preparation guide is designed to provide candidates with necessary information about the Professional Machine Learning Engineer exam. It includes exam summary, sample questions, practice test, objectives and ways to interpret the exam objectives to enable candidates to assess the types of questions-answers that may be asked during the Google Cloud Platform - Professional Machine Learning Engineer (GCP-PMLE) exam.
It is recommended for all the candidates to refer the GCP-PMLE objectives and sample questions provided in this preparation guide. The Google Professional Machine Learning Engineer certification is mainly targeted to the candidates who want to build their career in Professional domain and demonstrate their expertise. We suggest you to use practice exam listed in this cert guide to get used to with exam environment and identify the knowledge areas where you need more work prior to taking the actual Google Professional Machine Learning Engineer exam.
Google GCP-PMLE Exam Summary:
Exam Name
|
Google Professional Machine Learning Engineer |
Exam Code | GCP-PMLE |
Exam Price | $200 USD |
Duration | 120 minutes |
Number of Questions | 50-60 |
Passing Score | Pass / Fail (Approx 70%) |
Recommended Training / Books |
Google Cloud training Google Cloud documentation |
Schedule Exam | Google CertMetrics |
Sample Questions | Google GCP-PMLE Sample Questions |
Recommended Practice | Google Cloud Platform - Professional Machine Learning Engineer (GCP-PMLE) Practice Test |
Google Professional Machine Learning Engineer Syllabus:
Section | Objectives |
---|---|
Architecting low-code AI solutions (13% of the exam) |
|
Developing ML models by using BigQuery ML. Considerations include: |
- Building the appropriate BigQuery ML model (e.g., linear and binary classification, regression, time-series, matrix factorization, boosted trees, autoencoders) based on the business problem - Feature engineering or selection by using BigQuery ML - Generating predictions by using BigQuery ML |
Building AI solutions by using ML APIs or foundational models. Considerations include: |
- Building applications by using ML APIs from Model Garden - Building applications by using industry-specific APIs (e.g., Document AI API, Retail API) - Implementing retrieval augmented generation (RAG) applications by using Vertex AI Agent Builder |
Training models by using AutoML. Considerations include: |
- Preparing data for AutoML (e.g., feature selection, data labeling, Tabular Workflows on AutoML) - Using available data (e.g., tabular, text, speech, images, videos) to train custom models - Using AutoML for tabular data - Creating forecasting models by using AutoML - Configuring and debugging trained models |
Collaborating within and across teams to manage data and models (14% of the exam) |
|
Exploring and preprocessing organization-wide data (e.g., Cloud Storage, BigQuery, Cloud Spanner, Cloud SQL, Apache Spark, Apache Hadoop). Considerations include: |
- Organizing different types of data (e.g., tabular, text, speech, images, videos) for efficient training - Managing datasets in Vertex AI - Data preprocessing (e.g., Dataflow, TensorFlow Extended [TFX], BigQuery) - Creating and consolidating features in Vertex AI Feature Store - Privacy implications of data usage and/or collection (e.g., handling sensitive data such as personally identifiable information [PII] and protected health information [PHI]) - Ingesting different data sources (e.g., text documents) into Vertex AI for inference |
Model prototyping using Jupyter notebooks. Considerations include: |
- Choosing the appropriate Jupyter backend on Google Cloud (e.g., Vertex AI Workbench, Colab Enterprise, notebooks on Dataproc) - Applying security best practices in Vertex AI Workbench - Using Spark kernels - Integrating code source repositories - Developing models in Vertex AI Workbench by using common frameworks (e.g., TensorFlow, PyTorch, sklearn, Spark, JAX) - Leveraging a variety of foundational and open-source models in Model Garden |
Tracking and running ML experiments. Considerations include: |
- Choosing the appropriate Google Cloud environment for development and experimentation (e.g., Vertex AI Experiments, Kubeflow Pipelines, Vertex AI TensorBoard with TensorFlow and PyTorch) given the framework - Evaluating generative AI solutions |
Scaling prototypes into ML models (18% of the exam) |
|
Building models. Considerations include: |
- Choosing ML framework and model architecture - Modeling techniques given interpretability requirements |
Training models. Considerations include: |
- Organizing training data (e.g., tabular, text, speech, images, videos) on Google Cloud (e.g., Cloud Storage, BigQuery) - Ingestion of various file types (e.g., CSV, JSON, images, Hadoop, databases) into training - Training using different SDKs (e.g., Vertex AI custom training, Kubeflow on Google Kubernetes Engine, AutoML, tabular workflows) - Using distributed training to organize reliable pipelines - Hyperparameter tuning - Troubleshooting ML model training failures - Fine-tuning foundational models (e.g., Vertex AI, Model Garden) |
Choosing appropriate hardware for training. Considerations include: |
- Evaluation of compute and accelerator options (e.g., CPU, GPU, TPU, edge devices) - Distributed training with TPUs and GPUs (e.g., Reduction Server on Vertex AI, Horovod) |
Serving and scaling models (20% of the exam) |
|
Serving models. Considerations include: |
- Batch and online inference (e.g., Vertex AI, Dataflow, BigQuery ML, Dataproc) - Using different frameworks (e.g., PyTorch, XGBoost) to serve models - Organizing a model registry - A/B testing different versions of a model |
Scaling online model serving. Considerations include: |
- Vertex AI Feature Store - Vertex AI public and private endpoints - Choosing appropriate hardware (e.g., CPU, GPU, TPU, edge) - Scaling the serving backend based on the throughput (e.g., Vertex AI Prediction, containerized serving) - Tuning ML models for training and serving in production (e.g., simplification techniques, optimizing the ML solution for increased performance, latency, memory, throughput) |
Automating and orchestrating ML pipelines (22% of the exam) |
|
Developing end-to-end ML pipelines. Considerations include: |
- Data and model validation - Ensuring consistent data pre-processing between training and serving - Hosting third-party pipelines on Google Cloud (e.g., MLFlow) - Identifying components, parameters, triggers, and compute needs (e.g., Cloud Build, Cloud Run) - Orchestration framework (e.g., Kubeflow Pipelines, Vertex AI Pipelines, Cloud Composer) - Hybrid or multicloud strategies - System design with TFX components or Kubeflow DSL (e.g., Dataflow) |
Automating model retraining. Considerations include: |
- Determining an appropriate retraining policy - Continuous integration and continuous delivery (CI/CD) model deployment (e.g., Cloud Build, Jenkins) |
Tracking and auditing metadata. Considerations include: |
- Tracking and comparing model artifacts and versions (e.g., Vertex AI Experiments, Vertex ML Metadata) - Hooking into model and dataset versioning - Model and data lineage |
Monitoring AI solutions (13% of the exam) |
|
Identifying risks to AI solutions. Considerations include: |
- Building secure AI systems by protecting against unintentional exploitation of data or models (e.g., hacking) - Aligning with Google’s Responsible AI practices (e.g., monitoring for bias) - Assessing AI solution readiness (e.g., fairness, bias) - Model explainability on Vertex AI (e.g., Vertex AI Prediction) |
Monitoring, testing, and troubleshooting AI solutions. Considerations include: |
- Establishing continuous evaluation metrics (e.g., Vertex AI Model Monitoring, Explainable AI) - Monitoring for training-serving skew - Monitoring for feature attribution drift - Monitoring model performance against baselines, simpler models, and across the time dimension - Monitoring for common training and serving errors |