Machine Learning Production Systems

Lista Ofert

Opis

Using machine learning for products, services, and critical business processes is quite different from using ML in an academic or research setting-especially for recent ML graduates and those moving from research to a commercial environment. Whether you currently work to create products and services that use ML, or would like to in the future, this practical book gives you a broad view of the entire field.Authors Robert Crowe, Hannes Hapke, Emily Caveness, and Di Zhu help you identify topics that you can dive into deeper, along with reference materials and tutorials that teach you the details. You'll learn the state of the art of machine learning engineering, including a wide range of topics such as modeling, deployment, and MLOps. You'll learn the basics and advanced aspects to understand the production ML lifecycle.This book provides four in-depth sections that cover all aspects of machine learning engineering:Data: collecting, labeling, validating, automation, and data preprocessing; data feature engineering and selection; data journey and storageModeling: high performance modeling; model resource management techniques; model analysis and interoperability; neural architecture searchDeployment: model serving patterns and infrastructure for ML models and LLMs; management and delivery; monitoring and loggingProductionalizing: ML pipelines; classifying unstructured texts and images; genAI model pipelines Spis treści: Foreword Preface Who Should Read This Book Why We Wrote This Book Navigating This Book Conventions Used in This Book Using Code Examples OReilly Online Learning How to Contact Us Acknowledgments Robert Hannes Emily Di 1. Introduction to Machine Learning Production Systems What Is Production Machine Learning? Benefits of Machine Learning Pipelines Focus on Developing New Models, Not on Maintaining Existing Models Prevention of Bugs Creation of Records for Debugging and Reproducing Results Standardization The Business Case for ML Pipelines When to Use Machine Learning Pipelines Steps in a Machine Learning Pipeline Data Ingestion and Data Versioning Data Validation Feature Engineering Model Training and Model Tuning Model Analysis Model Deployment Looking Ahead 2. Collecting, Labeling, and Validating Data Important Considerations in Data Collection Responsible Data Collection Labeling Data: Data Changes and Drift in Production ML Labeling Data: Direct Labeling and Human Labeling Validating Data: Detecting Data Issues Validating Data: TensorFlow Data Validation Skew Detection with TFDV Types of Skew Example: Spotting Imbalanced Datasets with TensorFlow Data Validation Conclusion 3. Feature Engineering and Feature Selection Introduction to Feature Engineering Preprocessing Operations Feature Engineering Techniques Normalizing and Standardizing Bucketizing Feature Crosses Dimensionality and Embeddings Visualization Feature Transformation at Scale Choose a Framework That Scales Well Avoid TrainingServing Skew Consider Instance-Level Versus Full-Pass Transformations Using TensorFlow Transform Analyzers Code Example Feature Selection Feature Spaces Feature Selection Overview Filter Methods Wrapper Methods Forward selection Backward elimination Recursive feature elimination Code example Embedded Methods Feature and Example Selection for LLMs and GenAI Example: Using TF Transform to Tokenize Text Benefits of Using TF Transform Alternatives to TF Transform Conclusion 4. Data Journey and Data Storage Data Journey ML Metadata Using a Schema Schema Development Schema Environments Changes Across Datasets Enterprise Data Storage Feature Stores Metadata Precomputed features Time travel Data Warehouses Data Lakes Conclusion 5. Advanced Labeling, Augmentation, and Data Preprocessing Advanced Labeling Semi-Supervised Labeling Label propagation Sampling techniques Active Learning Margin sampling Other sampling techniques Weak Supervision Advanced Labeling Review Data Augmentation Example: CIFAR-10 Other Augmentation Techniques Data Augmentation Review Preprocessing Time Series Data: An Example Windowing Sampling Conclusion 6. Model Resource Management Techniques Dimensionality Reduction: Dimensionality Effect on Performance Example: Word Embedding Using Keras Curse of Dimensionality Adding Dimensions Increases Feature Space Volume Dimensionality Reduction Three approaches Algorithmic dimensionality reduction Principal component analysis Quantization and Pruning Mobile, IoT, Edge, and Similar Use Cases Quantization Benefits and process of quantization MobileNets Post-training quantization Quantization-aware training Comparing results Example: Quantizing models with TF Lite Optimizing Your TensorFlow Model with TF Lite Optimization Options Pruning The Lottery Ticket Hypothesis Pruning in TensorFlow Knowledge Distillation Teacher and Student Networks Knowledge Distillation Techniques TMKD: Distilling Knowledge for a Q&A Task Increasing Robustness by Distilling EfficientNets Conclusion 7. High-Performance Modeling Distributed Training Data Parallelism Synchronous versus asynchronous training Distribution awareness Tf.distribute: Distributed training in TensorFlow OneDeviceStrategy MirroredStrategy ParameterServerStrategy Fault tolerance Efficient Input Pipelines Input Pipeline Basics Input Pipeline Patterns: Improving Efficiency Optimizing Your Input Pipeline with TensorFlow Data Prefetching Parallelizing data transformation Caching Training Large Models: The Rise of Giant Neural Nets and Parallelism Potential Solutions and Their Shortcomings Gradient accumulation Swapping Parallelism, revisited in the context of giant neural nets Pipeline Parallelism to the Rescue? Conclusion 8. Model Analysis Analyzing Model Performance Black-Box Evaluation Performance Metrics and Optimization Objectives Advanced Model Analysis TensorFlow Model Analysis The Learning Interpretability Tool Advanced Model Debugging Benchmark Models Sensitivity Analysis Random attacks Partial dependence plots Vulnerability to attacks Measuring model vulnerability Hardening your models Residual Analysis Model Remediation Discrimination Remediation Fairness Fairness Evaluation True/false positive/negative rates Accuracy and AUC Fairness Considerations Continuous Evaluation and Monitoring Conclusion 9. Interpretability Explainable AI Model Interpretation Methods Method Categories Intrinsic or post hoc? Model specific or model agnostic? Local or global? Intrinsically Interpretable Models Feature importance Lattice models Model-Agnostic Methods Partial dependence plots Permutation feature importance Local Interpretable Model-Agnostic Explanations Shapley Values The SHAP Library Testing Concept Activation Vectors AI Explanations Integrated gradients XRAI Example: Exploring Model Sensitivity with SHAP Regression Models Natural Language Processing Models Conclusion 10. Neural Architecture Search Hyperparameter Tuning Introduction to AutoML Key Components of NAS Search Spaces Macro search space Micro search space Search Strategies Performance Estimation Strategies Simple approach to performance estimation More efficient performance estimation AutoML in the Cloud Amazon SageMaker Autopilot Microsoft Azure Automated Machine Learning Google Cloud AutoML Using AutoML Generative AI and AutoML Conclusion 11. Introduction to Model Serving Model Training Model Prediction Latency Throughput Cost Resources and Requirements for Serving Models Cost and Complexity Accelerators Feeding the Beast Model Deployments Data Center Deployments Mobile and Distributed Deployments Model Servers Managed Services Conclusion 12. Model Serving Patterns Batch Inference Batch Throughput Batch Inference Use Cases Product recommendations Sentiment analysis Demand forecasting ETL for Distributed Batch and Stream Processing Systems Introduction to Real-Time Inference Synchronous Delivery of Real-Time Predictions Asynchronous Delivery of Real-Time Predictions Optimizing Real-Time Inference Real-Time Inference Use Cases Serving Model Ensembles Ensemble Topologies Example Ensemble Ensemble Serving Considerations Model Routers: Ensembles in GenAI Data Preprocessing and Postprocessing in Real Time Training Transformations Versus Serving Transformations Windowing Options for Preprocessing Enter TensorFlow Transform Postprocessing Inference at the Edge and at the Browser Challenges Balancing energy consumption with processing power Performing model retraining and updates Securing the user data Model Deployments via Containers Training on the Device Federated Learning Runtime Interoperability Inference in Web Browsers Conclusion 13. Model Serving Infrastructure Model Servers TensorFlow Serving Servables Servable versions Models Loaders Sources Aspired versions Managers Core NVIDIA Triton Inference Server TorchServe Building Scalable Infrastructure Containerization Traditional Deployment Era Virtualized Deployment Era Container Deployment Era The Docker Containerization Framework Docker daemon Docker client Docker registry Docker objects Docker image Docker container Container Orchestration Kubernetes Kubernetes components Containers on clouds Kubeflow Reliability and Availability Through Redundancy Observability High Availability Automated Deployments Hardware Accelerators GPUs TPUs Conclusion 14. Model Serving Examples Example: Deploying TensorFlow Models with TensorFlow Serving Exporting Keras Models for TF Serving Setting Up TF Serving with Docker Basic Configuration of TF Serving Making Model Prediction Requests with REST Making Model Prediction Requests with gRPC Getting Predictions from Classification and Regression Models Using Payloads Getting Model Metadata from TF Serving Making Batch Inference Requests Example: Profiling TF Serving Inferences with TF Profiler Prerequisites TensorBoard Setup Model Profile Example: Basic TorchServe Setup Installing the TorchServe Dependencies Exporting Your Model for TorchServe Setting Up TorchServe Request handlers TorchServe configuration Making Model Prediction Requests Making Batch Inference Requests Setting batch configuration via config.properties Setting batch configuration via REST request Conclusion 15. Model Management and Delivery Experiment Tracking Experimenting in Notebooks Experimenting Overall Not just one big file Tracking runtime parameters Tools for Experiment Tracking and Versioning TensorBoard Tools for organizing experiment results Introduction to MLOps Data Scientists Versus Software Engineers ML Engineers ML in Products and Services MLOps MLOps Methodology MLOps Level 0 MLOps Level 1 MLOps Level 2 Components of an Orchestrated Workflow Three Types of Custom Components Python FunctionBased Components Container-Based Components Fully Custom Components TFX Deep Dive TFX SDK Intermediate Representation Runtime Implementing an ML Pipeline Using TFX Components Advanced Features of TFX Component dependency Data dependency Task dependency Importer Conditional execution Managing Model Versions Approaches to Versioning Models Versioning proposal Arbitrary grouping Black-box functional model Pipeline execution versioning Model Lineage Model Registries Continuous Integration and Continuous Deployment Continuous Integration Continuous Delivery Progressive Delivery Blue/Green Deployment Canary Deployment Live Experimentation A/B testing Multi-armed bandits Contextual bandits Conclusion 16. Model Monitoring and Logging The Importance of Monitoring Observability in Machine Learning What Should You Monitor? Custom Alerting in TFX Logging Distributed Tracing Monitoring for Model Decay Data Drift and Concept Drift Model Decay Detection Supervised Monitoring Techniques Statistical process control Sequential analysis Error distribution monitoring Unsupervised Monitoring Techniques Clustering Feature distribution monitoring Model-dependent monitoring Mitigating Model Decay Retraining Your Model When to Retrain Automated Retraining Conclusion 17. Privacy and Legal Requirements Why Is Data Privacy Important? What Data Needs to Be Kept Private? Harms Only Collect What You Need GenAI Data Scraped from the Web and Other Sources Legal Requirements The GDPR and the CCPA The GDPRs Right to Be Forgotten Pseudonymization and Anonymization Differential Privacy Local and Global DP Epsilon-Delta DP Applying Differential Privacy to ML Differentially Private Stochastic Gradient Descent Private Aggregation of Teacher Ensembles Confidential and Private Collaborative learning TensorFlow Privacy Example Federated Learning Encrypted ML Conclusion 18. Orchestrating Machine Learning Pipelines An Introduction to Pipeline Orchestration Why Pipeline Orchestration? Directed Acyclic Graphs Pipeline Orchestration with TFX Interactive TFX Pipelines Converting Your Interactive Pipeline for Production Orchestrating TFX Pipelines with Apache Beam Orchestrating TFX Pipelines with Kubeflow Pipelines Introduction to Kubeflow Pipelines Installation and Initial Setup Accessing Kubeflow Pipelines The Workflow from TFX to Kubeflow OpFunc Functions Orchestrating Kubeflow Pipelines Google Cloud Vertex Pipelines Setting Up Google Cloud and Vertex Pipelines Setting Up a Google Cloud Service Account Orchestrating Pipelines with Vertex Pipelines Executing Vertex Pipelines Choosing Your Orchestrator Interactive TFX Apache Beam Kubeflow Pipelines Google Cloud Vertex Pipelines Alternatives to TFX Conclusion 19. Advanced TFX Advanced Pipeline Practices Configure Your Components Import Artifacts Use Resolver Node Execute a Conditional Pipeline Export TF Lite Models Warm-Starting Model Training Use Exit Handlers Trigger Messages from TFX Custom TFX Components: Architecture and Use Cases Architecture of TFX Components Use Cases of Custom Components Using Function-Based Custom Components Writing a Custom Component from Scratch Defining Component Specifications Defining Component Channels Writing the Custom Executor Writing the Custom Driver Assembling the Custom Component Using Our Basic Custom Component Implementation Review Reusing Existing Components Creating Container-Based Custom Components Which Custom Component Is Right for You? TFX-Addons Conclusion 20. ML Pipelines for Computer Vision Problems Our Data Our Model Custom Ingestion Component Data Preprocessing Exporting the Model Our Pipeline Data Ingestion Data Preprocessing Model Training Model Evaluation Model Export Putting It All Together Executing on Apache Beam Executing on Vertex Pipelines Model Deployment with TensorFlow Serving Conclusion 21. ML Pipelines for Natural Language Processing Our Data Our Model Ingestion Component Data Preprocessing Putting the Pipeline Together Executing the Pipeline Model Deployment with Google Cloud Vertex Registering Your ML Model Creating a New Model Endpoint Deploying Your ML Model Requesting Predictions from the Deployed Model Cleaning Up Your Deployed Model Conclusion 22. Generative AI Generative Models GenAI Model Types Agents and Copilots Pretraining Pretraining Datasets Embeddings Self-Supervised Training with Masks Fine-Tuning Fine-Tuning Versus Transfer Learning Fine-Tuning Datasets Fine-Tuning Considerations for Production Fine-Tuning Versus Model APIs Parameter-Efficient Fine-Tuning LoRA S-LoRA Human Alignment Reinforcement Learning from Human Feedback Reinforcement Learning from AI Feedback Direct Preference Optimization Prompting Chaining Retrieval Augmented Generation ReAct Evaluation Evaluation Techniques Benchmarking Across Models LMOps GenAI Attacks Jailbreaks Prompt Injection Responsible GenAI Design for Responsibility Conduct Adversarial Testing Constitutional AI Conclusion 23. The Future of Machine Learning Production Systems and Next Steps Lets Think in Terms of ML Systems, Not ML Models Bringing ML Systems Closer to Domain Experts Privacy Has Never Been More Important Conclusion Index

Rozwiń Zwiń

Specyfikacja

Podstawowe informacje

Autor

Robert Crowe, Hannes Hapke, Emily Caveness

Informacje dodatkowe

Format	MOBI EPUB
Ilość stron	474
Rok wydania	2024

Machine Learning Production Systems Chorzów

Lista Ofert

Opis

Specyfikacja

Podstawowe informacje

Informacje dodatkowe

Zobacz także