Hands-On Large Language Models

Lista Ofert

Opis

AI has acquired startling new language capabilities in just the past few years. Driven by the rapid advances in deep learning, language AI systems are able to write and understand text better than ever before. This trend enables the rise of new features, products, and entire industries. With this book, Python developers will learn the practical tools and concepts they need to use these capabilities today.You'll learn how to use the power of pre-trained large language models for use cases like copywriting and summarization; create semantic search systems that go beyond keyword matching; build systems that classify and cluster text to enable scalable understanding of large amounts of text documents; and use existing libraries and pre-trained models for text classification, search, and clusterings.This book also shows you how to:Build advanced LLM pipelines to cluster text documents and explore the topics they belong toBuild semantic search engines that go beyond keyword search with methods like dense retrieval and rerankersLearn various use cases where these models can provide valueUnderstand the architecture of underlying Transformer models like BERT and GPTGet a deeper understanding of how LLMs are trainedUnderstanding how different methods of fine-tuning optimize LLMs for specific applications (generative model fine-tuning, contrastive fine-tuning, in-context learning, etc.) Spis treści: Preface An Intuition-First Philosophy Prerequisites Book Structure Part I: Understanding Language Models Part II: Using Pretrained Language Models Part III: Training and Fine-Tuning Language Models Hardware and Software Requirements API Keys Conventions Used in This Book Using Code Examples OReilly Online Learning How to Contact Us Acknowledgments I. Understanding Language Models 1. An Introduction to Large Language Models What Is Language AI? A Recent History of Language AI Representing Language as a Bag-of-Words Better Representations with Dense Vector Embeddings Types of Embeddings Encoding and Decoding Context with Attention Attention Is All You Need Representation Models: Encoder-Only Models Generative Models: Decoder-Only Models The Year of Generative AI The Moving Definition of a Large Language Model The Training Paradigm of Large Language Models Large Language Model Applications: What Makes Them So Useful? Responsible LLM Development and Usage Limited Resources Are All You Need Interfacing with Large Language Models Proprietary, Private Models Open Models Open Source Frameworks Generating Your First Text Summary 2. Tokens and Embeddings LLM Tokenization How Tokenizers Prepare the Inputs to the Language Model Downloading and Running an LLM How Does the Tokenizer Break Down Text? Word Versus Subword Versus Character Versus Byte Tokens Comparing Trained LLM Tokenizers BERT base model (uncased) (2018) BERT base model (cased) (2018) GPT-2 (2019) Flan-T5 (2022) GPT-4 (2023) StarCoder2 (2024) Galactica Phi-3 (and Llama 2) Tokenizer Properties Tokenization methods Tokenizer parameters The domain of the data Token Embeddings A Language Model Holds Embeddings for the Vocabulary of Its Tokenizer Creating Contextualized Word Embeddings with Language Models Text Embeddings (for Sentences and Whole Documents) Word Embeddings Beyond LLMs Using pretrained Word Embeddings The Word2vec Algorithm and Contrastive Training Embeddings for Recommendation Systems Recommending Songs by Embeddings Training a Song Embedding Model Summary 3. Looking Inside Large Language Models An Overview of Transformer Models The Inputs and Outputs of a Trained Transformer LLM The Components of the Forward Pass Choosing a Single Token from the Probability Distribution (Sampling/Decoding) Parallel Token Processing and Context Size Speeding Up Generation by Caching Keys and Values Inside the Transformer Block The feedforward neural network at a glance The attention layer at a glance Attention is all you need How attention is calculated Self-attention: Relevance scoring Self-attention: Combining information Recent Improvements to the Transformer Architecture More Efficient Attention Local/sparse attention Multi-query and grouped-query attention Optimizing attention: From multi-head to multi-query to grouped query Flash Attention The Transformer Block Positional Embeddings (RoPE) Other Architectural Experiments and Improvements Summary II. Using Pretrained Language Models 4. Text Classification The Sentiment of Movie Reviews Text Classification with Representation Models Model Selection Using a Task-Specific Model Classification Tasks That Leverage Embeddings Supervised Classification What If We Do Not Have Labeled Data? Text Classification with Generative Models Using the Text-to-Text Transfer Transformer ChatGPT for Classification Summary 5. Text Clustering and Topic Modeling ArXivs Articles: Computation and Language A Common Pipeline for Text Clustering Embedding Documents Reducing the Dimensionality of Embeddings Cluster the Reduced Embeddings Inspecting the Clusters From Text Clustering to Topic Modeling BERTopic: A Modular Topic Modeling Framework Adding a Special Lego Block KeyBERTInspired Maximal marginal relevance The Text Generation Lego Block Summary 6. Prompt Engineering Using Text Generation Models Choosing a Text Generation Model Loading a Text Generation Model Controlling Model Output Temperature top_p Intro to Prompt Engineering The Basic Ingredients of a Prompt Instruction-Based Prompting Advanced Prompt Engineering The Potential Complexity of a Prompt In-Context Learning: Providing Examples Chain Prompting: Breaking up the Problem Reasoning with Generative Models Chain-of-Thought: Think Before Answering Self-Consistency: Sampling Outputs Tree-of-Thought: Exploring Intermediate Steps Output Verification Providing Examples Grammar: Constrained Sampling Summary 7. Advanced Text Generation Techniques and Tools Model I/O: Loading Quantized Models with LangChain Chains: Extending the Capabilities of LLMs A Single Link in the Chain: Prompt Template A Chain with Multiple Prompts Memory: Helping LLMs to Remember Conversations Conversation Buffer Windowed Conversation Buffer Conversation Summary Agents: Creating a System of LLMs The Driving Power Behind Agents: Step-by-step Reasoning ReAct in LangChain Summary 8. Semantic Search and Retrieval-Augmented Generation Overview of Semantic Search and RAG Semantic Search with Language Models Dense Retrieval Dense retrieval example Getting the text archive and chunking it Embedding the text chunks Building the search index Search the index Caveats of dense retrieval Chunking long texts One vector per document Multiple vectors per document Nearest neighbor search versus vector databases Fine-tuning embedding models for dense retrieval Reranking Reranking example Open source retrieval and reranking with sentence transformers How reranking models work Retrieval Evaluation Metrics Scoring a single query with average precision Scoring across multiple queries with mean average precision Retrieval-Augmented Generation (RAG) From Search to RAG Example: Grounded Generation with an LLM API Example: RAG with Local Models Loading the generation model Loading the embedding model The RAG prompt Advanced RAG Techniques Query rewriting Multi-query RAG Multi-hop RAG Query routing Agentic RAG RAG Evaluation Summary 9. Multimodal Large Language Models Transformers for Vision Multimodal Embedding Models CLIP: Connecting Text and Images How Can CLIP Generate Multimodal Embeddings? OpenCLIP Making Text Generation Models Multimodal BLIP-2: Bridging the Modality Gap Preprocessing Multimodal Inputs Preprocessing images Preprocessing text Use Case 1: Image Captioning Use Case 2: Multimodal Chat-Based Prompting Summary III. Training and Fine-Tuning Language Models 10. Creating Text Embedding Models Embedding Models What Is Contrastive Learning? SBERT Creating an Embedding Model Generating Contrastive Examples Train Model In-Depth Evaluation Loss Functions Cosine similarity Multiple negatives ranking loss Fine-Tuning an Embedding Model Supervised Augmented SBERT Unsupervised Learning Transformer-Based Sequential Denoising Auto-Encoder Using TSDAE for Domain Adaptation Summary 11. Fine-Tuning Representation Models for Classification Supervised Classification Fine-Tuning a Pretrained BERT Model Freezing Layers Few-Shot Classification SetFit: Efficient Fine-Tuning with Few Training Examples Fine-Tuning for Few-Shot Classification Continued Pretraining with Masked Language Modeling Named-Entity Recognition Preparing Data for Named-Entity Recognition Fine-Tuning for Named-Entity Recognition Summary 12. Fine-Tuning Generation Models The Three LLM Training Steps: Pretraining, Supervised Fine-Tuning, and Preference Tuning Supervised Fine-Tuning (SFT) Full Fine-Tuning Parameter-Efficient Fine-Tuning (PEFT) Adapters Low-Rank Adaptation (LoRA) Compressing the model for (more) efficient training Instruction Tuning with QLoRA Templating Instruction Data Model Quantization LoRA Configuration Training Configuration Training Merge Weights Evaluating Generative Models Word-Level Metrics Benchmarks Leaderboards Automated Evaluation Human Evaluation Preference-Tuning / Alignment / RLHF Automating Preference Evaluation Using Reward Models The Inputs and Outputs of a Reward Model Training a Reward Model Reward model training dataset Reward model training step Training No Reward Model Preference Tuning with DPO Templating Alignment Data Model Quantization Training Configuration Training Summary Afterword Index

Rozwiń Zwiń

Specyfikacja

Podstawowe informacje

Autor

Jay Alammar, Maarten Grootendorst

Informacje dodatkowe

Format	MOBI EPUB
Ilość stron	428
Rok wydania	2024

Hands-On Large Language Models Chorzów

Lista Ofert

Opis

Specyfikacja

Podstawowe informacje

Informacje dodatkowe

Zobacz także