Transformers for Natural Language Processing - Second Edition (E-book) Chorzów

Transformers are...well...transforming the world of AI. There are many platforms and models out there, but which ones best suit your needs? Transformers for Natural Language Processing, 2nd Edition, guides you through the world of transformers, highlighting the strengths of different models and …

od 224,10 Najbliżej: 21 km

Liczba ofert: 1

Oferta sklepu

Opis

Transformers are...well...transforming the world of AI. There are many platforms and models out there, but which ones best suit your needs? Transformers for Natural Language Processing, 2nd Edition, guides you through the world of transformers, highlighting the strengths of different models and platforms, while teaching you the problem-solving skills you need to tackle model weaknesses. You'll use Hugging Face to pretrain a RoBERTa model from scratch, from building the dataset to defining the data collator to training the model. If you're looking to fine-tune a pretrained model, including GPT-3, then Transformers for Natural Language Processing, 2nd Edition, shows you how with step-by-step guides. The book investigates machine translations, speech-to-text, text-to-speech, question-answering, and many more NLP tasks. It provides techniques to solve hard language problems and may even help with fake news anxiety (read chapter 13 for more details). You'll see how cutting-edge platforms, such as OpenAI, have taken transformers beyond language into computer vision tasks and code creation using Codex. By the end of this book, you'll know how transformers work and how to implement them and resolve issues like an AI detective! Spis treści: Preface Who this book is for What this book covers To get the most out of this book Get in touch What are Transformers? The ecosystem of transformers Industry 4.0 Foundation models Is programming becoming a sub-domain of NLP? The future of artificial intelligence specialists Optimizing NLP models with transformers The background of transformers What resources should we use? The rise of Transformer 4.0 seamless APIs Choosing ready-to-use API-driven libraries Choosing a Transformer Model The role of Industry 4.0 artificial intelligence specialists Summary Questions References Getting Started with the Architecture of the Transformer Model The rise of the Transformer: Attention is All You Need The encoder stack Input embedding Positional encoding Sublayer 1: Multi-head attention Sublayer 2: Feedforward network The decoder stack Output embedding and position encoding The attention layers The FFN sublayer, the post-LN, and the linear layer Training and performance Tranformer models in Hugging Face Summary Questions References Fine-Tuning BERT Models The architecture of BERT The encoder stack Preparing the pretraining input environment Pretraining and fine-tuning a BERT model Fine-tuning BERT Hardware constraints Installing the Hugging Face PyTorch interface for BERT Importing the modules Specifying CUDA as the device for torch Loading the dataset Creating sentences, label lists, and adding BERT tokens Activating the BERT tokenizer Processing the data Creating attention masks Splitting the data into training and validation sets Converting all the data into torch tensors Selecting a batch size and creating an iterator BERT model configuration Loading the Hugging Face BERT uncased base model Optimizer grouped parameters The hyperparameters for the training loop The training loop Training evaluation Predicting and evaluating using the holdout dataset Evaluating using the Matthews Correlation Coefficient The scores of individual batches Matthews evaluation for the whole dataset Summary Questions References Pretraining a RoBERTa Model from Scratch Training a tokenizer and pretraining a transformer Building KantaiBERT from scratch Step 1: Loading the dataset Step 2: Installing Hugging Face transformers Step 3: Training a tokenizer Step 4: Saving the files to disk Step 5: Loading the trained tokenizer files Step 6: Checking resource constraints: GPU and CUDA Step 7: Defining the configuration of the model Step 8: Reloading the tokenizer in transformers Step 9: Initializing a model from scratch Exploring the parameters Step 10: Building the dataset Step 11: Defining a data collator Step 12: Initializing the trainer Step 13: Pretraining the model Step 14: Saving the final model (+tokenizer + config) to disk Step 15: Language modeling with FillMaskPipeline Next steps Summary Questions References Downstream NLP Tasks with Transformers Transduction and the inductive inheritance of transformers The human intelligence stack The machine intelligence stack Transformer performances versus Human Baselines Evaluating models with metrics Accuracy score F1-score Matthews Correlation Coefficient (MCC) Benchmark tasks and datasets From GLUE to SuperGLUE Introducing higher Human Baselines standards The SuperGLUE evaluation process Defining the SuperGLUE benchmark tasks BoolQ Commitment Bank (CB) Multi-Sentence Reading Comprehension (MultiRC) Reading Comprehension with Commonsense Reasoning Dataset (ReCoRD) Recognizing Textual Entailment (RTE) Words in Context (WiC) The Winograd schema challenge (WSC) Running downstream tasks The Corpus of Linguistic Acceptability (CoLA) Stanford Sentiment TreeBank (SST-2) Microsoft Research Paraphrase Corpus (MRPC) Winograd schemas Summary Questions References Machine Translation with the Transformer Defining machine translation Human transductions and translations Machine transductions and translations Preprocessing a WMT dataset Preprocessing the raw data Finalizing the preprocessing of the datasets Evaluating machine translation with BLEU Geometric evaluations Applying a smoothing technique Chencherry smoothing Translation with Google Translate Translations with Trax Installing Trax Creating the original Transformer model Initializing the model using pretrained weights Tokenizing a sentence Decoding from the Transformer De-tokenizing and displaying the translation Summary Questions References The Rise of Suprahuman Transformers with GPT-3 Engines Suprahuman NLP with GPT-3 transformer models The architecture of OpenAI GPT transformer models The rise of billion-parameter transformer models The increasing size of transformer models Context size and maximum path length From fine-tuning to zero-shot models Stacking decoder layers GPT-3 engines Generic text completion with GPT-2 Step 9: Interacting with GPT-2 Training a custom GPT-2 language model Step 12: Interactive context and completion examples Running OpenAI GPT-3 tasks Running NLP tasks online Getting started with GPT-3 engines Running our first NLP task with GPT-3 NLP tasks and examples Comparing the output of GPT-2 and GPT-3 Fine-tuning GPT-3 Preparing the data Step 1: Installing OpenAI Step 2: Entering the API key Step 3: Activating OpenAIs data preparation module Fine-tuning GPT-3 Step 4: Creating an OS environment Step 5: Fine-tuning OpenAIs Ada engine Step 6: Interacting with the fine-tuned model The role of an Industry 4.0 AI specialist Initial conclusions Summary Questions References Applying Transformers to Legal and Financial Documents for AI Text Summarization Designing a universal text-to-text model The rise of text-to-text transformer models A prefix instead of task-specific formats The T5 model Text summarization with T5 Hugging Face Hugging Face transformer resources Initializing the T5-large transformer model Getting started with T5 Exploring the architecture of the T5 model Summarizing documents with T5-large Creating a summarization function A general topic sample The Bill of Rights sample A corporate law sample Summarization with GPT-3 Summary Questions References Matching Tokenizers and Datasets Matching datasets and tokenizers Best practices Step 1: Preprocessing Step 2: Quality control Continuous human quality control Word2Vec tokenization Case 0: Words in the dataset and the dictionary Case 1: Words not in the dataset or the dictionary Case 2: Noisy relationships Case 3: Words in the text but not in the dictionary Case 4: Rare words Case 5: Replacing rare words Case 6: Entailment Standard NLP tasks with specific vocabulary Generating unconditional samples with GPT-2 Generating trained conditional samples Controlling tokenized data Exploring the scope of GPT-3 Summary Questions References Semantic Role Labeling with BERT-Based Transformers Getting started with SRL Defining semantic role labeling Visualizing SRL Running a pretrained BERT-based model The architecture of the BERT-based model Setting up the BERT SRL environment SRL experiments with the BERT-based model Basic samples Sample 1 Sample 2 Sample 3 Difficult samples Sample 4 Sample 5 Sample 6 Questioning the scope of SRL The limit of predicate analysis Redefining SRL Summary Questions References Let Your Data Do the Talking: Story, Questions, and Answers Methodology Transformers and methods Method 0: Trial and error Method 1: NER first Using NER to find questions Location entity questions Person entity questions Method 2: SRL first Question-answering with ELECTRA Project management constraints Using SRL to find questions Next steps Exploring Haystack with a RoBERTa model Exploring Q&A with a GTP-3 engine Summary Questions References Detecting Customer Emotions to Make Predictions Getting started: Sentiment analysis transformers The Stanford Sentiment Treebank (SST) Sentiment analysis with RoBERTa-large Predicting customer behavior with sentiment analysis Sentiment analysis with DistilBERT Sentiment analysis with Hugging Faces models list DistilBERT for SST MiniLM-L12-H384-uncased RoBERTa-large-mnli BERT-base multilingual model Sentiment analysis with GPT-3 Some Pragmatic I4.0 thinking before we leave Investigating with SRL Investigating with Hugging Face Investigating with the GPT-3 playground GPT-3 code Summary Questions References Analyzing Fake News with Transformers Emotional reactions to fake news Cognitive dissonance triggers emotional reactions Analyzing a conflictual Tweet Behavioral representation of fake news A rational approach to fake news Defining a fake news resolution roadmap The gun control debate Sentiment analysis Named entity recognition (NER) Semantic Role Labeling (SRL) Gun control SRL Reference sites COVID-19 and former President Trumps Tweets Semantic Role Labeling (SRL) Before we go Summary Questions References Interpreting Black Box Transformer Models Transformer visualization with BertViz Running BertViz Step 1: Installing BertViz and importing the modules Step 2: Load the models and retrieve attention Step 3: Head view Step 4: Processing and displaying attention heads Step 5: Model view LIT PCA Running LIT Transformer visualization via dictionary learning Transformer factors Introducing LIME The visualization interface Exploring models we cannot access Summary Questions References From NLP to Task-Agnostic Transformer Models Choosing a model and an ecosystem The Reformer Running an example DeBERTa Running an example From Task-Agnostic Models to Vision Transformers ViT Vision Transformers The Basic Architecture of ViT Vision transformers in code CLIP The Basic Architecture of CLIP CLIP in code DALL-E The Basic Architecture of DALL-E DALL-E in code An expanding universe of models Summary Questions References The Emergence of Transformer-Driven Copilots Prompt engineering Casual English with a meaningful context Casual English with a metonymy Casual English with an ellipsis Casual English with vague context Casual English with sensors Casual English with sensors but no visible context Formal English conversation with no context Prompt engineering training Copilots GitHub Copilot Codex Domain-specific GPT-3 engines Embedding2ML Step 1: Installing and importing OpenAI Step 2: Loading the dataset Step 3: Combining the columns Step 4: Running the GPT-3 embedding Step 5: Clustering (k-means clustering) with the embeddings Step 6: Visualizing the clusters (t-SNE) Instruct series Content filter Transformer-based recommender systems General-purpose sequences Dataset pipeline simulation with RL using an MDP Training customer behaviors with an MDP Simulating consumer behavior with an MDP Making recommendations Computer vision Humans and AI copilots in metaverses From looking at to being in Summary Questions References Appendix I Terminology of Transformer Models Stack Sublayer Attention heads Appendix II Hardware Constraints for Transformer Models The Architecture and Scale of Transformers Why GPUs are so special GPUs are designed for parallel computing GPUs are also designed for matrix multiplication Implementing GPUs in code Testing GPUs with Google Colab Google Colab Free with a CPU Google Colab Free with a GPU Google Colab Pro with a GPU Appendix III Generic Text Completion with GPT-2 Step 1: Activating the GPU Step 2: Cloning the OpenAI GPT-2 repository Step 3: Installing the requirements Step 4: Checking the version of TensorFlow Step 5: Downloading the 345M-parameter GPT-2 model Steps 6-7: Intermediate instructions Steps 7b-8: Importing and defining the model Step 9: Interacting with GPT-2 References Appendix IV Custom Text Completion with GPT-2 Training a GPT-2 language model Step 1: Prerequisites Steps 2 to 6: Initial steps of the training process Step 7: The N Shepperd training files Step 8: Encoding the dataset Step 9: Training a GPT-2 model Step 10: Creating a training model directory Step 11: Generating unconditional samples Step 12: Interactive context and completion examples References Appendix V Answers to the Questions Chapter 1, What are Transformers? Chapter 2, Getting Started with the Architecture of the Transformer Model Chapter 3, Fine-Tuning BERT Models Chapter 4, Pretraining a RoBERTa Model from Scratch Chapter 5, Downstream NLP Tasks with Transformers Chapter 6, Machine Translation with the Transformer Chapter 7, The Rise of Suprahuman Transformers with GPT-3 Engines Chapter 8, Applying Transformers to Legal and Financial Documents for AI Text Summarization Chapter 9, Matching Tokenizers and Datasets Chapter 10, Semantic Role Labeling with BERT-Based Transformers Chapter 11, Let Your Data Do the Talking: Story, Questions, and Answers Chapter 12, Detecting Customer Emotions to Make Predictions Chapter 13, Analyzing Fake News with Transformers Chapter 14, Interpreting Black Box Transformer Models Chapter 15, From NLP to Task-Agnostic Transformer Models Chapter 16, The Emergence of Transformer-Driven Copilots Other Books You May Enjoy Index

Specyfikacja

Podstawowe informacje

Autor
  • Denis Rothman
Wybrane wydawnictwa
  • Packt Publishing
Format
  • PDF
  • EPUB
Ilość stron
  • 602
Rok wydania
  • 2022