Transformers are the superpower behind large language models (LLMs) like ChatGPT, Bard, and LLAMA. Transformers in Action gives you the insights, practical techniques, and extensive code samples you need to adapt pretrained transformer models to new and exciting tasks.
Inside Transformers in Action
- How transformers and LLMs work
- Adapt HuggingFace models to new tasks
- Automate hyperparameter search with Ray Tune and Optuna
- Optimize LLM model performance
- Advanced prompting and zero/few-shot learning
- Text generation with reinforcement learning
- Responsible LLMs
Technically speaking, a “Transformer” is a neural network model that finds relationships in sequences of words or other data by using a mathematical technique called attention in its encoder/decoder components. This setup allows a transformer model to learn context and meaning from even long sequences of text, thus creating much more natural responses and predictions. Understanding the transformers architecture is the key to unlocking the power of LLMs for your own AI applications.
This comprehensive guide takes you from the origins of transformers all the way to fine-tuning an LLM for your own projects. Author Nicole Königstein demonstrates the vital mathematical and theoretical background of the transformer architecture practically through executable Jupyter notebooks, illuminating how this amazing technology works in action.
about the book
Transformers in Action
adds the revolutionary transformers architecture to your AI toolkit. You’ll dive into the essential details of the model’s architecture, with all complex concepts explained through easy-to-understand examples and clever analogies—from sock sorting to skiing! Even complex foundational concepts start with practical applications, so you never have to struggle with abstract theory. The book includes an extensive code repository that lets you instantly start playing and exploring different LLMs.
In this interesting guide, you’ll start by applying transformers to fundamental NLP tasks like text summarization and text classification. Then, you’ll push transformers farther with tasks like generating text, honing text generation with reinforcement learning, developing multimodal models, and few-shot learning. You’ll discover one-of-a-kind advice on prompt engineering, as well as proven-and-tested methods for optimizing and tuning large language models. Plus, you’ll find unique coverage of AI ethics such as mitigating bias and responsible usage.
about the reader
For both junior and experienced data scientists and machine learning engineers. Readers should be comfortable with the basics of ML, Python, and common data tools.
about the author
is a distinguished Data Scientist and Quantitative Researcher. She is presently the Chief Data Scientist and Head of AI & Quantitative Research at Wyden Capital.