If you want to go beyond scripting in Python, you need this book.
Master these effective techniques to reduce costs and run times, handle huge datasets, and implement complex machine learning applications efficiently in Python.
High Performance Python for Data Analytics is your guide to optimizing every part of your Python-based data analysis process, from the pure Python code you write to managing the resources of modern hardware and GPUs. You'll learn to rewrite inefficient data structures, improve underperforming code with multithreading, and simplify your datasets without sacrificing accuracy.
about the technology
Fast, accurate systems are vital for handling the huge datasets and complex analytical algorithms that are common in modern data science. Python programmers need to boost performance by writing faster pure-Python programs, optimizing the use of libraries, and utilizing modern multi-processor hardware; High Performance Python for Data Analytics
shows you how.
about the book
High Performance Python for Data Analytics
is a hands-on guide to writing Python code that can process more data, faster, and with less resources. It takes a holistic approach to Python performance, showing you how your code, libraries, and computing architecture interact and can be optimized together.
Written for experienced practitioners, this book dives right into practical solutions for improving computation and storage efficiency. You'll experiment with fun and interesting examples such as rewriting games in lower-level Cython and implementing a MapReduce framework from scratch. Finally, you'll go deep into Python GPU computing and learn how modern hardware has rehabilitated some former antipatterns and made counterintuitive ideas the most efficient way of working.
- Writing efficient pure-Python code
- Optimizing the NumPy and pandas libraries
- Rewriting critical code in Cython
- Designing persistent data structures
- Tailoring code for different architectures
- Implementing Python GPU computing
about the reader
For intermediate Python programmers familiar with the basics of concurrency.
about the author
works in the field of genetics, analyzing very large datasets and implementing complex algorithms to process the data. He leverages Python with all its libraries to do scientific computing and data engineering tasks. He is one of the co-authors of Biopython, a major bioinformatics package written in Python. He holds a BE in informatics and a PhD in bioinformatics.