Some people like to believe that all data is ready to be used immediately. Not so! Data in the wild is unkempt and unruly, and it's the job of data scientists to clean up raw data into something that's ready to be used. To manage the data jungle, you need the right perspective and the right tools. (There's no point hacking at overgrowth with a spoon after all!) Do your work well, and you'll create insight from chaos and discover the analytic patterns to your business on the right track.
Exploring the Data Jungle: Finding, Preparing, and Using Real-World Data is a collection of three hand-picked chapters introducing you to the often-overlooked art of cleaning data. Brian Godsey, author of Think Like a Data Scientist, has selected these chapters to help you navigate data in the wild, process and prepare raw data for machine learning, and visualize results clearly. As you explore the data jungle you'll discover real-world examples in Python, R, and other languages suitable for data science.