Robert Layton

Rob Layton is a data scientist, past core contributor to scikit-learn and holds a PhD in cybercrime analytics in analysing phishing websites to identify authorship patterns. He runs his own data analytics business, dataPipeline, and has given training with expert training provider Python Charmers for more than 5 years, to students in the finance, government and other private sectors.

projects by Robert Layton

Authorship Identification with Text Mining and Machine Learning

4 weeks · 7-10 hours per week · INTERMEDIATE

Included with a Manning Online subscription

catalog / Data Science

In this liveProject, you’ll step into the boots of an investigator trying to find the anonymous author of a seriously defamatory blog post. You’ve narrowed down your list of suspects, acquired a dataset of writing samples, and now plan to find the culprit using a custom machine learning project. Your challenge is to build an authorship analysis model that will match a sample to the defamatory blogpost and reveal the guilty party. To do this, you’ll need to extract data from a corpus of documents, build a model that can learn authorship style, scale the model to handle hundreds of suspects, and finally develop a user-friendly program that will allow non-technical colleagues to make use of your findings.