How to publish a best-selling book
Sebastian Raschka on upgrading your expertise and bringing it to the community
If you’re bit of a machine learning geek, you might have come across Sebastian Raschka on the Twitterverse. If you haven’t, you’re about to. Raschka is a researcher at Michigan State University, an expert in applied machine learning and deep learning, and author of Packt’s best-selling book of all time,
Python Machine Learning. With the upcoming launch of the book’s second edition, Raschka is here to give you some tips on becoming an expert in your field and getting your work recognised by a wider audience.
LEARN AND PRACTICE
The first and most important thing is to learn as much as you can and to really throw yourself into your topic. Probably the most influential factor in me starting my career was a class I took on statistical pattern classification when I started my PhD. I was already a passionate Python programmer and I wasn’t happy about
submitting my homework using MATLAB. Because of this, I spent a lot of time reimplementing algorithms from papers and textbooks in Python using NumPy and SciPy. This might seem like it was a tedious exercise (you’d be right), but studying the concepts of statistical pattern classification more intensely helped me a great deal with translating ideas from theory to code.
If you’re starting out, I recommend starting with a practical, introductory book or course to get a brief overview of your field and the different techniques that exist. For someone interested in my field, for example, a concrete place to start would be understanding the big picture and what data science and machine learning is capable of. I’d then recommend starting a project that you’re passionate about whilst applying your newly learned techniques to help you address and answer complex questions that might arise during your project. When you’re working on an exciting project, I think you’re more likely to be naturally motivated to read advanced material and improve your skills.
Around the time I enrolled in the statistical pattern classification class, I developed the strong urge to talk about all the things I was learning and discovering. I found it very useful in terms of my own understanding to discuss what I’d discovered and what techniques I found useful and exciting.
One thing led to another and I started a blog. I then became more and more active in the open-source community, as well as avidly participating in discussions on social media. This helped me meet new people and air my ideas, and eventually I was contacted by Packt about an opportunity to dedicate a whole book to two of the things I’m most passionate about: machine learning and open source tools. The rest, they say, is history.
Show your expertise
I cover a lot of different subfields of machine learning in my book: classification, regression analysis and so forth – with the intention of showing that machine learning can be useful in almost every problem domain. I think that knowing as much as you can about your subject area is crucial in reaching a wide audience and being able to answer questions posed to you.
Demonstrating well-developed and maintained open-source software in my examples makes machine learning more accessible to a broad audience, including experienced programmers and people who are new to programming – and remember the basics! By introducing the basic mathematics behind machine learning, I aim to educate my audience and show that machine learning is more than just black box algorithms, giving readers an idea of the capabilities but also limitations of machine learning and how to apply those algorithms wisely.
Listen to your audience
In the second edition of my book, we improved or added many things based on feedback, which is something that I think all writers should do – and not just when writing a book. Amongst everything that we added is a section on dealing with imbalanced datasets, which several readers thought was missing in the first edition. As time moves on, so does the world of software. When the first edition of Python Machine Learning was released in 2015, we included an introduction to deep learning via Theano. Since then it’s got a substantial overflow and is now based on TensorFlow, which has become a major player in my research toolbox since its open source release by Google in 2015, so we added a new introduction to deep learning using TensorFlow.
I really appreciated all the helpful feedback from readers, and I recommend making sure to improve your work all the time based on feedback. Your audience will appreciate you going back to reword paragraphs where things might not be totally clear and add additional explanations where necessary.