If you’re in a hurry, here are the slides for our talk.
A few weeks ago, I was chatting with Graham McBain at Galvanize and he asked if I would be interested in giving a talk at IBM’s Datapalooza conference in Denver. Given that I’m always looking for ways to help Denver’s growing big data and data science communities (ahem, and because we help data scientists with our big data services), I thought it was a great opportunity, but…it was only two weeks away…one week of which I spent in airplanes and hotels…I had nothing prepared specifically for Data Scientists…..plus we had to deliver the final results of a 6 month customer project…..and we launched a new product: AgilData Scalable Cluster for MySQL all in the same two weeks. Yikes.
Too busy? Get help!
Ok, let’s say you’re too busy to pull this off all by yourself in two weeks. Bail? No way. The right approach is to con someone else into co-presenting with you and sharing the prep work! I reached out to my buddy Patrick Russell, the Data Science Director at Craftsy. If you don’t know about Craftsy, I recommend you check them out. They will make you smarter on just about any topic that involves your hands: cooking, sewing, blacksmithing, uranium enrichment, etc… They also have a great engineering blog. Long story short, Patrick agreed to co-present with me and we committed to the slot.
Meet in person to get the framework done quickly.
In a world of asynchronous communication like slack, email, Google docs, there is still no substitute for highly-focused “pressure cooker” collaboration. It’s a fantastic forcing function to get things done quickly.
Patrick and I met up on a Saturday afternoon. While cranking some loud metal on Patrick’s awesome stereo system, we established the framework for our presentation.
We chose to use Google Slides to make the deck. Yes, Keynote is better in almost every way. Yes, the animations are primitive. Yes, you might get screwed by spotty WiFi at the last second. All that said, because of the real-time interactive editing support, Google Slides still wins out for collaborating with other authors.
By the time we finished 3 beers, we had settled on our outline, our fonts, and our key photos. The first rule of presenting is to keep it simple. Use only 2-3 good fonts. Favor images and simple text over giant walls of bulleted text. Only hit your audience with three or fewer key themes. We settled on two themes: Data Cleansing and Operationalizing Data Science.
Patrick and I have fairly different work schedules. Judging by my highly scientific review of his comments on the slides, he was more available during the day, and I was more available for maker mode during the evening. The value of Google Slides becomes even more apparent when you are collaborating not only over distance, but also over time. We went back and forth on several slides, tuning the content to our expected audience and simplifying relentlessly.
2 hours before our session time, Patrick and I met up to review the deck, finalize any slide animations and talk through a few of our “who’s talking about what” choices. The rest we left up to ourselves to ad lib.
Deliver a great presentation
This is simple, but not easy. There are tons of ways to psyche yourself out, impacting your confidence and your presentation. The solution? Be yourself, be honest, know your shit, and treat the audience like humans or, even better, smart humans. The real trick is that you have to force yourself to get up in front of people, no matter how uncomfortable it is. It does get easier with time. I promise.
It’s very important to make time after your presentation for questions. Some questions will come in the same room. Others will come afterward from people who weren’t comfortable asking in front of the larger audience. Take your time and get to know these people. What are their names? What do they do? Get their contact information and write them back, thanking them for attending.
Another key part of following-up is to publish your slides. I use SlideShare for this, although there are several other options. It’s also critical to write a blog post about your presentation, linking to your slide deck and inviting people to contact you with any other questions.
There, I’m done. 🙂
Wait, so what did we actually talk about?
Patrick and I talked about how we in the industry often use toy examples discussing or teaching data science, and how there is a big gap between those examples and doing data science in the real world. The data are spread out over multiple sources. They are in different formats and are sometimes terribly messy. They can have different timestamps and incorrect data types. The list goes on.
The second half of the presentation focuses on the process of operationalizing data science. It’s one thing to produce a predictive model on your laptop. It’s quite another thing to operationalize that model so that it delivers continuous value to the business. This gets into devops, hosting, databases, ETL pipelines, and data orchestration tooling.
If you have any questions, I’d be happy to chat. Drop me a line at firstname.lastname@example.org.