If you are a data scientist, I am sure you wondered whether deep neural networks will replace you at your job one day. Every time I read about reports of researchers who managed to trick neural networks, I wonder whether the researchers were thinking about their job security, or their professional pride while performing the experiments. I think that the first example of such a report is a 2014 paper by Christian Szegedy and his colleagues called “Intriguing properties of neural networks“. The main goal of this paper, so it seems, was to peek into the black box of neural networks. In one of the experiments, the authors designed minor, invisible perturbation of the original images. These perturbations diminished the classification accuracy of a trained model.
In the recent post “5 Ways to Troll Your Neural Network” Ben Orlin describes five different ways to “troll a network”.
Image credit: Figure 5 from “Intriguing properties of neural networks“.
Which of these two pictures should I choose for my gravatar?
Both were taken by Luca Sartoni
You know that I like reading a ruthless critique of others’ work — I like telling myself that by doing so I learn good practices (in reality, I suspect I’m just a case what we call in Hebrew שמחה לאיד — the joy of some else’s failure).
Anyhow, I’d like to share a set of posts by Lior Patcher in which he calls bullshit on several reputable people and concepts. Calling bullshit is easy. Doing so with arguments is not so. Lior Patcher worked hard to justify his opinion.
- . Albert-László Barabási is a renown network scientist. There’s a network model named after him. Some people claim that prof. Barabási will receive the Nobel prize one day.
- . Published one day after “The ““, this post critiques another renown scientist. Again, with a lot of solid-sounding arguments.
- . (“Where is part I?”, you may ask. Read the post to discover).
Unfortunately, I don’t publish academic papers. But if I do, I will definitely want prof. Patcher read it, and let the world know what he thinks about it. For good and for bad.
Speaking of calling bullshit. Believe it or not, University of Washington has a course with this exact title. The course is available online http://callingbullshit.org/ and is worth watching. I watched all the course’s videos during my last flight from Canada to Israel. The featured image of this post is a screenshot of this course’s homepage.
I went through my Machine Learning tag feed. Suddenly, I stumbled upon a pie chart that looked so terrible, I was sure the post would be about bad practices in data visualization. I was wrong. The chart was there to convey some information. The problem is that it is bad in so many ways. It is very hard to appreciate the information in a post that shows charts like that. Especially when the post talks about data science that relies so much on data visualization.
via Math required for machine learning — Youth Innovation
I would write a post about good practices in pie charts, but Robert Kosara, of https://eagereyes.org does this so well, I don’t really think I need to reinvent the weel. Pie charts are very powerful in conveying information. Make sure you use this tool well. I strongly suggest reading everything Robert Kosara has to say on this topic.
Compiled by my teammate Yanir Serourssi, the following is a reading list an A/B tests that you should read even if you don’t plan to perform an A/B test anytime soon. The list is Yanir’s. The reviews are mine. Collective intelligence in action 🙂
- If you don’t pay attention, data can drive you off a cliff
In this post, Yanir lists seven common mistakes that are common to any data-based analysis. At some point, you might think that this is a list of trivial truths. Maybe it is. The fact that Yanir’s points are trivial doesn’t make them less correct. Awareness doesn’t exist without knowledge. Unfortunately, knowledge doesn’t assure awareness. Which is why reading trivial truths is a good thing to do from time to time.
- How to identify your marketing lies and start telling the truth
This post was written by Tiberio Caetano, a data science professor at the University of Sidney. If I had to summarize this post with a single phrase, that would be “confounding factors”. A confounding variable is a variable hidden from your eye that influences a measured effect. One example of a confounding variable is when you start an ad campaign for ice cream, your sales go up, and you conclude that the ad campaign was effective. What you forgot was that the ad campaign started at the beginning of the summer, when people start buying more ice cream anyhow.
See this link for a detailed textbook-quality review of confounding variables.
- Seven rules of thumb for web site experimenters
I read this review back in 2014, shortly after it was published by, among others, researchers from Microsoft and LinkedIn. Judging by the title, one would expect yet another list of trivial truths in a self-promoting product blog. This is not the case here. In this paper, you will find several real-life case studies, many references to marketing studies, and no advertising of shady products or schemes.
- A dirty dozen: Twelve common metric interpretation pitfalls in online controlled experiments
Another academic paper by Microsoft researchers. This one lists a lot of “dont’s”. Like in the previous link, every advice the authors give is based on established theory and backed up by real data.
Perhaps you heard about Tay, Microsoft’s experimental Twitter chat-bot, and how within a day it became so offensive that Microsoft had to shut it down and never speak of it again. And you assumed that you would never make such a thing, because you’re not doing anything weird like letting random jerks on Twitter re-train […]
via How to make a racist AI without really trying — ConceptNet blog
Please leave a comment on this post. It doesn’t matter what you want to write. It can be short or long. Any comment. I need to know that humans read this blog. If you feel really generous, tell me how you found this blog, what you think of it.
In his blog post Big Data Or Big Hype? , Rick Ciesla is asking a question whether the “Big Data” phenomenon is “a real thing” or just a hype? I must admit that, until recently, I was sure that the term “Data Science” was a hype too — an overbroad term to describe various engineering and scientific activities. As time passes by, I become more and more confident that Data Science matures into a separate profession. I haven’t’ yet decided whether the word “science” is fully appropriate in this case is.
We have certainly heard a lot about Big Data in recent years, especially with regards to data science and machine learning. Just how large of a data set constitutes Big Data? What amount of data science and machine learning work involves truly stratospheric volumes of bits and bytes? There’s a survey for that, courtesy of […]
via Big Data Or Big Hype? — VenaData