• Calling bullshit on

    Calling bullshit on "persistence leads to success"

    May 14, 2020

    Did you know that J.K. Rowling, the author of Harry Potter, submitted her books 13 times before it was accepted? Did you know that Thomas Edison tried again and again, even though his teachers thought he was “too stupid to learn anything?” Did you know that Lior Raz (Fauda’s creator and lead actor) was an anonymous actor for more than ten years before he broke the barrier of anonymity? What do these all people have in common? They persisted, and they succeeded. BUT, and there is a big but.

    girl wearing pink framed sunglasses

    People keep telling us: follow your dream, and if you persist, it will come true. You will learn from your mistakes, improve, and adapt, and finally, will reach your goal. I call bullshit

    Think of the Martingale betting strategy. In theory, it works. Why doesn’t it work in practice? Because nobody has infinite time and infinite pockets. The same is right with chasing your dream. We need to pay for the shelter above our heads, the food on our tables, the clothes that we wear. Often other people depend on us. Time passes by. I had to be a party pooper, but some people who chase their dreams will eat all their savings and will either have to give up or declare bankruptcy (and then give up).

    Survivorship bias

    Read the story, it’s very educational

    But what about all those successful failers? What we see a typical example of survivorship bias, the logical error of concentrating on the people or things that made it past some selection process and overlooking those that did not, typically because of their lack of visibility. We know the names Rowling, Edison, Raz, and others not because of their multiple failures but DESPITE them. For every Rowling, Edison, and Raz, there are thousands of failed writers, engineers, and actors who ended up broke and caused sorrow to their families.

    So, should I quit?

    I don’t know. Maybe. Maybe not. It’s your life, your decision.

    May 14, 2020 - 2 minute read -
    career career-advise professional-success blog Career advice
  • COVID-19 vs. influenza dataviz. The order is now correct

    COVID-19 vs. influenza dataviz. The order is now correct

    May 12, 2020

    May

    March

    February

    Note about the numbers. While the COVID-19 casualties are based on more-or-less accurate live reports, the flu information is an estimate based on yearly average numbers.

    The code is here

    May 12, 2020 - 1 minute read -
    corona covid covid-19 blog
  • On a person that falls into the water. Or why thinking short-time is a good strategy in times of crisis

    On a person that falls into the water. Or why thinking short-time is a good strategy in times of crisis

    May 11, 2020

    At the beginning of the COVID-19 crisis, I tried to explain to my daughter (and to myself) the rationale behind the draconic measures the governments take to fight with the crisis. One rationalization that I found was an analogy of a person that falls into the water. In this situation, the person needs to act FAST to stabilize the situation. Only than, he or she can start planning their steps.

    I have been very vocal criticizing the dramatic measures that many governments took in the beginning of this crisis. It looks like these measures were more-or-less correct, and that the countries that didn’t implement them are now in a much worse situation, compared to the countries that did impose severe limitations. But even if in the retrospective it will turn out that one could do much better without the many “hammers,” I tend to think that those hammers were inevitable.

    The conclusion? One day or another, we will all need to act very fast. This means that we need to be prepared, have plan B’s work on resilience, and maybe perform emergency drills.

    May 11, 2020 - 1 minute read -
    covid covid-19 crisis blog
  • Inbox Zero

    Inbox Zero

    May 11, 2020

    May 11, 2020 - 1 minute read -
    blog
  • Bad advice from a reputable source is bad advice.

    Bad advice from a reputable source is bad advice.

    May 5, 2020

    Would you buy a grammar book with a clear spelling mistake on its cover? I hope not. That’s what happened to IBM when it published it’s new data visualization guide. I didn’t bother reading the manual because of what IBM decided to use as the first image of their guide.

    We use graphs to transfer information into images that are supposed to be later transformed in our brains to information. What visual attributes do we use to interpret the information behind a pie chart? It is the segment angle, its area, or maybe the arc length? Most probably, the answer is “all of the above” (see Robert Kosara’s works for more info). When done right, the three attributes of pie segments are linearly connected one to another, which allows synergism between the visual clues.
    But what did our friends at IBM do? The deliberately distorted the data! I took the screenshot from the guide homepage and made some measurements.
    The purple segment has the angle of 182 degrees, and the angle of the black segment is 75 degrees, which gives us the ratio of 2.42. However, while the radius of the purple segment is 135 pixels, the radius of the black one is only 110 pixels. Why is this a problem? Well, due to the radius differences, the ratio between the arc lengths is 2.91, and the ratio between the areas is 3.66. So now, let me ask you: what is the ratio between the numbers represented by the purple and the black segments?
    It is correct that the colors that IBM people used in their guide are neat, but data visualization that distorts information is not visualization but a piece of garbage. I assume that IBM produces decent computers, but don’t learn data visualization from them

    May 5, 2020 - 2 minute read -
    bad-practice critique data visualisation Data Visualization dataviz ibm blog
  • Why is it (almost) impossible to set deadlines for data science projects?

    Why is it (almost) impossible to set deadlines for data science projects?

    May 1, 2020

    I wrote this post in 2017. For some reason, it started gaining traffic in the last two weeks. I reviewed this post and couldn’t find any new insights. But maybe you can help me.

    May 1, 2020 - 1 minute read -
    blog
  • Online data science conference on May, 28

    Online data science conference on May, 28

    April 30, 2020

    NDR is a family of machine learning/data science conferences. Their next conference will be held online on May, 28 and the agenda looks great.

    Now, I’m not super objective here, because I’m presenting at NDR July event. But look at the topics, what an impressive selection!

    April 30, 2020 - 1 minute read -
    conference data science machine learning ndr romania blog
  • The quintessence of data visualization usefulness

    The quintessence of data visualization usefulness

    April 27, 2020

    I have to admit, I was skeptical at the beginning of the COVID-19 crisis. I started becoming skeptical now when it seems that the crisis didn’t hit my country too hard. But then I saw the graphs in this Financial Times article, and the skepticism disapeared. The graphs are accompanied by hundreds of words, but there is no need for reading the text to understand almost everything.

    These graphs are so good, so convincing, so well performed, they don’t leave any place for doubt or misunderstanding of the message the author wants to convey.

    If you study data visualization, look at these graphs. Look at the color choice, legend location, and design. Look at the ticks on the X- and Y-axes, how they are spaced and typeset. Note the amount of details on the axes, specifically how sparse these details are.

    April 27, 2020 - 1 minute read -
    covid-19 data visualisation Data Visualization dataviz blog
  • Finally We May Have a Path to the Fundamental Theory of Physics…  and It’s Beautiful — Stephen Wolfram Blog

    Finally We May Have a Path to the Fundamental Theory of Physics… and It’s Beautiful — Stephen Wolfram Blog

    April 27, 2020

    OK, so Stephen Wolfram (a mega celebrity in the computational intelligence world and, among other things a physicist) claims that he may have found a path to the Fundamental Theory of Physics. The blog post is long, and I hope to be able to finish reading it in a week or two. The accompanying technical text is a 450-page tome available on a dedicated site.

    Also, it turns out that Stephen Wolfram has a Twitch.tv channel in which he talks about science.

    Website: Wolfram Physics Project Technical Intro: A Class of Models with the Potential to Represent Fundamental Physics How We Got Here: The Backstory of the Wolfram Physics Project… 26,455 more words

    Finally We May Have a Path to the Fundamental Theory of Physics… and It’s Beautiful — Stephen Wolfram Blog

    April 27, 2020 - 1 minute read -
    physics reblog wolfram blog
  • Book review: Never Split the Difference by Chris Voss

    Book review: Never Split the Difference by Chris Voss

    April 25, 2020

    TL;DR: Dull on the surface but has a lot of good points

    Never_Split_3D_Jacket_copy.png

    I read Never Split the Difference following a friend’s recommendation. While reading the book, I kept feeling a constant sense of disappointment and mental eye-rolling. The author, Chris Voss, is a former FBI negotiator. The book is full of FBI war stories and pieces of advice that, on the top of it, sound either trivial or well known. HOWEVER, when the book was over, I sat summarizing my Kindle notes. Forty-five minutes later, I found myself staring at six pages of handwritten text of notes and takeaways. Which, surely, is a good sign.

    What I didn’t like: too many “war stories” from the author’s past as an FBI negotiator; their connection to the business world sometimes seems too far-fetched.

    What I liked: I liked the overall approach. Sometimes, the author cites academic research. Again, the fact that I took so many notes, is very impressive (to me).

    The bottom line: 4/5 Read it, even if you already read a negotiation book.

    April 25, 2020 - 1 minute read -
    book review netotiations blog
  • The missing graves

    The missing graves

    April 20, 2020

    Today, Israel marks Holocaust Day. Many words have been written about the Holocaust, and I want to write about missing graves.
    If you visit a Jewish cemetery, you might see a lot of gravestones with additional memorial plates.

    I took this picture in the Chișinău (Kishinev) Jewish cemetery. Burial of the deceased is considered the final act of kindness a person can perform to the dead. Erecting a “reminder and a name” (Yad-va-Shem), i.e a gravestone, is an intrinsic part of the burial. The Hebrew term for this act of kindness is “Chesed shel emet” – the truthful kindness. Many people died during the Holocaust without a grave, without a gravestone, and without any sign of kindness around them. That is why, when the Holocaust survivors started passing away after the war, their relatives decided to perform this final act of kindness by adding names of those who did not have the fortune to have their own grave.

    This is the gravestone of my grandmother’s sister Etl (Ester). The lower plate is a list of eleven relatives who never had a grave

    April 20, 2020 - 1 minute read -
    chisinau gravestone holocaust kishinev blog
  • Why is forecasting s-curves hard?

    Why is forecasting s-curves hard?

    April 19, 2020

    Constance Crozier (@clcrozier on Twitter) shared an interesting simulation in which she tried to fit a sigmoid curve (s-curve) to predict a plateau in a time-series. The result was a very intuitive and convincing animation that shows how wrong her initial forecasts were.

    The matter of fact is that this phenomenon is not new at all. My first post-University job involved fitting numerous pharmacodynamics models. We always had to keep in mind that if the available data does not account for at least 95% of the maximum effect, the model will be very much suboptimal. It took me a while, but I managed to find the reference for this phenomenon [here]. Maybe, when I have some time, I will repeat Constance Crozier’s analysis, and add confidence intervals to emphasize the point.

    EDIT: I came the conclusion that the most important takaway message of this demonstration is the necessity of reporting uncertainty with any forecast, and how small the value of a forecast is without uncertainty estimations.

    https://player.vimeo.com/video/408599958?dnt=1&app_id=122963

    S-curves (or sigmoid functions) are commonly used to model the evolution of social or biological systems over time [1]. These functions start with exponential growth, then increase linearly, and finally level off (therefore end up looking like a wonky s). Many things that we think of as exponential functions will actually follow an s-curve (otherwise […]

    Forecasting s-curves is hard — Constance Crozier

    April 19, 2020 - 1 minute read -
    curve-fitting data science forecast forecasting modelling pk-pd repost blog
  • On oranizing a data org in a company, job titles, and more

    On oranizing a data org in a company, job titles, and more

    April 16, 2020

    My colleague, Simon Ouderkik, recorded a REALLY interesting interview with Stephen Levin of Zapier and Emilie Schario of Gitlab on organizing data org in a company, job titles, career ladders, and other important stuff.

    April 16, 2020 - 1 minute read -
    reblog simon blog Career advice
  • If there is only one document you can read about data visualization, this is the one

    If there is only one document you can read about data visualization, this is the one

    April 7, 2020

    I’m sorting my teaching material, and I found this gem. The UK Government Statistical Service published a guideline for effective data visualization and tables. If you know a busy person who doesn’t have time to study data visualization and can only read one document, this document is for them (it has less than 40 pages full of examples). Click o the image above to go to the guideline

    April 7, 2020 - 1 minute read -
    data visualisation Data Visualization dataviz documentation guidelines blog
  • Data giraffe is  sometimes a feature, not a problem

    Data giraffe is sometimes a feature, not a problem

    April 7, 2020

    I wrote about data giraffes two weeks ago. Usually, “data giraffes” are a problem and we need to work hard in order to solve it. Sometimes, they are a useful feature. Take a look at this NYT front page that shows the number of new unemployment applications in the United States over the time

    And this is the pseudochartchart version of the same data

    https://twitter.com/davidgura/status/1245704384215945216

    Credits: I’ve found these examples on Stott Berkun’s page.

    April 7, 2020 - 1 minute read -
    data visualisation Data Visualization datavis giraffe blog
  • Everything is NOT just fine (repost)

    Everything is NOT just fine (repost)

    April 5, 2020

    My job wasn’t affected by the COVID madness in almost any way. I used to work from home before, and I work from home now, none on my customers cancelled any projects, the health system in Israel is still functioning, all of my relatives are in good health, everything is just fine! I know how unusual I am in the current world, with the skyrocketing unemployment, non-functioning governments, and three-digit body counts. I was about to write about that, but then I read AnnMaria’s post.

    You should read it too

    I’ve read a lot of cheery tweets that said something like, “Buffy, Biff and I are isolated at home with our terrier, Boo. Here’s a picture. Isn’t he cute? We played card games, then I baked this three-course meal I saw on Pinterest. Biff is taking this time to finally become proficient in Mandarin with…

    Everything is NOT just fine — AnnMaria’s Blog

    April 5, 2020 - 1 minute read -
    covid distributed work reblog remote working repost blog
  • Blogging isn't what it used to be. Podcasting is on the rise

    Blogging isn't what it used to be. Podcasting is on the rise

    April 2, 2020

    More than two years ago, I took a look at Google Trends for three phrases “start a blog”, “create a blog”, and “create a site”. I was surprised by the high volume of blog searches, compared to “create a site”.

    Today, I decided to go back to Google Trends and to add the new rising star: podcasting.

    It looks like podcasting starts its exponential growth, while the blogging continues its slow but steady decline. I will be unsurprised if, in 2022, the green, podcasting line will surpass the other lines in this graph. Let’s wait and see.

    April 2, 2020 - 1 minute read -
    blogging forecast podcast podcasting blog
  • A super-important read on the COVID-19 situation. I'm finally convinced

    A super-important read on the COVID-19 situation. I'm finally convinced

    March 22, 2020

    Until now I was very sceptical about the COVID-19 measures taken by many the governments around the world, especially the Israeli one. Today, finally, I read a post that addressed the three issues I was pointing to:

    1. This first lockdown will last for months, which seems unacceptable for many people.
    2. A months-long lockdown would destroy the economy.
    3. It wouldn’t even solve the problem, because we would be just postponing the epidemic: later on, once we release the social distancing measures, people will still get infected in the millions and die.
    4. My biggest concern: Either a lot of people die soon and we don’t hurt the economy today, or we hurt the economy today, just to postpone the deaths.

    There’s no point rephrasing here the original post,just go and read it. I’m convinced. Thank you, Tomas Pueyo


    Go and read. The image is clickable

    https://medium.com/@tomaspueyo/coronavirus-the-hammer-and-the-dance-be9337092b56

    March 22, 2020 - 1 minute read -
    convinced covid-19 blog
  • Data scientist? Thinking of working in a distributed company?

    Data scientist? Thinking of working in a distributed company?

    March 20, 2020

    Data scientist? Thinking of working in a distributed company? The team at Automattic in which I used to work is looking for a Machine Learning specialist. It’s an awesome team. Give it a try https://automattic.com/work-with-us/machine-learning-engineer/

    March 20, 2020 - 1 minute read -
    blog
  • The single most important thing about remove 1:1 meetings

    The single most important thing about remove 1:1 meetings

    March 19, 2020

    The COVID-19 lockdown forced many organizations to a remote work mode. Recently, I spoke with three managers from three “conventional” companies and all the three told me how surprisingly efficient their 1:1 meetings became. This is how one of them described the situation “I prepare the agenda, we log in, boom, boom, boom, and we are done”.

    The effectiveness of distributed work doesn’t surprise me, after all, I have been working in a distributed mode for about six years now. However, this super-efficiency has its own problems that one needs to know. Here’s the thing. We, humans, are social creatures. We depend on social interactions for our mental and physical well being. When people share the same physical office, they have enough social interactions “in-between” – in the hallway, next to the watercooler and in the parking lot. However, working in a distributed team creates isolation. That is why it is very important to start and end every meeting with a personal conversation. It is also important to make sure that the meeting feels as personal as possible. To do so, place the chat window below the camera, so that the person feels as if you are looking at them. During the conversation, resist the urge to check emails, read your Facebook feed or check my blog. Make the personal meeting personal, even if it’s remote.

    I have been working in distributed teams for about six years. If you need advice on how to make the transition easier for your organization, I’ll be glad to give one (or two).

    March 19, 2020 - 2 minute read -
    distributed work meetings remote working working-remotely blog
  • COVID-19 vs. influenza dataviz (an update)

    COVID-19 vs. influenza dataviz (an update)

    March 18, 2020

    Graph code: here.

    March 18, 2020 - 1 minute read -
    corona covid-19 blog
  • An interesting solution of the data giraffe problem

    An interesting solution of the data giraffe problem

    March 18, 2020

    A data giraffe is a situation where a very prominent data point shades everything else. I learned this term from a post by Pini Yakuel and immediately liked it a lot.

    Taken from https://www.optimove.com/blog/beware-the-giraffes-in-your-data

    Taken from https://www.optimove.com/blog/beware-the-giraffes-in-your-data

    Dealing with data giraffes is hard, especially when dealing with bar charts. Today I saw one interesting approach to this problem

    Katherine S. Rowell is a co-funder of a Boston firm that specializes in data visualization. In December, she published apost dedicated to one of the most popular but also most abused graph types, the bar charts. One of the examples in her post demonstrates a nice treatment of data giraffes

    http://ksrowell.com/blog-visualizing-data/2019/12/18/bar-humbug/

    In this example, Katherine draws the graph twice. The zoomed-out version shows the giraffes in all their glory, while the zoomed-in one gives the spotlight to the foxes, hyenas, and mice.
    Also, note how these graphs respect the rules that every bar chart has to include the zero.

    March 18, 2020 - 1 minute read -
    bar plot Data Visualization dataviz giraffe blog
  • Another piece of career advice

    Another piece of career advice

    March 17, 2020

    Here’s another email that I got with the question about switching to the data science career

    Hello, my name is X. I saw your blog, and to be honest, I said, “Wow, is this me :)” I’m a pharmacist 5th-grade student currently working on a project in computational drug design. I started programming, and I loved it. After that, I heard the term “Data Science” and started to do some research […]

    Basically, I loved being on a computer and solving problems its a good career option for me (at least for now, you can’t predict future) my mom has a pharmacy I worked there (internship), and it is not for me (i am counting the time when I’m in a pharmacy.) so I have a few questions for you

    I don’t have any degree in statistics or CS or something equivalent I am determined to learn these topics, but some people want to see the degree, and probably no one accept a pharmacist to a master degree in statistics (I also wish to do my Ms in computational drug design because, in the end, I don’t want to be a data scientist in social sciences or economics, at least for now, I want to use that knowledge in my field which is drugs and pharmaceuticals)

    Ph.D. on Bioinformatics would help ? or Biostatistics ( is it easier for us to be accepted in biostatistics rather than statistics? To be honest, I don’t know the difference much, I took a biostatistics class, but it was just one semester and probably not enough for Ph.D. :))

    Do I really need a degree in CS or statistics to be a pharmaceutical data scientist? I want to do my Ph.D. but also want to be realistic, it sounds amazing doing online masters in statistics while you are doing Computational drug design or Bİoinformatics Ph.D., but it is very hard and frustrating and also decrease your productivity in both fields.

    I asked a lot of questions, sorry, but I have many :). You can reply when you have time. Thank you, and I loved your blog. I read and watched tons of things, but yours was the best suited for me because being a pharmacist, computational drug design, considering bioinformatics, it is all fits. By the way, I also considering cybersecurity (not working in a company but learning). I see that as a “martial arts of the future,” maybe I am wrong, but a person should know it to protect him/her self. Thank you again :)

    Indeed, X’s background sounds very much like mine.
    I’m not sure I have too much to add to what I already wrote here, in this blog. The only thing that I have to say is that in my biased opinion, a Ph.D. is something worth pursuing. The more time passes by, the more Ph.Ds there are, and the lack of a degree might be a problem in the future job market. On the other hand, there are many smart and rich people who claim that university degrees are a waste of time. Go figure :-)

    I hope that this helps.

    March 17, 2020 - 3 minute read -
    data science careers feedback question blog Career advice
  • No signs (yet?) of the COVID-19 pandemic on StackOverflow job postings

    No signs (yet?) of the COVID-19 pandemic on StackOverflow job postings

    March 16, 2020

    I suppose that you knot that THE software developement Q&A site has its own job board. I suspected that the Corona pandemic would lead to a sharp decrease in the number of job postings on that board. I scraped the data, and it looks like for now, there are no drastic changes in the amount of postings published in the last couple of days.

    March 16, 2020 - 1 minute read -
    corona covid-19 crisis stackoverflow blog
  • Tips for making remote presentations

    Tips for making remote presentations

    March 11, 2020

    Before becoming a freelancer data scientist, I used to work in a distributed company. Remote communication, including remote presentations were the norm for me, long before the remote work experiment no one asked for. In this post, I share some tips for delivering better presentations remotely.

    Me presenting in front of the computer

    * Stand up! Usually, we stand up when we present in front of live audience. For some reason, when presenting remotely, people tend to sit. A sitting person is less dynamic and looks less engaging. I have a standing desk which allows me to stand up and to raise the camera to my face level. * If you can’t raise the camera, stay sitting. You don’t want your audience staring at your groin. * I always use a presentation remote control. It frees me up and lets me move more naturally. My remote is almost ten years old and I have a strong emotional attachment to it * When presenting, it is very important to see your audience. Use two monitors. Use one monitor for screen sharing, and the other one to see the audience. * Put the Skype/Zoom/whatever window that shows your audience under the camera. This way you’ll look most natural on the other side of the teleconference. * Starting a presentation in Powerpoint or Keynote “kidnaps” all the displays. You will not be able to see the audience when that happens. I export the presentation to a PDF file and use Acrobat Reader in full-screen mode. The up- and down- buttons in my presentation remote control work with the Reader. The “make screen black” button doesn’t. * I open a “lightable view” of my presentation and put it next to the audience screen. It’s not as useful as seeing the presenter’s notes using a “real” presentation program, but it is good enough.

    Auditorium in Chisinau showing me on their screen

    * Make a dry run. Ideally, the try run should be a day or two before the event, to make sure all the technical problems are fixed. * Go online at least five minutes before the schedule. Be in front of the camera, don’t let the audience stare at your empty room * Make sure nothing in your background will embarrass you. This risk is especially high if you present from home or a hotel. Nobody needs to see your bed during a business meeting.
    March 11, 2020 - 2 minute read -
    corona distributed work presentation presenting remote remote working skype zoom blog Data Visualization
  • Older posts Newer posts