Untitled

Modern tools make your skills obsolete. So what?

Read this if you are a data scientist (or another professional) worried about your career. So many people, including me, write about how fields such as copywriting, drawing, or data science change from being accessible to a niche of highly professional individuals to a mere commodity. I claim it’s a good thing, not only for…

Book review: Extreme ownership

TL;DR Own your wins, own your failures, stay calm and make decisions. Read it. 5/5 “Extreme ownership” is a book about leadership in business written by two ex-SEAL fighters. This book is full of war stories, as in actual stories from a real war. I read this book by the recommendation (an instruction, really) of…

New position, new challenge

I will skip the usual “I’m thrilled and excited…”. I’ll just say it.As of today, I am the CTO of wizer.me, a platform for teachers and educators to create and share interactive worksheets. On a scale of 1 to 10, how thrilled am I? 10On a scale of 1 to 10, how terrified am I? 10On…

Back to in-person presentations

Today, I gave my first in-person presentation since the pandemic. It was awesome! I was talking about the study I performed with Nabeel Sulieman about data visualization in environments that use right-to-left writing systems. I wrote about this study in the past [one, two]. Today, you may find the results of our study at http://direction-matters.com/.…

An example of a very bad graph

An example of a very bad graph Nature Medicine is a peer-reviewed journal that belongs to the very prestigious Nature group. Today, I was reading a paper that included THIS GEM. These two graphs are so bad. It looks as if the authors had a target to squeeze as many data visualization mistakes as possible…

On proper selection of colors in graphs

How do you properly select a colormap for a graph? What makes the rainbow color map a wrong choice, and what are the proper alternatives? Today, I stumbled upon a lengthy post that provides an in-depth review of the theory behind our color perception. The article concentrates on quantitative colormaps but also includes information relevant to…

Book review: The Hard Things About Hard Things by Ben Horowitz

TL;DR War stories and pieces of advice from the high tech industry veteran. I read this book following recomendations by Reem Sherman, the host of the excellent (!!!) podcast Geekonomy (in Hebrew). Ben Horowitz is a veteran manager and entrepreneur who found the company Opsware, which Hewlett-Packard acquired in 2007. This book describes Horotwitz’s journey…

😦

Usually, I keep my blog for professional news only, but this time, I’ll make an exception. This frame is from a video that was taken a couple of days ago, less than one hour away from my home. Note how many people are there.  Some people will claim that what we see is a peaceful…

Opening a new notebook in my productivity system

Those who know me, know that I always care with me a cheep and thin notebook which I use as an extension to my mind. Today, I opened a new notebook, and this is a good opportunity to share some links about my productivity system. Start with the post “The best productivity system I know”Failed…

Another example of the power of data visualization

I stumbled upon a great graph that tells a complex story compellingly. This graph compares the last two waves of COVID-19 in the United Kingdom and is shows so clearly that the new wave (that is supposedly composed of the Delta variant) is much more infections on the one hand, but on the other hand,…

Another evolution of my offline productivity system

This week, I mark an important milestone in my professional life. It is an excellent opportunity to start a new productivity notebook and tell you about the latest evolution of the best productivity system I know. To sum up, I use a custom variant of Mark Forster’s Final Version productivity system that uses a plain notebook to track,…

Experiment report

In January 2020, I started a new experiment. I quit what was a dream job and became a freelancer. Today, the experiment is over. This post serves as omphaloskepsis – a short reflection on what went well and what could have worked better. What worked well? To sum up, I declare this experiment successful. I had…

A new phase in my professional life

I’m excited to announce that I’m joining MyBiotics Pharma Ltd as the company’s Head of Data and Bioinformatics. I have been working with this fantastic company and its remarkable people as a freelancer for fourteen fruitful months. But today, I join the MyBiotics family as a full-time member. Together, we will strive to better understanding…

Black lives matter. Lior Pachter

Almost one year after it was originally published, I stumbled upon this powerful post. Today, June 10th 2020, black academic scientists are holding a strike in solidarity with Black Lives Matter protests. I strike with them and for them. This is why: I began to understand the enormity of racism against blacks thirty five years…

Super useful videos for advanced data visualizers

The great Robert Kosara, also known as the “eager eyes” has started publishing a series of videos he calls Chart Appreciation. In these videos, Robert takes a piece of data visualization from a reputable and known source, and discusses why this particular piece is so good, what decisions were made that made it possible, what…

Career advise. Upgrading data science career

From time to time, people send me emails asking for career advice. Here’s one recent exchange. Hi Boris, I am currently trying to decide on a career move and would like to ask for your advice. I have a MSc from a leading university in ML, without thesis. I have 5 years of experience in…

Interview 27: Racial discrimination and fair machine learning

I invited Dr. Charles Earl for this episode of my podcast “Job Interview” to talk about racial discrimination at the workplace and fairness in machine learning. Dr. Charles Earl is a data scientist in Automattic, my previous place of work. Charles holds a Ph.D. in computer science, M.A. in education, M.Sc in Electrical engineering, and…

On startup porn

Danny Lieberman managed teams of programmers before I couldn’t read, so when Danny writes a post as bold and blunt as this, you should read it.

Book review. The Persuasion Slide by Richard Dooley

TL;DR Very shallow and uninformative. It could be an OK series of blog posts for complete novices, but not a book. The Persuasion Slide by Richard Dooley was a disappointment for me. I love Dooley’s podcast Brainfluence, and I was sure that Richard’s book would full of in-depth knowledge and case studies. However, it contained neither.  The…

Graphical comparison of changes in large populations with “volcano plots”

I recently rediscovered a volcano plot — a scatter plot that aims to visualize changes in large populations. Volcano plots are very technical and specialized and, most probably, are not a good fit for explanatory data visualization. However, they can be useful during the exploration phase, and they come with a set of well-established metrics. Moreover,…

Book review: Manager in shorts by Gal Zellermayer

TL;DR Nice’n’easy reading for novice managers I read this book after hearing the author, Gal Zellermayer, in a podcast. Gal is an Israeli guy who has been working as a manager in several global companies’ Israeli offices. He brings a perspective that combines (what is perceived) the best practices of American managing style with the…

Innumeracy

Innumeracy is the “inability to deal comfortably with the fundamental notions of number and chance”.I wish there was a better term for “innumeracy”, a term that would reflect the importance of analyzing risks, uncertainty, and chance. Unfortunately, I can’t find such a term. Nevertheless, the problem is huge. In this long post, Tom Breur reviews…

Before and after — stacked bar charts

A fellow data analyst asked a question? What do we do when we need to draw a stacked bar chart that has too many colors? How do we select the colors so that they are nice but also are easily distinguishable? To answer this question, let’s look at the data similar to what appeared in…

The Problem With Slope Charts (by Nick Desbarats)

Slope charts are often suggested as a valid alternative to clustered bar charts, especially for “before and after” cases. So, instead of a clustered bar char like this we tend to recommend a slope chart (or slope graph) like this However, a slope chart isn’t free of problems either. In the past, I already wrote…

Before and after: Alternatives to a radar chart (spider chart)

A radar chart (sometimes called “spider charts”) look cool but are, in fact,pretty lame. So much so that when the data visualization author Stephen Few mentioned them in his book Show me the numbers, he did so in a chapter called “Silly graphs that are best forsaken.” Here, I will demonstrate some of its problems,…

Another language

بعد حوالي سنتين من الدراسة ، بحس حالي جاهز لإضافة اللغة العربية إلى قائمة اللغات في ال-LinkedIn  After about two years of study, I feel ready to add Arabic to LinkedIn’s language list

Basic data visualization video course (in Hebrew)

I had the honor to record an introductory data visualization course for high school students as a part of the Israeli national distance learning project. The course is in Hebrew, and since it targets high schoolers, it does not require any prior knowledge. I got paid for this job. However, when I divide the money…

Text Visualization Browser

I’ve stumbled upon an exciting project — text visualization browser. It’s a web page that allows one to search for different text visualization techniques using keywords and publication time.  The ability to limit the search to various years gives a nice historical perspective on this interesting topic This site’s information is based on a 2015 paper Text…

Sharing the results of your Python code

If you work, but nobody knows about your results or cares about them, have you done any work at all?  As a data scientist, the product of my work is usually an algorithm, an analysis, or a model. What is a good way to share these results with my clients?  Since 99% of my time,…

The information is beautiful. The graphs are shit!

I apologize for my harsh language, but recently I was exposed to a bunch of graphs on the “information is beautiful” site, and I was offended (well, ot really, but let’s pretend I was). I mean, I’m a liberal person, and I don’t care what graphs people do in their own time. Many people visit…

Career advice. Becoming a freelancer immediately after finishing a masters degree

Will Cray [link] is a fresh M.Sc. in Computer Science and considers becoming a freelancer in the Machine Learning / Artificial Intelligence / Data Science field. Will asked for advice on the LocallyOptimistic.com community Slack channel. Here’s will question (all the names in this post are used with people’s permissions). Read more career advices [here]. Let’s begin.…

Exploring alternatives to population pyramids

A population pyramid also called an “age-gender-pyramid”, is a graphical illustration that shows the distribution of various age groups in a population (typically that of a country or region of the world), which forms the shape of a pyramid when the population is growing [citation from Wikipedia]. In some cases, the pyramid provides interesting insights into…

The Mysterious Status of .blog Domains

When the .blog TLD was started by Automattic, employees were given the option to reserve a domain for free. In return […], they asked that the domain be used as a primary domain (no forwarding to a different site), and that the site be updated with new content at least once a month. This requirement…

ASCII histograms are quick, easy to use and to implement

From time to time, we need to look at a distribution of a group of values. Histograms are, I think, the most popular way to visualize distributions. “Back in the old days,” when most of my work was done in the console, and when creating a plot from Python was required too many boilerplate code…

A short compilation of productivity blog posts

This post contains a bunch of links to blogs that write about productivity. Musings of Brown Girls This is not an exclusively productivity blog. The authors of this collective effort write about other interesting things. I read some posts, and I liked them 2. Self care Do you know that feeling when you feel bad…

35 (and more) Ways Data Go Bad — Stats With Cats Blog

If you plan working data analysis or processing, read the excellent post in the “stats with cats blog” titled “35 Ways Data Go Bad” post. I did experience each and every one of the 35 problems. However, this list is far from being complete. One should add the comprehensive list of Falsehoods Programmers Believe About…

Unexpected hitch of working in a distributed team

It has been about half a year after I became a freelance data scientist. Before my career change, I worked in a distributed team for more than five years. Today, I suddenly realized that working in a distributed team has a significant problem, inherent to its distributed, multinational, nature. My team was always spread over…

Data visualization is not only dots, bars, and pies

Look at this wonderful piece of data visualization (taken from here). If you know the terms “tertiary structure” and “glycan”, there is NO way you miss the message that the author of this figure wanted to convey. Also, note how using appropriate colors in the title, the authors got rid of graph legend.

How to become a Python professional in 42 hours?

Here’s an appealing ad that I saw How to become a Python professional in 42 hours? I’ll tell you how. There is no way. I don’t know any field of knowledge in which one can become professional after 42 hours. Certainly not Python. Not even after 42 days. Maybe after 42 weeks if that’s mostly…

Book review. Five Stars by Carmine Gallo

TL;DR Good motivation to improve communication. Inadequate source of information on how to achieve that  The central premise of Five Stars Communication Secrets to Get from Good to Great by Carmine Gallo is that professionals who don’t invest in communication skills are at high risk of being replaced by computers and robots. One of the…

The delicate art of fine trolling

I’m reading the a 1991 paper by Barbara Tversky that deals with the directional representation of time. One sentence in the paper interview says “There does not seem to be strong universal cognitive associations of quantity or quality to left or right” Whenever I make a similar statement in the context of data visualization, I…

Lie factor in ad graphs

It’s fun to look at the visit statistics and to discover old stories. I wrote this post in 2016. For a reason I don’t know, this post has been one of the most viewed posts in my blogs during the last week.  So, I decided to publish it again. I won’t add any new examples,…

StellarGraph — another promising network analysis library for Python and Scala

Network (graph) analysis is a complicated topic. There are several tools available for this task with different pros and cons. Recently, I stumbled upon another tool StellarGraph. StellarGraph authors claim to provide excellent performance; NumPy, Pandas, TensorFlow integration, an impressive set of algorithms, inter compatibility with Neo4j (THE graph database); and much more. The documentation looks…

Nice but useless data visualization

Network visualization can mesmerize and hypnotize. Chord diagrams are especially cool because they are so colorful and smooth. The problem is that sometimes, the result doesn’t provide any actual value, and serves as a cute illustration. Cute illustrations are cute; they help put some “easiness” to the text without the risk of looking too unprofessional. …

Logarithmic scale misinforms. Period

Being a data scientist and a self-proclaimed data visualization expert, I like using log scale graphs when I find them appropriate. However, as a speaker and a communicator, I refrain from using them in presentations as much as possible. From my experience as a data visualization lecturer, I noticed that even “technical” struggle grasping the…

Visualising Odds Ratio — Henry Lau

Besides being a freelancer data scientist and visualization expert, I teach. One of the toughest concepts to teach and to visualize is odds ratio. Today, I stumbled upon a very interesting post that deals exactly with that

Online data science conference on May, 28

NDR is a family of machine learning/data science conferences. Their next conference will be held online on May, 28 and the agenda looks great. Now, I’m not super objective here, because I’m presenting at NDR July event. But look at the topics, what an impressive selection!

The missing graves

Today, Israel marks Holocaust Day. Many words have been written about the Holocaust, and I want to write about missing graves.If you visit a Jewish cemetery, you might see a lot of gravestones with additional memorial plates. I took this picture in the Chișinău (Kishinev) Jewish cemetery. Burial of the deceased is considered the final…

Why is forecasting s-curves hard?

Constance Crozier (@clcrozier) shared an interesting simulation in which she tried to fit a sigmoid curve (s-curve) to predict a plateau in a time-series. It took me a while to find the reference for a paper that explains why.

Data giraffe is sometimes a feature, not a problem

I wrote about data giraffes two weeks ago. Usually, “data giraffes” are a problem and we need to work hard in order to solve it. Sometimes, they are a useful feature. Take a look at this NYT front page that shows the number of new unemployment applications in the United States over the time And…

Everything is NOT just fine (repost)

My job wasn’t affected by the COVID madness in almost any way. I used to work from home before, and I work from home now, none on my customers cancelled any projects, the health system in Israel is still functioning, all of my relatives are in good health, everything is just fine! I know how…