It has been about half a year after I became a freelance data scientist. Before my career change, I worked in a distributed team for more than five years. Today, I suddenly realized that working in a distributed team has a significant problem, inherent to its distributed, multinational, nature. My team was always spread over … Continue reading Unexpected hitch of working in a distributed team
Here’s a neat method that helps me organize my week, increase my productivity and fight procrastination.
Even good graphs have place for improvement. Follow this post to
Look at this wonderful piece of data visualization (taken from here). If you know the terms “tertiary structure” and “glycan”, there is NO way you miss the message that the author of this figure wanted to convey. Also, note how using appropriate colors in the title, the authors got rid of graph legend.
Here’s an appealing ad that I saw How to become a Python professional in 42 hours? I’ll tell you how. There is no way. I don’t know any field of knowledge in which one can become professional after 42 hours. Certainly not Python. Not even after 42 days. Maybe after 42 weeks if that’s mostly … Continue reading How to become a Python professional in 42 hours?
I’m honored to take part in standardizing bidirectional language support in interfaces and visualization, as a part of an expert group formed for the Hebrew Support in Computerized Systems Committee at the SII-the standards institution of Israel. The Committee is led by Gilad Almosnino. Below is Gilad’s project announcement.
TL;DR Good motivation to improve communication. Inadequate source of information on how to achieve that The central premise of Five Stars Communication Secrets to Get from Good to Great by Carmine Gallo is that professionals who don’t invest in communication skills are at high risk of being replaced by computers and robots. One of the … Continue reading Book review. Five Stars by Carmine Gallo
I’m reading the a 1991 paper by Barbara Tversky that deals with the directional representation of time. One sentence in the paper interview says “There does not seem to be strong universal cognitive associations of quantity or quality to left or right” Whenever I make a similar statement in the context of data visualization, I … Continue reading The delicate art of fine trolling
It’s fun to look at the visit statistics and to discover old stories. I wrote this post in 2016. For a reason I don’t know, this post has been one of the most viewed posts in my blogs during the last week. So, I decided to publish it again. I won’t add any new examples, … Continue reading Lie factor in ad graphs
Network (graph) analysis is a complicated topic. There are several tools available for this task with different pros and cons. Recently, I stumbled upon another tool StellarGraph. StellarGraph authors claim to provide excellent performance; NumPy, Pandas, TensorFlow integration, an impressive set of algorithms, inter compatibility with Neo4j (THE graph database); and much more. The documentation looks … Continue reading StellarGraph — another promising network analysis library for Python and Scala
On balance between specialization and the risk to become obsolete.
Network visualization can mesmerize and hypnotize. Chord diagrams are especially cool because they are so colorful and smooth. The problem is that sometimes, the result doesn’t provide any actual value, and serves as a cute illustration. Cute illustrations are cute; they help put some “easiness” to the text without the risk of looking too unprofessional. … Continue reading Nice but useless data visualization
When I was in elementary school (back in the USSR of the mid 80’s), I had a friend whose father was a shoemaker. Due to the crazy stupid way the Soviet economy worked, a Soviet shoemaker was much richer than a physician or an engineer. But this is not the story. The story is that … Continue reading Bioinformatics career advice and a story about a Soviet shoemaker
Interview on leadership. The difference between statistically meaningful and practically meaningful;Giving credit, being decent and not cheating;
All good teamwork starts with effective communication;
You don’t know that the stuff that you know is unknown to others;
Is Distributed Work a Divide and Conquer Strategy?
Being a data scientist and a self-proclaimed data visualization expert, I like using log scale graphs when I find them appropriate. However, as a speaker and a communicator, I refrain from using them in presentations as much as possible. From my experience as a data visualization lecturer, I noticed that even “technical” struggle grasping the concept of log scale graphs.
Book review: The Year Without Pants. WordPress.com and the future of work. Read it if history of work is your thing, or if you work in a small company that grows rapidly
“Why it burns when you P” and other statistics rants
Besides being a freelancer data scientist and visualization expert, I teach. One of the toughest concepts to teach and to visualize is odds ratio. Today, I stumbled upon a very interesting post that deals exactly with that
Did you know that J.K. Rowling, the author of Harry Potter, submitted her books 13 times before it was accepted? So what?
COVID-19 vs. influenza dataviz. The order is now correct
On a person that falls into the water. Or why thinking short-time is a good strategy in times of crisis
One day or another, we will all need to act very fast. This means that we need to be prepared, have plan B’s work on resilience, and maybe perform emergency drills.
It is correct that the colors that IBM people used in their guide are neat, but data visualization that distorts information is not visualization but a piece of garbage. I assume that IBM produces decent computers, but don’t learn data visualization from them
Originally posted on Boris Gorelik:
In many cases, attempts to set a deadline to a data science project result in a complete fiasco. Why is that? Why, in many software projects, managers can have a reasonable time estimate for the completion but in most data science projects they can’t? The key points to answer this…
NDR is a family of machine learning/data science conferences. Their next conference will be held online on May, 28 and the agenda looks great. Now, I’m not super objective here, because I’m presenting at NDR July event. But look at the topics, what an impressive selection!
Finally We May Have a Path to the Fundamental Theory of Physics… and It’s Beautiful — Stephen Wolfram Blog
OK, so Stephen Wolfram (a mega celebrity in the computational intelligence world and, among other things a physicist) claims that he may have found a path to the Fundamental Theory of Physics. The blog post is long, and I hope to be able to finish reading it in a week or two. The accompanying technical … Continue reading Finally We May Have a Path to the Fundamental Theory of Physics… and It’s Beautiful — Stephen Wolfram Blog
The quintessence of data visualization usefulness. These graphs are SOOOO good and convincing.
Never Split the Difference. A negotiation book that you might want to read. A book review.
Today, Israel marks Holocaust Day. Many words have been written about the Holocaust, and I want to write about missing graves.If you visit a Jewish cemetery, you might see a lot of gravestones with additional memorial plates. I took this picture in the Chișinău (Kishinev) Jewish cemetery. Burial of the deceased is considered the final … Continue reading The missing graves
Constance Crozier (@clcrozier) shared an interesting simulation in which she tried to fit a sigmoid curve (s-curve) to predict a plateau in a time-series. It took me a while to find the reference for a paper that explains why.
My colleague, Simon Ouderkik, recorded a REALLY interesting interview with Stephen Levin of Zapier and Emilie Schario of Gitlab on organizing data org in a company, job titles, career ladders, and other important stuff.
If there is only one document you can read about data visualization, this is the one
I wrote about data giraffes two weeks ago. Usually, “data giraffes” are a problem and we need to work hard in order to solve it. Sometimes, they are a useful feature. Take a look at this NYT front page that shows the number of new unemployment applications in the United States over the time And … Continue reading Data giraffe is sometimes a feature, not a problem
My job wasn’t affected by the COVID madness in almost any way. I used to work from home before, and I work from home now, none on my customers cancelled any projects, the health system in Israel is still functioning, all of my relatives are in good health, everything is just fine! I know how … Continue reading Everything is NOT just fine (repost)
More than two years ago, I took a look at Google Trends for three phrases “start a blog”, “create a blog”, and “create a site”. I was surprised by the high volume of blog searches, compared to “create a site”. Today, I decided to go back to Google Trends and to add the new rising … Continue reading Blogging isn’t what it used to be. Podcasting is on the rise
A super-important read on the COVID-19 situation. I’m finally convinced
Data scientist? Thinking of working in a distributed company? The team at Automattic in which I used to work is looking for a Machine Learning specialist. It’s an awesome team. Give it a try https://automattic.com/work-with-us/machine-learning-engineer/
Make the personal meeting personal, even if it’s remote.
An interesting solution of the data giraffe problem
COVID-19 vs. influenza dataviz (an update)
Here’s another email that I got with the question about switching to the data science career
I suppose that you knot that THE software developement Q&A site has its own job board. I suspected that the Corona pandemic would lead to a sharp decrease in the number of job postings on that board. I scraped the data, and it looks like for now, there are no drastic changes in the amount of postings published in the last couple of days.
The cardiovascular safety of antiobesity drugs—analysis of signals in the FDA Adverse Event Report System Database
I am glad and proud to announce that a paper which I helped to prepare and publish is available on the Nature’s group site. The paper, The cardiovascular safety of antiobesity drugs—analysis of signals in the FDA Adverse Event Report System Database, by Einat Gorelik et al. (including myself) analyzes the data in the FDA Adverse … Continue reading The cardiovascular safety of antiobesity drugs—analysis of signals in the FDA Adverse Event Report System Database
Please leave a comment to this post. It doesn’t matter what, it can be a simple Hi or an interesting link. It doesn’t matter when or where you see it. I want to see how many real people are actually reading this blog.
Before becoming a freelancer data scientist, I used to work in a distributed company. Remote communication, including remote presentations were the norm for me, long before the remote work experiment no one asked for. In this post, I share some tips for delivering better presentations remotely. Stand up! Usually, we stand up when we present … Continue reading Tips for making remote presentations
Originally posted on בוריס גורליק:
תרשים עוגה כחלופה הולמת לגרף עמודות במהלך חיי המקצועיים שמעתי רבות בגנות תרשימי עוגה. הסיבה לכך נעוצה בעובדה שקל מאוד לייצר זוועות עם תרשימים אלו. לא עזרה העובדה שבמשך המון זמן ברירת המחדל של תרשימי עוגה, בכל כלי ההדמיה העיקריים, ייצרה תרשימים מעוותים לגמרי. מצדדי החרם על תרשים עוגה מציעים את גרף…
“One idea per slide” means one idea per slide. The simplest way to enforce this rule is to devote one slide per a sentence. Remember, adding slides is free, the audience attention is not.
Graph code: here.
Being a data science freelancer, and a long-time AnnMaria’s fan, I HAVE to repost here latest post on consulting success
People ask me for good intro video to data visualization. I tend to ask them to look for one of my lectures. To save the search, here’s one of the most relevant talks that I gave This lecture was a part of 2018 EuroScipy conference, where I also ran a workshop.
Career advice. A clinical pharmacist, epidemiologist, and a Ph.D. student wants to become a data scientist.
From time to time, I get emails from people who seek advice in their career paths. This time, I got an email from a clinical pharmacist and a Ph.D student
Being a freelancer data scientist, I get to talk to people about proposals that don’t materialize into projects.
I can’t elaborate yet, but in case you wondered how scientific satisfaction looks like, here’s a perfect illustration. Stay tuned
Gilad Almosnino is an internationalization expert. I’m reading his post “Eight emojis that will create a more inclusive experience for Middle Eastern markets,” in which he mentions “Turkish or Arabic Coffee,” which reminded me of my last visit to Athens. When, in one restaurant, I asked for a Turkish coffee, the waiter looked at me harshly and … Continue reading Which coffee is this?
Do you believe in telepathy? Yesterday, I submitted final proofs of a paper in which I actively participated. During the proofreading, I noticed that our abstract ends with “further research is needed” and scratched my head. I submitted the proofs and then then, I saw this pearl in my blog feed
TL;DR shallow and disappointing The Great Mental Models by Shane Parrish was highly praised by Automattic’s CEO Matt Mullenweg. Since I appreciate Matt’s opinion a lot, I decided to buy the book. I read it and was disappointed. This book is very ambitious but yet shallow and non-engaging. If you consider reading a book on … Continue reading Book review: Great mental models by Shane Parrish
Which data scientists can refuse more computing power? None. My collection of computing devices has a new addition a Soviet arithmometer Felix M.
TicToc — a flexible and straightforward stopwatch library for Python.
Why it is OK to have a loud argument with your co-workers.
The difference between python decorators and inheritance that cost me three hours of hair-pulling
In playing cards, the Queen is worth less than the King? Is it time for a change? #gender-equality
Originally posted on Akshay Budhkar:
? Introduction I was fascinated by Zipf’s Law when I came across it on a VSauce video. It is an empirical law that states that the frequency of occurrence of a word in a large text corpus is inversely proportional to its rank in its frequency table. The frequency distribution…
A great piece of advice from an experienced freelance consultant
“Replay” by Ken Grimwood is an excellent fiction reading. Here’s my review
From time to time, we need to look at the distribution of a group of values. Histograms are, I think, the most popular way to visualize distributions. “Back in the old days,” when we did most of our work in the console, and when creating a plot from Python required too many boilerplate code lines, … Continue reading ASCII histograms are quick, easy to use and to implement
Some people, in face of important changes visit tombs of the righteous for a blessing. I went to see WEIZAC — Israel’s first computer (and one of the first ones in the world) that was built in 1955.
I got a dream job at one of the biggest distributed companies in the world, almost by chance. It was an excellent experience, but it’s time for a change.
If you read my shortish post about staying employable as a data scientist, you might like a longer post by a colleague, Yanir Seroussi. In his post, Yanir lists four possible paths for a data scientist. To his list, I add two other options.
I received an email from a pharmacist who considers becoming a data scientist. Since this is not a first (or last) similar email that I receive, I think others will find this message exchange interesting.
On November 7, 2016, I started an experiment in personal productivity. I decided to use a notebook for thirty days to manage all of my tasks. The thirty days ended more than three years ago, and I still use notebooks to manage myself.
Don’t we all like a good contradiction? On gut feelings.
Is data science immune to becoming obsolete? I claim it is not. As time passes by, tools become stronger, smarter, and faster. To stay relevant, we need to be in a constant movement.
Yes, ML transparency opens opportunities for hacking and abuse. However, this is EXACTLY the reason why such openness is needed. Hacking attempts will not disappear with transparency removal; they will be harder to defend.
Last year I talked at NDR Iasi. I enjoyed that so much and when Vlad Iliescu, one of the NDR organizers, asked me to present at NDR Bucharest in June, I didn’t think twice.
TL;DR: a nice popular science book that covers many aspects of the modern science A Short History of Nearly Everything by Bill Bryson is a popular science book. I didn’t learn anything fundamental out of this book, but it was worth reading. I was particularly impressed by the intrigues, lies, and manipulations behind so many … Continue reading Book review. A Short History of Nearly Everything by Bill Bryson
Yesterday, a new episode was published in the Popcorn podcast, where the host, Lior Frenkel, interviewed me. Everyone who knows me knows how much I love talking about myself and what I do. I definitely used this opportunity to talk about the world of data. Some people who listened to this episode told me that … Continue reading Cow shit, virtual patient, big data, and the future of the human species
Data visualization as an engineering task – a methodological approach towards creating effective data visualization
Data visualization as an engineering task – a methodological approach towards creating effective data visualization
Combining getting things done with a tangible Kanban method
Knowledge graphs and NLP — a conference summary
What are some Data science tools with a graphical user interface?
On differences in communication styles when working in a distributed company.
Sometimes, you don’t really need a legend in your graph.
What do we see when we look at slices of a pie chart? Angles? Areas? Arc length? The answer to this question isn’t clear and thus “experts” recommend avoiding pie charts at all.
Inspired by A citation is not a citation is not a citation by Lior Patcher, this rant is about metrics. Lior Patcher is a researcher in Caltech. As many other researchers in the academy, Dr. Patcher is measured by, among other things, publications and their impact as measured by citations. In his post, Lior Patcher criticised both the … Continue reading The problem with citation count as an impact metric
Book review. The War of Art by . Pressfield. TL;DR This is a long motivational book that is “too spiritual” for the cynic materialist that I am.
Data visualization with statistical reasoning: seeing uncertainty with the bootstrap — Dataviz – Stats – Bayes
On Sunday, I wrote about bootstrapping. On Monday, I wrote about visualization uncertainty. Let’s now talk about bootstrapping and uncertainty visualization. Robert Grant is a data visualization expert who wrote a book about interactive data visualization (which I should read, BTW). Robert runs an interesting blog from which I learned another approach to uncertainty visualization, … Continue reading Data visualization with statistical reasoning: seeing uncertainty with the bootstrap — Dataviz – Stats – Bayes
When Massive Online Open Courses (a.k.a MOOCs) emerged some X years ago, I was ecstatic. I was sure that MOOCs were the Big Boom of higher education. Unfortunately, the MOOC impact turned out to be very modest. This modest impact, combined with the high production cost was one of the reasons I quit making my … Continue reading On MOOCs
The fact that you can put error bars on a bar chart, you shouldn’t. Here’s why.
Not long ago, I wrote a post about a fast hack that increased my reading speed by tracking the reading with a finger. I think that the logic behind using a tracking finger is to suppress subvocalization. I noticed that, at least in my case, suppressing subvocalization reduces the fun of reading. I actually enjoy … Continue reading You don’t need a fast way to increase your reading speed by 25%. Or, don’t suppress subvocalization
Originally posted on Yanir Seroussi:
Bootstrapping the right way is a talk I gave earlier this year at the YOW! Data conference in Sydney. You can now watch the video of the talk and have a look through the slides. The content of the talk is similar to a post I published on bootstrapping pitfalls,…
From time to time, people (mostly conference organizers) ask for a picture of mine. Feel free using any of these images
Originally posted on richardbrath:
We create visualizations to aid viewers in making visual inferences. Different visualizations are suited to different inferences. Some visualizations offer more additional perceptual inferences over comparable visualizations. That is, the specific configuration enables additional inferences to be observed directly, without additional cognitive load. (e.g. see Gem Stapleton et al, Effective Representation…
Nir Eyal is known for his book “Hooked” in which he teaches how to create addictive products. In his new book “Indistractable“, Nir teaches how to live in the world full of addictive products. The book itself isn’t bad. It provides interesting information and, more importantly, practical tips and action items. Nir covers topics such … Continue reading Book review. Indistractable by Nir Eyal
Next time you wonder why your Israeli colleague, customer or partner barely works during October, recall this post
I was sceptic but I tried, measured, and arrived to the conclusion. First, I set a timer to 60 seconds and read some text. I managed to read seventeen lines. Then, I used my finger to guide my eyes the same way kids do when they learn reading. It turned out that I was able … Continue reading A fast way to increase your reading speed by 25%
If you find yourself thinking about your professional future, or if you are looking for a good career advice, I recommend reading The Formula
As much as I love thinking that I live in a global world, most people whom I know speak Hebrew. From time to time, someone would tell me “nice post, but why not in Hebrew?”. So, from now on, I will try to translate all my new posts to Hebrew. I will try. Not promising … Continue reading My blog in Hebrew