Clearly, our support vector classifier is learning something from the text information that helps to enhance predictive power, however the variable importance plot below presents two reasons for caution. First, the occurrence of the...
Data viz is like the ultimate step in delivering insights. Analyst craft beautiful insights but sometimes they don’t have enough time to create amazing visualizations. Unfortunately, this could take away from the effectiveness of...
The industry-wide neglect of information design and data quality (and what you may do about it)My favorite way of explaining the difference between data science and data engineering is that this:If data science is...
The industry-wide neglect of knowledge design and data quality (and what you'll be able to do about it)My favorite way of explaining the difference between data science and data engineering is that this:If data...
Are you Searching Parameters Efficiently?We are usually not here for claiming the models with the very best performances by cents improvements of validation metrics. We must pursue business goals. In our simulated scenario,...
Here’s the right way to set a self-study routine that you just’ll actually keep on with while learning data scienceWhile self-studying data science, you’ll end up in certainly one of two hypothetical settings: on...
A step-by-step case study of how data scientists approach and execute a cluster evaluationCluster “1” has higher average arrests across all crimesNo observable difference in average urban population %The three clusters appear to be...
4 other ways to have a look at itMatrix AB is a sum of p rank-1 matrices of size mxn, where the i_th matrix (amongst p) is the results of multiplying column-i of A...