An Overview of Practical Skills for CRISP-DM
"Data are just summaries of thousands of stories – tell a few of those stories to help make the data meaningful." — Chip and Dan Heath
Data mining techniques are changing daily and, thus, requiring academic curriculums to constantly evolve to maintain relevance. Because of this, paper-based books full of text-based instruction are inadequate because they cannot keep up with the rate of change. Therefore, the purpose of this online book is to teach--through practice-based video tutorials--the latest and most common techniques for both descriptive and predictive data analytics. We use currently industry-leading tool, Tableau, to teach dashboard design and story telling which describes the current state of an organization based on measurable data. However, the supreme value of data is in it's ability to predict the future. This is also the most difficult and risky directive. Therefore, we begin by teaching basic methods in Excel for multiple regression and the assumptions of linear regression. Afterward, the bulk of the course is spent covering more advanced algorithms and techniques using an industry-leading tool for predictive analysis: Microsoft Azure Machine Learning Studio. We chose these tools, first, because they are mainstream industry tools that you are likely to use across a variety of industries, but second, because both come with free versions for students :)