Course Outline

Machine Learning Introduction

  • Types of machine learning – supervised vs unsupervised
  • From statistical learning to machine learning
  • The data mining workflow: business understanding, data preparation, modeling, deployment
  • Choosing the right algorithm for the task
  • Overfitting and the bias-variance tradeoff

Python and ML Libraries Overview

  • Why use programming languages for ML
  • Choosing between R and Python
  • Python crash course and Jupyter Notebooks
  • Python libraries: pandas, NumPy, scikit-learn, matplotlib, seaborn

Testing and Evaluating ML Algorithms

  • Generalization, overfitting, and model validation
  • Evaluation strategies: holdout, cross-validation, bootstrapping
  • Metrics for regression: ME, MSE, RMSE, MAPE
  • Metrics for classification: accuracy, confusion matrix, unbalanced classes
  • Model performance visualization: profit curve, ROC curve, lift curve
  • Model selection and grid search for tuning

Data Preparation

  • Data import and storage in Python
  • Exploratory analysis and summary statistics
  • Handling missing values and outliers
  • Standardization, normalization, and transformation
  • Qualitative data recoding and data wrangling with pandas

Classification Algorithms

  • Binary vs multiclass classification
  • Logistic regression and discriminant functions
  • Naïve Bayes, k-nearest neighbors
  • Decision trees: CART, Random Forests, Bagging, Boosting, XGBoost
  • Support Vector Machines and kernels
  • Ensemble learning techniques

Regression and Numerical Prediction

  • Least squares and variable selection
  • Regularization methods: L1, L2
  • Polynomial regression and nonlinear models
  • Regression trees and splines

Neural Networks

  • Introduction to neural networks and deep learning
  • Activation functions, layers, and backpropagation
  • Multilayer perceptrons (MLP)
  • Using TensorFlow or PyTorch for basic neural network modeling
  • Neural networks for classification and regression

Sales Forecasting and Predictive Analytics

  • Time series vs regression-based forecasting
  • Handling seasonal and trend-based data
  • Building a sales forecasting model using ML techniques
  • Evaluating forecast accuracy and uncertainty
  • Business interpretation and communication of results

Unsupervised Learning

  • Clustering techniques: k-means, k-medoids, hierarchical clustering, SOMs
  • Dimensionality reduction: PCA, factor analysis, SVD
  • Multidimensional scaling

Text Mining

  • Text preprocessing and tokenization
  • Bag-of-words, stemming, and lemmatization
  • Sentiment analysis and word frequency
  • Visualizing text data with word clouds

Recommendation Systems

  • User-based and item-based collaborative filtering
  • Designing and evaluating recommendation engines

Association Pattern Mining

  • Frequent itemsets and Apriori algorithm
  • Market basket analysis and lift ratio

Outlier Detection

  • Extreme value analysis
  • Distance-based and density-based methods
  • Outlier detection in high-dimensional data

Machine Learning Case Study

  • Understanding the business problem
  • Data preprocessing and feature engineering
  • Model selection and parameter tuning
  • Evaluation and presentation of findings
  • Deployment

Summary and Next Steps

Requirements

  • Basic knowledge of machine learning concepts such as supervised and unsupervised learning
  • Familiarity with Python programming (variables, loops, functions)
  • Some experience with data handling using libraries like pandas or NumPy is helpful but not required
  • No prior experience with advanced modeling or neural networks is expected

Audience

  • Data scientists
  • Business analysts
  • Software engineers and technical professionals working with data
 28 Hours

Number of participants


Price per participant

Testimonials (2)

Upcoming Courses (Minimal 5 peserta)

Related Categories