Skip to content

KoVoidG/python-machine-learning

Repository files navigation

🍗 KFC Siam — Customer Behavior Data Analysis

A multi-page Streamlit dashboard for analyzing customer behavior data collected from KFC Siam (Siam University). This project covers statistical summaries, data visualizations, and machine learning models built from survey responses.

📋 Survey form (data collection): https://survey.htunthihamyo.com/kfc/


👥 Team Members

Global Academy, Siam University — Year 2, Semester 2 · Data Science Project


📋 Overview

This dashboard analyses responses collected via the KFC Siam Customer Behavior Survey. It explores:

  • Demographics — age, gender, nationality, occupation, faculty
  • Spending behavior — budget ranges and patterns by group
  • Ordering habits — visit frequency, dine type, order method, payment
  • Menu & add-on preferences — most popular items
  • Promotions — preferred promotion types
  • Ratings — flavor quality and service satisfaction
  • Machine learning models — budget and occupation prediction

🗂️ Project Structure

KFC final/
├── KFCfinal.py               # Entry point — Streamlit multi-page navigation
├── VisualizationforKFC.py    # Page 1: Key main charts & visualizations
├── Statistics.py             # Page 2: Statistical summaries (mode, median, etc.)
├── OtherFindings.py          # Page 3: Additional charts and correlation heatmap
├── DecisionTree.py           # Page 4: ML — Decision Tree (predict budget)
├── KNeighbour.py             # Page 5: ML — K-Nearest Neighbours (predict occupation)
├── KFC.xlsx                  # Survey dataset (Clean_data, Addon, Promotion sheets)
├── kfc_insights_export (1).xlsx  # Exported insights / summary data
└── README.md                 # This file

📊 Dashboard Pages

1 · Key Main Charts (VisualizationforKFC.py)

Interactive visualizations using Plotly and Matplotlib:

Chart Description
Budget by Occupation Bar chart showing spending distribution across student, staff, TA, others
Demographic Spending by Age Budget breakdown per age group
Demographic Spending by Nationality Stacked bar chart per nationality
Most Popular Main Menu Horizontal histogram — Chicken dominates
Add-on Popularity French fries are the top add-on
Order Type Distribution Pie chart — Individual (40.5%) and Promotion (38.1%) lead
Order Method by Age Group Kiosk preferred by younger customers

2 · Statistics (Statistics.py)

Descriptive statistics with interpretations:

  • Gender & Nationality distribution with frequency tables
  • Dine type & Payment mode analysis
  • Age — Median & mode: 18–22 age group
  • Budget — Median & mode: 100–199 THB range
  • Visit Frequency — Median & mode: Sometimes
  • Ratings — Flavor and service rating medians & modes

3 · Other Findings (OtherFindings.py)

Additional exploratory charts:

Chart Key Finding
Budget Distribution Most customers spend 100–199 THB
Budget vs Occupation Boxplot TAs spend the most overall
Ordering Method Histogram Kiosk is the most preferred ordering method
Budget by Age Boxplot 28–35 group shifts toward 300+ budget tier
Promotion Preference Discounts are the #1 preferred promotion
Order Type by Age 18–22 group dominates individual and promotion orders
Major Distribution Global Academy students = 74.5% of respondents
Correlation Heatmap Age, budget, visit frequency are strongly correlated (0.92–0.98)

4 · Decision Tree (DecisionTree.py)

Goal: Predict a customer's spending budget category.

  • Features: Order type, age, occupation
  • Algorithm: DecisionTreeClassifier (max depth = 3)
  • Preprocessing: StandardScaler + LabelEncoder
  • Interactive: Adjustable test/train ratio; real-time prediction on user input

5 · K-Nearest Neighbours (KNeighbour.py)

Goal: Predict a customer's occupation category.

  • Features: Age, budget, order type, order method
  • Algorithm: KNeighborsClassifier
  • Interactive: Selectable K value (1–15), adjustable test ratio, real-time prediction

🛠️ Tech Stack

Technology Purpose
Python 3 Core language
Streamlit Multi-page web dashboard
Pandas Data loading & manipulation
NumPy Numerical operations
Matplotlib & Seaborn Static charts
Plotly Express Interactive charts
scikit-learn Machine learning models

🚀 Getting Started

Prerequisites

Install the required Python packages:

pip install streamlit pandas numpy matplotlib seaborn plotly scikit-learn openpyxl

Run the dashboard

streamlit run KFCfinal.py

The app will open at http://localhost:8501 in your browser.

Note: KFC.xlsx must be in the same directory as the scripts. The workbook must contain three sheets: Clean_data, Addon, and Promotion.


📁 Dataset (KFC.xlsx)

Sheet Contents
Clean_data Main cleaned survey responses (demographics, behavior, ratings, encoded columns)
Addon Add-on item counts
Promotion Promotion type counts

Key Columns (Clean_data)

Column Description
age Age range label
age_enc Encoded age (1–5)
gender Male / Female / LGBTQ+
nationality Country of origin
occupation Student / Staff / TA / Others
major Faculty (students only)
budget Spending budget label
budget_enc Encoded budget (1–4)
visitFrequency_enc Encoded visit frequency (1–5)
dineType Dine-in / Takeaway / Delivery
orderType Individual / Promotion / Group / Snack Sharing
orderMethod Counter / Kiosk / App
payment Cash / PromptPay / Card / True Money
menuCategory Most purchased menu category
flavorRating Food flavor rating (1–5)
serviceRating Service experience rating (1–5)

📌 Key Insights

  • 🏆 Chicken is overwhelmingly the most popular menu category
  • 🍟 French fries are the most preferred add-on
  • 💸 Most customers spend 100–199 THB per visit
  • 👥 The majority of respondents are 18–22 year old students
  • 🖥️ Kiosk is the most popular ordering method
  • 🏷️ Discounts are the most preferred promotion type
  • 📈 Strong correlation (0.92–0.98) exists between age, budget, and visit frequency

© 2026 KFC Siam · Data Science Research Project · Global Academy, Siam University

About

This is a group project for Data Science course at the university

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages