Sitemap

✈️ Predicting Airline Customer Satisfaction: From Insights to Random Forest (Part 2)

6 min readSep 4, 2025

In Part 1 of this series we transformed raw airline passenger data into a structured dataset. We handled missing values, encoded categorical variables, created new features like Total Delay, and standardized numerical features.

Friendly Link:- In case, if your medium quota is over you can read it here.

Press enter or click to view image in full size

Now it’s time for the real action 🚀 → Exploratory Data Analysis (EDA), Model Training, and finding out which algorithm predicts customer satisfaction best.

📌 Target Distribution

  • 56.7% passengers were dissatisfied.
  • 43.3% were satisfied.
    This slight imbalance matters when evaluating models — predicting only the majority class wouldn’t be enough.
# Visualise the distribution of the target variable. 
import matplotlib.pyplot as plt
import seaborn as sns

# plot distribution of target variable
plt.figure(figsize=(6,4))
sns.countplot(x=y, palette='Set2')

plt.title('Distribution of Customer Satisfaction (0 = Dissatisfied, 1 = Satisfied)')
plt.xlabel('Satisfaction')
plt.ylabel('Count')
plt.show()

# print percentages
satisfaction_counts = pd.Series(y_encoded).value_counts(normalize=True) * 100
print("Satisfaction Distribution (%):\n", satisfaction_counts)

--

--

rahul sahay
rahul sahay

Written by rahul sahay

🌟 Unleashing the Power of Languages! 🚀 Expert polyglot developer, author, and Course creator on a mission to transform coding into an art. 🎨 Join me in

No responses yet