Student Mental Health Analysis

01 objective

Not "how many?" — but "how severe, and when?"

Most mental health analyses stop at prevalence: counting how many students report a condition. This project was structured around a different question — does psychological burden intensify as students advance academically?

The analysis was built on three principles: cohort-based comparison using Year of Study as the primary lens, severity stratification separating single from overlapping conditions, and proportional normalisation to prevent enrollment imbalance from distorting results. The goal was risk identification and intervention prioritisation, not descriptive reporting.

02 data preparation

Cleaning, removing noise, and engineering severity

The raw dataset had 11 columns and 101 rows collected via a survey form. Several structural problems needed resolving before any analysis could begin.

STEP_01

Removed Low-Signal Columns

Dropped Timestamp — narrow collection window made it analytically meaningless. Dropped Course — excessive categorical fragmentation with low signal value would have added noise, not insight.

STEP_02

Standardised Categorical Values

Column headers were renamed from verbose survey labels (e.g. "Do you have Anxiety?") to clean field names. CGPA was grouped into standardised ranges: 0–1.99, 2.00–2.49, 2.50–2.99, 3.00–3.49, 3.50–4.00. Year labels normalised to Year 1–4.

STEP_03

Removed Incomplete Records

Blank and incomplete rows were removed to ensure analytical reliability. The cleaned dataset retained all 101 valid student records across the four year cohorts.

STEP_04

Severity Engineering — 3 New Columns

Three derived classification columns were created to measure escalation intensity rather than simple presence. This is the core feature engineering step that powers the entire severity analysis.

03 column changes

What was removed, renamed, and built from scratch

Column	Status	Notes
Timestamp	REMOVED	Narrow survey window made time analysis meaningless
What is your course?	REMOVED	Too many fragmented categories, low analytical signal
Choose your gender → `Gender`	RENAMED	Shortened from verbose survey phrasing
Your current year of Study → `Year Of Study`	RENAMED	Standardised to clean label
What is your CGPA? → `CGPA range`	RENAMED	Grouped into 5 standardised ranges for pivot analysis
Do you have Depression? → `Depression?`	RENAMED	Shortened; same for Anxiety and Panic Attack
`S/N`	NEW	Serial number column added for row identification
`At least 1` — Any Condition	NEW	TRUE if student reports at least one condition. Baseline prevalence flag.
`At least 2` — Multiple Cases	NEW	TRUE if two or more conditions present concurrently. Elevated burden flag.
`All three` — Complex Cases	NEW	TRUE if all three conditions (Anxiety + Depression + Panic Attack) reported together. Highest severity tier.

04 formulas & dax logic

The calculations that drive the severity model

Excel — Severity Classification Columns

These three columns were calculated directly in Excel using OR/AND logic on the three condition columns before the data was loaded into Power BI.

At least 1 — Any Condition EXCEL

// TRUE if any one condition is reported
=OR(G2="Yes", H2="Yes", I2="Yes")
// G2=Depression, H2=Anxiety, I2=Panic Attack

At least 2 — Multiple Cases EXCEL

// TRUE if two or more conditions are present
=(COUNTIF(G2:I2,"Yes")>=2)
// Counts "Yes" values across the three condition columns
// Returns TRUE when count is 2 or more

All three — Complex Cases EXCEL

// TRUE only when all three conditions are simultaneously present
=AND(G2="Yes", H2="Yes", I2="Yes")
// Highest severity tier — flags compounded psychological burden

DAX — Proportional Prevalence (Power BI)

Raw counts alone would have been misleading — Year 1 has 43 students while Year 4 has only 8. DAX measures were written to calculate within-cohort prevalence, making year-on-year comparison valid.

Prevalence % Within Year of Study DAX

// % of students in the current year filter with any condition
Prevalence % =
  DIVIDE(
    CALCULATE(COUNTROWS(Students),
      Students[At least 1] = TRUE),
    COUNTROWS(Students),
    0
  )
// Context-aware: responds to Year of Study slicer
// DIVIDE with 0 fallback prevents division errors on empty filters

Complex Cases Count (DAX) DAX

// Conditional count — only rows where all three = TRUE
Complex Cases =
  CALCULATE(
    COUNTROWS(Students),
    Students[All three] = TRUE
  )

05 key insights

Year 3 is the highest-risk cohort — but raw counts hid it

⚠ CRITICAL FINDING

Year 1 had the largest raw case count (25 students). Without proportional normalisation, this looks like the most at-risk cohort. Proportional analysis tells the opposite story — Year 3 carries the highest burden at 75% prevalence.

Cohort-by-Cohort Breakdown

Cohort	n	Any Condition	Multiple Cases	Complex Cases
Year 1	43	58.1%	30.2%	9.3%
Year 2	26	73.1%	26.9%	7.7%
Year 3	24	75.0%	29.2%	12.5%
Year 4	8 ⚠	25.0%	12.5%	12.5%

⚠ Year 4 n=8. Low sample size limits generalisability — interpret with caution.

75%

Year 3 any-condition prevalence — the highest risk point in the academic journey

63.4%

Overall cohort prevalence — nearly 2 in 3 students report at least one condition

6 / 64

Only 6 of 64 affected students sought specialist care — a significant treatment gap

Academic Performance Is Not the Filter

CGPA segmentation shows that psychological burden is not isolated to lower-performing students. The highest concentration of students sits in the 3.00–4.00 CGPA range — and conditions are present across all tiers. Stress exposure is cross-performance, which means targeting only low-CGPA students for support would miss the majority of cases.

06 recommendations

Five evidence-based interventions

Front-load prevention in Year 1. Implement structured screening at entry level before accumulation begins. The data shows escalation is progressive — intervening at Year 1 disrupts the pathway to Year 3 complexity.
Concentrate counselling resources on Year 2 and Year 3. These cohorts show the steepest proportional burden. Mid-progression is the highest-risk zone, not the entry point.
Establish a monitoring protocol for Complex Cases. The 10 students carrying all three conditions simultaneously need prioritised, structured support — not standard triage.
Adopt universal screening, not CGPA-based targeting. Psychological burden appears across all academic performance tiers. A performance-gated support model would exclude the majority of students who need it.
Close the specialist care gap. Only 6 of 64 affected students accessed specialist treatment. Reducing friction — awareness, referral pathways, stigma — could significantly improve outcomes.

07 conclusion

This is a risk segmentation model, not a chart

The core contribution of this project is methodological. By combining Excel based severity engineering with DAX-driven proportional analysis in Power BI, the work moves beyond descriptive dashboarding into structured risk identification.

The finding that Year 3 represents the peak risk cohort, not Year 1 which had the highest raw count, would have been invisible without normalisation. That distinction is the difference between resource allocation that works and resource allocation that misses the problem entirely.

Student Mental Health
Risk Analysis