ALX DATA SCIENCE / PRODUCT INTELLIGENCE CASE STUDY

Learner progression, completion, and dropout pressure across a cohort-based programme.

This case models learner behaviour in a structured data science programme with sprints, peer review, assignments, and checkpoints. The goal is simple: show where momentum breaks, which signals matter early, and what product or programme decisions would improve completion.

Domain EdTech
Series III.IV Flagship 03
Model Synthetic cohort data
Dropout rate
46.7%
Overall learner loss across the model.
Completion rate
46.1%
Learners who finish completed cohorts.
Critical sprint
Sprint 4
Where the biggest single drop cluster lands.
Mobile gap
16.3pp
Desktop completion minus mobile completion.

Progress is not the same as completion.

In a cohort-based programme, learner loss is rarely random. It usually clusters around onboarding, pace pressure, and assignment strain. This case was built to separate those moments and show where intervention would matter most.

Five linked tables model the learner journey.

  • learners.csv tracks learner profile, cohort, device, and final status.
  • cohorts.csv defines size, dates, duration, and intake cycle.
  • lesson_progress.csv follows weekly progress, completion, and time on platform.
  • assignments.csv captures submission, score, lateness, and attempts.
  • dropout_flags.csv records final dropout timing, inactivity, and reason category.

The strongest risks are early pressure, late work, and mobile-only study.

  • 33.5% of all dropouts happen by week 6, which keeps onboarding as a major retention pressure point even though Sprint 4 is the sharpest single spike.
  • Sprint 4 carries the heaviest single drop cluster, which points to a second pressure point later in the programme.
  • High lateness lifts dropout by 27.2 percentage points against low-lateness learners.
  • 57.8% of week 6 to 10 dropouts come from employed learners, which suggests middle-stage pacing strain.
  • Mobile-only learners complete less often, with a 16.3 percentage point gap versus desktop learners.

Where the programme starts to lose momentum.

Retention curves

Retention curves by cohort

Retention differs by cohort, but the overall shape still shows a sharp early decline and a second pressure point later in the cycle.

Sprint drop-off

Sprint drop-off analysis

Loss clusters in the first two sprints and then returns around Sprint 4 when the programme load gets heavier.

Engagement and completion

Engagement versus completion

Steady weekly participation is one of the clearest separators between completion and dropout.

Assignment risk

Assignment lateness and dropout

Late work is not just an academic issue. It acts like an early warning signal for learner loss.

Device pattern

Device type and completion

Mobile-only learners face the weakest completion outcomes in this run.

Cohort mix

Cohort completion mix

Each cohort keeps its own shape, but the broad pattern remains consistent enough to support programme-level action.

Built for cohort logic, not MOOC logic.

The synthetic model uses weekly lesson progression, sprint structure, assignment submission, and dropout timing. The programme is treated as a guided, cohort-based experience with deadlines and peer review, not a self-paced content library.

All data is synthetic. The programme structure is modeled to feel operationally believable rather than to mirror a private internal dataset.