Progress is not the same as completion.
In a cohort-based programme, learner loss is rarely random. It usually clusters around onboarding, pace pressure, and assignment strain. This case was built to separate those moments and show where intervention would matter most.
Five linked tables model the learner journey.
- learners.csv tracks learner profile, cohort, device, and final status.
- cohorts.csv defines size, dates, duration, and intake cycle.
- lesson_progress.csv follows weekly progress, completion, and time on platform.
- assignments.csv captures submission, score, lateness, and attempts.
- dropout_flags.csv records final dropout timing, inactivity, and reason category.
The strongest risks are early pressure, late work, and mobile-only study.
- 33.5% of all dropouts happen by week 6, which keeps onboarding as a major retention pressure point even though Sprint 4 is the sharpest single spike.
- Sprint 4 carries the heaviest single drop cluster, which points to a second pressure point later in the programme.
- High lateness lifts dropout by 27.2 percentage points against low-lateness learners.
- 57.8% of week 6 to 10 dropouts come from employed learners, which suggests middle-stage pacing strain.
- Mobile-only learners complete less often, with a 16.3 percentage point gap versus desktop learners.
Where the programme starts to lose momentum.
Retention curves

Retention differs by cohort, but the overall shape still shows a sharp early decline and a second pressure point later in the cycle.
Sprint drop-off

Loss clusters in the first two sprints and then returns around Sprint 4 when the programme load gets heavier.
Engagement and completion

Steady weekly participation is one of the clearest separators between completion and dropout.
Assignment risk

Late work is not just an academic issue. It acts like an early warning signal for learner loss.
Device pattern

Mobile-only learners face the weakest completion outcomes in this run.
Cohort mix

Each cohort keeps its own shape, but the broad pattern remains consistent enough to support programme-level action.
Built for cohort logic, not MOOC logic.
The synthetic model uses weekly lesson progression, sprint structure, assignment submission, and dropout timing. The programme is treated as a guided, cohort-based experience with deadlines and peer review, not a self-paced content library.
All data is synthetic. The programme structure is modeled to feel operationally believable rather than to mirror a private internal dataset.