Workforce Intelligence, From Movement History to Career Pathways

HR analytics usually stops at descriptive reporting: who moved where last year. The next question, who should move next and along which pathway, is the one the business actually pays for. Answering it needs historical movement patterns and skills intelligence, at a scale that mirrors a real enterprise workforce, with iteration loops that do not depend on production extracts every time someone wants to try an idea.

The problem

Workforce planning teams hold three big inputs that rarely line up.

Position histories run to hundreds of thousands of rows per financial year. A “movement” can look like a real career step, an administrative reclassification, a temporary secondment, or a record-keeping artefact, and telling them apart matters because the downstream model treats them as evidence of mobility. Role definitions sit in a separate taxonomy with skill requirements that update on a different cadence than the positions referencing them. And the skills themselves are mapped to roles with directional weights, because moving from one role to another is rarely the same difficulty in reverse.

A platform that fuses all three has to detect transitions accurately, calculate transferability between roles, recommend pathways that combine pattern and gap, and stay testable at enterprise scale even when the real data cannot leave the environment.

The approach

Treat workforce analytics like product engineering, not a notebook. The Workforce Intelligence Engine applies a small set of patterns that make the system survive taxonomy changes and reproduce across environments.

Configuration-driven architecture. No hardcoded thresholds in source modules. Employee ID ranges, management level distributions, geographic splits, and skill category proportions live in YAML files. The same code runs against development, staging, and production scale by switching the configuration profile. Onboarding a new dataset means describing it, not editing the engine.

Movement detection. Track position changes through temporal lookup keys, enrich each position with role profile and inherited skills, then validate the transition rate against ground-truth windows. The engine accepts that some “movements” are noise and tunes the detector against known clean periods rather than chasing a metric in isolation.

Skills intelligence. Asymmetric similarity between roles, because moving from analyst to manager is not the same as the reverse. Gap analysis names the skills a person would need to acquire, the ones they already have that the destination role does not require, and the overlap that explains why the transition is plausible at all.

Synthetic data generation. Factory, strategy, builder, and adapter patterns produce datasets that match real workforce distributions without touching real employee records. Enterprise-scale demo runs become reproducible: anyone with the codebase can stand up a fresh dataset, run the engine, and see the same shape of output the production team sees.

Architecture from movement detection through skills analysis to pathways

Modes of operation

comprehensive runs the full pipeline: movement history, skills similarity, pathway recommendations
individual returns pathway recommendations for a single employee ID
scenarios answers strategic questions like restructuring, acquisition fit, or expansion into a new region

Conceptual career pathway from source role through intermediate roles to target

Evidence

~90k colleagues, ~45k positions, ~500k records per financial year on validated runs
100% position enrichment against role profiles
~59% movement rate detected with realistic false-positive control on known-clean windows
Zero processing errors on validated pipeline runs
Pairs naturally with the Skill Similarity Engine: similarity answers role-to-role fit, workforce intelligence answers person-to-pathway fit over time

The framing shift is the point. The work moves leadership conversations from “what happened?” to “what should happen next?”, and the synthetic data layer means the conversation does not stall waiting on production access.