Turning scattered organisational knowledge into a searchable, cited intelligence layer which reduces discovery time from 7 days to 20 minutes
Every experimentation program generates knowledge. The question is whether that knowledge compounds or disappears.
At Sykes, we were running hundreds of experiments annually across multiple brands and product squads. Each test generated insights some conclusive, some directional, some dead ends that prevented harmful changes. This should have made the organisation smarter with every test. Instead, it created a different problem: knowledge access.
The problem surfaced in a pattern I kept seeing in practice: a product manager would ask 'have we ever tested checkout layout changes?' and the answer required coordinating across teams, manually searching Slack threads, digging through experiment dashboards, and cross-referencing UX research folders. The process took 30–45 minutes minimum and still left uncertainty about whether the complete picture had been found.
This wasn't a knowledge gap. It was a knowledge access problem. Decision-makers weren't acting on what the organisation already knew because finding what we knew took longer than making a gut call.
The stakes: at 300+ annual tests, even a 10% reduction in re-work compounds significantly over time. But the deeper cost was strategic. When institutional knowledge is hard to access, teams either re-test old ideas or avoid bold hypotheses entirely because they can't build confidence from past learnings.
I set out to build a single intelligence layer that made the organisation's complete knowledge base accessible through a single natural language question.
The core architecture used Azure OpenAI for synthesis, Power Automate for orchestration, and intelligent multi-source routing. The system synthesised data from 1,200+ experiments and 200 UX research studies, enabling natural language queries across all sources simultaneously. All outputs were delivered as fully cited HTML reports
The implementation ran through a phased rollout: first indexing the existing experiment archive, then integrating UX research repo, then adding automated meta-analysis as experiments concluded.
The quantitative impact was immediate: discovery time dropped from 7+ days (coordinating teams, manual collation, pattern analysis) to 15–20 minutes through conversational queries.
Decision-makers stopped asking 'I wonder if we know anything about this' and started asking 'what do we know about this?' a subtle shift in language that reflected a fundamental shift in how knowledge was treated.
The system also had an unexpected cultural benefit: it reduced the pressure on the CRO team as the institutional memory holder. Product managers, designers, and engineers could now self-serve on historical context, which freed the CRO team to focus on forward-looking work rather than constantly answering 'did we test this before?' questions.
The insight was in how the system was framed. An AI search tool for experiments would have failed adoption. A conversational intelligence layer that answered strategic questions with cited evidence became indispensable.
The 80% confidence threshold before routing was also a hard-won design principle. Early tests with lower thresholds produced too many 'I'm not sure' responses, which can reduce trust. Setting the bar at 80% meant the system only answered questions it could genuinely synthesise and simply said 'no match' when it couldn't, rather than guessing.
If I were to rebuild this today, I'd instrument adoption tracking earlier understanding which queries drove the most value would have helped prioritise additional data source integrations faster.
The transferable principle: institutional knowledge is an asset that degrades if it isn't systematically accessible. The value of an experimentation program isn't just in the tests you run, it's in whether each test makes the organisation smarter.