Case study
Everyone agreed on the problem, but no one could figure out what it would take to fix it
::
A little bit of background
My company had a customer experience problem that everyone could see but nobody could measure. Customers consistently reported trouble finding what they needed across the company’s sprawling digital landscape. The cause was well understood: During years of rapid growth, individual teams optimized for their own customers and spun up parallel sites in order to move quickly.
Everyone agreed on the problem, but no one had the data to make a business case for fixing it. Previous attempts to measure scope and ROI had collapsed under the weight of the problem’s complexity. I owned the initiative to finally put a number on it: a $500K program with VP sponsorship and visibility across multiple VP-led organizations.
the challenge
When it’s too complex to measure, the answer is always “not yet”
The scale of the problem was the problem. 20+ organizations were producing content for their customers across 800+ internal and external sites, none of which were connected by a single platform or user interface. Previous initiatives ran into this wall and down-scoped the effort to only the properties they controlled, which didn’t fix the fragmentation.
The business needed a definitive picture to make a case for prioritizing cross-organizational work, not another incremental change that could be dismissed as incomplete.
the action
A rigorous answer to an imprecise problem
I designed a three-stage program: 1/ Compile a content inventory via web crawling, 2/ enrich the data using natural language processing with human reinforcement to support robust analysis, then 3/ build a permanent technical solution my organization could run on an ongoing basis. I engaged a vendor for the first two stages while we worked on hiring an internal engineering team for the long-term solution.
The discovery process confirmed what everyone suspected — 800+ sites, 20+ owning teams — but inventorying that much content would have blown the budget and timeline before we got to any analysis.
The vendor’s first crawl missed large swaths of content. A lack of site maps and structured hierarchies was hiding content from the scanner. Re-running the crawl at the necessary scale would have consumed the entire budget with nothing left for Stage 2.
A cross-functional working group couldn’t reach consensus on shared terms or descriptors. This wasn’t a collaboration failure. Each team was legitimately optimized for its own customers and had no incentive to conform to standards that might get in the way.
the result
For the first time, we had a map
::
Web crawling via vendor engagement and their proprietary scanner. This was pre-LLM, so data enrichment was manual via URL parsing and reconciliation of competing metadata. Microsoft Excel for synthesis and visualizations.
::
I called a halt to the program before it truly got off the ground. Once I’d assessed the remaining budget and the complexity of what we’d encountered, I concluded the ROI wouldn’t be there unless we could demonstrate equal and direct impact to the company’s bottom line, which wasn’t the case. Stopping felt like failure, but it was actually strategic judgment.
::
This is exactly the kind of problem LLMs were built for. Today I wouldn’t need a vendor. I’d build an end-to-end agentic workflow: automated crawling, AI-assisted classification and metadata enrichment, and continuous auditing, all at a fraction of the cost and timeline. The Stage 3 permanent solution that wasn’t worth building then would be worth building now.