casestudy_3 - The Conventional Rebel

Case study

The map nobody had

Everyone agreed on the problem, but no one could figure out what it would take to fix it

A little bit of background

My company had a customer experience problem that everyone could see but nobody could measure. Customers consistently reported trouble finding what they needed across the company’s sprawling digital landscape. The cause was well understood: During years of rapid growth, individual teams optimized for their own customers and spun up parallel sites in order to move quickly.

Everyone agreed on the problem, but no one had the data to make a business case for fixing it. Previous attempts to measure scope and ROI had collapsed under the weight of the problem’s complexity. I owned the initiative to finally put a number on it: a $500K program with VP sponsorship and visibility across multiple VP-led organizations.

the challenge

A problem that defied every attempt to fix it

When it’s too complex to measure, the answer is always “not yet”

The scale of the problem was the problem. 20+ organizations were producing content for their customers across 800+ internal and external sites, none of which were connected by a single platform or user interface. Previous initiatives ran into this wall and down-scoped the effort to only the properties they controlled, which didn’t fix the fragmentation.

The business needed a definitive picture to make a case for prioritizing cross-organizational work, not another incremental change that could be dismissed as incomplete.

the action

Map it, even if it’s not perfect

A rigorous answer to an imprecise problem

I designed a three-stage program: 1/ Compile a content inventory via web crawling, 2/ enrich the data using natural language processing with human reinforcement to support robust analysis, then 3/ build a permanent technical solution my organization could run on an ongoing basis. I engaged a vendor for the first two stages while we worked on hiring an internal engineering team for the long-term solution.

Challenge #1: The scope exceeded the plan

The discovery process confirmed what everyone suspected — 800+ sites, 20+ owning teams — but inventorying that much content would have blown the budget and timeline before we got to any analysis.

Start smaller and move faster. The vendor ran a lean initial crawl focused only on publicly available sites and minimal metadata. I handled internal inventorying directly, reaching out to repository owners for their data lists. No waiting for a perfect process.

Challenge #2: The crawler couldn’t see what wasn’t structured

The vendor’s first crawl missed large swaths of content. A lack of site maps and structured hierarchies was hiding content from the scanner. Re-running the crawl at the necessary scale would have consumed the entire budget with nothing left for Stage 2.

Reframe the goal, not the failure. The vendor had captured solid data from the four largest sites, which was enough to deliver on the spirit of the goal. I re-scoped Stage 1 to deliver a landscape analysis that required manually enriching ~1.2M pages with a limited set of metadata. It wasn’t the original plan, but it was the right plan.

Challenge #3: No one could agree on the metadata

A cross-functional working group couldn’t reach consensus on shared terms or descriptors. This wasn’t a collaboration failure. Each team was legitimately optimized for its own customers and had no incentive to conform to standards that might get in the way.

Use what existed and be transparent about it. I drew on content type definitions I’d compiled for a prior project and mapped them to the marketing team’s filtering taxonomy. I was explicit with stakeholders that it was a framework for this analysis only and not a mandate. It got us to the finish line without a political fight.

the result

Data did what consensus couldn’t

For the first time, we had a map

Delivered what every previous effort couldn’t. My landscape analysis was the first time anyone could articulate the full extent of the complexity, which turned a chronic conversation-ender into an active executive initiative.
Sparked a multi-team cleanup. The analysis inspired a cross-organizational effort to remove hundreds of stale pages and assets. This work never would have been prioritized without the data to justify it.
Lasting impact. Three years later, the report was still cited as foundational by teams who were navigating the landscape.
The hard pivot was the right call. Stopping before Stage 2 was disappointing in the moment. In retrospect, it saved a year or more of work that would have been made obsolete by LLM technologies shortly after. The snapshot I delivered held its value far longer than a brittle automated system would have.

Tools & tech

Web crawling via vendor engagement and their proprietary scanner. This was pre-LLM, so data enrichment was manual via URL parsing and reconciliation of competing metadata. Microsoft Excel for synthesis and visualizations.

Where I broke the rules

I called a halt to the program before it truly got off the ground. Once I’d assessed the remaining budget and the complexity of what we’d encountered, I concluded the ROI wouldn’t be there unless we could demonstrate equal and direct impact to the company’s bottom line, which wasn’t the case. Stopping felt like failure, but it was actually strategic judgment.

What I’d do with AI now

This is exactly the kind of problem LLMs were built for. Today I wouldn’t need a vendor. I’d build an end-to-end agentic workflow: automated crawling, AI-assisted classification and metadata enrichment, and continuous auditing, all at a fraction of the cost and timeline. The Stage 3 permanent solution that wasn’t worth building then would be worth building now.