AI Documentation Strategy: University of Nebraska
Evaluating AI documentation tools and recommending RAG architecture for an 8-campus university system
Context
The University of Nebraska System supports 8 campuses and state colleges. Our documentation was scattered across wikis, Confluence pages, and tribal knowledge. Leadership wanted to explore AI-powered documentation tools that could help users find answers without manually searching through dozens of outdated pages.
I was asked to evaluate whether an AI documentation assistant was feasible, and if so, what architecture would work best.
My Roles & Responsibilities
As the lead evaluator on this initiative, I was in charge of:
- Defining the evaluation criteria and testing methodology
- Building a proof-of-concept with Mistral 7B
- Benchmarking RAG vs. fine-tuning approaches against 50 real user queries
- Presenting findings and a data-backed recommendation to senior IT leadership
The Challenge
The core question wasn't "can AI do this?" It was "which approach gives us accurate answers we can trust?"
Two main architectural approaches existed:
- Fine-tuning: Train a model on our documentation corpus. Potentially more accurate for our specific domain, but expensive and hard to update.
- RAG (Retrieval-Augmented Generation): Keep documents in a vector database and retrieve relevant chunks at query time. Easier to update but depends heavily on retrieval quality.
For a university system, source fidelity was critical. If the AI told a student the wrong deadline or policy, the consequences were real.
What I Did
Built a Proof-of-Concept with Mistral 7B
I chose Mistral 7B as the base model because it was open-source, ran on commodity hardware, and had strong performance benchmarks for its size.
The POC had two components:
- A RAG pipeline that ingested our documentation, chunked it, embedded it into a vector store, and retrieved relevant context at query time
- A fine-tuned version trained on a subset of our documentation for comparison
Benchmarked Both Approaches
I tested both against a set of 50 representative questions that reflected real user queries (policy questions, procedural how-tos, deadline lookups). I measured:
- Factual accuracy: Did the answer match the source document?
- Source attribution: Could the system point to where the answer came from?
- Latency: How fast did it respond?
- Maintenance burden: How hard is it to update when documents change?
Presented Findings to Stakeholders
I compiled the results into a presentation for senior IT leadership. The recommendation was clear: RAG outperformed fine-tuning on source fidelity because it retrieved and cited actual document chunks rather than generating from learned patterns. Fine-tuning produced more fluent responses but occasionally hallucinated policy details, which was unacceptable for a university system.
The Recommendation
I recommended RAG architecture for three reasons:
| Factor | RAG | Fine-tuning |
|---|---|---|
| Source fidelity | High: cites actual documents | Medium: can hallucinate details |
| Update speed | Fast: re-index changed docs | Slow: requires retraining |
| Cost | Lower: no GPU training needed | Higher: compute-intensive |
| Maintenance | Team can update docs, system adapts | Requires ML expertise to retrain |
The recommendation was accepted by leadership.
Takeaways
- The best technical solution isn't always the most impressive one. Fine-tuning felt more sophisticated, but RAG was the right choice because it prioritized accuracy over fluency, which is what mattered for our users.
- PMs need to translate technical trade-offs into business language. Stakeholders didn't care about embedding dimensions or chunk sizes. They cared about "will it give wrong answers?" and "how much will it cost to maintain?"
- Build the POC before the PowerPoint. Having a working prototype with real performance data made the recommendation credible. A slide deck alone wouldn't have been convincing.
Skills Applied
- AI architecture evaluation (RAG vs. fine-tuning)
- Proof-of-concept development
- Performance benchmarking and data analysis
- Stakeholder communication and technical translation
- Decision-making under uncertainty