RAG is not just uploading documents

Many RAG projects start by choosing a vector database and indexing every PDF, web page, and knowledge base article. The demo works. Then production users ask real questions, and the system returns vague answers, stale policy, or citations that do not support the response.

The hard part of RAG is not only storage. It is document preparation, retrieval quality, ranking, answer generation, citation, permissions, and evaluation.

Chunking sets the ceiling

Chunks that are too large contain unrelated material. Chunks that are too small lose context. A better approach respects document structure: headings, paragraphs, tables, code blocks, and FAQ sections should be handled differently.

Metadata is just as important. Store source, update time, product version, permission scope, and section path. Without metadata, it is hard to filter old content or show reliable citations.

Retrieval must be evaluated

A RAG system needs a test set of real questions. For each question, mark the expected source document or answer points. Then check whether the correct material appears in the top results before judging the final answer.

If the answer is wrong, you need to know whether retrieval failed, reranking failed, or generation failed.

Answer boundaries are product behavior

The model should answer from retrieved material, not from imagination. For product documentation, contracts, support policy, price rules, or compliance content, unsupported answers are dangerous. When the sources do not contain the answer, the system should say so.

Showing citations is not decoration. It lets users verify the answer and builds trust.

Updates and permissions are not optional

Documents expire. Permissions change. Deleted documents must be removed from the index. Internal-only material must not appear in public answers. These rules should be part of ingestion from the beginning, not added after a leak.

Takeaway

RAG becomes useful when the whole chain is controlled: structured chunks, measurable retrieval, reranking, source-backed answers, permission filters, and update handling. The vector database is only one component.