RAG is not just uploading documents
Many RAG projects start by choosing a vector database and indexing every PDF, web page, and knowledge base article. The demo works. Then production users ask real questions, and the system returns vague answers, stale policy, or citations that do not support the response.
The hard part of RAG is not only storage. It is document preparation, retrieval quality, ranking, answer generation, citation, permissions, and evaluation.
Chunking sets the ceiling
Chunks that are too large contain unrelated material. Chunks that are too small lose context. A better approach respects document structure: headings, paragraphs, tables, code blocks, and FAQ sections should be handled differently.
Metadata is just as important. Store source, update time, product version, permission scope, and section path. Without metadata, it is hard to filter old content or show reliable citations.
Retrieval must be evaluated
A RAG system needs a test set of real questions. For each question, mark the expected source document or answer points. Then check whether the correct material appears in the top results before judging the final answer.
If the answer is wrong, you need to know whether retrieval failed, reranking failed, or generation failed.
Answer boundaries are product behavior
The model should answer from retrieved material, not from imagination. For product documentation, contracts, support policy, price rules, or compliance content, unsupported answers are dangerous. When the sources do not contain the answer, the system should say so.
Showing citations is not decoration. It lets users verify the answer and builds trust.
Updates and permissions are not optional
Documents expire. Permissions change. Deleted documents must be removed from the index. Internal-only material must not appear in public answers. These rules should be part of ingestion from the beginning, not added after a leak.
Takeaway
RAG becomes useful when the whole chain is controlled: structured chunks, measurable retrieval, reranking, source-backed answers, permission filters, and update handling. The vector database is only one component.