AI-3018 Learning Portal
Objective 2.1 30 minhigh priorityvector-storefile-searchragembeddingschunking

2.1 — Add Files to the Agent's Vector Store

Upload and index files in the agent's built-in vector store to enable retrieval-augmented generation (RAG) with the File Search tool.

Prerequisites: 1.1, 1.2
Concept — What & Why

The Built-In Vector Store

The Azure AI Foundry Agent Service gives every agent a built-in Vector StoreA managed embedding database inside the Foundry project that stores chunked, vectorized document content. The Agent Service manages the underlying index automatically with no external infrastructure required. — a managed embedding database that sits inside the project and requires no external infrastructure. When you upload files to an agent, the service chunks them, generates embeddings, and stores those vectors so the File Search ToolThe built-in agent tool that performs semantic retrieval from the vector store (or a connected Azure AI Search index) and injects retrieved chunks into the model's context before generating a reply. can perform semantic retrieval at query time.

A vector store is a numeric representation of document content. Each chunk of text is converted into a high-dimensional embedding vector. At query time the user's question is embedded in the same space and the closest chunks are returned — this is Retrieval-Augmented Generation (RAG)A pattern where relevant document chunks are retrieved from a vector store and injected into the model's context to ground its response in specific source material.. You can tune retrieval precision by setting a Ranking ThresholdA minimum similarity score (0–1) that a chunk must meet to be included in File Search results. Setting this to 0.5–0.7 filters weakly matching chunks and reduces hallucination from tangentially related content. to filter out low-relevance chunks.

ConceptDetail
ChunkingFiles are split into overlapping text chunks (default ~800 tokens, ~400 overlap)
Embedding modelManaged by the service; you do not select or host it
Index scopePer-agent; each agent has its own vector store namespace
PersistenceFiles persist until explicitly deleted; re-upload is not needed across runs
Deep Dive — How It Works

Supported File Types and Size Limits

File TypeExtensionNotes
PDF.pdfText-layer PDFs only; scanned images are not OCR'd
Word.docxParagraph and table content extracted
Plain text.txtUTF-8 preferred
Markdown.mdHeaders and body extracted
PowerPoint.pptxSlide text extracted
HTML.htmlTag content stripped
JSON / CSV.json, .csvTreated as structured text

Size limits: Single file maximum is 512 MB. The total vector store per agent is limited to 100 GB. Each file is processed asynchronously — large files may take several minutes before they become searchable.

How the File Search Tool Uses the Vector Store

When a user sends a message, the File Search tool is invoked automatically (if enabled). The workflow is:

  1. The user query is embedded.
  2. The vector store returns the top-k most similar chunks (configurable via max_num_results).
  3. Retrieved chunks are injected into the system context before the model generates a reply.
  4. The response includes citations — inline references showing which file and chunk sourced each claim.

File Search Tool Configuration Parameters

ParameterPurposeDefault
max_num_resultsMaximum chunks returned per query20
ranking_thresholdMinimum similarity score (0–1) to include a chunk0 (all returned)
chunk_overlap_tokensOverlap between adjacent chunks (advanced)400

Setting ranking_threshold to 0.50.7 filters out weakly matching chunks and reduces hallucination risk from tangentially related content.

CriterionBuilt-in Vector StoreAzure AI Search Connection
Setup effortZero — automaticRequires a separate Azure AI Search resource
Data locationManaged inside the Foundry projectYour own subscription
Custom indexingNot configurableFull control (analyzers, fields, facets)
ScaleUp to 100 GB per agentUnlimited (tiered pricing)
Hybrid searchNot supportedSupported (BM25 + vector)
Best forQuick RAG on small–medium corporaEnterprise indexes, compliance requirements

Use the built-in vector store when you need rapid onboarding with minimal infrastructure. Choose Azure AI Search when you have an existing index, need hybrid retrieval, or must keep data in a specific region under your own tenant.

Hands-On Lab

Hands-On: Upload a PDF and Confirm Indexing

Goal: Upload a PDF to an agent and confirm it is indexed and searchable.

  1. Navigate to Azure AI Foundry (https://ai.azure.com) and open your project.
  2. Click Agents in the left navigation, then select or create an agent.
  3. In the agent editor, click the Files tab.
  4. Click + Upload file → select a PDF from your local machine (under 512 MB).
  5. Watch the Status column — it transitions from Processing to Indexed (typically 10–60 seconds for small files).
  6. Switch to the Playground tab, type a question whose answer is in the PDF, and press Send.
  7. Confirm the response includes an inline citation with the file name and page reference.
  8. Expand the Tool calls panel to see the raw File Search input, output chunks, and similarity scores.
  9. (Optional) Go back to Files, click the file name, and inspect the Chunks list to see how the document was split.
  10. Try an out-of-scope question (not in the document) and verify the agent acknowledges it cannot answer rather than hallucinating.
Exam Angle — What AI-3018 Tests

AI-3018 Assessment Focus

File type limitations (especially scanned PDFs) and the distinction between vector store files and Code Interpreter files are the most common exam traps in this domain.

Exam Trap

"Upload via Azure Blob Storage to use File Search" — False. Files are uploaded directly through the Agent Service portal or SDK. You do not need a Storage account for the built-in vector store.

Exam Trap

"Scanned PDF images are automatically OCR'd" — False. The File Search tool extracts the text layer only. A scanned PDF with no embedded text produces no retrievable content.

Exam Trap

"The agent re-embeds files on every run" — False. Embeddings are computed once at upload time and stored persistently. They are not regenerated per conversation.

Exam Trap

"Setting max_num_results to 1 always improves accuracy" — Not necessarily. Too few chunks may omit the passage that answers the query; the model hallucinates to fill gaps.

Must Memorize

File Search files and Code Interpreter files are separate upload targets. A file uploaded to the vector store is NOT accessible to Code Interpreter, and vice versa.

Question — click to flip

Q: What happens when you upload a scanned PDF to the agent's vector store?

Question — click to flip

Q: What is the default maximum number of chunks returned by File Search per query?

Question — click to flip

Q: Are file embeddings recomputed at the start of each conversation?

Question — click to flip

Q: What is the single-file size limit for the built-in vector store?

Question — click to flip

Q: Which built-in vector store feature does Azure AI Search support that the Foundry vector store does not?

Question — click to flip

Q: What does setting ranking_threshold to 0.6 do?

Sources & Further Reading