Metadata Filtering¶
Metadata Filtering enhances Gaia's search by combining semantic retrieval with precise metadata-based filters. Instead of relying solely on embeddings and similarity scores, Gaia can detect implicit filters from a user's natural-language query and return more targeted results.
How It Works¶
When a user asks a question, Gaia:
- Parses the natural language query
- Detects implicit metadata filters (file names, email addresses, dates, etc.)
- Combines semantic search with the extracted filters
- Returns results that match both meaning and metadata
This happens transparently — your app doesn't need to pass explicit filter parameters.
Currently Supported Object Types
Metadata Filtering is available for Microsoft 365 Mailbox and Microsoft 365 OneDrive only.
Supported Filters¶
Microsoft 365 OneDrive¶
| Filter | Description |
|---|---|
| File name | Full or partial file name |
| File path | Complete file path |
Microsoft 365 Mailbox¶
| Filter | Description |
|---|---|
| Sender email | Sender's email address |
| Recipient emails (To) | List of recipient email addresses |
| CC emails | CC email addresses (full addresses required) |
| BCC emails | BCC email addresses (full addresses required) |
| Email subject | Subject line text |
| Sent time | Email sent timestamp |
Query Examples¶
Queries That Work Well¶
| Query | What Gaia Detects |
|---|---|
| "Extract disaster recovery steps from DR_Runbook_v3" | File name filter: DR_Runbook_v3 |
| "Find compliance requirements in SOC2_Report" | File name filter: SOC2_Report |
| "Find emails sent to legal@company.com about contract renewals" | Recipient filter + semantic: legal@company.com + "contract renewals" |
| "Find emails with subject containing 'Incident Report'" | Subject filter: Incident Report |
Queries That Don't Work¶
Unsupported Patterns
These query types are not supported by metadata filtering:
- "Summarize the file HR2026" — summarization of a specific file
- "Find all documents from the HR department" — department-level filtering
- "List all emails from last month" — time-only queries without semantic content
- "Summarize all emails from finance@company.com last quarter" — aggregate summarization
Developer Integration¶
It's Automatic¶
Metadata filtering works without changes to your API calls:
# Gaia automatically extracts the email filter from the query
response = await gaia.ask(
dataset_names=["company-email"],
query="Find emails sent to legal@company.com about contract renewals"
)
# Results are filtered by recipient AND semantic relevance
Building Better Search UX¶
While filtering is automatic, you can improve the experience by helping users craft effective queries:
function SearchForm() {
const [query, setQuery] = useState("");
const [fileFilter, setFileFilter] = useState("");
const [emailFilter, setEmailFilter] = useState("");
const buildQuery = () => {
let fullQuery = query;
if (fileFilter) fullQuery += ` in file ${fileFilter}`;
if (emailFilter) fullQuery += ` from ${emailFilter}`;
return fullQuery;
};
const handleSearch = () => {
gaiaApi.ask(selectedDatasets, buildQuery());
};
return (
<form onSubmit={handleSearch}>
<input placeholder="What are you looking for?" value={query} onChange={...} />
<input placeholder="File name (optional)" value={fileFilter} onChange={...} />
<input placeholder="Email address (optional)" value={emailFilter} onChange={...} />
<button type="submit">Search</button>
</form>
);
}
Tips for Accuracy¶
- Include specific names — File names, email addresses, and case numbers help Gaia apply precise filters
- Combine semantic + metadata — "contract renewals from legal@company.com" works better than just "emails from legal@company.com"
- Use full email addresses — Partial addresses may not trigger metadata filtering
Next Steps¶
- Exhaustive Search — Combine metadata filtering with exhaustive search for complete results.
- Querying & RAG — Core query patterns.
- Permission-Aware Answers — Layer permissions on top of filtered results.