SDK Reference¶
The gaia_sdk Python package provides a lightweight, async client for the Cohesity Gaia RAG API. It handles authentication, request serialization, error mapping, and SSE streaming — so you can focus on building your application logic.
Installation¶
Install the SDK from the local package:
Or add it to your requirements.txt:
Dependencies
The SDK depends on httpx for async HTTP and pydantic for data models. Both are installed automatically.
Quick Start¶
Python
import asyncio
from gaia_sdk import GaiaClient
async def main():
async with GaiaClient(api_key="your-api-key") as gaia:
datasets = await gaia.list_datasets()
response = await gaia.ask(
dataset_names=[datasets[0].name],
query="What are the key findings?",
)
print(response.response_string)
asyncio.run(main())
Or initialize from environment variables:
GaiaClient Methods¶
Constructor¶
| Parameter | Type | Default | Description |
|---|---|---|---|
api_key | str \| None | None | Gaia API key. Falls back to GAIA_API_KEY env var. |
base_url | str \| None | None | API base URL. Falls back to GAIA_BASE_URL env var. |
timeout | int | 60 | Request timeout in seconds. |
verify_ssl | bool | True | Verify SSL certificates. |
security_context | str \| None | None | Multi-tenant security context. |
Datasets¶
| Method | Signature | Description |
|---|---|---|
list_datasets | (prefix: str \| None = None) → list[Dataset] | List available datasets, optionally filtered by name prefix. |
get_dataset | (name: str) → DatasetDetails | Get detailed information about a specific dataset. |
create_dataset | (name: str, **kwargs) → dict | Create a new dataset. |
delete_dataset | (name: str) → dict | Delete a dataset. |
trigger_indexing | (name: str) → dict | Trigger indexing for a dataset. |
RAG Queries¶
| Method | Signature | Description |
|---|---|---|
ask | (dataset_names: list[str], query: str, conversation_id: str \| None, llm_name: str \| None) → AskResponse | Synchronous RAG query. Returns the full response. |
ask_stream | (dataset_names: list[str], query: str, ...) → StreamResult | Streaming RAG query. Accumulates and returns the complete result. |
ask_stream_iter | (dataset_names: list[str], query: str, ...) → AsyncIterator[StreamChunk] | Streaming RAG query. Yields individual SSE chunks for real-time display. |
Search & Refine¶
| Method | Signature | Description |
|---|---|---|
exhaustive_search | (dataset_name: str, query: str, page_size: int, pagination_token: str \| None, conversation_id: str \| None) → ExhaustiveSearchResponse | Paginated document search across a dataset. |
refine | (query_uid: str, dataset_names: list[str], query: str, doc_ids: list[str]) → RefineResponse | Refine a previous answer using specific documents. |
search_similar_parts | (dataset_name: str, query: str, **kwargs) → dict | Semantic chunk retrieval for similar document parts. |
Feedback¶
| Method | Signature | Description |
|---|---|---|
send_feedback | (query_uid: str, is_good: bool, feedback_text: str \| None) → dict | Submit thumbs-up/down feedback on a query response. |
Document Upload¶
| Method | Signature | Description |
|---|---|---|
create_upload_session | () → UploadSession | Create a new upload session for grouping file uploads. |
upload_file | (session_id: str, file_path: str \| Path, file_name: str \| None) → dict | Upload a file to an existing upload session. |
Discovery & Conversations¶
| Method | Signature | Description |
|---|---|---|
get_discovery | (dataset_id: str) → dict | Get discovery results (document hierarchy) for a dataset. |
list_conversations | () → list[dict] | List all conversations. |
get_chat_history | (conversation_id: str) → list[dict] | Get the message history for a conversation. |
delete_conversation | (conversation_id: str) → dict | Delete a conversation and its messages. |
LLMs & Policies¶
| Method | Signature | Description |
|---|---|---|
list_llms | () → list[dict] | List registered LLMs available for queries. |
list_sensitive_data_policies | () → list[dict] | List sensitive data handling policies. |
Data Models¶
The SDK uses Pydantic models for type-safe request and response handling:
| Model | Purpose |
|---|---|
AskRequest | Request body for RAG queries |
AskResponse | Response from synchronous RAG queries |
Dataset | Dataset summary (name, status) |
DatasetDetails | Full dataset information (documents, indexing status) |
Document | Document metadata (ID, name, source) |
ExhaustiveSearchRequest | Request body for exhaustive search |
ExhaustiveSearchResponse | Paginated search results |
RefineRequest | Request body for answer refinement |
RefineResponse | Refined answer response |
UploadSession | Upload session metadata |
Exception Hierarchy¶
All SDK exceptions inherit from GaiaError:
Text Only
GaiaError
├── GaiaAuthError # 401 — Invalid or missing API key
├── GaiaNotFoundError # 404 — Dataset or resource not found
├── GaiaRateLimitError # 429 — Rate limit exceeded
├── GaiaServerError # 500+ — Gaia server error
└── GaiaTimeoutError # Request timed out
Python
from gaia_sdk import GaiaClient, GaiaAuthError, GaiaRateLimitError
async with GaiaClient.from_env() as gaia:
try:
response = await gaia.ask(["my-dataset"], "What happened?")
except GaiaAuthError:
print("Check your GAIA_API_KEY")
except GaiaRateLimitError:
print("Too many requests — back off and retry")
Environment Variables¶
| Variable | Required | Default | Description |
|---|---|---|---|
GAIA_API_KEY | Yes | — | Your Cohesity Gaia API key |
GAIA_BASE_URL | No | https://helios.cohesity.com/v2/mcm/gaia | Gaia API base URL |
GAIA_VERIFY_SSL | No | true | Enable SSL certificate verification |
GAIA_SECURITY_CTX | No | — | Security context for multi-tenant operations |
Detailed Documentation¶
For the full SDK README with additional examples and contribution guidelines:
Next Steps¶
- Your First API Call — Use the SDK to make your first Gaia query.
- Building Your App — Integrate the SDK into a FastAPI application.
- Streaming Responses — Use
ask_stream_iterfor real-time token delivery.