SDK Reference¶

The gaia_sdk Python package provides a lightweight, async client for the Cohesity Gaia RAG API. It handles authentication, request serialization, error mapping, and SSE streaming — so you can focus on building your application logic.

Installation¶

Install the SDK from the local package:

Bash

pip install -e sdk/python/

Or add it to your requirements.txt:

Text Only

-e ./sdk/python

Dependencies

The SDK depends on httpx for async HTTP and pydantic for data models. Both are installed automatically.

Quick Start¶

Python

import asyncio
from gaia_sdk import GaiaClient

async def main():
    async with GaiaClient(api_key="your-api-key") as gaia:
        datasets = await gaia.list_datasets()
        response = await gaia.ask(
            dataset_names=[datasets[0].name],
            query="What are the key findings?",
        )
        print(response.response_string)

asyncio.run(main())

Or initialize from environment variables:

Python

async with GaiaClient.from_env() as gaia:
    ...

GaiaClient Methods¶

Constructor¶

Parameter	Type	Default	Description
`api_key`	`str \\| None`	`None`	Gaia API key. Falls back to `GAIA_API_KEY` env var.
`base_url`	`str \\| None`	`None`	API base URL. Falls back to `GAIA_BASE_URL` env var.
`timeout`	`int`	`60`	Request timeout in seconds.
`verify_ssl`	`bool`	`True`	Verify SSL certificates.
`security_context`	`str \\| None`	`None`	Multi-tenant security context.

Datasets¶

Method	Signature	Description
`list_datasets`	`(prefix: str \\| None = None) → list[Dataset]`	List available datasets, optionally filtered by name prefix.
`get_dataset`	`(name: str) → DatasetDetails`	Get detailed information about a specific dataset.
`create_dataset`	`(name: str, **kwargs) → dict`	Create a new dataset.
`delete_dataset`	`(name: str) → dict`	Delete a dataset.
`trigger_indexing`	`(name: str) → dict`	Trigger indexing for a dataset.

RAG Queries¶

Method	Signature	Description
`ask`	`(dataset_names: list[str], query: str, conversation_id: str \\| None, llm_name: str \\| None) → AskResponse`	Synchronous RAG query. Returns the full response.
`ask_stream`	`(dataset_names: list[str], query: str, ...) → StreamResult`	Streaming RAG query. Accumulates and returns the complete result.
`ask_stream_iter`	`(dataset_names: list[str], query: str, ...) → AsyncIterator[StreamChunk]`	Streaming RAG query. Yields individual SSE chunks for real-time display.

Search & Refine¶

Method	Signature	Description
`exhaustive_search`	`(dataset_name: str, query: str, page_size: int, pagination_token: str \\| None, conversation_id: str \\| None) → ExhaustiveSearchResponse`	Paginated document search across a dataset.
`refine`	`(query_uid: str, dataset_names: list[str], query: str, doc_ids: list[str]) → RefineResponse`	Refine a previous answer using specific documents.
`search_similar_parts`	`(dataset_name: str, query: str, **kwargs) → dict`	Semantic chunk retrieval for similar document parts.

Feedback¶

Method	Signature	Description
`send_feedback`	`(query_uid: str, is_good: bool, feedback_text: str \\| None) → dict`	Submit thumbs-up/down feedback on a query response.

Document Upload¶

Method	Signature	Description
`create_upload_session`	`() → UploadSession`	Create a new upload session for grouping file uploads.
`upload_file`	`(session_id: str, file_path: str \\| Path, file_name: str \\| None) → dict`	Upload a file to an existing upload session.

Discovery & Conversations¶

Method	Signature	Description
`get_discovery`	`(dataset_id: str) → dict`	Get discovery results (document hierarchy) for a dataset.
`list_conversations`	`() → list[dict]`	List all conversations.
`get_chat_history`	`(conversation_id: str) → list[dict]`	Get the message history for a conversation.
`delete_conversation`	`(conversation_id: str) → dict`	Delete a conversation and its messages.

LLMs & Policies¶

Method	Signature	Description
`list_llms`	`() → list[dict]`	List registered LLMs available for queries.
`list_sensitive_data_policies`	`() → list[dict]`	List sensitive data handling policies.

Data Models¶

The SDK uses Pydantic models for type-safe request and response handling:

Model	Purpose
`AskRequest`	Request body for RAG queries
`AskResponse`	Response from synchronous RAG queries
`Dataset`	Dataset summary (name, status)
`DatasetDetails`	Full dataset information (documents, indexing status)
`Document`	Document metadata (ID, name, source)
`ExhaustiveSearchRequest`	Request body for exhaustive search
`ExhaustiveSearchResponse`	Paginated search results
`RefineRequest`	Request body for answer refinement
`RefineResponse`	Refined answer response
`UploadSession`	Upload session metadata

Exception Hierarchy¶

All SDK exceptions inherit from GaiaError:

Text Only

GaiaError
├── GaiaAuthError          # 401 — Invalid or missing API key
├── GaiaNotFoundError      # 404 — Dataset or resource not found
├── GaiaRateLimitError     # 429 — Rate limit exceeded
├── GaiaServerError        # 500+ — Gaia server error
└── GaiaTimeoutError       # Request timed out

Python

from gaia_sdk import GaiaClient, GaiaAuthError, GaiaRateLimitError

async with GaiaClient.from_env() as gaia:
    try:
        response = await gaia.ask(["my-dataset"], "What happened?")
    except GaiaAuthError:
        print("Check your GAIA_API_KEY")
    except GaiaRateLimitError:
        print("Too many requests — back off and retry")

Environment Variables¶

Variable	Required	Default	Description
`GAIA_API_KEY`	Yes	—	Your Cohesity Gaia API key
`GAIA_BASE_URL`	No	`https://helios.cohesity.com/v2/mcm/gaia`	Gaia API base URL
`GAIA_VERIFY_SSL`	No	`true`	Enable SSL certificate verification
`GAIA_SECURITY_CTX`	No	—	Security context for multi-tenant operations

Detailed Documentation¶

For the full SDK README with additional examples and contribution guidelines:

SDK README

Next Steps¶

Your First API Call — Use the SDK to make your first Gaia query.
Building Your App — Integrate the SDK into a FastAPI application.
Streaming Responses — Use ask_stream_iter for real-time token delivery.