Infrintia Python SDK

The Infrintia SDK provides ComputeClient and AsyncComputeClient for interacting with the Infrintia GPU compute marketplace from Python.

Installation

pip install infrintia

ComputeClient

Synchronous client for the Infrintia API.

Initialization

from infrintia import ComputeClient

client = ComputeClient(
    base_url="https://api.infrintia.crossgl.net",
    username="alice",
    password="secure-password-123"
)

Parameter	Type	Required	Description
`base_url`	str	Yes	Infrintia API base URL
`username`	str	No	Username for authentication
`password`	str	No	Password for authentication
`api_key`	str	No	API key (alternative to username/password)
`timeout`	int	No	Request timeout in seconds (default: `60`)

Methods

`create_user(username, email, password)`

Create a new user account.

user = client.create_user(
    username="alice",
    email="alice@example.com",
    password="secure-password-123"
)

Returns: dict with user_id, username, email, credits, created_at.

`get_me()`

Get the authenticated user's profile.

profile = client.get_me()
print(f"Credits: {profile['credits']}")

`list_models()`

List all available models on the marketplace.

models = client.list_models()
for m in models:
    print(f"{m['name']} — {m['min_price_per_token']} credits/token")

Returns: list[dict] with model metadata.

`list_hosts()`

List all active GPU hosts.

hosts = client.list_hosts()
for h in hosts:
    print(f"{h['host_id']}: {h['gpu']} ({h['status']})")

`run_model(model, input_text, max_tokens=128, temperature=0.7)`

Submit an inference job.

job = client.run_model(
    model="meta-llama/Llama-3-8B-Instruct",
    input_text="What is gravity?",
    max_tokens=256,
    temperature=0.7
)
print(f"Job {job['job_id']} — status: {job['status']}")

Returns: dict with job_id, status, model, credits_reserved.

`get_job(job_id)`

Get the status and details of a specific job.

status = client.get_job("job_def456")

`list_jobs(status=None, limit=20, offset=0)`

List jobs for the authenticated user.

jobs = client.list_jobs(status="completed", limit=10)

`stream_job(job_id)`

Stream results from a running job. Yields tokens as they are generated.

for token in client.stream_job("job_def456"):
    print(token, end="", flush=True)

Yields: str — individual tokens.

`get_metrics()`

Get platform-wide metrics.

metrics = client.get_metrics()
print(f"Active hosts: {metrics['active_hosts']}")

AsyncComputeClient

Async version of ComputeClient for use with asyncio.

Initialization

from infrintia import AsyncComputeClient

client = AsyncComputeClient(
    base_url="https://api.infrintia.crossgl.net",
    username="alice",
    password="secure-password-123"
)

Async Methods

All methods mirror ComputeClient but are async:

import asyncio
from infrintia import AsyncComputeClient

async def main():
    client = AsyncComputeClient(
        base_url="https://api.infrintia.crossgl.net",
        username="alice",
        password="secure-password-123"
    )

    job = await client.run_model(
        model="meta-llama/Llama-3-8B-Instruct",
        input_text="Write a poem about silicon.",
        max_tokens=128
    )

    async for token in client.stream_job(job["job_id"]):
        print(token, end="", flush=True)
    print()

asyncio.run(main())

Available Async Methods

Method	Description
`await create_user(...)`	Create user account
`await get_me()`	Get authenticated user profile
`await list_models()`	List available models
`await list_hosts()`	List GPU hosts
`await run_model(...)`	Submit inference job
`await get_job(job_id)`	Get job status
`await list_jobs(...)`	List user's jobs
`async for token in stream_job(job_id)`	Stream job results
`await get_metrics()`	Get platform metrics