Skip to main content
This guide walks through everything you need to call Stealth Agent reliably from your own backend. It assumes you’ve read the overview and picked a use case. If you’re migrating from the old business: true flag on /api/stealthify, this is the integration you’re moving to.

TL;DR

1

Get an API token

Activate API access at stealthgpt.ai/stealthapi and add a payment method.
2

Build a request body for one preset

Pick academic, seo, or social. Include prompt. For social, include platform. Optionally enable enableFactCheck and enableImageGeneration.
3

POST with a 10-minute timeout

Send to https://stealthgpt.ai/api/stealthify/agent with api-token and Content-Type: application/json headers. Configure your HTTP client for at least 600s.
4

Handle the response

On 200, persist runId, documentId, result, and the billing fields. On 4xx/5xx, branch on status — see Error handling below.
For production integrations and automation platforms, use POST /api/stealthify/agent/runs instead. It returns immediately with a runId, supports idempotency-key, and can deliver the terminal result to webhookUrl.

Step 1: Pick a preset and build the request

The endpoint accepts a discriminated union on preset. Unknown fields are rejected by the schema, so don’t send extras.
{
  "preset": "academic",
  "prompt": "Explain how mitochondria became eukaryotic energy organelles, with citations.",
  "enableFactCheck": true,
  "enableImageGeneration": false
}
See the full schema and per-field documentation in the endpoint reference.

Step 2: Configure your HTTP client

Stealth Agent runs are multi-step and can take several minutes. The server allows up to 600 seconds per request — your client must allow at least that long.
Most HTTP clients default to 30–60 seconds. If you don’t raise the timeout, your client will abort runs mid-pipeline and the user will see a misleading error while the server keeps working.
curl --request POST \
  --url 'https://stealthgpt.ai/api/stealthify/agent' \
  --header 'Content-Type: application/json' \
  --header 'api-token: YOUR_API_TOKEN' \
  --max-time 600 \
  --data '{
    "preset": "seo",
    "prompt": "Write a 1,800-word guide on choosing PostgreSQL vs MySQL for early-stage SaaS.",
    "enableFactCheck": true,
    "enableImageGeneration": true
  }'

Step 3: Choose sync or async delivery

A 5–10 minute synchronous request is fine for a CLI or a one-off script, but it’s a poor fit for an HTTP-facing API, UI, workflow engine, or agent integration.

Use sync for scripts

POST /api/stealthify/agent returns the final markdown directly, but your HTTP client must wait up to 10 minutes.

Use async for integrations

POST /api/stealthify/agent/runs returns 202 Accepted, lets you poll statusUrl, and can call your webhookUrl on completion.
If you use async runs, the production pattern is:
1

Accept a generation request synchronously

Validate inputs, persist your own job row, and call POST /api/stealthify/agent/runs with an idempotency-key.
2

Store the run id

Save runId and statusUrl from the 202 Accepted response next to your own job id.
3

Poll or receive webhook

Poll GET /api/stealthify/agent/runs/{runId} every 5-15 seconds, or provide webhookUrl in the create request and wait for the terminal callback.
4

Persist the terminal payload

On completed, store documentId, result, and billing fields. On failed or cancelled, store error.code and error.message.
This pattern makes client timeouts non-disruptive, avoids tying up frontend servers, and gives workflow platforms a stable polling/callback contract. Stealth Agent can return 402 Payment Required if the account has no prepaid balance and pay-as-you-go is not enabled. To give a better user experience, call GET /api/stealthify/balance before queuing a long run:
python
balance = requests.get(
    "https://stealthgpt.ai/api/stealthify/balance",
    headers={"api-token": "YOUR_API_TOKEN"},
    timeout=10,
).json()

prepaid = balance["credits"]
payg_mode = (balance.get("payg") or {}).get("mode")

if prepaid <= 0 and payg_mode != "metered_active":
    raise RuntimeError(
        "No prepaid words and metered billing is not active — "
        "have the user accept pay-as-you-go terms before queuing the run."
    )
This is best-effort: another concurrent run can still drain the balance while your job waits in the queue. Always also handle the 402 response from the agent endpoint itself.

Step 5: Read the response

A successful response is a single JSON payload — there is no polling and no runId-based status endpoint to call afterwards.
{
  "runId": "0c5d…",
  "documentId": "9e1b…",
  "preset": "seo",
  "result": "# How to choose between PostgreSQL and MySQL\n\n…",
  "outputWords": 1842,
  "creditsSpent": 18420,
  "remainingCredits": 481580,
  "billingMode": "prepaid",
  "meteredChargedCredits": 0
}
What to persist on your side:
  • result — the markdown to render or save.
  • runId — useful when reporting issues to support.
  • documentId — the persisted document on StealthGPT’s side; pair it with your own row.
  • creditsSpent / billingMode / meteredChargedCredits — for usage dashboards and reconciliation.
outputWords is the canonical word count used for billing (creditsSpent === ceil(outputWords × 10)); use it directly instead of recounting result on your side.

Error handling

Stealth Agent returns standard HTTP status codes. Branch on response.status first, then read message and (when present) info:
StatusMeaningRecommended action
400Invalid JSON or schema violation. info lists the failing fields.Don’t retry. Fix the request body and resubmit.
401Missing api-token header or token not found.Don’t retry. Surface to the user that the integration is misconfigured.
402Insufficient credits, payg not available, payment method missing, or billing cap delay.If message mentions a cap delay, retry after a few seconds. Otherwise prompt the user to top up or accept payg terms.
499Request was aborted (your client disconnected).No charge. Safe to retry.
500Server error.Retry with exponential backoff (e.g. 30s → 2m → 10m). After 3 attempts, escalate.
A safe Node.js retry wrapper:
nodejs
async function runStealthAgent(body, { attempts = 3 } = {}) {
  for (let i = 0; i < attempts; i++) {
    try {
      const res = await fetch('https://stealthgpt.ai/api/stealthify/agent', {
        method: 'POST',
        headers: {
          'api-token': process.env.STEALTHGPT_API_TOKEN,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify(body),
        signal: AbortSignal.timeout(600_000),
      })

      if (res.ok) return await res.json()

      const isRetriable = res.status >= 500 || res.status === 499
      if (!isRetriable) {
        const error = await res.json().catch(() => ({}))
        throw new Error(`stealth-agent ${res.status}: ${error.message ?? ''}`)
      }
    } catch (error) {
      if (i === attempts - 1) throw error
    }

    const backoffMs = Math.min(30_000 * 2 ** i, 600_000)
    await new Promise((resolve) => setTimeout(resolve, backoffMs))
  }

  throw new Error('stealth-agent: exhausted retries')
}

Idempotency and retries

For the synchronous endpoint, each 200 response creates a new run, persists a new document, and charges credits again. For async runs, send an idempotency-key when creating the run so retries of the same create request do not enqueue duplicates.

Retry only on 499 / 5xx

Those statuses confirm no charge happened. Anything else (200, 4xx) is final.

Use async idempotency keys

Send idempotency-key to POST /api/stealthify/agent/runs when retries are possible.

Cap retry attempts

Three attempts with exponential backoff is plenty. Beyond that, escalate to manual review — you may be hitting a real outage.

Avoid blind sync retries

A timeout you observe locally does not mean the server failed. Prefer async runs when the caller cannot wait reliably.

Cost control checklist

Stealth Agent charges ceil(outputWords × 10) per successful run. To keep spend predictable:
1

Constrain output length in the prompt

Include a target word range ("Write a 1,200-word essay…"). The agent respects length hints; longer outputs cost proportionally more.
2

Don't auto-retry 200 responses

Each sync retry is a fresh charge. For async create retries, reuse the same idempotency-key.
3

Skip optional steps you don't need

enableFactCheck and enableImageGeneration add latency. They don’t change the per-word price, but turning them off makes the run faster — useful if you’re rate-limited or paying for compute on your worker side.
4

Track usage with the response fields

Record creditsSpent and billingMode on every job. A weekly aggregation surfaces drift before you see it on the invoice.

Migrating from /api/stealthify business mode

If your current code looks like this:
python (legacy)
requests.post(
    "https://stealthgpt.ai/api/stealthify",
    headers={"api-token": TOKEN, "Content-Type": "application/json"},
    json={
        "prompt": "Write a 1,500-word essay on the causes of the French Revolution.",
        "rephrase": False,
        "tone": "PhD",
        "business": True,
    },
)
Switch it to:
python (stealth agent)
requests.post(
    "https://stealthgpt.ai/api/stealthify/agent",
    headers={"api-token": TOKEN, "Content-Type": "application/json"},
    json={
        "preset": "academic",
        "prompt": "Write a 1,500-word essay on the causes of the French Revolution.",
        "enableFactCheck": True,
    },
    timeout=600,
)
Migration checklist:
Citations or scholarly tone → academic. Public-facing blog content → seo. LinkedIn or Medium → social.
tone, mode, rephrase, business, detector, qualityMode, outputFormat are not part of the agent schema and will fail the strict validator. Send only the documented fields.
Replace 30–60s defaults with 600s. The legacy endpoint typically returned in seconds; the agent can take minutes.
The agent does not return wordsSpent/tokensSpent. Use outputWords for the word count and creditsSpent for the actual charge.
Synchronous multi-minute calls block UI threads and request handlers. Queue them.
See the changelog entry for the full breakdown of what changed and why.

Where to go next

Async runs

Create runs, poll status, receive callbacks, and use idempotency keys.

Use cases

Concrete prompt patterns and recommended option combinations.

Pricing

Per-endpoint billing rules, tiers, and pay-as-you-go behavior.