This guide walks through everything you need to callDocumentation Index
Fetch the complete documentation index at: https://docs.stealthgpt.ai/llms.txt
Use this file to discover all available pages before exploring further.
POST /api/stealthify/agent reliably from your own backend. It assumes you’ve read the overview and picked a use case.
If you’re migrating from the old business: true flag on /api/stealthify, this is the integration you’re moving to.
TL;DR
Get an API token
Activate API access at stealthgpt.ai/stealthapi and add a payment method.
Build a request body for one preset
Pick
academic, seo, or social. Include prompt. For social, include platform. Optionally enable enableFactCheck and enableImageGeneration.POST with a 10-minute timeout
Send to
https://stealthgpt.ai/api/stealthify/agent with api-token and Content-Type: application/json headers. Configure your HTTP client for at least 600s.Handle the response
On
200, persist runId, documentId, result, and the billing fields. On 4xx/5xx, branch on status — see Error handling below.Step 1: Pick a preset and build the request
The endpoint accepts a discriminated union onpreset. Unknown fields are rejected by the schema, so don’t send extras.
- academic
- seo
Step 2: Configure your HTTP client
Stealth Agent runs are multi-step and can take several minutes. The server allows up to 600 seconds per request — your client must allow at least that long.Step 3: Run it from a background worker
A 5–10 minute synchronous request is fine for a CLI or a one-off script, but it’s a poor fit for an HTTP-facing API or a UI. Treat Stealth Agent calls as background jobs:Accept a generation request synchronously
Validate inputs, persist a job row, and return a job id immediately to the caller.
Process the job in a worker
Have a worker (Trigger.dev, a cron-driven queue, BullMQ, etc.) pick up the job and call
POST /api/stealthify/agent with a 600s timeout.Persist the response and notify
On
200, store runId, documentId, result, and the billing fields. Notify the user via webhook, email, or websocket.Don't auto-retry on success-shaped errors
A
200 already charged credits — never retry it. Only retry on 5xx or 499 (cancelled). See Idempotency and retries.Step 4: Preflight balance check (optional but recommended)
Stealth Agent can return402 Payment Required if the account has no prepaid balance and pay-as-you-go is not enabled. To give a better user experience, call GET /api/stealthify/balance before queuing a long run:
python
402 response from the agent endpoint itself.
Step 5: Read the response
A successful response is a single JSON payload — there is no polling and norunId-based status endpoint to call afterwards.
result— the markdown to render or save.runId— useful when reporting issues to support.documentId— the persisted document on StealthGPT’s side; pair it with your own row.creditsSpent/billingMode/meteredChargedCredits— for usage dashboards and reconciliation.
outputWords is the canonical word count used for billing (creditsSpent === ceil(outputWords × 10)); use it directly instead of recounting result on your side.
Error handling
Stealth Agent returns standard HTTP status codes. Branch onresponse.status first, then read message and (when present) info:
| Status | Meaning | Recommended action |
|---|---|---|
| 400 | Invalid JSON or schema violation. info lists the failing fields. | Don’t retry. Fix the request body and resubmit. |
| 401 | Missing api-token header or token not found. | Don’t retry. Surface to the user that the integration is misconfigured. |
| 402 | Insufficient credits, payg not available, payment method missing, or billing cap delay. | If message mentions a cap delay, retry after a few seconds. Otherwise prompt the user to top up or accept payg terms. |
| 499 | Request was aborted (your client disconnected). | No charge. Safe to retry. |
| 500 | Server error. | Retry with exponential backoff (e.g. 30s → 2m → 10m). After 3 attempts, escalate. |
nodejs
Idempotency and retries
Each200 response creates a new run, persists a new document, and charges credits again. There is no idempotency key, so you must protect against double-charging from your side:
Retry only on 499 / 5xx
Those statuses confirm no charge happened. Anything else (200, 4xx) is final.
Deduplicate at the job layer
If you accept generation jobs over HTTP, dedupe by your own request id (e.g.
Idempotency-Key) before enqueuing.Cap retry attempts
Three attempts with exponential backoff is plenty. Beyond that, escalate to manual review — you may be hitting a real outage.
Never retry inside `try { await agent(...) } catch`
A timeout you observe locally does not mean the server failed. The run may still complete and charge. Always check the status code, never raw network errors alone.
Cost control checklist
Stealth Agent chargesceil(outputWords × 10) per successful run. To keep spend predictable:
Constrain output length in the prompt
Include a target word range (
"Write a 1,200-word essay…"). The agent respects length hints; longer outputs cost proportionally more.Don't auto-retry 200 responses
Each retry is a fresh charge. Persist
runId immediately on success and dedupe in your own database.Skip optional steps you don't need
enableFactCheck and enableImageGeneration add latency. They don’t change the per-word price, but turning them off makes the run faster — useful if you’re rate-limited or paying for compute on your worker side.Migrating from /api/stealthify business mode
If your current code looks like this:
python (legacy)
python (stealth agent)
Pick the right preset for your prompt
Pick the right preset for your prompt
Citations or scholarly tone →
academic. Public-facing blog content → seo. LinkedIn or Medium → social.Drop fields the agent doesn't use
Drop fields the agent doesn't use
tone, mode, rephrase, business, detector, qualityMode, outputFormat are not part of the agent schema and will fail the strict validator. Send only the documented fields.Raise your client timeout
Raise your client timeout
Replace 30–60s defaults with 600s. The legacy endpoint typically returned in seconds; the agent can take minutes.
Switch from `wordsSpent` to `outputWords` + `creditsSpent`
Switch from `wordsSpent` to `outputWords` + `creditsSpent`
The agent does not return
wordsSpent/tokensSpent. Use outputWords for the word count and creditsSpent for the actual charge.Move the call into a background worker
Move the call into a background worker
Synchronous multi-minute calls block UI threads and request handlers. Queue them.
Where to go next
Endpoint reference
Per-preset request bodies, response fields, and error codes.
Use cases
Concrete prompt patterns and recommended option combinations.
Pricing
Per-endpoint billing rules, tiers, and pay-as-you-go behavior.