Cloud Agents Best Practices

Get the most out of cloud agents by writing clear prompts, structuring tasks effectively, and managing costs. These practices are based on patterns observed across thousands of agent runs.

Writing Effective Prompts

The quality of a cloud agent's output depends heavily on how well you describe the task. A specific, well-scoped prompt consistently outperforms a vague one.

Be Specific About the Goal

Tell the agent exactly what you want it to produce, not just what area to look at.

Weak Prompt	Strong Prompt
"Review the auth code"	"Review the changes in src/auth/ for SQL injection risks, missing input validation, and insecure token storage. Post inline comments on any issues found."
"Add tests"	"Generate Vitest unit tests for all exported functions in src/utils/validation.ts. Cover edge cases: empty input, null, undefined, and boundary values. Follow the test structure in src/utils/__tests__/format.test.ts."
"Fix the bug"	"The /api/users endpoint returns 500 when the email field contains unicode characters. Find the validation logic, fix the regex to support unicode, and add a test case."

Provide Context

Cloud agents start with a fresh clone and no prior conversation history. Include any context that would help a new engineer understand the task.

Mention which files or directories are relevant.
Reference the testing framework, coding style, or architecture patterns your project uses.
Explain domain-specific terminology that might not be obvious from the code.
If the task is part of a larger effort, describe the bigger picture.

Define the Output Format

If you want the agent to produce something specific (a diff, a report, PR comments), say so explicitly.

{

"prompt": "Analyze the codebase for unused exports. Output a markdown report listing each unused export with its file path and line number. Group by directory. Include a summary count at the top."

}

Structuring Tasks

How you break up work into agent runs affects both quality and cost.

One Task Per Run

Each cloud agent run should have a single, clear objective. Multi-objective prompts ("review the code AND add tests AND update the docs") tend to produce lower-quality results because the agent spreads its context window across too many concerns.

Approach	Quality	Cost
One run: "review, test, and document"	Lower -- agent loses focus	Higher -- more tokens spent switching context
Three runs: review, test, document separately	Higher -- each run is focused	Similar total -- but each result is better

Scope Appropriately

Small scope: "Add error handling to the createUser function" -- fast, cheap, precise.
Medium scope: "Add input validation to all API routes in src/routes/" -- good balance of breadth and quality.
Large scope: "Refactor the entire auth system to use JWTs" -- may hit timeout limits. Break into sub-tasks.

Use Follow-ups

If a run produces partial results or you want to iterate, use the follow-up API to continue the conversation without starting a new clone.

# Add a follow-up message to a running or completed agent

curl -X POST https://api.creor.ai/v1/agents/run_abc123/followup \

-H "Authorization: Bearer $CREOR_API_KEY" \

-H "Content-Type: application/json" \

-d '{

"message": "Good start, but also check the middleware functions in src/middleware/ for the same issues."

Cost Management

Cloud agent costs come from two sources: LLM token usage and container compute time. Here is how to keep costs predictable.

Choose the Right Model

Task Type	Recommended Model	Reason
Code review (complex)	Claude Sonnet, GPT-4o	Best reasoning for subtle bugs
Code review (simple)	Claude Haiku, GPT-4o-mini	Fast and cheap for straightforward checks
Code generation	Claude Sonnet, GPT-4o	Best code quality
Documentation	Claude Haiku, GPT-4o-mini	Good enough for docs, much cheaper
Large-scale refactoring	Claude Sonnet	Needs strong multi-file reasoning

Reduce Token Usage

Use file path filters to limit which files the agent reads. A prompt scoped to "src/auth/**" reads far fewer files than one scoped to the entire repo.
Exclude test fixtures, generated files, and vendored code from the agent's search scope.
Keep prompts concise. Lengthy preambles consume tokens without improving results.
Set a max token budget in agent settings to cap spending per run.

Set Spending Limits

Configure per-run and monthly spending limits in the dashboard to prevent runaway costs.

# Set a per-run token limit via the API

{

"maxTokens": 100000,

"maxMinutes": 10,

"model": "claude-haiku-3.5"

}

Tip

Start with cheaper models and shorter timeouts. Upgrade to stronger models only for tasks where the cheaper model's output is not good enough.

Cloud vs Local: When to Use Each

Cloud agents and local agents are complementary. Here is a practical guide for choosing the right one.

Scenario	Use Cloud	Use Local
Interactive coding session	No	Yes -- immediate feedback loop
Automated PR review	Yes -- triggered by events	No -- requires manual action
One-off refactoring of 5 files	Either works	Local is faster for small scope
Codebase-wide migration	Yes -- long-running, parallel	No -- blocks your editor
Exploring and understanding code	No	Yes -- conversational flow
Nightly test generation	Yes -- scheduled, unattended	No -- requires IDE open
Bug investigation with debugging	No -- limited shell access	Yes -- full terminal access

Common Pitfalls

Overly broad prompts

"Review the entire codebase" will consume a lot of tokens and produce generic feedback. Scope prompts to specific files, functions, or concerns.

Missing context about project conventions

Cloud agents do not have your CREOR.md or local rules unless they are committed to the repository. Add a .creor/rules/ directory to your repo with project instructions that the agent will pick up automatically.

Expecting interactive behavior

Cloud agents cannot ask you questions mid-run (unlike local agents). If the task might require decisions, provide all necessary information upfront or break it into smaller steps with decision points.

Ignoring the timeout

The default timeout is 10 minutes. Complex tasks on large repos can exceed this. Check the estimated runtime in the dashboard before launching, and increase the timeout for heavy tasks.