Best Practices

Get the most out of cloud agents by writing clear prompts, structuring tasks effectively, and managing costs. These practices are based on patterns observed across thousands of agent runs.

Writing Effective Prompts

The quality of a cloud agent's output depends heavily on how well you describe the task. A specific, well-scoped prompt consistently outperforms a vague one.

Be Specific About the Goal

Tell the agent exactly what you want it to produce, not just what area to look at.

Weak PromptStrong Prompt
"Review the auth code""Review the changes in src/auth/ for SQL injection risks, missing input validation, and insecure token storage. Post inline comments on any issues found."
"Add tests""Generate Vitest unit tests for all exported functions in src/utils/validation.ts. Cover edge cases: empty input, null, undefined, and boundary values. Follow the test structure in src/utils/__tests__/format.test.ts."
"Fix the bug""The /api/users endpoint returns 500 when the email field contains unicode characters. Find the validation logic, fix the regex to support unicode, and add a test case."

Provide Context

Cloud agents start with a fresh clone and no prior conversation history. Include any context that would help a new engineer understand the task.

  • Mention which files or directories are relevant.
  • Reference the testing framework, coding style, or architecture patterns your project uses.
  • Explain domain-specific terminology that might not be obvious from the code.
  • If the task is part of a larger effort, describe the bigger picture.

Define the Output Format

If you want the agent to produce something specific (a diff, a report, PR comments), say so explicitly.

1
2
3
{
"prompt": "Analyze the codebase for unused exports. Output a markdown report listing each unused export with its file path and line number. Group by directory. Include a summary count at the top."
}

Structuring Tasks

How you break up work into agent runs affects both quality and cost.

One Task Per Run

Each cloud agent run should have a single, clear objective. Multi-objective prompts ("review the code AND add tests AND update the docs") tend to produce lower-quality results because the agent spreads its context window across too many concerns.

ApproachQualityCost
One run: "review, test, and document"Lower -- agent loses focusHigher -- more tokens spent switching context
Three runs: review, test, document separatelyHigher -- each run is focusedSimilar total -- but each result is better

Scope Appropriately

  • Small scope: "Add error handling to the createUser function" -- fast, cheap, precise.
  • Medium scope: "Add input validation to all API routes in src/routes/" -- good balance of breadth and quality.
  • Large scope: "Refactor the entire auth system to use JWTs" -- may hit timeout limits. Break into sub-tasks.

Use Follow-ups

If a run produces partial results or you want to iterate, use the follow-up API to continue the conversation without starting a new clone.

1
2
3
4
5
6
7
# Add a follow-up message to a running or completed agent
curl -X POST https://api.creor.ai/v1/agents/run_abc123/followup \
-H "Authorization: Bearer $CREOR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"message": "Good start, but also check the middleware functions in src/middleware/ for the same issues."
}'

Cost Management

Cloud agent costs come from two sources: LLM token usage and container compute time. Here is how to keep costs predictable.

Choose the Right Model

Task TypeRecommended ModelReason
Code review (complex)Claude Sonnet, GPT-4oBest reasoning for subtle bugs
Code review (simple)Claude Haiku, GPT-4o-miniFast and cheap for straightforward checks
Code generationClaude Sonnet, GPT-4oBest code quality
DocumentationClaude Haiku, GPT-4o-miniGood enough for docs, much cheaper
Large-scale refactoringClaude SonnetNeeds strong multi-file reasoning

Reduce Token Usage

  • Use file path filters to limit which files the agent reads. A prompt scoped to "src/auth/**" reads far fewer files than one scoped to the entire repo.
  • Exclude test fixtures, generated files, and vendored code from the agent's search scope.
  • Keep prompts concise. Lengthy preambles consume tokens without improving results.
  • Set a max token budget in agent settings to cap spending per run.

Set Spending Limits

Configure per-run and monthly spending limits in the dashboard to prevent runaway costs.

1
2
3
4
5
6
# Set a per-run token limit via the API
{
"maxTokens": 100000,
"maxMinutes": 10,
"model": "claude-haiku-3.5"
}

Tip

Start with cheaper models and shorter timeouts. Upgrade to stronger models only for tasks where the cheaper model's output is not good enough.

Cloud vs Local: When to Use Each

Cloud agents and local agents are complementary. Here is a practical guide for choosing the right one.

ScenarioUse CloudUse Local
Interactive coding sessionNoYes -- immediate feedback loop
Automated PR reviewYes -- triggered by eventsNo -- requires manual action
One-off refactoring of 5 filesEither worksLocal is faster for small scope
Codebase-wide migrationYes -- long-running, parallelNo -- blocks your editor
Exploring and understanding codeNoYes -- conversational flow
Nightly test generationYes -- scheduled, unattendedNo -- requires IDE open
Bug investigation with debuggingNo -- limited shell accessYes -- full terminal access

Common Pitfalls

Overly broad prompts

"Review the entire codebase" will consume a lot of tokens and produce generic feedback. Scope prompts to specific files, functions, or concerns.

Missing context about project conventions

Cloud agents do not have your CREOR.md or local rules unless they are committed to the repository. Add a .creor/rules/ directory to your repo with project instructions that the agent will pick up automatically.

Expecting interactive behavior

Cloud agents cannot ask you questions mid-run (unlike local agents). If the task might require decisions, provide all necessary information upfront or break it into smaller steps with decision points.

Ignoring the timeout

The default timeout is 10 minutes. Complex tasks on large repos can exceed this. Check the estimated runtime in the dashboard before launching, and increase the timeout for heavy tasks.