Deep Research
Execute autonomous multi-step research using the Gemini Deep Research API — structured prompts, async polling, follow-up queries, output formatting, and production integration patterns.
Content
Run autonomous research tasks that plan, search, read, and synthesize information into comprehensive, cited reports. The Gemini Deep Research Agent executes dozens of search queries, reads and cross-references sources, and produces structured output — all from a single prompt.
How the Deep Research Agent Works
The Deep Research Agent operates through an asynchronous Interactions API. When you submit a query, the agent creates a research plan, executes multiple web searches (often 20-50 queries per task), reads and analyzes the results, and synthesizes everything into a cohesive report with inline citations. The entire process runs server-side — your client polls for status updates or streams progress in real-time.
A single research task typically takes 2-10 minutes and consumes 250k-900k input tokens and 60k-80k output tokens. At current Gemini pricing, expect $2-5 per task depending on complexity.
Requirements
- -Python 3.8+
- -httpx:
pip install httpx - -GEMINI_API_KEY environment variable
Setup
Get a Gemini API key from Google AI Studio and export it:
Or create a .env file in the skill directory for persistent configuration.
Crafting Effective Research Prompts
The quality of your research output depends heavily on prompt structure. Use this framework for best results:
The key principle: be specific about what you want, how you want it structured, and how the agent should handle missing data. Vague prompts produce vague reports.
Prompt Engineering Examples
Core Usage Patterns
Start a Research Task
Monitor and Retrieve Results
Follow-Up Queries
After a research task completes, you can ask follow-up questions that reference the original research context without re-running the entire workflow:
Follow-ups are significantly cheaper than new research tasks because the agent reuses its existing context rather than executing new searches.
Output Formats
The default markdown output includes inline citations linking back to source URLs. The JSON format wraps the same content in a structured envelope with metadata about token usage, sources consulted, and execution time.
Multimodal Research
The Deep Research Agent supports multimodal inputs — you can attach PDFs, images, or documents to ground the research in your specific context:
Use multimodal inputs cautiously — they increase token consumption and cost significantly. Reserve them for cases where the agent genuinely needs your internal context to produce useful output.
Cost and Performance
| Metric | Typical Range |
|---|---|
| Execution time | 2-10 minutes |
| Cost per task | $2-5 (varies by complexity) |
| Input tokens | 250k-900k |
| Output tokens | 60k-80k |
| Follow-up cost | ~30-50% of initial task |
Production Integration Patterns
For CI/CD or automated pipelines, use the async pattern with polling:
Best Practices
- -Constrain the scope — broad queries produce shallow reports. Narrow the topic, specify criteria, and define the output format.
- -Handle missing data explicitly — tell the agent what to do when data is unavailable ("state it is a projection" vs. "estimate based on trends").
- -Use follow-ups for iteration — refining via
--continueis cheaper and faster than re-running the entire research task. - -Stream for interactive use — use
--streamwhen a human is waiting; use--no-waitfor automated pipelines. - -Validate citations — the agent provides sources, but verify critical claims. Hallucinated citations are rare but possible.
- -Budget token usage — multimodal inputs and broad queries can push costs above $5 per task. Start with text-only prompts and add files only when necessary.
Common Pitfalls
- -Submitting vague, open-ended queries that produce unfocused reports.
- -Not specifying output structure, resulting in inconsistent formatting across tasks.
- -Running expensive multimodal research when a text-only prompt would suffice.
- -Ignoring follow-up capabilities and re-running full research for minor refinements.
- -Not setting timeouts in automated pipelines — complex research can exceed 10 minutes.
Exit Codes
- -0: Success
- -1: Error (API error, config issue, timeout)
- -130: Cancelled by user (Ctrl+C)
FAQ
Discussion
Loading comments...