aiwithgrant
about me
aiwithgrantGuidesGoogle
Google
Google
Vertex AI Docs
Intermediate

Gemini 3 Prompting Guide

Gemini 3 specific quirks and techniques on Vertex AI. Temperature gotchas, thinking modes, persona grounding, verbosity control, and multi-source synthesis.

Official Google docs →
Content sourced from official Google documentation
1

Temperature: don't touch it

This is the number one gotcha with Gemini 3. Keep temperature at the default 1.0. Lowering it, which is standard advice for every other provider, causes unexpected behavior with Gemini 3: looping, degraded reasoning, and poor performance on math tasks. This is counterintuitive if you're coming from OpenAI or Anthropic where lower temp means more consistency.

💡If you're getting inconsistent results with Gemini 3, the fix is NOT lowering temperature. Instead, improve your prompt specificity and use few-shot examples.
2

Thinking modes and latency

Gemini 3 has a thinking mode that lets it reason internally. Set thinking level to LOW and use directives like 'think silently' when speed matters more than deep reasoning. For production applications, this can significantly reduce response latency without sacrificing basic quality.

💡Use HIGH thinking for complex analysis, coding, and math. Use LOW thinking for classification, extraction, and simple Q&A. Match the thinking budget to the task complexity.
3

Deduction vs. outside knowledge

When you need Gemini to work strictly from provided context, broad restrictions like 'do not infer' don't work well. Instead, be specific: 'perform calculations and logical deductions based strictly on the provided text' while avoiding external information. The key insight: you WANT the model to deduce and reason, you just don't want it pulling in training data.

💡The difference matters: deduction from your data is good, inference from training data is hallucination. Frame your constraints to allow the former and block the latter.
Grounding in context
Perform calculations and logical deductions based strictly on the provided text. Do not incorporate any information from your training data. If the answer cannot be determined from the provided text alone, say 'insufficient information.'
The first prompt blocks all reasoning including valid deductions. The second allows logical reasoning from your data while blocking outside knowledge.
4

Prompt structure matters more

Structure prompts with context first, then main instructions, ending with negative and formatting constraints. Placing critical restrictions at the end prevents Gemini 3 from dropping them. This is because of recency bias: the model pays more attention to what comes last. For complex prompts, this ordering is the difference between hitting 70% and 95% of your requirements.

💡Put your 'must not' constraints at the very end of the prompt. They're less likely to be ignored there.
5

Personas: powerful but dangerous

Gemini 3 takes personas seriously. Really seriously. It may prioritize staying in character over following your other instructions if they conflict. Always review persona assignments against your other requirements. Also, the model may reference training data over your provided context when in a persona. State explicitly: 'the provided context is the only source of truth for the current session.'

💡Test your persona prompts with edge cases where the persona's 'natural behavior' might conflict with your instructions. Catch these conflicts before production.
Persona grounding
[System] You are a Harvard economics professor. The provided context is the only source of truth for this session. Do not reference any external data or examples from your training.

Analyze this company's financials using ONLY the data below:
<data>{{DATA}}</data>
Without grounding, the 'professor' persona might reference real companies or academic papers from training data instead of focusing on your actual data.
6

Multi-source synthesis

When working with multiple documents, place specific questions after the full context. Use anchoring phrases like 'Based on the entire document above...' or 'Drawing from all provided sources...' to ensure Gemini processes everything comprehensively instead of latching onto the first relevant section.

💡For long multi-document contexts, explicitly ask Gemini to cite which source each claim comes from. This forces thorough reading.
7

Controlling verbosity

Gemini 3 defaults to concise, direct answers. This is different from older models that tended to be verbose. If you want conversational, explanatory, or detailed responses, you need to ask for it explicitly: 'explain this as a friendly, talkative assistant' or 'provide a detailed explanation with examples.'

💡For most production use cases, the default conciseness is actually what you want. Only override it for user-facing conversational interfaces.

Key topics covered

Gemini 3 features
Thinking mode
Grounding
Persona control
Context handling
Verbosity tuning
Read the full guide
View the complete Google documentation
Official docs →

More guides