Gemini 3 Prompting Guide
Gemini 3 specific quirks and techniques on Vertex AI. Temperature gotchas, thinking modes, persona grounding, verbosity control, and multi-source synthesis.
Official Google docs →Temperature: don't touch it
This is the number one gotcha with Gemini 3. Keep temperature at the default 1.0. Lowering it, which is standard advice for every other provider, causes unexpected behavior with Gemini 3: looping, degraded reasoning, and poor performance on math tasks. This is counterintuitive if you're coming from OpenAI or Anthropic where lower temp means more consistency.
Thinking modes and latency
Gemini 3 has a thinking mode that lets it reason internally. Set thinking level to LOW and use directives like 'think silently' when speed matters more than deep reasoning. For production applications, this can significantly reduce response latency without sacrificing basic quality.
Deduction vs. outside knowledge
When you need Gemini to work strictly from provided context, broad restrictions like 'do not infer' don't work well. Instead, be specific: 'perform calculations and logical deductions based strictly on the provided text' while avoiding external information. The key insight: you WANT the model to deduce and reason, you just don't want it pulling in training data.
Prompt structure matters more
Structure prompts with context first, then main instructions, ending with negative and formatting constraints. Placing critical restrictions at the end prevents Gemini 3 from dropping them. This is because of recency bias: the model pays more attention to what comes last. For complex prompts, this ordering is the difference between hitting 70% and 95% of your requirements.
Personas: powerful but dangerous
Gemini 3 takes personas seriously. Really seriously. It may prioritize staying in character over following your other instructions if they conflict. Always review persona assignments against your other requirements. Also, the model may reference training data over your provided context when in a persona. State explicitly: 'the provided context is the only source of truth for the current session.'
Multi-source synthesis
When working with multiple documents, place specific questions after the full context. Use anchoring phrases like 'Based on the entire document above...' or 'Drawing from all provided sources...' to ensure Gemini processes everything comprehensively instead of latching onto the first relevant section.
Controlling verbosity
Gemini 3 defaults to concise, direct answers. This is different from older models that tended to be verbose. If you want conversational, explanatory, or detailed responses, you need to ask for it explicitly: 'explain this as a friendly, talkative assistant' or 'provide a detailed explanation with examples.'