Calling Gemini API from Java — Google GenAI SDK Setup (2026 Guide)

Series: GenAI with Java | Post 2 of 21

If you are a Java developer who wants to start building AI-powered applications using Google Gemini, you are in the right place. In this tutorial, we will set up the Google GenAI Java SDK, connect to the Gemini API, and make our first AI call — all from plain Java.

No Python. No workarounds. Google now offers a first-class, production-ready Java SDK that is stable, actively maintained, and the recommended way to access Gemini models.

By the end of this post, you will have:

A working Maven project connected to the Gemini API
Code to generate text, have a multi-turn conversation, and configure generation parameters
A clean project structure you can reuse for every post in this series

Prerequisites

Before we begin, make sure you have the following:

Java 11 or higher installed (Java 17+ recommended)
Maven 3.6+ installed
A Google AI Studio account — sign up for free at aistudio.google.com
A Gemini API key (we will generate this in Step 1)

Step 1 — Get Your Gemini API Key

The Gemini API key is what authenticates your Java application with Google’s servers.

Go to aistudio.google.com
Click “Get API key” in the top navigation
Click “Create API key”
Copy the generated key and store it somewhere safe — you will never see it again after closing that dialog

Important: Never hardcode your API key directly in your source code. We will use an environment variable to keep it secure.

Set the environment variable in your terminal:

# Linux / macOS
export GOOGLE_API_KEY="your-api-key-here"

# Windows (Command Prompt)
set GOOGLE_API_KEY=your-api-key-here

# Windows (PowerShell)
$env:GOOGLE_API_KEY="your-api-key-here"

The Google GenAI SDK automatically reads the GOOGLE_API_KEY environment variable, so you do not need to pass it in code.

Step 2 — Create the Maven Project

Create a new Maven project. You can use your IDE or run the following from the terminal:

mvn archetype:generate \
  -DgroupId=com.gangforcode \
  -DartifactId=gemini-java-demo \
  -DarchetypeArtifactId=maven-archetype-quickstart \
  -DarchetypeVersion=1.4 \
  -DinteractiveMode=false

cd gemini-java-demo

Step 3 — Add the Google GenAI SDK Dependency

Open your pom.xml and add the following dependency. As of 2026, version 1.0.0 is the stable GA release:

<dependencies>
  <dependency>
    <groupId>com.google.genai</groupId>
    <artifactId>google-genai</artifactId>
    <version>1.0.0</version>
  </dependency>
</dependencies>

Also make sure your project targets at least Java 11:

<properties>
  <maven.compiler.source>17</maven.compiler.source>
  <maven.compiler.target>17</maven.compiler.target>
</properties>

Your full pom.xml should look like this:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
         http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>com.gangforcode</groupId>
  <artifactId>gemini-java-demo</artifactId>
  <version>1.0-SNAPSHOT</version>

  <properties>
    <maven.compiler.source>17</maven.compiler.source>
    <maven.compiler.target>17</maven.compiler.target>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>

  <dependencies>
    <dependency>
      <groupId>com.google.genai</groupId>
      <artifactId>google-genai</artifactId>
      <version>1.0.0</version>
    </dependency>
  </dependencies>
</project>

Run the following to download the dependency:

mvn clean install

Step 4 — Your First Gemini API Call

Create a new file at src/main/java/com/gangforcode/GeminiDemo.java:

package com.gangforcode;

import com.google.genai.Client;
import com.google.genai.types.GenerateContentResponse;

public class GeminiDemo {

    public static void main(String[] args) {
        // The Client automatically reads GOOGLE_API_KEY from environment
        Client client = new Client();

        GenerateContentResponse response = client.models.generateContent(
            "gemini-2.5-flash",
            "Explain what Java generics are in 3 sentences.",
            null
        );

        System.out.println(response.text());
    }
}

Run it:

mvn compile exec:java -Dexec.mainClass="com.gangforcode.GeminiDemo"

You should see a clear, concise explanation of Java generics printed to your console. That is your first successful Gemini API call from Java.

Step 5 — Understanding the Client

The Client class is the single entry point for all Gemini API interactions. It supports three initialization styles:

Option A — Environment Variable (Recommended)

// Reads GOOGLE_API_KEY automatically
Client client = new Client();

This is the cleanest approach and is what you should use in all your projects.

Option B — Explicit API Key via Builder

Client client = Client.builder()
    .apiKey("your-api-key")
    .build();

Use this when you are managing keys programmatically (e.g., reading from a secrets manager).

Option C — Vertex AI Backend

Client client = Client.builder()
    .project("your-gcp-project")
    .location("us-central1")
    .vertexAI(true)
    .build();

Use this for enterprise deployments on Google Cloud. We will cover Vertex AI in a later post in this series.

Step 6 — Choosing the Right Gemini Model

The model name you pass to generateContent() determines which Gemini model handles your request. Here is a quick reference for the models available in 2026:

Model	Best For	Speed	Cost
`gemini-2.5-flash`	Most tasks — fast and capable	Very fast	Low
`gemini-2.5-pro`	Complex reasoning, long context	Moderate	Higher
`gemini-3-flash-preview`	Latest preview features	Fast	Low

For this tutorial series, we will default to gemini-2.5-flash for all examples. It offers an excellent balance of speed and quality for everyday development tasks.

Step 7 — Configuring Generation Parameters

You can control how the model generates text using GenerateContentConfig. This lets you tune creativity, response length, and safety filters.

package com.gangforcode;

import com.google.genai.Client;
import com.google.genai.types.GenerateContentConfig;
import com.google.genai.types.GenerateContentResponse;

public class GeminiWithConfig {

    public static void main(String[] args) {
        Client client = new Client();

        GenerateContentConfig config = GenerateContentConfig.builder()
            .temperature(0.7f)       // 0.0 = deterministic, 1.0 = creative
            .maxOutputTokens(512)    // limit response length
            .topP(0.9f)              // nucleus sampling
            .systemInstruction("You are a helpful Java programming tutor. " +
                                "Keep answers beginner-friendly.")
            .build();

        GenerateContentResponse response = client.models.generateContent(
            "gemini-2.5-flash",
            "What is the difference between an interface and an abstract class in Java?",
            config
        );

        System.out.println(response.text());
    }
}

Key parameters explained:

temperature controls randomness. Use 0.2 for factual/code tasks, 0.7-0.9 for creative writing.
maxOutputTokens sets the upper limit on response length. One token is roughly 4 characters.
systemInstruction sets a persistent role or persona for the model — this is how you make Gemini behave like a tutor, code reviewer, or customer support agent.
topP controls diversity of word choices. Leave at 0.9 for most tasks.

Step 8 — Multi-Turn Conversations (Chat)

Real applications need back-and-forth conversations with memory. The SDK provides a Chat object for this:

package com.gangforcode;

import com.google.genai.Chat;
import com.google.genai.Client;
import com.google.genai.types.GenerateContentConfig;
import com.google.genai.types.GenerateContentResponse;

public class GeminiChat {

    public static void main(String[] args) {
        Client client = new Client();

        GenerateContentConfig config = GenerateContentConfig.builder()
            .systemInstruction("You are a Java interview coach. " +
                                "Ask one question at a time and give feedback.")
            .temperature(0.5f)
            .build();

        // Create a chat session — conversation history is managed automatically
        Chat chat = client.chats.create("gemini-2.5-flash", config);

        // Turn 1
        GenerateContentResponse r1 = chat.sendMessage("Start the interview. Ask me a question.");
        System.out.println("Gemini: " + r1.text());

        // Turn 2 — the model remembers the previous exchange
        GenerateContentResponse r2 = chat.sendMessage(
            "A HashMap does not maintain insertion order, " +
            "but a LinkedHashMap does."
        );
        System.out.println("Gemini: " + r2.text());

        // Turn 3
        GenerateContentResponse r3 = chat.sendMessage("Ask me the next question.");
        System.out.println("Gemini: " + r3.text());
    }
}

The Chat object internally maintains a List<Content> representing the conversation history. Each call to sendMessage() appends the new turn and sends the full history to the model. You do not have to manage this yourself.

Step 9 — Streaming Responses

For long responses, waiting for the entire text before displaying anything creates a poor user experience. Streaming lets you print each token as it arrives — just like ChatGPT’s typewriter effect.

package com.gangforcode;

import com.google.genai.Client;
import com.google.genai.types.GenerateContentResponse;

public class GeminiStreaming {

    public static void main(String[] args) {
        Client client = new Client();

        // generateContentStream returns an Iterable of chunks
        Iterable<GenerateContentResponse> stream = client.models.generateContentStream(
            "gemini-2.5-flash",
            "Write a detailed explanation of how the Java garbage collector works.",
            null
        );

        // Print each chunk as it arrives
        for (GenerateContentResponse chunk : stream) {
            if (chunk.text() != null) {
                System.out.print(chunk.text());
                System.out.flush(); // Ensure immediate output
            }
        }

        System.out.println(); // New line after stream ends
    }
}

Streaming is especially useful when building REST APIs that pipe Gemini output to a frontend using Server-Sent Events (SSE). We will cover that pattern in the Spring AI post later in this series.

Project Structure

After following all the steps above, your project should look like this:

gemini-java-demo/
├── pom.xml
└── src/
    └── main/
        └── java/
            └── com/
                └── gangforcode/
                    ├── GeminiDemo.java          ← Basic text generation
                    ├── GeminiWithConfig.java    ← Generation parameters
                    ├── GeminiChat.java          ← Multi-turn chat
                    └── GeminiStreaming.java     ← Streaming responses

Common Errors and Fixes

IllegalArgumentException: API key must be provided
Your GOOGLE_API_KEY environment variable is not set. Double-check with echo $GOOGLE_API_KEY (Linux/macOS) or echo %GOOGLE_API_KEY% (Windows).

429 Resource Exhausted
You have hit the free tier rate limit. Wait a moment and retry, or upgrade your Google AI Studio plan.

404 Model not found
Check the model name string. Common mistake: using gemini-pro (old name) instead of gemini-2.5-flash.

NullPointerException on response.text()
The model may have returned no candidates due to safety filters. Add a null check: if (response.text() != null).

What We Built

In this post we covered the complete setup of the Google GenAI Java SDK:

Getting a Gemini API key from Google AI Studio
Adding the google-genai Maven dependency
Three ways to initialize the Client
Choosing the right Gemini model for your use case
Controlling output with GenerateContentConfig
Building multi-turn conversations with the Chat API
Streaming long responses token by token

What’s Next

In Post 3, we will go beyond plain text and explore Gemini’s multimodal capabilities — sending images and PDFs alongside text prompts, all from Java.

Post 3: Streaming responses + multimodal input (text, image, PDF) with Gemini in Java
Post 4: Prompt engineering techniques with Gemini in Java

Part of the GenAI with Java series | All examples use Google Gemini models