Building Multi-Agent Systems in Java Without Leaving the JVM
You don’t need Python to build multi-agent systems.
I know that’s a contrarian take. The entire AI agent ecosystem — CrewAI, AutoGen, LangGraph — is Python-first. But if your backend is Java, your CI is Gradle or Maven, and your team thinks in terms of interfaces and generics rather than decorators and duck typing, there’s no reason to introduce a second language just for agent orchestration.
This post is a hands-on walkthrough. We’ll build four progressively more sophisticated multi-agent systems, all in Java, all running on the JVM, using AgentEnsemble.
Add the dependency to your build.gradle.kts:
dependencies { implementation("net.agentensemble:agentensemble-core:2.3.0")}Or if you’re using Maven:
<dependency> <groupId>net.agentensemble</groupId> <artifactId>agentensemble-core</artifactId> <version>2.3.0</version></dependency>You’ll also need a LangChain4j model provider. For OpenAI:
implementation("dev.langchain4j:langchain4j-open-ai:1.0.0-beta4")Build 1: The Research-Writer Pipeline
Section titled “Build 1: The Research-Writer Pipeline”The classic starting point — two agents, sequential execution, one feeds into the other.
// Create an LLM modelChatLanguageModel model = OpenAiChatModel.builder() .apiKey(System.getenv("OPENAI_API_KEY")) .modelName("gpt-4o-mini") .build();
// Define agents with roles and goalsAgent researcher = Agent.builder() .role("Senior Research Analyst") .goal("Find comprehensive, accurate information about {{topic}}") .background("Expert at synthesizing information from multiple sources") .build();
Agent writer = Agent.builder() .role("Technical Writer") .goal("Transform research into clear, engaging content") .background("Skilled at making complex topics accessible") .build();
// Define tasks -- the writer's task depends on the researcher's outputTask researchTask = Task.builder() .description("Research {{topic}} thoroughly, covering key concepts, " + "current state, and recent developments") .expectedOutput("Comprehensive research notes with sources") .agent(researcher) .build();
Task writeTask = Task.builder() .description("Write a well-structured article based on the research") .expectedOutput("A polished, publication-ready article") .agent(writer) .context(List.of(researchTask)) // <-- this creates the dependency .build();
// Build and run the ensembleEnsembleOutput output = Ensemble.builder() .agents(researcher, writer) .tasks(researchTask, writeTask) .chatLanguageModel(model) .inputs(Map.of("topic", "WebAssembly beyond the browser")) .build() .run();
System.out.println(output.getRaw());A few things to notice:
{{topic}}is a template variable, resolved at runtime from theinputs()map.context(List.of(researchTask))tells the framework thatwriteTaskneeds the output ofresearchTask. This is how dependencies are expressed.- No workflow declaration. The framework sees a linear dependency chain and infers sequential execution.
Build 2: The Hierarchical Team
Section titled “Build 2: The Hierarchical Team”Now let’s build something more interesting — a manager agent that delegates to specialist workers.
ChatLanguageModel managerModel = OpenAiChatModel.builder() .apiKey(System.getenv("OPENAI_API_KEY")) .modelName("gpt-4o") .build();
ChatLanguageModel workerModel = OpenAiChatModel.builder() .apiKey(System.getenv("OPENAI_API_KEY")) .modelName("gpt-4o-mini") .build();
Agent marketResearcher = Agent.builder() .role("Market Research Specialist") .goal("Analyze market trends and competitive landscape") .llm(workerModel) .build();
Agent financialAnalyst = Agent.builder() .role("Financial Analyst") .goal("Analyze financial data and provide investment insights") .llm(workerModel) .build();
Agent reportWriter = Agent.builder() .role("Report Writer") .goal("Compile findings into a comprehensive report") .llm(workerModel) .build();
Task comprehensiveReport = Task.builder() .description("Create a comprehensive analysis of {{company}}") .expectedOutput("A detailed report covering market position, " + "financials, and strategic outlook") .build();// Note: no .agent() -- the manager decides who handles what
EnsembleOutput output = Ensemble.builder() .agents(marketResearcher, financialAnalyst, reportWriter) .tasks(comprehensiveReport) .chatLanguageModel(managerModel) // the manager's brain .workflow(Workflow.HIERARCHICAL) .inputs(Map.of("company", "Tesla")) .build() .run();Key differences from the sequential pipeline:
- No agent assigned to the task. The manager decides which worker handles what.
- Different models for different roles. The manager gets
gpt-4ofor its coordination reasoning; workers get the cheapergpt-4o-mini. Workflow.HIERARCHICALis explicit here because there’s a single unassigned task — the framework needs to know you want delegation, not just a single-agent execution.
You can also add constraints to the delegation:
HierarchicalConstraints constraints = HierarchicalConstraints.builder() .requiredWorkers(List.of("Market Research Specialist", "Financial Analyst")) .maxCallsPerWorker(2) .globalMaxDelegations(6) .build();
Ensemble.builder() // ... agents, tasks, model ... .workflow(Workflow.HIERARCHICAL) .hierarchicalConstraints(constraints) .build() .run();This ensures the manager consults both specialists and doesn’t get stuck in a loop delegating endlessly to one agent.
Build 3: The Parallel DAG
Section titled “Build 3: The Parallel DAG”What if you have tasks that can run concurrently? Let’s build a competitive intelligence pipeline:
Agent marketAnalyst = Agent.builder() .role("Market Analyst") .goal("Analyze market positioning and trends") .build();
Agent financialAnalyst = Agent.builder() .role("Financial Analyst") .goal("Analyze financial performance and projections") .build();
Agent strategist = Agent.builder() .role("Strategy Consultant") .goal("Synthesize findings into strategic recommendations") .build();
// These two tasks are independent -- they can run in parallelTask marketResearch = Task.builder() .description("Analyze market position of {{company}}") .expectedOutput("Market analysis report") .agent(marketAnalyst) .build();
Task financialAnalysis = Task.builder() .description("Analyze financial performance of {{company}}") .expectedOutput("Financial analysis report") .agent(financialAnalyst) .build();
// This task depends on BOTH -- it waits for them to finishTask swotAnalysis = Task.builder() .description("Create a SWOT analysis based on market and financial findings") .expectedOutput("Complete SWOT analysis") .agent(strategist) .context(List.of(marketResearch, financialAnalysis)) // both must complete first .build();
// This depends on everythingTask executiveSummary = Task.builder() .description("Write an executive summary of all findings") .expectedOutput("One-page executive summary") .agent(strategist) .context(List.of(marketResearch, financialAnalysis, swotAnalysis)) .build();
EnsembleOutput output = Ensemble.builder() .agents(marketAnalyst, financialAnalyst, strategist) .tasks(marketResearch, financialAnalysis, swotAnalysis, executiveSummary) .chatLanguageModel(model) .inputs(Map.of("company", "Nvidia")) .build() .run();Again, no explicit workflow declaration. The framework sees that marketResearch and financialAnalysis have no dependencies on each other, so it runs them concurrently. swotAnalysis waits for both. executiveSummary waits for all three.
The dependency graph is a DAG (directed acyclic graph), and the framework does topological scheduling automatically.
If you want resilience, add an error strategy:
Ensemble.builder() // ... .workflow(Workflow.parallel() .errorStrategy(ParallelErrorStrategy.CONTINUE_ON_ERROR) .build()) .build() .run();Now if the financial analysis fails, the market research still completes, and downstream tasks get whatever results are available.
Build 4: Typed Structured Output
Section titled “Build 4: Typed Structured Output”Raw text output is fine for articles, but many use cases need structured data. Java records make this clean:
record CompetitorProfile( String name, String marketPosition, List<String> strengths, List<String> weaknesses, double estimatedMarketShare) {}
Task profileTask = Task.builder() .description("Create a detailed profile of {{competitor}}") .expectedOutput("A structured competitor profile") .agent(analyst) .outputType(CompetitorProfile.class) .build();
EnsembleOutput output = Ensemble.builder() .agents(analyst) .tasks(profileTask) .chatLanguageModel(model) .inputs(Map.of("competitor", "AMD")) .build() .run();
// Typed access -- no parsing, no castingCompetitorProfile profile = output.getTaskOutputs().get(0) .getStructuredOutput(CompetitorProfile.class);
System.out.println(profile.name()); // "AMD"System.out.println(profile.strengths()); // ["Strong GPU lineup", ...]System.out.println(profile.estimatedMarketShare()); // 24.5The framework instructs the LLM to return JSON conforming to the record’s schema, deserializes it, and hands you back a typed object. If the LLM’s output doesn’t parse correctly, the framework retries automatically (configurable via maxOutputRetries()).
The Common Thread
Section titled “The Common Thread”All four examples share the same building blocks: Agent.builder(), Task.builder(), Ensemble.builder(). The same API produces a simple pipeline, a hierarchical delegation system, a parallel DAG, or a typed extraction pipeline.
And all of it runs on your existing JVM. Your existing Gradle build. Your existing CI pipeline. Your existing logging and monitoring infrastructure.
No Python sidecar. No REST wrapper. No new deployment topology.
What’s Next
Section titled “What’s Next”This post covered the core building blocks. In upcoming posts in this series, we’ll dig into:
- Production concerns: observability, error handling, cost tracking, rate limiting
- Advanced patterns: MapReduce ensembles, dynamic agent creation, tool pipelines
- Human-in-the-loop: review gates, approval workflows, pre-flight validation
Get started:
- Documentation — guides, examples, and API reference
- Getting Started — up and running in 5 minutes
- Examples — runnable code for every pattern
- GitHub — source, issues, and contributions
AgentEnsemble is MIT-licensed and available on GitHub.