Skip to content

CAMEL-23697: Add responseType parameter for structured output in lang…#23996

Open
zbendhiba wants to merge 1 commit into
apache:mainfrom
zbendhiba:CAMEL-23697-langchain4j-agent-java-response-type
Open

CAMEL-23697: Add responseType parameter for structured output in lang…#23996
zbendhiba wants to merge 1 commit into
apache:mainfrom
zbendhiba:CAMEL-23697-langchain4j-agent-java-response-type

Conversation

@zbendhiba

@zbendhiba zbendhiba commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

…chain4j-agent

Description

Target

  • I checked that the commit is targeting the correct branch (Camel 4 uses the main branch)

Tracking

  • If this is a large change, bug fix, or code improvement, I checked there is a JIRA issue filed for the change (usually before you start working on it).

Apache Camel coding standards and style

  • I checked that each commit in the pull request has a meaningful subject line and body.
  • I have run mvn clean install -DskipTests locally from root folder and I have committed all auto-generated changes.

…chain4j-agent

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown
Contributor

🌟 Thank you for your contribution to the Apache Camel project! 🌟
🤖 CI automation will test this PR automatically.

🐫 Apache Camel Committers, please review the following items:

  • First-time contributors require MANUAL approval for the GitHub Actions to run
  • You can use the command /component-test (camel-)component-name1 (camel-)component-name2.. to request a test from the test bot although they are normally detected and executed by CI.
  • You can label PRs using skip-tests and test-dependents to fine-tune the checks executed by this PR.
  • Build and test logs are available in the summary page. Only Apache Camel committers have access to the summary.

⚠️ Be careful when sharing logs. Review their contents before sharing them publicly.

@github-actions

Copy link
Copy Markdown
Contributor

🧪 CI tested the following changed modules:

  • catalog/camel-catalog
  • components/camel-ai/camel-langchain4j-agent-api
  • components/camel-ai/camel-langchain4j-agent
  • dsl/camel-componentdsl
  • dsl/camel-endpointdsl

⚠️ Some tests are disabled on GitHub Actions (@DisabledIfSystemProperty(named = "ci.env.name")) and require manual verification:

  • components/camel-ai/camel-langchain4j-agent: 16 test(s) disabled on GitHub Actions
All tested modules (12 modules)
  • Camel :: AI :: LangChain4j :: Agent
  • Camel :: AI :: LangChain4j :: Agent :: API
  • Camel :: Catalog :: Camel Catalog
  • Camel :: Component DSL
  • Camel :: Endpoint DSL
  • Camel :: JBang :: MCP
  • Camel :: JBang :: Plugin :: Route Parser
  • Camel :: JBang :: Plugin :: TUI
  • Camel :: JBang :: Plugin :: Validate
  • Camel :: Launcher :: Container
  • Camel :: YAML DSL :: Validator
  • Camel :: YAML DSL :: Validator Maven Plugin

⚙️ View full build and test results

@Croway Croway left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated code review of this PR — Claude Code on behalf of Federico Mariani (@Croway).

The feature works well on the tested path (inline agentConfiguration + POJO responseType). The main theme of the inline findings: the new validation only covers one of the three agent-creation paths (agent bean, agentConfiguration, agentFactory), so several supported configurations end up with responseType parsing enabled while the JSON schema was never applied to the model — failing per-message at runtime instead of fast at startup. There are also two smaller correctness/consistency issues in the schema-derivation branch (misleading error for non-POJO types, schema-name divergence from the documented camel_schema convention) and a few design/cleanup notes.

Details inline. One finding (producer field race) is pre-existing and only noted for a follow-up JIRA.

String response = agent.chat(aiAgentBody, toolProvider);
exchange.getMessage().setBody(response);

Class<?> responseType = endpoint.getConfiguration().getResponseType();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

responseType parsing runs even when the schema was never applied to the model. This lookup makes process() parse with responseType regardless of whether resolveAndApplyStructuredOutput() actually ran in doInit(). Two reachable configurations bypass it:

  1. agent + agentConfiguration both set — both options are autowired=true, so a registry containing one bean of each auto-injects both. The agent != null branch wins and the ResponseFormat is never applied, yet the startup guard passes because agentConfiguration != null.
  2. agentFactory + agentConfiguration + responseType — the factory-created agent replaces the internal agent on every exchange and never receives the ResponseFormat that doInit() applied to the (discarded) internal agent.

In both cases the model gets no JSON schema, answers in free-form prose, and SERVICE_OUTPUT_PARSER.parseText throws OutputParsingException on every message at runtime instead of failing fast at startup. Suggest extending the doInit() validation to reject responseType when agent or agentFactory is configured (or fixing it at the root — see the comment on setResponseFormat below).

if (endpoint.getConfiguration().getAgent() != null) {
agent = endpoint.getConfiguration().getAgent();
} else if (endpoint.getConfiguration().getAgentConfiguration() != null) {
if (ObjectHelper.isNotEmpty(endpoint.getConfiguration().getJsonSchema())

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mutual-exclusion check is unreachable when an agent bean is also configured. Because this check is nested inside the else if (agentConfiguration != null) branch, an endpoint with agent=#myAgent&agentConfiguration=#cfg&jsonSchema=...&responseType=... starts up cleanly with both "mutually exclusive" options set — jsonSchema is silently dropped while responseType still drives response parsing. Moving the check to the top-level validation in doInit() (alongside the responseType-requires-agentConfiguration guard) would make the documented contract hold on every path.


String resolved = endpoint.getCamelContext().resolvePropertyPlaceholders(jsonSchema);
if (responseType != null) {
JsonSchema schema = JsonSchemas.jsonSchemaFrom(responseType)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-POJO responseType values fail with a misleading error message. In langchain4j 1.16.2, JsonSchemas.jsonSchemaFrom() returns Optional.empty() for any type whose output parser is not PojoOutputParserString, Integer, enums, List/Set, primitives. So e.g. responseType=java.lang.Integer fails startup with "Ensure the class has public fields or getters" even though Integer has getters; the real restriction is POJO-only. Suggest rewording the message (e.g. "responseType must be a POJO class; simple types, enums and collections are not supported") so users aren't sent chasing the wrong fix.

+ ". Ensure the class has public fields or getters and is not a raw generic type."));
ResponseFormat derivedFormat = ResponseFormat.builder()
.type(ResponseFormatType.JSON)
.jsonSchema(schema)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Schema name diverges from the documented camel_schema convention. JsonSchemas.jsonSchemaFrom() names the schema rawClass.getSimpleName() (verified in langchain4j 1.16.2 sources), while the jsonSchema branch below deliberately uses the fixed name camel_schema with a comment citing cross-component consistency with camel-openai. Migrating a route from jsonSchema to responseType therefore silently changes the json_schema.name sent to OpenAI-compatible providers from camel_schema to e.g. CarRentalRecommendation. Consider rebuilding the derived schema with .name("camel_schema") to keep both options consistent.


Class<?> responseType = endpoint.getConfiguration().getResponseType();
if (responseType != null) {
exchange.getMessage().setBody(SERVICE_OUTPUT_PARSER.parseText(responseType, response));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No error handling around parsing — raw response is lost on failure. If the model returns a null/blank/unparseable response (content filter, guardrail rewrite, provider quirk), parseText throws a raw dev.langchain4j.service.output.OutputParsingException and the original response text is gone — onException handlers can't log it or route a fallback. Consider wrapping in a Camel-friendly exception that carries the raw response (or preserving it in a header) so routes keep an escape hatch.

// ServiceOutputParser is used intentionally: beyond JSON parsing, it strips markdown code fences
// (e.g. ```json ... ```) that some LLMs emit even when ResponseFormat is explicitly set to JSON.
// Plain Jackson would throw a JsonParseException on such responses.
private static final ServiceOutputParser SERVICE_OUTPUT_PARSER = new ServiceOutputParser();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heads-up: ServiceOutputParser is annotated @Internal in langchain4j 1.16.2 (verified via javap; JsonSchemas by contrast is public API), and this is the only place in Camel that uses it. The code-fence-stripping rationale in the comment is a fair tradeoff, but its signature can change without deprecation on any langchain4j upgrade — worth keeping in mind when bumping the dependency.

.type(ResponseFormatType.JSON)
.jsonSchema(schema)
.build();
((AbstractAgent<?>) agent).setResponseFormat(derivedFormat);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design suggestion: rather than the cast+setter, a responseFormat field on AgentConfiguration consumed by AbstractAgent.configureBuilder() (which already sources guardrails, tools, RAG and memory from the configuration) would let factory-created agents and user-provided AbstractAgent subclasses receive the format too — removing the "requires agentConfiguration" restriction at the root and fixing the agentFactory gap noted above. The cast pattern pre-dates this PR (CAMEL-23642 introduced it for jsonSchema), but this PR adds a second cast site, so it may be the right moment to generalize.

* the agent is created internally from agentConfiguration.
*/
private void resolveAndApplyJsonSchema() throws Exception {
private void resolveAndApplyStructuredOutput() throws Exception {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: both branches of this method duplicate the ResponseFormat.builder().type(JSON).jsonSchema(...).build() + ((AbstractAgent<?>) agent).setResponseFormat(...) epilogue, and the feature's validation is spread over three places (top of doInit, inside the agentConfiguration branch, and the early-return here). Deriving the JsonSchema per branch and building/applying the format once — plus one consolidated validation block — would prevent drift; the two branches have already diverged on schema naming.

@@ -102,7 +121,13 @@ public void process(Exchange exchange) throws Exception {

ToolProvider toolProvider = createComposedToolProvider(tags, exchange);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing issue adjacent to this change (not introduced by this PR, noting for a follow-up JIRA): a few lines above, agent = agentFactory.createAgent(exchange) writes the shared instance field from process() on a singleton producer, so concurrent exchanges race — thread A can end up chatting with thread B's factory-created agent (cross-tenant memory/credential leakage with per-tenant factories). The fix is a local variable: Agent agent = agentFactory != null ? agentFactory.createAgent(exchange) : this.agent;.

+ "Camel will automatically derive the JSON schema from the class structure and unmarshal the response. "
+ "This option works only when using agentConfiguration (inline agent creation mode). "
+ "Mutually exclusive with jsonSchema.")
private Class<?> responseType;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consistency note: camel-openai exposes the equivalent feature as outputClass — a String resolved at request time via ClassResolver, overridable per message via the CamelOpenAIOutputClass header, with the body left as the raw JSON string. This option differs in name, type, dynamicity and body semantics (auto-unmarshalled POJO), even though the PR borrows camel-openai's camel_schema convention. A Class<?> bound at startup also bypasses the ClassResolver indirection, which matters in classloader-partitioned runtimes. Worth aligning the surface (or at least documenting the difference) so users moving between camel-ai components aren't surprised.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants