Skip to content

coprocessor: fix parsing of graphql responses with null as data #7141

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Apr 18, 2025

Conversation

IvanGoncharov
Copy link
Contributor

@IvanGoncharov IvanGoncharov commented Mar 28, 2025

Coprocessor deserialized GraphQL response using serde_json directly that resulted in data: null were ignored.
After my change, it's using Response::from_value, which applies the same steps for parsing GraphQL response that we are already using for parsing subgraph response.


Checklist

Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.

  • Changes are compatible1
  • Documentation2 completed
  • Performance impact assessed and acceptable
  • Tests added and passing3
    • Unit Tests
    • Integration Tests
    • Manual Tests

Exceptions

Note any exceptions here

Notes

Footnotes

  1. It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this.

  2. Configuration is an important part of many changes. Where applicable please try to document configuration examples.

  3. Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions.

This comment has been minimized.

@svc-apollo-docs
Copy link
Collaborator

svc-apollo-docs commented Mar 28, 2025

✅ Docs preview has no changes

The preview was not built because there were no changes.

Build ID: fb1f29e54dfdeafbe06042a0

@IvanGoncharov IvanGoncharov force-pushed the i1g/fix_coprocessor_data_null branch from e5c2803 to 43052be Compare March 28, 2025 14:37
@IvanGoncharov IvanGoncharov force-pushed the i1g/fix_coprocessor_data_null branch from 43052be to 554654e Compare April 7, 2025 20:21
@IvanGoncharov IvanGoncharov marked this pull request as ready for review April 10, 2025 06:15
@IvanGoncharov IvanGoncharov requested a review from a team as a code owner April 10, 2025 06:15
Copy link
Contributor

@bnjjj bnjjj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It contains a lot of changes unrelated to the fix itself (formatting fixes), I tried to read everything carefully to make sure I didn't miss anything but feel free to tell me in which files changes are really important to look at.

@IvanGoncharov IvanGoncharov requested a review from a team as a code owner April 10, 2025 11:36
@IvanGoncharov
Copy link
Contributor Author

It contains a lot of changes unrelated to the fix itself (formatting fixes), I tried to read everything carefully to make sure I didn't miss anything but feel free to tell me in which files changes are really important to look at.

@bnjjj I will add comments on all important changes.

Ok(Some(s)) => Ok(s.as_str().to_string()),
Ok(None) => Err(MalformedResponseError {
reason: "missing required `message` property within error".to_owned(),
}),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new behaviour for subgraph response parsing.
Previously, if the description was missing, it was set up by unwrap_or_default, which resulted in an empty string.
Since the coprocessor used serde for parsion, it was erroring on missing message.
Since the GraphQL spec requires message I just added validation here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it potentially a breaking change or could it break existing implementation ? I think it's a bug fix but as it's more strict maybe it could be interesting to add notes about this in the changelog to avoid confusions

@@ -0,0 +1,9 @@
### Fix Parsing of Coprocessor GraphQL Responses ([PR #7141](https://github.com/apollographql/router/pull/7141))

Standardized GraphQL response parsing and validation between coprocessors and subgraphs to ensure consistent behavior throughout the Router.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a user focused changset entry. Something like.

Previously coprocessors allowed null body in graphql responses, hoever this is not allowed in the graphql spec. The router now correctly enforces that the body of a coprocessor response is valid, and will reject such responses.

If your coprocessor relied on this behaviour then you must update your coprocessor to return a calid graphql response.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense 👍

pub(crate) struct MalformedResponseError {
/// The reason the deserialization failed.
pub(crate) reason: String,
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, we reported all errors related to parsing/validation graphql response as FetchError::SubrequestMalformedResponse which made this code only usable for subgraph responses. Since now we have generic error it could be reused for coprocessor responses.

@@ -206,7 +207,7 @@ where

let body_to_send = request_config
.body
.then(|| serde_json::from_slice::<serde_json::Value>(&bytes))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general idea was to switch the entire coprocessor plugin to use serde_json_bytes since it's the only way to reuse code written for subgraph response, which is already tied to serde_json_bytes. In general, I think it's a good idea since even if the coprocessor changed the GraphQL response in a minor way, it still needs to be deserialized in its entirety, and serde_json_bytes provides necessary optimisation for large responses.

@@ -1137,29 +1138,28 @@ where
let code = control.get_http_status()?;

let res = {
let graphql_response: crate::graphql::Response =
match co_processor_output.body.unwrap_or(serde_json::Value::Null) {
serde_json::Value::String(s) => crate::graphql::Response::builder()
Copy link
Contributor Author

@IvanGoncharov IvanGoncharov Apr 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is weird. I think it would be better to just have a custom error saying you need to provide body if you return Break. Instead, we push null instead of body, which results in a parsing error.
But I decided to stop myself here since I had already gone too deep into refactoring this code.

@@ -274,3 +266,51 @@ async fn test_coprocessor_demand_control_access() -> Result<(), BoxError> {

Ok(())
}

#[tokio::test(flavor = "multi_thread")]
async fn test_coprocessor_proxying_error_response() -> Result<(), BoxError> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a test that tests the customer's issue:

@IvanGoncharov IvanGoncharov enabled auto-merge (squash) April 17, 2025 20:17
@IvanGoncharov IvanGoncharov disabled auto-merge April 18, 2025 07:57
@IvanGoncharov IvanGoncharov enabled auto-merge (squash) April 18, 2025 07:58
@IvanGoncharov IvanGoncharov disabled auto-merge April 18, 2025 09:46
@IvanGoncharov IvanGoncharov merged commit 388afab into dev Apr 18, 2025
15 checks passed
@IvanGoncharov IvanGoncharov deleted the i1g/fix_coprocessor_data_null branch April 18, 2025 12:33
@IvanGoncharov
Copy link
Contributor Author

@mergify backport 1.x

Copy link
Contributor

mergify bot commented Apr 18, 2025

backport 1.x

✅ Backports have been created

IvanGoncharov added a commit that referenced this pull request Apr 18, 2025
)

(cherry picked from commit 388afab)

# Conflicts:
#	apollo-router/src/graphql/request.rs
#	apollo-router/src/metrics/mod.rs
#	apollo-router/src/plugins/coprocessor/execution.rs
#	apollo-router/src/plugins/coprocessor/supergraph.rs
#	apollo-router/src/plugins/coprocessor/test.rs
@abernix abernix mentioned this pull request Apr 29, 2025
@abernix abernix mentioned this pull request May 6, 2025
@abernix abernix mentioned this pull request Jul 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants