Skip to content

Improve BigQueryIO Storage Write API error messages#38948

Open
damccorm wants to merge 5 commits into
apache:masterfrom
damccorm:feature/improve-bq-error-messages
Open

Improve BigQueryIO Storage Write API error messages#38948
damccorm wants to merge 5 commits into
apache:masterfrom
damccorm:feature/improve-bq-error-messages

Conversation

@damccorm

@damccorm damccorm commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

This PR improves the error messages thrown by the BigQuery Storage Write API (both sharded and unsharded paths) when retries are exhausted or when persistent errors are encountered.

* Surface the root cause in the RuntimeException when AppendRows retries are exhausted.
* Elevate the final failure logging to ERROR level.
* Provide actionable advice for PERMISSION_DENIED and NOT_FOUND errors, suggesting to check if the destination table exists and if the service account has the TABLES_UPDATE_DATA permission.
* Add a unit test to verify the improved error messages on retry exhaustion.
@damccorm

Copy link
Copy Markdown
Contributor Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves error reporting when BigQuery Storage API write retries are exhausted. It enhances error messages with the last encountered error and provides specific hints for PERMISSION_DENIED and NOT_FOUND status codes (e.g., checking permissions and table existence). These changes are applied to both sharded and unsharded record writers, and a new unit test is added to verify this behavior. Feedback on the changes highlights potential NullPointerException risks when comparing the statusCode enum using .equals(). It is recommended to use the == operator instead, which is null-safe and standard practice for enum comparisons in Java.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@damccorm

Copy link
Copy Markdown
Contributor Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves error reporting when BigQuery Storage API write retries are exhausted by appending actionable troubleshooting advice for PERMISSION_DENIED and NOT_FOUND errors, and adds a unit test to verify this behavior. The review feedback suggests removing duplicate logging before throwing exceptions to avoid cluttered logs, preserving the full exception chain rather than discarding the outer exception, and simplifying the status code extraction.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@damccorm

Copy link
Copy Markdown
Contributor Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves error messages when BigQuery Storage API writes fail after exhausting retries by including the last encountered error and suggesting permission checks for PERMISSION_DENIED and NOT_FOUND statuses. The review feedback recommends handling InterruptedException properly in StorageApiWritesShardedRecords to avoid swallowing the interrupt status, and updating the referenced IAM permission to the official 'bigquery.tables.updateData' name across the codebase and tests.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@damccorm

Copy link
Copy Markdown
Contributor Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves error reporting in BigQuery Storage API writes (both sharded and unsharded records) when retries are exhausted. It enhances the error messages to include the last encountered error and, for PERMISSION_DENIED or NOT_FOUND status codes, adds actionable advice to check table existence and service account permissions. It also introduces a unit test to verify this behavior using a simulated append failure in FakeDatasetService. I have no feedback to provide as there are no review comments.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@damccorm damccorm marked this pull request as ready for review June 12, 2026 18:11
@damccorm

Copy link
Copy Markdown
Contributor Author

R: @ahmedabu98

@github-actions

Copy link
Copy Markdown
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment assign set of reviewers

@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the debugging experience for users of the BigQuery Storage Write API by providing more descriptive and actionable error messages. When AppendRows operations fail after exhausting retries or encountering persistent issues, the system will now output detailed information, including the last error encountered and specific guidance on common problems like incorrect permissions or missing tables. This change aims to reduce the time and effort required to diagnose and resolve write failures.

Highlights

  • Enhanced Error Messages: Improved error messages for BigQuery Storage Write API failures, providing more context on the last encountered error and suggesting potential causes like missing permissions or non-existent tables.
  • Error Handling in Sharded Writes: Implemented try-catch blocks in the sharded write path to capture and re-throw exceptions with the new, more informative error messages.
  • Testability Improvements: Modified the FakeDatasetService to allow injecting AppendRows errors, enabling robust testing of error handling scenarios.
  • New Test Case: Added a new test testStorageApiWriteFailureExhaustedRetries to validate the improved error messages when retries are exhausted due to permission issues.
New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves error reporting for BigQuery Storage API writes by appending the last encountered error and helpful troubleshooting instructions (such as checking table existence and permissions) to the exception message when AppendRows fails. It also adds a unit test to verify this behavior. The reviewer suggested simplifying the error message in the sharded write path from "More than %d attempts to call AppendRows failed" to "AppendRows failed" because persistent errors can cause immediate failure on the first attempt, making the original message misleading. A corresponding update to the test assertions was also recommended.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request improves the developer experience when working with the BigQuery Storage Write API by providing more actionable error messages. By capturing and surfacing the last encountered error and offering specific troubleshooting advice for common permission and configuration issues, it helps users diagnose and resolve failures more efficiently when retries are exhausted.

Highlights

  • Enhanced Error Messaging: Updated BigQueryIO Storage Write API to provide more descriptive error messages when retries are exhausted, including the last encountered error.
  • Actionable Permission Feedback: Added specific guidance for PERMISSION_DENIED and NOT_FOUND errors, suggesting checks for table existence and service account permissions.
  • Improved Exception Handling: Wrapped retry logic in sharded and unsharded paths to ensure failures are caught and reported with the new, detailed error context.
  • Test Coverage: Introduced a new test case to verify that exhausted retries correctly trigger the updated error messages and suggestions.
New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves error reporting when BigQuery Storage API write retries are exhausted by including the last encountered error and adding actionable troubleshooting hints for PERMISSION_DENIED and NOT_FOUND errors. The reviewer suggests extracting the duplicated error message formatting logic between StorageApiWriteUnshardedRecords and StorageApiWritesShardedRecords into a shared helper method to improve maintainability.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@gemini-code-assist

Copy link
Copy Markdown
Contributor

Warning

Gemini encountered an error creating the summary. You can try again by commenting /gemini summary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant