Fix/add experiment refactor#38252
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request performs a clean refactor of how pipeline experiments are managed in the codebase. By migrating from manual list operations to established helper methods, the implementation becomes more concise and robust, ensuring consistent handling of experiment flags without altering existing behavior. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
|
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
CI AnalysisThe Python CI failure (
None of these appear in the error trace. This is a pre-existing test infrastructure issue. |
|
assign set of reviewers |
|
Assigning reviewers: R: @shunping for label python. Note: If you would like to opt out of this review, comment Available commands:
The PR bot will only process comments in the main thread (not review comments). |
|
Reminder, please take a look at this pr: @shunping |
|
Hi! Could you rebase your PR to the current master? Hopefully, that will reduce the number of failed tests. |
bf53c7f to
86063e1
Compare
|
Reminder, please take a look at this pr: @shunping |
|
Assigning new set of reviewers because Pr has gone too long without review. If you would like to opt out of this review, comment R: @tvalentyn for label python. Available commands:
|
|
@ash6898 thanks for the contribuion. could you resolve the conflict please? thanks! |
|
apologies for the delays and us asking you again to rebase, looks like this fell trough the cracks due to some tooling issues and OOO's |
Replaces manual list manipulation pattern with DebugOptions.add_experiment() which handles null-init and deduplication internally. Resolves apache#19347
… Java Replaces getExperiments()/modify/setExperiments() boilerplate with ExperimentalOptions.addExperiment() which handles null-init and deduplication. Resolves apache#19347
86063e1 to
13549dc
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request refactors how pipeline experiments are added across both the Java and Python SDKs by replacing manual list manipulation and copying with helper methods (ExperimentalOptions.addExperiment in Java and DebugOptions.add_experiment in Python). The review feedback correctly points out that several pre-emptive copies of the experiments list to a mutable ArrayList in the Java codebase are redundant, as ExperimentalOptions.addExperiment already handles null-initialization and list copying internally. Removing these redundant blocks will further clean up the code.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| // Ensure the experiments list is mutable before any experiments are added. | ||
| if (options.getExperiments() != null) { | ||
| options.setExperiments(new ArrayList<>(options.getExperiments())); | ||
| } | ||
| job.setName(options.getJobName().toLowerCase()); |
There was a problem hiding this comment.
This pre-emptive copy of the experiments list to an ArrayList is redundant. ExperimentalOptions.addExperiment already handles null-initialization and copies the existing list into a new mutable ArrayList internally before adding any new experiments. You can safely remove this block.
| // Ensure the experiments list is mutable before any experiments are added. | |
| if (options.getExperiments() != null) { | |
| options.setExperiments(new ArrayList<>(options.getExperiments())); | |
| } | |
| job.setName(options.getJobName().toLowerCase()); | |
| job.setName(options.getJobName().toLowerCase()); |
| // Ensure the experiments list is mutable before any experiments are added. | ||
| if (options.getExperiments() != null) { | ||
| options.setExperiments(new ArrayList<>(options.getExperiments())); | ||
| } | ||
| // Multi-language pipelines and pipelines that include upgrades should automatically be upgraded |
There was a problem hiding this comment.
This pre-emptive copy of the experiments list to an ArrayList is redundant. ExperimentalOptions.addExperiment already handles null-initialization and copies the existing list into a new mutable ArrayList internally before adding any new experiments. You can safely remove this block.
| // Ensure the experiments list is mutable before any experiments are added. | |
| if (options.getExperiments() != null) { | |
| options.setExperiments(new ArrayList<>(options.getExperiments())); | |
| } | |
| // Multi-language pipelines and pipelines that include upgrades should automatically be upgraded | |
| // Multi-language pipelines and pipelines that include upgrades should automatically be upgraded |
| // Ensure the experiments list is mutable before any experiments are added. | ||
| if (options.getExperiments() != null) { | ||
| options.setExperiments(new ArrayList<>(options.getExperiments())); | ||
| } | ||
| options.setProject("project"); |
There was a problem hiding this comment.
This pre-emptive copy of the experiments list to an ArrayList is redundant. ExperimentalOptions.addExperiment already handles null-initialization and copies the existing list into a new mutable ArrayList internally before adding any new experiments. You can safely remove this block.
| // Ensure the experiments list is mutable before any experiments are added. | |
| if (options.getExperiments() != null) { | |
| options.setExperiments(new ArrayList<>(options.getExperiments())); | |
| } | |
| options.setProject("project"); | |
| options.setProject("project"); |
There was a problem hiding this comment.
here and in other places: why was this copy necessary?
| ImmutableList.<String>builder().addAll(experiments).add("upload_graph").build()); | ||
| if (jobGraphByteSize >= CREATE_JOB_REQUEST_LIMIT_BYTES && !useUnifiedWorker(options)) { | ||
| ExperimentalOptions.addExperiment(options, "upload_graph"); | ||
| LOG.info( |
There was a problem hiding this comment.
nit: looks like this log line will now be printed regardless even if the experiment was already enabled.
let's add && !hasExperiment(options, "upload_graph") back?
addresses #19347
Summary
Replace manual experiment list manipulation with the existing helper methods:
experiments.append(...)todebug_options.add_experiment(...)in 5 filesgetExperiments()/modify/setExperiments()pattern withExperimentalOptions.addExperiment(...)in 3 filesThis is a pure refactor — no behavior change. The helper methods already handle null-init and deduplication internally.
Files changed
Python:
sdks/python/apache_beam/pipeline.pysdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner.pysdks/python/apache_beam/io/iobase_test.pysdks/python/apache_beam/io/external/xlang_parquetio_test.pysdks/python/apache_beam/runners/dataflow/internal/apiclient.pyJava:
runners/google-cloud-dataflow-java/src/main/java/.../DataflowPipelineTranslator.javarunners/google-cloud-dataflow-java/src/main/java/.../DataflowRunner.javarunners/google-cloud-dataflow-java/worker/src/main/.../GrpcWindmillServer.java