Skip to content

[query] allow attaching debugger to Spark driver JVM#15377

Open
patrick-schultz wants to merge 1 commit into
hail-is:mainfrom
patrick-schultz:ps/push-ulsopwrsmpvq
Open

[query] allow attaching debugger to Spark driver JVM#15377
patrick-schultz wants to merge 1 commit into
hail-is:mainfrom
patrick-schultz:ps/push-ulsopwrsmpvq

Conversation

@patrick-schultz
Copy link
Copy Markdown
Member

Change Description

Introduce an environment variable HAIL_DEBUG_JVM, which makes the JVM created by py4j or pyspark wait for a remote debugger to attach. Add a note in the dev-docs on how to get this working with IntelliJ.

Security Assessment

  • This change cannot impact the Hail Batch instance as deployed by Broad Institute in GCP

Copy link
Copy Markdown
Member

@ehigham ehigham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice, but I have a couple of comments.

I think it would be better to support passing java options instead of these flags. In the past, I've hacked py4j so that I can set JAVA_TOOL_OPTIONS in the envrionment to
- debug two versions of hail side-by-side. This implementation wont work there as the port is already bound by the first
- attach the yourkit profilter

Ideally, I would do something like hl.init(java_opts=['-Xmx8', 'agentlib:blah'])

Copy link
Copy Markdown
Member Author

Supporting arbitrary java options makes sense. But since I expect the debugger listening options will be the most used (and they're quite long), I'd like some way to enable that without needing to look up the exact options every time.

Likewise, allowing java options to be passed to hl.init makes sense, but I think we also need to be able to configure them through envvars. The way I use this is to debug a failing python test using HAIL_DEBUG_JVM=1 make pytest PYTEST_ARGS='-k foo'.

Other than those desiderata, I don't have strong opinions about the interface for passing java opts.

Perhaps something like this covers all the bases?

  • a java_opts argument to hl.init
  • a HAIL_DRIVER_JVM_OPTS envvar
  • a HAIL_JVM_DEBUGGER_PORT envvar (if set, enable the debugger listener using this port)
  • we append all three to the jvm opts

@cjllanwarne
Copy link
Copy Markdown
Collaborator

If #15395 merges, the test batch for this PR would only have run the following steps:

[
    'check_hail',
    'merge_code',
    'test_batch_invariants',
    'test_hail_python',
    'test_hail_python_local_backend',
    'test_hail_python_service_backend_gcp',
    'test_hail_python_unchecked_allocator',
    'test_hail_spark_conf_requester_pays_parsing',
    'test_hailgenetics_hail_image',
    'test_hailgenetics_hail_image_python_3_10',
    'test_hailgenetics_hail_image_python_3_12',
    'test_hailgenetics_hail_image_python_3_13',
    'test_python_docs',
]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants