Stream SDK file uploads and downloads instead of buffering in memory#1433
Stream SDK file uploads and downloads instead of buffering in memory#1433mishushakov wants to merge 23 commits into
Conversation
PR SummaryMedium Risk Overview Uploads: Downloads: Other: Empty files in JS Reviewed by Cursor Bugbot for commit 4351aae. Bugbot is set up for automated code reviews on this repo. Configure here. |
🦋 Changeset detectedLatest commit: 4351aae The changes in this PR will be included in the next version bump. This PR includes changesets to release 2 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Package ArtifactsBuilt from d182095. Download artifacts from this workflow run. JS SDK ( npm install ./e2b-2.30.3-mishushakov-stream-write-file-upload.0.tgzCLI ( npm install ./e2b-cli-2.12.1-mishushakov-stream-write-file-upload.0.tgzPython SDK ( pip install ./e2b-2.29.2+mishushakov.stream.write.file.upload-py3-none-any.whl |
…memory
- Volume.writeFile/write_file: stream ReadableStream (JS, non-browser) and
file-like objects (Python) to the API instead of buffering them in memory
- Sandbox.files.write with octet-stream upload: stream ReadableStream data
(JS, non-browser) and file-like objects (Python), with chunked gzip
compression
- Python Sandbox.files.read(format="stream"): stream the response body
instead of downloading it into memory before iterating (sync and async)
- JS Sandbox.files.read({ format: 'stream' }): bound only the initial
handshake by the request timeout instead of killing an actively-consumed
stream; the user signal can still cancel it mid-stream
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Empty files short-circuit response parsing (Content-Length: 0), so Sandbox.files.read() with format 'blob' returned '' and Volume.readFile() with 'blob'/'stream' returned undefined. Return an empty Blob or ReadableStream matching the requested format instead. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
baf796b to
9d72b2e
Compare
- js-sdk: handle empty files explicitly in the bytes path of read() instead of relying on new Uint8Array(undefined) coercion - python-sdk: bound the request timeout to the initial handshake for read(format="stream") in both sync and async implementations, matching the JS SDK behavior; document the semantics in the stream overloads Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6cda03fa92
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
matthewlouisbrockman
left a comment
There was a problem hiding this comment.
this will leak the connection if we don't use it right? do we need to 404 immediately vs wait to actually use it?
…ite-file-upload # Conflicts: # packages/js-sdk/src/sandbox/filesystem/index.ts # packages/python-sdk/e2b/sandbox_async/filesystem/filesystem.py # packages/python-sdk/e2b/sandbox_sync/filesystem/filesystem.py
read(format="stream") returned a bare (async) generator whose finally only closed the response if iteration had begun. A reader that was created but never consumed (or never started) held its pooled connection open until the client was closed, leaking connections. Wrap the streamed response in FileStreamReader / AsyncFileStreamReader, which: - release the connection when the stream is fully consumed or errors, - expose deterministic cleanup via close()/aclose() and (async) context manager support, - register a weakref.finalize safety net so an abandoned reader releases its connection on garbage collection (the async variant schedules aclose() on the running loop). Both remain Iterator[bytes] / AsyncIterator[bytes], so existing usage is unchanged. Adds credential-free unit tests covering consume/context manager/close/GC, plus live-sandbox tests for the context manager and partial-then-close paths. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Match the Python SDK's connection lifecycle for read(format='stream'): - explicitly cancel the unconsumed error body before propagating, instead of relying solely on the abort controller (parity with r.close()) - add a FinalizationRegistry safety net so an abandoned stream releases its connection on GC, mirroring Python's weakref.finalize on FileStreamReader Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…eader The async stream reader's garbage-collection safety net (_schedule_response_aclose via weakref.finalize) was best-effort at best: loop.create_task(aclose()) from a finalizer is not thread-safe, has no guarantee of running before loop teardown, and is useless once the loop is gone. Remove it and rely on the cleanup that actually works—auto-close on full consume / read error, aclose(), and the async context manager. The sync FileStreamReader keeps its weakref.finalize(response.close) net, which is reliable because close() is synchronous. Document that an abandoned async stream holds its pooled connection until the client is closed, and update the unit test accordingly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Follow-ups on the stream upload/download work, applying the established stream policy consistently and addressing review findings: - volumes: drop the client read timeout on `read_file`/`readFile` streams (Python `httpx.Timeout(..., read=None)`, JS handshake-bounded controller + `wrapStreamWithConnectionCleanup`), matching the sandbox files stream path and the RPC streams. The request timeout now bounds only the handshake, not body consumption. - JS sandbox streaming uploads: use the file-transfer timeout (1h) instead of the 60s request default so large streamed uploads aren't aborted mid-transfer; buffered uploads keep the short default. Centralize `FILE_TIMEOUT_MS` in connectionConfig and reuse it from volume. - JS: factor the stream cleanup + GC-finalizer logic into a shared `wrapStreamWithConnectionCleanup` used by both sandbox files and volumes. - stream handshake error mapping (Bugbot): map dropped connections during the stream handshake to typed, health-checked errors — JS via `handleEnvdApiFetchError`, Python via the `httpx.RemoteProtocolError` wrapper — mirroring the non-stream read paths. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Streaming an upload only happens on the octet-stream path; the multipart path buffers (`toBlob` in JS, `.read()` for text file-likes in Python), so with the old `useOctetStream`/`use_octet_stream` default of false a streamed write was silently buffered into memory. Default the flag to auto-detect instead: use octet-stream when any write entry is streamable (JS `ReadableStream`; Python file-like / non-str-bytes), and `multipart/form-data` otherwise. Browsers stay on multipart since they can't stream request bodies. An explicit flag value still wins, gzip still implies octet-stream, and the old-envd fallback is preserved. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Streamed (file-like) sandbox writes used the 60s request timeout for the write phase, so a large or slow streamed upload could trip WriteTimeout while the body was still being sent — inconsistent with the JS SDK (1h) and Python volume writes (1h). Relax the write timeout to FILE_TIMEOUT (1h) when any write entry is streamable, keeping connection setup and the response read bounded by the request timeout. Buffered str/bytes uploads keep the request timeout. FILE_TIMEOUT is shared via e2b/connection_config.py, mirroring the JS SDK's FILE_TIMEOUT_MS. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Revert the volume read/write streaming changes so this PR is scoped to the sandbox files streaming work. The volume changes land in a follow-up PR that builds on the shared streaming infrastructure introduced here. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
…ite-file-upload-v1 # Conflicts: # packages/js-sdk/src/connectionConfig.ts
A fully consumed stream returns its connection to the pool, where it can linger as an idle keep-alive entry until the server-side close is observed. Asserting on total pool size therefore flaked under load (test_sync_full_ consume_releases_connection saw 1 instead of 0 on CI). Count only checked-out (non-idle) connections, which is what the helper name promises and is the actual leak condition. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The FinalizationRegistry safety net for `read({ format: 'stream' })` only ran
`cleanup()` (aborting the handshake AbortController), unlike the cancel and
error paths which explicitly cancel the response body to release the pooled
envd connection. Abandoned streams could leave connections checked out until
the client was torn down. Mirror the cancel/error paths (and the Python sync
finalizer's `response.close`) by cancelling the body reader before cleanup.
Adds unit tests for wrapStreamWithConnectionCleanup, including a GC-abandonment
test (needs --expose-gc, enabled for the connectionConfig vitest project).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Drop the FinalizationRegistry (JS) and weakref.finalize (sync Python) safety nets on streamed reads in favor of a deterministic idle-read timeout that reclaims a stalled stream's pooled connection. Python maps it to httpx's per-chunk read timeout; JS arms a per-chunk timer that aborts the request controller. Configurable via streamIdleTimeoutMs / stream_idle_timeout (default 60s, 0/None disables), and the consume/close contract is now documented consistently across all three readers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bound a stalled streamed transfer with a per-chunk idle timeout (default the request timeout, configurable via streamIdleTimeoutMs / stream_idle_timeout, 0/None disables) on both reads and writes, so a producer or consumer that stops making progress no longer holds the pooled connection. Reads map it to httpx's per-chunk read timeout / a JS idle-abort wrapper; writes use httpx's per-write timeout / a JS upload idle-abort wrapper. The total-transfer cap is intentionally left to the server (envd): a client cap is advisory, can't protect against non-conforming clients, and would mean maintaining the same ceiling across three SDKs. The pre-existing client-side 1h upload total is removed for consistency with reads. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
aiter_io_chunks / agzip_iter were async generators doing synchronous file reads and zlib compression inline, stalling the asyncio event loop for the duration of those operations on large AsyncSandbox uploads. Offload both to a worker thread via asyncio.to_thread. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Arm the JS read idle timer only around the network read and clear it the moment a chunk arrives, so a slow or paused consumer no longer trips it; it fires only when the server stops sending mid-stream (a held-but-unread stream is reclaimed server-side). Matches Python's httpx read timeout, which only counts during socket reads. Drop the JS upload idle wrapper: it bounded producer latency (local), not the upload wire (not observable through fetch). Stalled uploads are bounded server-side or via the caller's signal; Python keeps its per-write httpx timeout, which does bound the wire. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
I will wait with this PR until we have server-side connection handling for dropping idle clients. |
After the filesystem streaming revert, FILE_TIMEOUT is used only by the volume client; sandbox filesystem streaming bounds each chunk by the request timeout and leaves the total to the server. Reword the comment so it no longer reads as a general streaming-transfer timeout (addresses a Bugbot review note). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…imeout The per-write httpx timeout on streamed uploads guards a stuck socket write (server stops reading); it can't observe the opposite direction. Record that envd >= 0.6.7's per-read idle timeout backstops that case. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Remove the JS 'empty file' read test now subsumed by 'read empty file in all formats', and trim the duplicate double-close idempotency assertions from the Python partial-then-close tests (idempotency is covered by the stream-reader unit tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The main connection_config FILE_TIMEOUT was never imported or re-exported; volume transfers use the constant in volume/connection_config.py and sandbox filesystem streaming now bounds each chunk by the request timeout. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 4351aae. Configure here.

Description
Removes full in-memory buffering from the SDK sandbox file-transfer paths, in both JS and Python (sync + async).
Streamed uploads —
Sandbox.files.write/write_filesstreamsReadableStream(JS, outside the browser) and file-like (Python) input to the sandbox with chunk-by-chunk gzip compression, instead of buffering the whole body in memory.useOctetStream/use_octet_streamnow defaults to auto-detect — octet-stream when any entry is streamable (so streamed uploads aren't silently buffered),multipart/form-dataotherwise; browsers always usemultipart/form-datasince streaming request bodies aren't supported there. A streamed upload is bounded by a per-chunk timeout on the wire (Python's per-writehttpxtimeout, default the request timeout); a stalled upload the wire can't observe is bounded server-side. On Python'sAsyncSandbox, the blocking file reads and gzip compression of a streamed upload now run in a worker thread so a large upload doesn't stall the event loop.Streamed downloads —
Sandbox.files.read(format="stream")now streams the response body from the sandbox instead of downloading it into memory before iterating (Python sync + async), and the 60s request timeout no longer kills the stream while it's being consumed:read()option (streamIdleTimeoutMsin JS,stream_idle_timeoutin Python; default the request timeout — 60s —0/Noneto disable). It's armed only while waiting on a network read and cleared the moment a chunk arrives, so it aborts only when the server stops sending mid-stream; a slow or paused consumer never trips it (a held-but-unread stream is reclaimed server-side, not by this timer).signalcan still cancel an in-flight stream.FileStreamReader/AsyncFileStreamReadersupporting deterministic cleanup viaclose()/aclose()and (async) context-manager use; both still satisfyIterator[bytes]/AsyncIterator[bytes], so existing iteration is unchanged.Empty files — JS
Sandbox.files.read()withbloborstreamformat now returns a format-correct empty value (emptyBlob/ emptyReadableStream) for empty files instead of"".Note
The equivalent volume streaming changes (
Volume.writeFile/write_file,Volume.readFile/read_filestreams) live in a follow-up PR, #1453, which is based on this branch.Usage
🤖 Generated with Claude Code