Skip to content
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 11 additions & 8 deletions packages/dds/cell/src/cell.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,11 @@
* Licensed under the MIT License.
*/

import { assert, unreachableCase } from "@fluidframework/core-utils/internal";
import {
DoublyLinkedList,
assert,
unreachableCase,
} from "@fluidframework/core-utils/internal";
import type {
IChannelAttributes,
IFluidDataStoreRuntime,
Expand Down Expand Up @@ -85,7 +89,7 @@ export class SharedCell<T = any>
*/
private messageIdObserved: number = -1;

private readonly pendingMessageIds: number[] = [];
private readonly pendingMessageIds = new DoublyLinkedList<number>();

private attribution: AttributionKey | undefined;

Expand Down Expand Up @@ -264,9 +268,7 @@ export class SharedCell<T = any>
0x00c /* "messageId is incorrect from from the local client's ACK" */,
);
assert(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep Review: The messageId !== messageIdObserved watermark predicate is a sound proxy for "any pending writes?" today because the lifecycle is strictly push-on-submit / shift-on-ACK / pop-on-rollback — messageId advances exactly when pendingMessageIds grows, and the head moves in lockstep with ACKs.

This PR's stated motivation (per the description) is to enable future O(1) list.remove(node) for arbitrary-position drop in the squash follow-up. As soon as squash/arbitrary-removal lands and a mid-queue entry is dropped, messageId (the last-submitted id) keeps advancing while messageIdObserved (the last-ACKed id) lags, leaving the predicate true even when no actual pending op remains — which would suppress remote-op application indefinitely.

The cross-cutting invariant — that this guard must move to !this.pendingMessageIds.empty when arbitrary-position removal lands — is easy to miss from inside a future squash PR that's focused on reSubmitSquashed plumbing. Two options:

  1. Migrate the guard to !this.pendingMessageIds.empty in this PR. Behavioral no-op today, and you're already touching every other site that reads the queue.
  2. Add a one-line // TODO: at this predicate referencing the squash/arbitrary-removal follow-up, so the next maintainer can't silently miss the dependency.

Either is fine; the goal is just to make the coupling visible at the predicate site.

// eslint-disable-next-line @typescript-eslint/prefer-optional-chain -- TODO: ADO#58518 Code owners should verify if this code change is safe and make it if so or update this comment otherwise
this.pendingMessageIds !== undefined &&
this.pendingMessageIds[0] === cellOpMetadata.pendingMessageId,
this.pendingMessageIds.first?.data === cellOpMetadata.pendingMessageId,
0x471 /* Unexpected pending message received */,
);
this.pendingMessageIds.shift();
Expand Down Expand Up @@ -306,9 +308,10 @@ export class SharedCell<T = any>
previousValue?: Serializable<T>,
): ICellLocalOpMetadata {
const pendingMessageId = ++this.messageId;
this.pendingMessageIds.push(pendingMessageId);
const { first: pendingNode } = this.pendingMessageIds.push(pendingMessageId);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep Review: DoublyLinkedList.push(...items) returns a ListNodeRange spanning all newly appended nodes (packages/common/core-utils/src/list.ts:196-205). For a single-item push, range.first === range.last, so this is correct today — but push appends at the tail, so last reads correctly as "the node I just appended." If a future caller ever batches multiple ids in one push, first would silently grab the head of the inserted range instead of the most recently appended node.

Suggested change
const { first: pendingNode } = this.pendingMessageIds.push(pendingMessageId);
const { last: pendingNode } = this.pendingMessageIds.push(pendingMessageId);

const localMetadata: ICellLocalOpMetadata = {
pendingMessageId,
pendingNode,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep Review: PR #12273 wired applyStashedOp through createLocalOpMetadata, so stashed ops push onto pendingMessageIds and emit ICellPendingLocalOpMetadata carrying a pendingNode. The new Pending op bookkeeping describe covers direct cell.set(...) submits and LIFO rollback but doesn't exercise stash → rehydrate → ACK.

This is narrower than it might look: applyStashedOp isn't an independent producer — it delegates to set()/delete()submitCellMessage()createLocalOpMetadata(), the same path the new tests already cover, so the "different list instance" failure mode is ruled out by construction. The residual risk is a future regression in stash-rehydrate sequencing through processCore's ICellPendingLocalOpMetadata cast surfacing only in production.

One applyStashedOp round-trip test in the new Pending op bookkeeping describe (drive a stashed set/delete op, then ACK, assert no 0x471 and ordering preserved) closes that gap cheaply. If you prefer to defer to the squash follow-up that first reads pendingNode, note that in the PR description so the gap is intentional.

Is there an existing stash → rehydrate → ACK regression elsewhere in the DDS test corpus that exercises applyStashedOp end-to-end through the pending-id queue? If yes, this is fully addressed by reference.

previousValue,
};
return localMetadata;
Expand Down Expand Up @@ -352,8 +355,8 @@ export class SharedCell<T = any>
this.setCore(cellOpMetadata.previousValue as Serializable<T>);
}

const lastPendingMessageId = this.pendingMessageIds.pop();
if (lastPendingMessageId !== cellOpMetadata.pendingMessageId) {
const lastPendingNode = this.pendingMessageIds.pop();
if (lastPendingNode?.data !== cellOpMetadata.pendingMessageId) {
throw new Error("Rollback op does not match last pending");
}
} else {
Expand Down
8 changes: 8 additions & 0 deletions packages/dds/cell/src/interfaces.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
* Licensed under the MIT License.
*/

import type { ListNode } from "@fluidframework/core-utils/internal";
import type { Serializable } from "@fluidframework/datastore-definitions/internal";
import type { AttributionKey } from "@fluidframework/runtime-definitions/internal";
import type {
Expand Down Expand Up @@ -125,6 +126,13 @@ export interface ICellLocalOpMetadata<T = any> {
*/
pendingMessageId: number;

/**
* The node in the pending message list corresponding to this operation (op).
* Holding a direct reference to the node enables O(1) removal from arbitrary
* positions in the pending list, which is required for future squash support.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep Review: ListNode<number> leaks onto the exported ICellLocalOpMetadata here, and the new import type { ListNode } from "@fluidframework/core-utils/internal" at the top of this file couples the cell package's interfaces module to a linked-list node type that is purely a runtime implementation detail.

Containment check: ICellLocalOpMetadata is not re-exported from packages/dds/cell/src/index.ts and the only literal construction is updated in cell.ts:306-317, so this is not a public-API break today. The cost is encapsulation and maintainability — embedding ListNode<number> on a package-shaped interface makes the metadata shape harder to mock or re-implement out of tree, and your own PR self-summary flagged this same risk. PR #12273's reviewer (scarlettjlee) historically pushed back on widening this exact interface with implementation details.

Suggested fix: declare a private ICellPendingLocalOpMetadata extends ICellLocalOpMetadata inside cell.ts and use that as the concrete metadata type at the three creation/consumption sites — interfaces.ts then doesn't need the core-utils/internal import. If pendingNode must remain on the exported type for a reason not visible in the diff, mark it optional and tag it @internal. The O(1) ACK and the future O(1) arbitrary-removal capability are preserved either way.

*/
pendingNode: ListNode<number>;

/**
* The value of the {@link ISharedCell} prior to this operation (op).
*/
Expand Down
Loading