Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: tree index api prototype #22491

Draft
wants to merge 37 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 36 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
25b0e1f
add noah's code
jenn-le Jul 30, 2024
9a701be
Merge remote-tracking branch 'upstream/main' into index-api
jenn-le Jul 30, 2024
7fa4244
format
jenn-le Jul 30, 2024
f4e3ced
start trying some type stuff
jenn-le Jul 30, 2024
8a1224b
format
jenn-le Jul 31, 2024
9587173
some fixes
jenn-le Jul 31, 2024
c305255
some type inference fixes
jenn-le Aug 1, 2024
1d73679
add some comments
jenn-le Aug 2, 2024
bcf4b4f
some more comments
jenn-le Aug 7, 2024
28ef175
more comments
jenn-le Aug 7, 2024
ad8c52a
more comments
jenn-le Aug 8, 2024
4bea967
Merge remote-tracking branch 'upstream/main' into index-api
jenn-le Aug 13, 2024
eb274ce
more cleanup
jenn-le Aug 17, 2024
8f84ef5
Merge remote-tracking branch 'upstream/main' into index-api
jenn-le Aug 19, 2024
849121a
some more comments
jenn-le Aug 22, 2024
80e77a2
some updates
jenn-le Aug 23, 2024
9f41b27
some type fixes
jenn-le Aug 26, 2024
db696d0
Merge remote-tracking branch 'upstream/main' into index-api
jenn-le Aug 26, 2024
95a78ef
format
jenn-le Aug 26, 2024
35bbda8
comment out stuff
jenn-le Aug 27, 2024
3ea7711
Merge remote-tracking branch 'upstream/main' into index-api
jenn-le Aug 27, 2024
2ab3e95
stuff
jenn-le Aug 27, 2024
42844b6
stuff
jenn-le Sep 3, 2024
ef70f40
Merge remote-tracking branch 'upstream/main' into index-api
jenn-le Sep 4, 2024
d9eaa93
stuff
jenn-le Sep 4, 2024
6b2075c
stuff
jenn-le Sep 4, 2024
4b07bbf
Merge remote-tracking branch 'upstream/main' into index-api
jenn-le Sep 9, 2024
25e29b1
stuff to get tests working
jenn-le Sep 11, 2024
a38f598
fix deps and get tests passing
jenn-le Sep 12, 2024
459d17c
some changes
jenn-le Sep 13, 2024
8dee5df
Merge remote-tracking branch 'upstream/main' into index-api
jenn-le Sep 13, 2024
2455119
fix test
jenn-le Sep 13, 2024
5cb8fa7
add skipped test and todos
jenn-le Sep 14, 2024
23de9d2
pr feedback
jenn-le Sep 16, 2024
850c146
pr feedback
jenn-le Sep 16, 2024
592d0fe
stuff
jenn-le Sep 16, 2024
e652c9d
remove onlys
jenn-le Sep 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .changeset/stinky-shoes-melt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
"fluid-framework": minor
---
---
"section": "tree"
---
Export alpha index APIs

`fluid-framework/beta` now contains the `@beta` APIs from `@fluidframework/tree/beta`.
15 changes: 15 additions & 0 deletions packages/dds/tree/api-report/tree.alpha.api.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,15 @@ export interface CommitMetadata {
readonly kind: CommitKind;
}

// @alpha
export function createIdentifierIndex(context: FlexTreeContext): IdentifierIndex;

// @alpha
export function createSimpleTreeIndex<TKey extends TreeValue, TValue>(context: FlexTreeContext, indexer: (schema: TreeNodeSchema) => KeyFinder<TKey> | undefined, getValue: (nodes: TreeIndexNodes<TreeNode>) => TValue): SimpleTreeIndex<TKey, TValue>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want this in alpha in the current form. Having users provide a KeyFinder (and take the burden of determinism/inval) seems like a bug pit. I feel like a reduced scope form (maybe they provide simple tree schema type + a value field pair and we have a KeyFinder that knows how to handle those) should be exposed first.

Copy link
Contributor

@taylorsw04 taylorsw04 Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely can't expose it until the value field mutation inval issue is solved (if it exists at all)


// @public (undocumented)
export function createSimpleTreeIndex<TKey extends TreeValue, TValue, TSchema extends TreeNodeSchema>(context: FlexTreeContext, indexer: (schema: TSchema) => KeyFinder<TKey> | undefined, getValue: (nodes: TreeIndexNodes<NodeFromSchema<TSchema>>) => TValue, indexableSchema: readonly TSchema[]): SimpleTreeIndex<TKey, TValue>;

// @public @sealed
interface DefaultProvider extends ErasedType<"@fluidframework/tree.FieldProvider"> {
}
Expand Down Expand Up @@ -87,6 +96,9 @@ type FlexListToUnion<TList extends FlexList> = ExtractItemType<TList[number]>;
// @alpha
export function getJsonSchema(schema: ImplicitAllowedTypes): JsonTreeSchema;

// @alpha
export type IdentifierIndex = SimpleTreeIndex<string, TreeNode>;

// @public
export type ImplicitAllowedTypes = AllowedTypes | TreeNodeSchema;

Expand Down Expand Up @@ -427,6 +439,9 @@ export class SchemaFactory<out TScope extends string | undefined = string | unde
// @public
type ScopedSchemaName<TScope extends string | undefined, TName extends number | string> = TScope extends undefined ? `${TName}` : `${TScope}.${TName}`;

// @alpha
export type SimpleTreeIndex<TKey extends TreeValue, TValue> = TreeIndex<TKey, TValue>;

// @public
export type TransactionConstraint = NodeInDocumentConstraint;

Expand Down
3 changes: 3 additions & 0 deletions packages/dds/tree/api-report/tree.beta.api.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@ export interface CommitMetadata {
readonly kind: CommitKind;
}

// @public (undocumented)
export function createSimpleTreeIndex<TKey extends TreeValue, TValue, TSchema extends TreeNodeSchema>(context: FlexTreeContext, indexer: (schema: TSchema) => KeyFinder<TKey> | undefined, getValue: (nodes: TreeIndexNodes<NodeFromSchema<TSchema>>) => TValue, indexableSchema: readonly TSchema[]): SimpleTreeIndex<TKey, TValue>;

// @public @sealed
interface DefaultProvider extends ErasedType<"@fluidframework/tree.FieldProvider"> {
}
Expand Down
3 changes: 3 additions & 0 deletions packages/dds/tree/api-report/tree.public.api.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@ export interface CommitMetadata {
readonly kind: CommitKind;
}

// @public (undocumented)
export function createSimpleTreeIndex<TKey extends TreeValue, TValue, TSchema extends TreeNodeSchema>(context: FlexTreeContext, indexer: (schema: TSchema) => KeyFinder<TKey> | undefined, getValue: (nodes: TreeIndexNodes<NodeFromSchema<TSchema>>) => TValue, indexableSchema: readonly TSchema[]): SimpleTreeIndex<TKey, TValue>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume these will go away (public and beta)?


// @public @sealed
interface DefaultProvider extends ErasedType<"@fluidframework/tree.FieldProvider"> {
}
Expand Down
8 changes: 8 additions & 0 deletions packages/dds/tree/src/feature-libraries/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -250,3 +250,11 @@ export {
tryGetMapTreeNode,
UnhydratedContext,
} from "./flex-map-tree/index.js";

export {
type KeyFinder,
AnchorTreeIndex,
hasElement,
type TreeIndex,
type TreeIndexNodes,
} from "./indexing/index.js";
291 changes: 291 additions & 0 deletions packages/dds/tree/src/feature-libraries/indexing/anchorTreeIndex.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,291 @@
/*!
* Copyright (c) Microsoft Corporation and contributors. All rights reserved.
* Licensed under the MIT License.
*/

import { assert } from "@fluidframework/core-utils/internal";
import { disposeSymbol, fail, getOrCreate } from "../../util/index.js";
import {
type Anchor,
type AnchorNode,
type FieldKey,
type IForestSubscription,
type ITreeSubscriptionCursor,
type TreeNodeSchemaIdentifier,
type TreeValue,
forEachField,
forEachNode,
} from "../../core/index.js";
import type { TreeIndex, TreeIndexNodes } from "./types.js";
import { TreeStatus } from "../flex-tree/index.js";

/**
* A function that gets the value to index a node on.
* The given cursor should point to the node that will be indexed.
*
* @returns a value on the node that the index will use as
*
* @remarks
* This function does not own the cursor in any way, it walks the cursor to find the key the node is indexed on
* but returns the cursor to the state it was in before being passed to the function. It should also not be disposed by this function
* and must be disposed elsewhere.
*/
export type KeyFinder<TKey extends TreeValue> = (tree: ITreeSubscriptionCursor) => TKey;

/**
* An index from some arbitrary keys to anchor nodes. Keys can be anything that is a {@link TreeValue}.
* A key can map to multiple nodes but each collection of nodes only results in a single value.
*
* @remarks
* Detached nodes are stored in the index but filtered out when any public facing apis are called. This means that
* calling {@link keys} will not include any keys that are stored in the index but only map to detached nodes.
*
* TODO: need to make sure key finders are deterministic or have a way to invalidate them
* TODO: the index does not update on leaf node changes
*/
export class AnchorTreeIndex<TKey extends TreeValue, TValue>
implements TreeIndex<TKey, TValue>
{
/**
* Caches {@link KeyFinder}s for each schema definition. If a schema maps to null, it does not
* need to be considered at all for this index. This allows us to skip subtrees that aren't relevant
* as a performance optimization.
*/
private readonly keyFinders = new Map<TreeNodeSchemaIdentifier, KeyFinder<TKey> | null>();
/**
* The actual index from keys to anchor nodes.
*/
private readonly nodes = new Map<TKey, TreeIndexNodes<AnchorNode>>();
/**
* Keeps track of anchors for disposal.
*/
private readonly anchors = new Map<AnchorNode, Anchor>();

/**
* @param forest - the forest that is being indexed
* @param indexer - a function that retrieves the key finder based on a given schema or undefined if the schema does not have an associated key finder
* @param getValue - a function that returns the value or undefined given at least one anchor node
* @param checkTreeStatus - a function that gets the tree status from an anchor node, used for filtering out detached nodes
*/
public constructor(
private readonly forest: IForestSubscription,
indexer: (schemaId: TreeNodeSchemaIdentifier) => KeyFinder<TKey> | undefined,
private readonly getValue: (anchorNodes: TreeIndexNodes<AnchorNode>) => TValue | undefined,
private readonly checkTreeStatus: (node: AnchorNode) => TreeStatus | undefined,
) {
/**
* Given a cursor in field mode, recursively indexes all nodes under the field.
*/
const indexField = (fieldCursor: ITreeSubscriptionCursor): void => {
forEachNode(fieldCursor, (nodeCursor) => {
jenn-le marked this conversation as resolved.
Show resolved Hide resolved
const keyFinder = getOrCreate(
this.keyFinders,
// the node schema type to look up
nodeCursor.type,
// if the indexer does not return a key finder for this schema, we cache a null value to indicate the indexer
// does not need to be called if this schema is encountered in the future
(schema) => indexer(schema) ?? null,
);

if (keyFinder !== null) {
const key = keyFinder(nodeCursor);
const anchor = nodeCursor.buildAnchor();
const anchorNode = forest.anchors.locate(anchor) ?? fail("expected anchor node");

const nodes = this.nodes.get(key);
if (nodes !== undefined) {
// if the key already exists in the index, the anchor node is appended to its list of nodes
this.nodes.set(key, [...nodes, anchorNode]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to clone the array rather than push to it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the array is readonly for api purposes and gets super ugly (not even sure if it's completely feasible) to make a bunch of casts

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we just need the output type (the thing we hand to getValue) to implement readonly array...which array does? can we not just type our internal type as arry, e.g. private readonly nodes = new Map<TKey, AnchorNode[]>();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we filter the array (meaning allocate a new one) before handing it to getValue anyway, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeah, you're right, I was just doing something stupid when I tried it the first time

} else {
this.nodes.set(key, [anchorNode]);
}

this.anchors.set(anchorNode, anchor);
// when the anchor node is destroyed, delete it from the index
anchorNode.on("afterDestroy", () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having talked to Noah, we think this might be better written as a delta visitor instead of event subscriptions

const indexedNodes = this.nodes.get(key);
assert(
indexedNodes !== undefined,
"destroyed anchor node should be tracked by index",
);
const index = indexedNodes.indexOf(anchorNode);
assert(index !== -1, "destroyed anchor node should be tracked by index");
const newNodes = filterNodes(nodes, (n) => n !== anchorNode);
if (newNodes !== undefined) {
this.nodes.set(key, newNodes);
} else {
this.nodes.delete(key);
}
assert(
this.anchors.delete(anchorNode),
"destroyed anchor should be tracked by index",
);
});
}

forEachField(nodeCursor, (f) => {
indexField(f);
});
});
};

const detachedFieldKeys: FieldKey[] = [];
const detachedFieldsCursor = forest.getCursorAboveDetachedFields();
forEachField(detachedFieldsCursor, (field) => {
detachedFieldKeys.push(field.getFieldKey());
});

// index all existing trees (this includes the primary document tree and all other detached/removed trees)
for (const fieldKey of detachedFieldKeys) {
const cursor = forest.allocateCursor();
forest.tryMoveCursorToField({ fieldKey, parent: undefined }, cursor);
indexField(cursor);
cursor.free();
}

// index any new trees that are created later
// TODO this event isn't fired when leaf values are changed
forest.on("afterRootFieldCreated", (fieldKey) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this event guaranteed to fire for all content created in the tree, including modifications of value fields (e.g. user does foo.x = 5, and the x field is indexed)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not, I created a skipped test and added a todo to use a different event. Will add the proposal for it in the api doc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, so are indexes not totally broken until that works? is there another forest event that is fired when that happens that we can subscribe to?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@noencke is claiming that primitives are wrapped in nodes and so even value nodes start out detached and are moved in. this sounds vaguely familiar. have you tested this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, that's what I thought too but then I wrote that skipped test to make sure and it failed

const cursor = forest.allocateCursor();
forest.tryMoveCursorToField({ fieldKey, parent: undefined }, cursor);
indexField(cursor);
cursor.free();
});
}

/**
* Returns the value associated with the given key if it has been indexed
*/
public get(key: TKey): TValue | undefined {
return this.getFilteredValue(this.nodes.get(key));
}

/**
* Returns true iff the key exists in the index
*/
public has(key: TKey): boolean {
return this.get(key) !== undefined;
}

/**
* Returns the number of values that are indexed
*/
public get size(): number {
let s = 0;
for (const nodes of this.nodes.values()) {
if (this.getFilteredValue(nodes) !== undefined) {
s += 1;
}
}
return s;
}

/**
* Returns all keys in the index
*/
public *keys(): IterableIterator<TKey> {
for (const [key, nodes] of this.nodes.entries()) {
if (this.getFilteredValue(nodes) !== undefined) {
yield key;
}
}
}

/**
* Returns an iterable of values in the index
*/
public *values(): IterableIterator<TValue> {
for (const nodes of this.nodes.values()) {
const filtered = this.getFilteredValue(nodes);
if (filtered !== undefined) {
yield filtered;
}
}
}

/**
* Returns an iterable of key, value pairs for every entry in the index
*/
public *entries(): IterableIterator<[TKey, TValue]> {
for (const [key, nodes] of this.nodes.entries()) {
const filtered = this.getFilteredValue(nodes);
if (filtered !== undefined) {
yield [key, filtered];
}
}
}

public [Symbol.iterator](): IterableIterator<[TKey, TValue]> {
return this.entries();
}

/**
* Applies the provided callback to each entry in the index.
*/
public forEach(
callbackfn: (value: TValue, key: TKey, map: AnchorTreeIndex<TKey, TValue>) => void,
thisArg?: unknown,
): void {
for (const [key, nodes] of this.nodes.entries()) {
const filtered = this.getFilteredValue(nodes);
if (filtered !== undefined) {
callbackfn.call(thisArg, filtered, key, this);
}
}
}

/**
* Disposes this index and all the anchors it holds onto.
*/
public [disposeSymbol](): void {
for (const anchor of this.anchors.values()) {
this.forest.forgetAnchor(anchor);
}
this.anchors.clear();
Reflect.defineProperty(this, disposeSymbol, {
value: () => {
throw new Error("Index is already disposed");
},
});
}

/**
* Filters out any anchor nodes that are detached and returns the value for the remaining nodes.
*/
private getFilteredValue(
anchorNodes: TreeIndexNodes<AnchorNode> | undefined,
): TValue | undefined {
const attachedNodes = filterNodes(anchorNodes, (anchorNode) => {
const nodeStatus = this.checkTreeStatus(anchorNode);
return nodeStatus === TreeStatus.InDocument;
});

if (attachedNodes !== undefined) {
return this.getValue(attachedNodes as unknown as TreeIndexNodes<AnchorNode>);
}
}
}

/**
* Filters the given anchor nodes based on the given filter function.
*/
function filterNodes(
anchorNodes: readonly AnchorNode[] | undefined,
filter: (node: AnchorNode) => boolean,
): TreeIndexNodes<AnchorNode> | undefined {
if (anchorNodes !== undefined) {
const filteredNodes: readonly AnchorNode[] = anchorNodes.filter(filter);
if (hasElement(filteredNodes)) {
return filteredNodes;
}
}

return undefined;
}

/**
* Checks that an array is of the type {@link TreeIndexNodes} and has at least one element.
*/
export function hasElement<T>(array: readonly T[]): array is TreeIndexNodes<T> {
return array.length >= 1;
}
11 changes: 11 additions & 0 deletions packages/dds/tree/src/feature-libraries/indexing/index.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
/*!
* Copyright (c) Microsoft Corporation and contributors. All rights reserved.
* Licensed under the MIT License.
*/

export {
AnchorTreeIndex,
type KeyFinder,
hasElement,
} from "./anchorTreeIndex.js";
export type { TreeIndex, TreeIndexNodes } from "./types.js";
13 changes: 13 additions & 0 deletions packages/dds/tree/src/feature-libraries/indexing/types.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
/*!
* Copyright (c) Microsoft Corporation and contributors. All rights reserved.
* Licensed under the MIT License.
*/

import type { TreeValue } from "../../core/index.js";

/**
* an array of nodes that is guaranteed to have at least one element
*/
export type TreeIndexNodes<TNode> = readonly [first: TNode, ...rest: TNode[]];

export interface TreeIndex<TKey extends TreeValue, TValue> extends ReadonlyMap<TKey, TValue> {}
Loading
Loading