Files
silo/docs/DAG.md
Forbes 9a8b3150ff docs: add DAG and worker system specifications
DAG.md describes the two-tier dependency graph (BOM DAG + feature DAG),
node/edge data model, validation states, dirty propagation, forward/backward
cone queries, DAG sync payload format, and REST API.

WORKERS.md describes the general-purpose async compute job system: YAML job
definitions, job lifecycle (pending→claimed→running→completed/failed),
runner registration and authentication, claim semantics (SELECT FOR UPDATE
SKIP LOCKED), timeout enforcement, SSE events, and REST API.
2026-02-14 13:03:48 -06:00

247 lines
9.7 KiB
Markdown

# Dependency DAG Specification
**Status:** Draft
**Last Updated:** 2026-02-13
---
## 1. Purpose
The Dependency DAG is a server-side graph that tracks how features, constraints, and assembly relationships depend on each other. It enables three capabilities described in [MULTI_USER_EDITS.md](MULTI_USER_EDITS.md):
1. **Interference detection** -- comparing dependency cones of concurrent edit sessions to classify conflicts as none, soft, or hard before the user encounters them.
2. **Incremental validation** -- marking changed nodes dirty and propagating only through the affected subgraph, using input-hash memoization to stop early when inputs haven't changed.
3. **Structured merge safety** -- walking the DAG to determine whether concurrent edits share upstream dependencies, deciding if auto-merge is safe or manual review is required.
---
## 2. Two-Tier Model
Silo maintains two levels of dependency graph:
### 2.1 BOM DAG (existing)
The assembly-to-part relationship graph already stored in the `relationships` table. Each row represents a parent item containing a child item with a quantity and relationship type (`component`, `alternate`, `reference`). This graph is queried via `GetBOM`, `GetExpandedBOM`, `GetWhereUsed`, and `HasCycle` in `internal/db/relationships.go`.
The BOM DAG is **not modified** by this specification. It continues to serve its existing purpose.
### 2.2 Feature DAG (new)
A finer-grained graph stored in `dag_nodes` and `dag_edges` tables. Each node represents a feature within a single item's revision -- a sketch, pad, fillet, pocket, constraint, body, or part-level container. Edges represent "depends on" relationships: if Pad003 depends on Sketch001, an edge runs from Sketch001 to Pad003.
The feature DAG is populated by clients (silo-mod) when users save, or by runners after compute jobs. Silo stores and queries it but does not generate it -- the Create client has access to the feature tree and is the authoritative source.
### 2.3 Cross-Item Edges
Assembly constraints often reference geometry on child parts (e.g., "mate Face6 of PartA to Face2 of PartB"). These cross-item dependencies are stored in `dag_cross_edges`, linking a node in one item to a node in another. Each cross-edge optionally references the `relationships` row that establishes the BOM connection.
---
## 3. Data Model
### 3.1 dag_nodes
| Column | Type | Description |
|--------|------|-------------|
| `id` | UUID | Primary key |
| `item_id` | UUID | FK to `items.id` |
| `revision_number` | INTEGER | Revision this DAG snapshot belongs to |
| `node_key` | TEXT | Feature name from Create (e.g., `Sketch001`, `Pad003`, `Body`) |
| `node_type` | TEXT | One of: `sketch`, `pad`, `pocket`, `fillet`, `chamfer`, `constraint`, `body`, `part`, `datum`, `mirror`, `pattern`, `boolean` |
| `properties_hash` | TEXT | SHA-256 of the node's parametric inputs (sketch coordinates, fillet radius, constraint values). Used for memoization -- if the hash hasn't changed, validation can skip this node. |
| `validation_state` | TEXT | One of: `clean`, `dirty`, `validating`, `failed` |
| `validation_msg` | TEXT | Error message when `validation_state = 'failed'` |
| `metadata` | JSONB | Type-specific data (sketch coords, feature params, constraint definitions) |
| `created_at` | TIMESTAMPTZ | Row creation time |
| `updated_at` | TIMESTAMPTZ | Last state change |
**Uniqueness:** `(item_id, revision_number, node_key)` -- one node per feature per revision.
### 3.2 dag_edges
| Column | Type | Description |
|--------|------|-------------|
| `id` | UUID | Primary key |
| `source_node_id` | UUID | FK to `dag_nodes.id` -- the upstream node |
| `target_node_id` | UUID | FK to `dag_nodes.id` -- the downstream node that depends on source |
| `edge_type` | TEXT | `depends_on` (default), `references`, `constrains` |
| `metadata` | JSONB | Optional edge metadata |
**Direction convention:** An edge from A to B means "B depends on A". A is upstream, B is downstream. Forward-cone traversal from A walks edges where A is the source.
**Uniqueness:** `(source_node_id, target_node_id, edge_type)`.
**Constraint:** `source_node_id != target_node_id` (no self-edges).
### 3.3 dag_cross_edges
| Column | Type | Description |
|--------|------|-------------|
| `id` | UUID | Primary key |
| `source_node_id` | UUID | FK to `dag_nodes.id` -- node in item A |
| `target_node_id` | UUID | FK to `dag_nodes.id` -- node in item B |
| `relationship_id` | UUID | FK to `relationships.id` (nullable) -- the BOM entry connecting the two items |
| `edge_type` | TEXT | `assembly_ref` (default) |
| `metadata` | JSONB | Reference details (face ID, edge ID, etc.) |
**Uniqueness:** `(source_node_id, target_node_id)`.
---
## 4. Validation States
Each node has a `validation_state` that tracks whether its computed geometry is current:
| State | Meaning |
|-------|---------|
| `clean` | Node's geometry matches its `properties_hash`. No recompute needed. |
| `dirty` | An upstream change has propagated to this node. Recompute required. |
| `validating` | A compute job is currently revalidating this node. |
| `failed` | Recompute failed. `validation_msg` contains the error. |
### 4.1 State Transitions
```
clean → dirty (upstream change detected, or MarkDirty called)
dirty → validating (compute job claims this node)
validating → clean (recompute succeeded, properties_hash updated)
validating → failed (recompute produced an error)
failed → dirty (upstream change detected, retry possible)
dirty → clean (properties_hash matches previous -- memoization shortcut)
```
### 4.2 Dirty Propagation
When a node is marked dirty, all downstream nodes in its forward cone are also marked dirty. This is done atomically in a single recursive CTE:
```sql
WITH RECURSIVE forward_cone AS (
SELECT $1::uuid AS node_id
UNION
SELECT e.target_node_id
FROM dag_edges e
JOIN forward_cone fc ON fc.node_id = e.source_node_id
)
UPDATE dag_nodes SET validation_state = 'dirty', updated_at = now()
WHERE id IN (SELECT node_id FROM forward_cone)
AND validation_state = 'clean';
```
### 4.3 Memoization
Before marking a node dirty, the system can compare the new `properties_hash` against the stored value. If they match, the change did not affect this node's inputs, and propagation stops. This is the memoization boundary described in MULTI_USER_EDITS.md Section 5.2.
---
## 5. Graph Queries
### 5.1 Forward Cone
Returns all nodes downstream of a given node -- everything that would be affected if the source node changes. Used for interference detection: if two users' forward cones overlap, there is potential interference.
```sql
WITH RECURSIVE forward_cone AS (
SELECT target_node_id AS node_id
FROM dag_edges WHERE source_node_id = $1
UNION
SELECT e.target_node_id
FROM dag_edges e
JOIN forward_cone fc ON fc.node_id = e.source_node_id
)
SELECT n.* FROM dag_nodes n JOIN forward_cone fc ON n.id = fc.node_id;
```
### 5.2 Backward Cone
Returns all nodes upstream of a given node -- everything the target node depends on.
### 5.3 Dirty Subgraph
Returns all nodes for a given item where `validation_state != 'clean'`, along with their edges. This is the input to an incremental validation job.
### 5.4 Cycle Detection
Before adding an edge, check that it would not create a cycle. Uses the same recursive ancestor-walk pattern as `HasCycle` in `internal/db/relationships.go`.
---
## 6. DAG Sync
Clients push the full feature DAG to Silo via `PUT /api/items/{partNumber}/dag`. The sync payload is a JSON document:
```json
{
"revision": 3,
"nodes": [
{
"key": "Sketch001",
"type": "sketch",
"properties_hash": "a1b2c3...",
"metadata": {
"coordinates": [[0, 0], [10, 0], [10, 5]],
"constraints": ["horizontal", "vertical"]
}
},
{
"key": "Pad003",
"type": "pad",
"properties_hash": "d4e5f6...",
"metadata": {
"length": 15.0,
"direction": [0, 0, 1]
}
}
],
"edges": [
{
"source": "Sketch001",
"target": "Pad003",
"type": "depends_on"
}
]
}
```
The server processes this within a single transaction:
1. Upsert all nodes (matched by `item_id + revision_number + node_key`).
2. Replace all edges for this item/revision.
3. Compare new `properties_hash` values against stored values to detect changes.
4. Mark changed nodes and their forward cones dirty.
5. Publish `dag.updated` SSE event.
---
## 7. Interference Detection
When a user registers an edit context (MULTI_USER_EDITS.md Section 3.1), the server:
1. Looks up the node(s) being edited by `node_key` within the item's current revision.
2. Computes the forward cone for those nodes.
3. Compares the cone against all active edit sessions' cones.
4. Classifies interference:
- **No overlap** → no interference, fully concurrent.
- **Overlap, different objects** → soft interference, visual indicator via SSE.
- **Same object, same edit type** → hard interference, edit blocked.
---
## 8. REST API
All endpoints are under `/api/items/{partNumber}` and require authentication.
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| `GET` | `/dag` | viewer | Get full feature DAG for current revision |
| `GET` | `/dag/forward-cone/{nodeKey}` | viewer | Get forward dependency cone |
| `GET` | `/dag/dirty` | viewer | Get dirty subgraph |
| `PUT` | `/dag` | editor | Sync full feature tree (from client or runner) |
| `POST` | `/dag/mark-dirty/{nodeKey}` | editor | Manually mark a node and its cone dirty |
---
## 9. References
- [MULTI_USER_EDITS.md](MULTI_USER_EDITS.md) -- Full multi-user editing specification
- [WORKERS.md](WORKERS.md) -- Worker/runner system that executes validation jobs
- [ROADMAP.md](ROADMAP.md) -- Tier 0 Dependency DAG entry