silo/docs/DAG.md

# Dependency DAG Specification

**Status:** Draft
**Last Updated:** 2026-02-13

---

## 1. Purpose

The Dependency DAG is a server-side graph that tracks how features, constraints, and assembly relationships depend on each other. It enables three capabilities described in [MULTI_USER_EDITS.md](MULTI_USER_EDITS.md):

1. **Interference detection** -- comparing dependency cones of concurrent edit sessions to classify conflicts as none, soft, or hard before the user encounters them.
2. **Incremental validation** -- marking changed nodes dirty and propagating only through the affected subgraph, using input-hash memoization to stop early when inputs haven't changed.
3. **Structured merge safety** -- walking the DAG to determine whether concurrent edits share upstream dependencies, deciding if auto-merge is safe or manual review is required.

---

## 2. Two-Tier Model

Silo maintains two levels of dependency graph:

### 2.1 BOM DAG (existing)

The assembly-to-part relationship graph already stored in the `relationships` table. Each row represents a parent item containing a child item with a quantity and relationship type (`component`, `alternate`, `reference`). This graph is queried via `GetBOM`, `GetExpandedBOM`, `GetWhereUsed`, and `HasCycle` in `internal/db/relationships.go`.

The BOM DAG is **not modified** by this specification. It continues to serve its existing purpose.

### 2.2 Feature DAG (new)

A finer-grained graph stored in `dag_nodes` and `dag_edges` tables. Each node represents a feature within a single item's revision -- a sketch, pad, fillet, pocket, constraint, body, or part-level container. Edges represent "depends on" relationships: if Pad003 depends on Sketch001, an edge runs from Sketch001 to Pad003.

The feature DAG is populated by clients (silo-mod) when users save, or by runners after compute jobs. Silo stores and queries it but does not generate it -- the Create client has access to the feature tree and is the authoritative source.

### 2.3 Cross-Item Edges

Assembly constraints often reference geometry on child parts (e.g., "mate Face6 of PartA to Face2 of PartB"). These cross-item dependencies are stored in `dag_cross_edges`, linking a node in one item to a node in another. Each cross-edge optionally references the `relationships` row that establishes the BOM connection.

---

## 3. Data Model

### 3.1 dag_nodes

| Column | Type | Description |
|--------|------|-------------|
| `id` | UUID | Primary key |
| `item_id` | UUID | FK to `items.id` |
| `revision_number` | INTEGER | Revision this DAG snapshot belongs to |
| `node_key` | TEXT | Feature name from Create (e.g., `Sketch001`, `Pad003`, `Body`) |
| `node_type` | TEXT | One of: `sketch`, `pad`, `pocket`, `fillet`, `chamfer`, `constraint`, `body`, `part`, `datum`, `mirror`, `pattern`, `boolean` |
| `properties_hash` | TEXT | SHA-256 of the node's parametric inputs (sketch coordinates, fillet radius, constraint values). Used for memoization -- if the hash hasn't changed, validation can skip this node. |
| `validation_state` | TEXT | One of: `clean`, `dirty`, `validating`, `failed` |
| `validation_msg` | TEXT | Error message when `validation_state = 'failed'` |
| `metadata` | JSONB | Type-specific data (sketch coords, feature params, constraint definitions) |
| `created_at` | TIMESTAMPTZ | Row creation time |
| `updated_at` | TIMESTAMPTZ | Last state change |

**Uniqueness:** `(item_id, revision_number, node_key)` -- one node per feature per revision.

### 3.2 dag_edges

| Column | Type | Description |
|--------|------|-------------|
| `id` | UUID | Primary key |
| `source_node_id` | UUID | FK to `dag_nodes.id` -- the upstream node |
| `target_node_id` | UUID | FK to `dag_nodes.id` -- the downstream node that depends on source |
| `edge_type` | TEXT | `depends_on` (default), `references`, `constrains` |
| `metadata` | JSONB | Optional edge metadata |

**Direction convention:** An edge from A to B means "B depends on A". A is upstream, B is downstream. Forward-cone traversal from A walks edges where A is the source.

**Uniqueness:** `(source_node_id, target_node_id, edge_type)`.

**Constraint:** `source_node_id != target_node_id` (no self-edges).

### 3.3 dag_cross_edges

| Column | Type | Description |
|--------|------|-------------|
| `id` | UUID | Primary key |
| `source_node_id` | UUID | FK to `dag_nodes.id` -- node in item A |
| `target_node_id` | UUID | FK to `dag_nodes.id` -- node in item B |
| `relationship_id` | UUID | FK to `relationships.id` (nullable) -- the BOM entry connecting the two items |
| `edge_type` | TEXT | `assembly_ref` (default) |
| `metadata` | JSONB | Reference details (face ID, edge ID, etc.) |

**Uniqueness:** `(source_node_id, target_node_id)`.

---

## 4. Validation States

Each node has a `validation_state` that tracks whether its computed geometry is current:

| State | Meaning |
|-------|---------|
| `clean` | Node's geometry matches its `properties_hash`. No recompute needed. |
| `dirty` | An upstream change has propagated to this node. Recompute required. |
| `validating` | A compute job is currently revalidating this node. |
| `failed` | Recompute failed. `validation_msg` contains the error. |

### 4.1 State Transitions

```
clean → dirty       (upstream change detected, or MarkDirty called)
dirty → validating  (compute job claims this node)
validating → clean  (recompute succeeded, properties_hash updated)
validating → failed (recompute produced an error)
failed → dirty      (upstream change detected, retry possible)
dirty → clean       (properties_hash matches previous -- memoization shortcut)
```

### 4.2 Dirty Propagation

When a node is marked dirty, all downstream nodes in its forward cone are also marked dirty. This is done atomically in a single recursive CTE:

```sql
WITH RECURSIVE forward_cone AS (
    SELECT $1::uuid AS node_id
    UNION
    SELECT e.target_node_id
    FROM dag_edges e
    JOIN forward_cone fc ON fc.node_id = e.source_node_id
)
UPDATE dag_nodes SET validation_state = 'dirty', updated_at = now()
WHERE id IN (SELECT node_id FROM forward_cone)
  AND validation_state = 'clean';
```

### 4.3 Memoization

Before marking a node dirty, the system can compare the new `properties_hash` against the stored value. If they match, the change did not affect this node's inputs, and propagation stops. This is the memoization boundary described in MULTI_USER_EDITS.md Section 5.2.

---

## 5. Graph Queries

### 5.1 Forward Cone

Returns all nodes downstream of a given node -- everything that would be affected if the source node changes. Used for interference detection: if two users' forward cones overlap, there is potential interference.

```sql
WITH RECURSIVE forward_cone AS (
    SELECT target_node_id AS node_id
    FROM dag_edges WHERE source_node_id = $1
    UNION
    SELECT e.target_node_id
    FROM dag_edges e
    JOIN forward_cone fc ON fc.node_id = e.source_node_id
)
SELECT n.* FROM dag_nodes n JOIN forward_cone fc ON n.id = fc.node_id;
```

### 5.2 Backward Cone

Returns all nodes upstream of a given node -- everything the target node depends on.

### 5.3 Dirty Subgraph

Returns all nodes for a given item where `validation_state != 'clean'`, along with their edges. This is the input to an incremental validation job.

### 5.4 Cycle Detection

Before adding an edge, check that it would not create a cycle. Uses the same recursive ancestor-walk pattern as `HasCycle` in `internal/db/relationships.go`.

---

## 6. DAG Sync

Clients push the full feature DAG to Silo via `PUT /api/items/{partNumber}/dag`. The sync payload is a JSON document:

```json
{
  "revision": 3,
  "nodes": [
    {
      "key": "Sketch001",
      "type": "sketch",
      "properties_hash": "a1b2c3...",
      "metadata": {
        "coordinates": [[0, 0], [10, 0], [10, 5]],
        "constraints": ["horizontal", "vertical"]
      }
    },
    {
      "key": "Pad003",
      "type": "pad",
      "properties_hash": "d4e5f6...",
      "metadata": {
        "length": 15.0,
        "direction": [0, 0, 1]
      }
    }
  ],
  "edges": [
    {
      "source": "Sketch001",
      "target": "Pad003",
      "type": "depends_on"
    }
  ]
}
```

The server processes this within a single transaction:
1. Upsert all nodes (matched by `item_id + revision_number + node_key`).
2. Replace all edges for this item/revision.
3. Compare new `properties_hash` values against stored values to detect changes.
4. Mark changed nodes and their forward cones dirty.
5. Publish `dag.updated` SSE event.

---

## 7. Interference Detection

When a user registers an edit context (MULTI_USER_EDITS.md Section 3.1), the server:

1. Looks up the node(s) being edited by `node_key` within the item's current revision.
2. Computes the forward cone for those nodes.
3. Compares the cone against all active edit sessions' cones.
4. Classifies interference:
   - **No overlap** → no interference, fully concurrent.
   - **Overlap, different objects** → soft interference, visual indicator via SSE.
   - **Same object, same edit type** → hard interference, edit blocked.

---

## 8. REST API

All endpoints are under `/api/items/{partNumber}` and require authentication.

| Method | Path | Auth | Description |
|--------|------|------|-------------|
| `GET` | `/dag` | viewer | Get full feature DAG for current revision |
| `GET` | `/dag/forward-cone/{nodeKey}` | viewer | Get forward dependency cone |
| `GET` | `/dag/dirty` | viewer | Get dirty subgraph |
| `PUT` | `/dag` | editor | Sync full feature tree (from client or runner) |
| `POST` | `/dag/mark-dirty/{nodeKey}` | editor | Manually mark a node and its cone dirty |

---

## 9. References

- [MULTI_USER_EDITS.md](MULTI_USER_EDITS.md) -- Full multi-user editing specification
- [WORKERS.md](WORKERS.md) -- Worker/runner system that executes validation jobs
- [ROADMAP.md](ROADMAP.md) -- Tier 0 Dependency DAG entry