| `fiFamily` | `fi_fam fi` |
| `fiTyVars` | `fi_tvs fi` |
| `fiArgs` | `fi_tys fi`, filtered through `visibleArgs (famInstTyCon fi)` |
-| `fiRhs` | `fi_rhs fi` |
+| `fiRhs` | `fi_rhs fi` (note: for data fam-instances this is the synthetic R: TyCon, which we then rewrite — see below) |
| `fiSrc` | `coAxBranchSpan (coAxiomSingleBranch (fi_axiom fi))` |
| `fiIsData` | `fi_flavor fi == DataFamilyInst _` |
+| `fiDataCons` | `tyConDataCons rep_tc`, where `rep_tc` is pulled out of `DataFamilyInst rep_tc` (see below) |
+| `fiDefinedIn` | filled in by the merge step from the enclosing `ModuleDump`'s package id (after normalisation) |
A few extraction choices worth documenting:
The family's `Name` would point at the `data family` / `type family`
declaration, which is wrong for every instance after the first.
-- **Data-family RHS is intentionally hidden in the viewer.** GHC
- synthesises a fresh data-constructor TyCon for every
- `data instance`, with a name like `R:Foo[List]Intmk`. That name has
- no source-level meaning, so we only record it in the schema (for
- completeness) and the viewer drops it from labels when `fiIsData`.
+- **Data-family RHS is rewritten to the abstract family form.** GHC
+ represents `data instance Foo Args = …` internally as a synthetic
+ `R:FooArgs` TyCon. When `R:FooArgs` shows up inside a constraint
+ or instance head, the user reads it as `Foo Args` (the abstract
+ application). `Classgraph.Extract.typeArg` gates on
+ `tyConFamInst_maybe` *before* the regular `TyConApp` path: a TyCon
+ for which it returns `Just (parent, parentArgs)` is emitted as
+ `FamilyApp parent parentArgs'`, where `parentArgs'` is
+ `parentArgs` after substituting the rep TyCon's tyvars
+ (`tyConTyVars rep_tc`) with the use-site args from
+ `splitTyConApp_maybe`. Without that substitution, `parentArgs`
+ references the rep's *internal* tyvars and they leak through
+ `typeArg` as `OtherArg "<reptv>"` because they're not in the
+ caller's `boundTvs`.
+
+- **Data-family RHS is *also* circular by construction.** The
+ rewrite above means the `fiRhs` of a data fam-instance is
+ structurally equal to its LHS (e.g. for `data instance Crate
+ Int = …`, `fiRhs` is `FamilyApp Crate [Int]`). The viewer hides
+ the RHS for data fam-instances in node labels (and in the side
+ panel — see below); `reduceTypeArg` skips `fiIsData` rows
+ entirely to avoid an infinite recursion that would otherwise
+ loop forever rewriting `Crate Int` → `Crate Int`.
+
+- **`fiDataCons` from `rep_tc`, not `famInstTyCon`.** For a data
+ fam-instance, `famInstTyCon` returned the family TyCon (not the
+ rep) on at least GHC 9.14.1, opposite to what the GHC docs
+ suggested. We pull the rep TyCon directly out of the
+ `DataFamilyInst rep_tc` constructor and call `tyConDataCons` on
+ it. Each `DataCon` becomes a `DataConInfo` with the constructor's
+ name, its arg types (as `TypeArg`s in the fam-instance's
+ `fiTyVars` scope), and its field labels (when record syntax is
+ used; otherwise an empty list). The viewer renders just the
+ constructor *names* on the node label (`Family args = ConA |
+ ConB`), and the full constructor declarations (with field types
+ / record fields) in the side panel.
## Type-family resolution in the viewer
2. **For each fam-instance `fi` of that family:**
- 1. **Bidirectional unification** between `fi.fiArgs` and the
- `FamilyApp`'s args. `biUnify` allows TyVarRefs on either side to
- bind to anything on the other side. *If unification fails, this
- `fi` is irrelevant — skip it entirely.* (This is what stops e.g.
- a `Foo Int`-shaped predicate from being matched against a
- `type instance F Bool = …` equation.)
-
- 2. **Substitute `fi.fiRhs` into the predicate.** The bidirectional
+ 1. **Reduce the pivot's sub-args first.** Before unifying we run
+ `reduceTypeArg` over the *FamilyApp pivot's sub-args*, so a
+ nested family application gets unwrapped where possible. For
+ a use site `F (G a)` with `type instance G _ = something`,
+ the pivot's args go from `[FamilyApp G [a]]` to
+ `[something]` — which is what gives `biUnify` a chance to
+ match `F`'s fam-instances. Without this step nested-family
+ use sites looked unresolvable even when the equations to make
+ them resolvable were right there.
+
+ 2. **Bidirectional unification** between `fi.fiArgs` and the
+ (reduced) `FamilyApp`'s args. `biUnify` allows TyVarRefs on
+ either side to bind to anything on the other side. *If
+ unification fails, this `fi` is irrelevant — skip it
+ entirely.* (This is what stops e.g. a `Foo Int`-shaped
+ predicate from being matched against a `type instance F Bool
+ = …` equation.) The wildcard rule is intentionally
+ permissive — when the inner reduction unwraps to a `TyVarRef`
+ (an abstract tyvar in the use site), every concrete
+ data-instance of the outer family ends up matching it. The
+ user sees a chain edge per match; we deliberately don't try
+ to disambiguate further.
+
+ 3. **Substitute `fi.fiRhs` into the predicate.** The bidirectional
unification gave us substitutions for both fi's tyvars (used to
rewrite fi's RHS) and the predicate's tyvars (used to specialise
the rest of the predicate). The result is a fully-substituted
predicate with no occurrence of the originating family.
- 3. **Reduce the result.** `reduceTypeArg` walks the resulting
+ 4. **Reduce the result.** `reduceTypeArg` walks the resulting
`TypeArg` and tries to apply *any* available fam-instance
(using one-sided unification — predicate vars stay free) to
collapse remaining family applications. This handles chained
reductions like `F (G x)` where `F` and `G` are both reducible.
+ Data fam-instances are skipped here (their `fiRhs` is
+ structurally equal to the LHS post-rewrite, so naïve
+ recursion looped forever).
- 4. **Find a matching class instance.** `findMatchingInstances`
+ 5. **Find a matching class instance.** `findMatchingInstances`
scans `pdInstances` for an instance of the predicate's class
whose `iiArgs` shape-match the reduced args (TyVarRefs on
- either side acting as wildcards).
+ either side acting as wildcards). If one or more are found,
+ draw a `fam-resolves` edge from the `fi` node to each.
+
+ 6. **Fall back to the predicate node.** If no class instance
+ matched (typically because the class's defining package
+ isn't in the dumps), draw a `fam-resolves` edge from the
+ `fi` node to the originating *predicate node* (the one
+ created from the context constraint or unmatched superclass).
+ Without this fallback the fam-instance ended up visually
+ orphaned even though we knew exactly which constraint it was
+ discharging.
- 5. **Draw the chain.** Edge from focused instance → family node →
- `fi` node → matched class instance.
+ 7. **Draw the chain.** Net result: focused instance → predicate
+ node → fam-instance(s) → class instance(s) (or back to the
+ predicate node if no class instance was found).
-If any step fails — relevance check, no fam-instance applies, no
-matching class instance exists in the dump — the chain stops at the
-last node we can justify.
+If the relevance check kills every `fi` of `fa` — see the next
+section on placeholders for what happens in that branch.
## Are the resolution chains deterministic?
visual language as the placeholder, so the chain reads as one
continuous "outside this project" thread.
+- **Unapplied-family use sites get no placeholder.** When a
+ constraint references a family with *no* args — passed as a
+ higher-kinded thing, e.g. `class HasTables l where …` and a
+ context `HasTables LedgerState` where `LedgerState` is a data
+ family of kind `Type -> MapKind -> Type` left unapplied — the
+ use-site args list is empty. A `LedgerState = ?` placeholder
+ would convey nothing beyond what the family node itself shows,
+ so the synthesizer skips empty-args use sites entirely.
+ Concretizing the unapplied form to e.g. `LedgerState
+ (ShelleyBlock proto era) mk` would require constraint-solver-
+ like propagation across the constraint set (matching `l ~
+ LedgerState (ShelleyBlock proto era)` from elsewhere) — work
+ GHC does at typecheck time but we don't preserve in the dump.
+
+- **Side panel honestly explains why a placeholder fired.** The
+ panel for a placeholder fam-instance lists both common reasons
+ (external family vs. nested/abstract args we couldn't reduce
+ through) so the user knows which dump to add or which limitation
+ is in play, rather than silently looking unresolved.
+
## What the viewer is *not* doing
A few things we deliberately don't do, in case you wonder: