-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace unification-fd
with a custom implementation of unification
#1802
Merged
Merged
Changes from 47 commits
Commits
Show all changes
50 commits
Select commit
Hold shift + click to select a range
ea1d6ef
some initial explorations into custom unification code (towards #1661)
byorgey 35c94fb
WIP: substitution
byorgey 9cff7ed
Merge branch 'main' into refactor/custom-unification
byorgey 2305185
fix `.cabal`
byorgey 5d321da
fix warning
byorgey 3e9b7b8
better names in Subst
byorgey 61f5b14
WIP: start ripping out unification-fd
byorgey f258ff4
use `wrapFix` in preference to directly using the `Fix` constructor
byorgey 8b395a7
fix a few warnings
byorgey aa0e056
fix unification code, and derive some necessary instances
byorgey 7a03018
start converting `Swarm.Language.Typecheck` away from `unification-fd`
byorgey 8524418
bit more conversion
byorgey dee9b1e
WIP: work on adding unification effect
byorgey b16ddec
WIP: implement basic unification
byorgey ba30a71
WIP: generalize unification a bit
byorgey 402ddc2
WIP: work on unification effect
byorgey 9a2ed46
WIP: bit of work converting old code
byorgey 26d7242
more unification effect operations
byorgey 5f3345e
WIP: convert some typechecking things to new framework
byorgey 1df0b2d
WIP: convert more typechecking things
byorgey e3b1fe5
WIP: get runTC working, progress on uncommenting things
byorgey 78fe0d9
WIP: get most of Typecheck working
byorgey 8d651fb
WIP: fix bug in subst
byorgey d33f82e
refactor + formatting
byorgey ae54401
refactoring, cleanup, fix comment, fix pretty-printing bug
byorgey f7dd9a3
fix definition of substitution composition
byorgey a31056e
add a todo
byorgey e5b4d3e
refactor: rearrange some code
byorgey 242b966
split out naive unification carrier to separate module
byorgey 0e7ab3d
fix .cabal and formatting
byorgey 3118a13
remove unneeded LANGUAGE pragmas
byorgey 3ce440d
first cut at a faster implementation of unification
byorgey baa2cc8
make use of utility `histogram` function
byorgey 4b42098
some documentation, and fix bug
byorgey c766cfc
comments, remove commented-out code
byorgey 9bc9fef
code cleanup + comments
byorgey 2bc52b1
factor out common unification infrastructure into new module
byorgey bcdf6a6
remove dead-end exploration
byorgey 4228e3e
cleanup and comments
byorgey e0e0ab8
Restyled by fourmolu (#1803)
restyled-io[bot] 1dc6abb
get rid of `unifyCheck`; it now does the same as unification itself
byorgey 4939ff1
Merge branch 'refactor/custom-unification' of github.com:swarm-game/s…
byorgey a3a7e1d
apply `cabal-gild`
byorgey 217682d
update `cabal-gild` version
byorgey 761a872
update to `cabal-gild-setup-action@v2`
byorgey fd25c6d
apply `hlint` hints
byorgey a46f3ab
Merge branch 'main' into refactor/custom-unification
byorgey b3e396a
fix misspelling
byorgey 496c7ef
use `insert` instead of `@@` with singleton as a small optimization
byorgey 1520994
Merge branch 'main' into refactor/custom-unification
mergify[bot] File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
{-# LANGUAGE GADTs #-} | ||
{-# LANGUAGE KindSignatures #-} | ||
|
||
-- | | ||
-- SPDX-License-Identifier: BSD-3-Clause | ||
-- Description: This module defines an effect signature for | ||
-- computations that support doing unification. The intention is for | ||
-- code needing unification to use the operations defined in this | ||
-- module, and then import 'Swarm.Effect.Unify.Fast' to dispatch the | ||
-- 'Unification' effects. | ||
module Swarm.Effect.Unify where | ||
|
||
import Control.Algebra | ||
import Data.Kind (Type) | ||
import Data.Set (Set) | ||
import Swarm.Language.Types hiding (Type) | ||
|
||
-- | Data type representing available unification operations. | ||
data Unification (m :: Type -> Type) k where | ||
Unify :: UType -> UType -> Unification m (Either UnificationError UType) | ||
ApplyBindings :: UType -> Unification m UType | ||
FreshIntVar :: Unification m IntVar | ||
FreeUVars :: UType -> Unification m (Set IntVar) | ||
|
||
-- | Unify two types, returning a type equal to both, or a 'UnificationError' if | ||
-- the types definitely do not unify. | ||
(=:=) :: Has Unification sig m => UType -> UType -> m (Either UnificationError UType) | ||
t1 =:= t2 = send (Unify t1 t2) | ||
|
||
-- | Substitute for all the unification variables that are currently | ||
-- bound. It is guaranteed that any unification variables remaining | ||
-- in the result are not currently bound, /i.e./ we have learned no | ||
-- information about them. | ||
applyBindings :: Has Unification sig m => UType -> m UType | ||
applyBindings = send . ApplyBindings | ||
|
||
-- | Compute the set of free unification variables of a type (after | ||
-- substituting away any which are already bound). | ||
freeUVars :: Has Unification sig m => UType -> m (Set IntVar) | ||
freeUVars = send . FreeUVars | ||
|
||
-- | Generate a fresh unification variable. | ||
freshIntVar :: Has Unification sig m => m IntVar | ||
freshIntVar = send FreshIntVar | ||
|
||
-- | An error that occurred while running the unifier. | ||
data UnificationError where | ||
-- | Occurs check failure, i.e. the solution to some unification | ||
-- equations was an infinite term. | ||
Infinite :: IntVar -> UType -> UnificationError | ||
-- | Mismatch error between the given terms. | ||
UnifyErr :: TypeF UType -> TypeF UType -> UnificationError | ||
deriving (Show) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
{-# LANGUAGE DerivingStrategies #-} | ||
{-# LANGUAGE GeneralisedNewtypeDeriving #-} | ||
|
||
-- | | ||
-- SPDX-License-Identifier: BSD-3-Clause | ||
-- | ||
-- Description: Common definitions used in both the naive and fast | ||
-- implementations of unification. | ||
module Swarm.Effect.Unify.Common where | ||
|
||
import Control.Algebra | ||
import Control.Effect.State (State, get) | ||
import Data.Map (Map) | ||
import Data.Map qualified as M | ||
import Data.Set (Set) | ||
import Prelude hiding (lookup) | ||
|
||
------------------------------------------------------------ | ||
-- Substitutions | ||
|
||
-- | A value of type @Subst n a@ is a substitution which maps | ||
-- names of type @n@ (the /domain/, see 'dom') to values of type | ||
-- @a@. Substitutions can be /applied/ to certain terms (see | ||
-- 'subst'), replacing any free occurrences of names in the | ||
-- domain with their corresponding values. Thus, substitutions can | ||
-- be thought of as functions of type @Term -> Term@ (for suitable | ||
-- @Term@s that contain names and values of the right type). | ||
-- | ||
-- Concretely, substitutions are stored using a @Map@. | ||
newtype Subst n a = Subst {getSubst :: Map n a} | ||
deriving (Eq, Ord, Show, Functor) | ||
|
||
-- | The domain of a substitution is the set of names for which the | ||
-- substitution is defined. | ||
dom :: Subst n a -> Set n | ||
dom = M.keysSet . getSubst | ||
|
||
-- | The identity substitution, /i.e./ the unique substitution with an | ||
-- empty domain, which acts as the identity function on terms. | ||
idS :: Subst n a | ||
idS = Subst M.empty | ||
|
||
-- | Construct a singleton substitution, which maps the given name to | ||
-- the given value. | ||
(|->) :: n -> a -> Subst n a | ||
x |-> t = Subst (M.singleton x t) | ||
|
||
-- | Look up the value a particular name maps to under the given | ||
-- substitution; or return @Nothing@ if the name being looked up is | ||
-- not in the domain. | ||
lookup :: Ord n => n -> Subst n a -> Maybe a | ||
lookup x (Subst m) = M.lookup x m | ||
|
||
-- | Look up a name in a substitution stored in a state effect. | ||
lookupS :: (Ord n, Has (State (Subst n a)) sig m) => n -> m (Maybe a) | ||
lookupS x = lookup x <$> get |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,232 @@ | ||
{-# LANGUAGE DerivingStrategies #-} | ||
{-# LANGUAGE GADTs #-} | ||
{-# LANGUAGE GeneralizedNewtypeDeriving #-} | ||
{-# LANGUAGE UndecidableInstances #-} | ||
|
||
-- | | ||
-- SPDX-License-Identifier: BSD-3-Clause | ||
-- | ||
-- Description: Fast yet purely functional implementation of | ||
-- unification, using a map as a lazy substitution, i.e. a | ||
-- manually-mainted "functional shared memory". | ||
-- | ||
-- See Dijkstra, Middelkoop, & Swierstra, "Efficient Functional | ||
-- Unification and Substitution", Utrecht University tech report | ||
-- UU-CS-2008-027 (section 5) for the basic idea, and Peyton Jones et | ||
-- al, "Practical type inference for arbitrary-rank types" | ||
-- (pp. 74--75) for a correct implementation of unification via | ||
-- references. | ||
module Swarm.Effect.Unify.Fast where | ||
|
||
import Control.Algebra | ||
import Control.Applicative (Alternative) | ||
import Control.Carrier.State.Strict (StateC, evalState) | ||
import Control.Carrier.Throw.Either (ThrowC, runThrow) | ||
import Control.Category ((>>>)) | ||
import Control.Effect.State (State, get, gets, modify) | ||
import Control.Effect.Throw (Throw, throwError) | ||
import Control.Monad.Free | ||
import Control.Monad.Trans (MonadIO) | ||
import Data.Function (on) | ||
import Data.Map qualified as M | ||
import Data.Map.Merge.Lazy qualified as M | ||
import Data.Set qualified as S | ||
import Swarm.Effect.Unify | ||
import Swarm.Effect.Unify.Common | ||
import Swarm.Language.Types hiding (Type) | ||
import Prelude hiding (lookup) | ||
|
||
------------------------------------------------------------ | ||
-- Substitutions | ||
|
||
-- | Compose two substitutions. Applying @s1 \@\@ s2@ is the same as | ||
-- applying first @s2@, then @s1@; that is, semantically, | ||
-- composition of substitutions corresponds exactly to function | ||
-- composition when they are considered as functions on terms. | ||
-- | ||
-- As one would expect, composition is associative and has 'idS' as | ||
-- its identity. | ||
-- | ||
-- Note that we do /not/ apply @s1@ to all the values in @s2@, since | ||
-- the substitution is maintained lazily; we do not need to maintain | ||
-- the invariant that values in the mapping do not contain any of | ||
-- the keys. This makes composition much faster, at the cost of | ||
-- making application more complex. | ||
(@@) :: (Ord n, Substitutes n a a) => Subst n a -> Subst n a -> Subst n a | ||
(Subst s1) @@ (Subst s2) = Subst (s2 `M.union` s1) | ||
|
||
-- | Class of things supporting substitution. @Substitutes n b a@ means | ||
-- that we can apply a substitution of type @Subst n b@ to a | ||
-- value of type @a@, replacing all the free names of type @n@ | ||
-- inside the @a@ with values of type @b@, resulting in a new value | ||
-- of type @a@. | ||
-- | ||
-- We also do a lazy occurs-check during substitution application, | ||
-- so we need the ability to throw a unification error. | ||
class Substitutes n b a where | ||
subst :: Has (Throw UnificationError) sig m => Subst n b -> a -> m a | ||
|
||
-- | We can perform substitution on terms built up as the free monad | ||
-- over a structure functor @f@. | ||
instance Substitutes IntVar UType UType where | ||
subst s = go S.empty | ||
where | ||
go seen (Pure x) = case lookup x s of | ||
Nothing -> pure $ Pure x | ||
Just t | ||
| S.member x seen -> throwError $ Infinite x t | ||
| otherwise -> go (S.insert x seen) t | ||
go seen (Free t) = Free <$> goF seen t | ||
|
||
goF _ t@(TyBaseF {}) = pure t | ||
goF _ t@(TyVarF {}) = pure t | ||
goF seen (TySumF t1 t2) = TySumF <$> go seen t1 <*> go seen t2 | ||
goF seen (TyProdF t1 t2) = TyProdF <$> go seen t1 <*> go seen t2 | ||
goF seen (TyRcdF m) = TyRcdF <$> mapM (go seen) m | ||
goF seen (TyCmdF c) = TyCmdF <$> go seen c | ||
goF seen (TyDelayF c) = TyDelayF <$> go seen c | ||
goF seen (TyFunF t1 t2) = TyFunF <$> go seen t1 <*> go seen t2 | ||
|
||
------------------------------------------------------------ | ||
-- Carrier type | ||
|
||
-- Note: this carrier type and the runUnification function are | ||
-- identical between this module and Swarm.Effect.Unify.Naive, but it | ||
-- seemed best to duplicate it, so we can modify the carriers | ||
-- independently in the future if we want. | ||
|
||
-- | Carrier type for unification: we maintain a current substitution, | ||
-- a counter for generating fresh unification variables, and can | ||
-- throw unification errors. | ||
newtype UnificationC m a = UnificationC | ||
{ unUnificationC :: | ||
StateC (Subst IntVar UType) (StateC FreshVarCounter (ThrowC UnificationError m)) a | ||
} | ||
deriving newtype (Functor, Applicative, Alternative, Monad, MonadIO) | ||
|
||
-- | Counter for generating fresh unification variables. | ||
newtype FreshVarCounter = FreshVarCounter {getFreshVarCounter :: Int} | ||
deriving (Eq, Ord, Enum) | ||
|
||
-- | Run a 'Unification' effect via the 'UnificationC' carrier. | ||
runUnification :: Algebra sig m => UnificationC m a -> m (Either UnificationError a) | ||
runUnification = | ||
unUnificationC >>> evalState idS >>> evalState (FreshVarCounter 0) >>> runThrow | ||
|
||
------------------------------------------------------------ | ||
-- Unification | ||
|
||
-- The idea here (using an explicit substitution as a sort of | ||
-- "functional shared memory", instead of directly using IORefs), is | ||
-- based on Dijkstra et al. Unfortunately, their implementation of | ||
-- unification is subtly wrong; fortunately, a single integration test | ||
-- in the Swarm test suite failed, leading to discovering the bug. | ||
-- The basic issue is that when unifying an equation between two | ||
-- variables @x = y@, we must look up *both* to see whether they are | ||
-- already mapped by the substitution (and if so, replace them by | ||
-- their referent and keep recursing). Dijkstra et al. only look up | ||
-- @x@ and simply map @x |-> y@ if x is not in the substitution, but | ||
-- this can lead to cycles where e.g. x is mapped to y, and later we | ||
-- unify @y = x@ resulting in both @x |-> y@ and @y |-> x@ in the | ||
-- substitution, which at best leads to a spurious infinite type | ||
-- error, and at worst leads to infinite recursion in the unify function. | ||
-- | ||
-- Peyton Jones et al. show how to do it correctly: when unifying x = y and | ||
-- x is not mapped in the substitution, we must also look up y. | ||
|
||
-- | Implementation of the 'Unification' effect in terms of the | ||
-- 'UnificationC' carrier. | ||
instance Algebra sig m => Algebra (Unification :+: sig) (UnificationC m) where | ||
alg hdl sig ctx = UnificationC $ case sig of | ||
L (Unify t1 t2) -> (<$ ctx) <$> runThrow (unify t1 t2) | ||
L (ApplyBindings t) -> do | ||
s <- get @(Subst IntVar UType) | ||
(<$ ctx) <$> subst s t | ||
L FreshIntVar -> do | ||
v <- IntVar <$> gets getFreshVarCounter | ||
modify @FreshVarCounter succ | ||
return $ v <$ ctx | ||
L (FreeUVars t) -> do | ||
s <- get @(Subst IntVar UType) | ||
(<$ ctx) . fuvs <$> subst s t | ||
R other -> alg (unUnificationC . hdl) (R (R (R other))) ctx | ||
|
||
-- | Unify two types, returning a unified type equal to both. Note | ||
-- that for efficiency we /don't/ do an occurs check here, but | ||
-- instead lazily during substitution. | ||
unify :: | ||
( Has (Throw UnificationError) sig m | ||
, Has (State (Subst IntVar UType)) sig m | ||
) => | ||
UType -> | ||
UType -> | ||
m UType | ||
unify ty1 ty2 = case (ty1, ty2) of | ||
(Pure x, Pure y) | x == y -> pure (Pure x) | ||
(Pure x, y) -> do | ||
mxv <- lookupS x | ||
case mxv of | ||
Nothing -> unifyVar x y | ||
Just xv -> unify xv y | ||
(x, Pure y) -> unify (Pure y) x | ||
(Free t1, Free t2) -> Free <$> unifyF t1 t2 | ||
|
||
-- | Unify a unification variable which /is not/ bound by the current | ||
-- substitution with another term. If the other term is also a | ||
-- variable, we must look it up as well to see if it is bound. | ||
unifyVar :: | ||
( Has (Throw UnificationError) sig m | ||
, Has (State (Subst IntVar UType)) sig m | ||
) => | ||
IntVar -> | ||
UType -> | ||
m UType | ||
unifyVar x (Pure y) = do | ||
myv <- lookupS y | ||
case myv of | ||
-- x = y but the variable y is not bound: just add (x |-> y) to the current Subst | ||
Nothing -> modify @(Subst IntVar UType) ((x |-> Pure y) @@) >> pure (Pure y) | ||
-- x = y and y is bound to v: recurse on x = v. | ||
Just yv -> unify (Pure x) yv | ||
|
||
-- x = t for a non-variable t: just add (x |-> t) to the Subst. | ||
unifyVar x t = modify ((x |-> t) @@) >> pure t | ||
|
||
-- | Perform unification on two non-variable terms: check that they | ||
-- have the same top-level constructor and recurse on their | ||
-- contents. | ||
unifyF :: | ||
( Has (Throw UnificationError) sig m | ||
, Has (State (Subst IntVar UType)) sig m | ||
) => | ||
TypeF UType -> | ||
TypeF UType -> | ||
m (TypeF UType) | ||
unifyF t1 t2 = case (t1, t2) of | ||
(TyBaseF b1, TyBaseF b2) -> case b1 == b2 of | ||
True -> pure t1 | ||
False -> unifyErr | ||
(TyBaseF {}, _) -> unifyErr | ||
-- Note that *type variables* are not the same as *unification variables*. | ||
-- Type variables must match exactly. | ||
(TyVarF v1, TyVarF v2) -> case v1 == v2 of | ||
True -> pure t1 | ||
False -> unifyErr | ||
(TyVarF {}, _) -> unifyErr | ||
(TySumF t11 t12, TySumF t21 t22) -> TySumF <$> unify t11 t21 <*> unify t12 t22 | ||
(TySumF {}, _) -> unifyErr | ||
(TyProdF t11 t12, TyProdF t21 t22) -> TyProdF <$> unify t11 t21 <*> unify t12 t22 | ||
(TyProdF {}, _) -> unifyErr | ||
(TyRcdF m1, TyRcdF m2) -> | ||
case ((==) `on` M.keysSet) m1 m2 of | ||
False -> unifyErr | ||
_ -> fmap TyRcdF . sequence $ M.merge M.dropMissing M.dropMissing (M.zipWithMatched (const unify)) m1 m2 | ||
(TyRcdF {}, _) -> unifyErr | ||
(TyCmdF c1, TyCmdF c2) -> TyCmdF <$> unify c1 c2 | ||
(TyCmdF {}, _) -> unifyErr | ||
(TyDelayF c1, TyDelayF c2) -> TyDelayF <$> unify c1 c2 | ||
(TyDelayF {}, _) -> unifyErr | ||
(TyFunF t11 t12, TyFunF t21 t22) -> TyFunF <$> unify t11 t21 <*> unify t12 t22 | ||
(TyFunF {}, _) -> unifyErr | ||
where | ||
unifyErr = throwError $ UnifyErr t1 t2 |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which test, out of curiosity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was
Challenges/Sliding Puzzles/_sliding-puzzle/solution.sw
that failed to typecheck, specifically thegetInitialBlankDestination
function. Something about the way it involved inferring the type of a higher-order function that took tuple projections as an argument tickled the type inference algorithm in just the right way to trigger the bug. So thanks for writing that code! 😄