Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom implementation of unification #1661

Closed
byorgey opened this issue Nov 25, 2023 · 2 comments · Fixed by #1802
Closed

Custom implementation of unification #1661

byorgey opened this issue Nov 25, 2023 · 2 comments · Fixed by #1802
Assignees
Labels
C-Moderate Effort Should take a moderate amount of time to address. L-Type inference The process of inferring the type of a Swarm expression. Z-Refactoring This issue is about restructuring the code without changing the behaviour to improve code quality.

Comments

@byorgey
Copy link
Member

byorgey commented Nov 25, 2023

Right now we use the unification-fd package to implement unification of types during type checking, as described here. However, I think perhaps it's time to replace our use of unification-fd with our own custom implementation of unification. This would have a number of benefits:

  • unification-fd is quite old and it has some idiosyncracies that have been awkward to work around. For example, it insists on working in the context of a monad transformer stack that has an error transformer on top of its own IntBindingT transformer. If we implement our own we can integrate it with our existing effects framework instead.
  • Implementing our own unification means the possibility to generate better custom error messages.
  • unification-fd can only do pure structural unification. However, several features we are contemplating might need more custom processing:
    • Type synonyms (Add type synonyms (or just generalize to System Fω) #153) would require expanding type synonyms to their definitions whenever they are encountered during unification
    • Recursive types (Recursive types #154) require being able to unroll recursive types one step when they are encountered during unification.
      • With equirecursive types, we also would have to track a set of encountered pairs during unification so that we can cut off the recursion when seeing a given pair again
      • Even with isorecursive types, it is too fragile to require that a recursive type be available in the right mode (check or infer) when encountering every roll or unroll; we need to be able to solve for unification variables matched with recursive types, which means being able to unroll them as needed during unification.

On the other hand there would also be some downsides:

  • unification-fd is quite sophisticated and optimized; it is almost surely way faster than anything we might implement ourselves. However, I am hoping that this will not make too much of a difference as long as our new implementation is "good enough". And even if it isn't, we might be able to incorporate some optimization techniques from unification-fd into our own implementation.
  • It represents a significant refactoring effort.
@byorgey byorgey added Z-Refactoring This issue is about restructuring the code without changing the behaviour to improve code quality. C-Moderate Effort Should take a moderate amount of time to address. L-Type inference The process of inferring the type of a Swarm expression. labels Nov 25, 2023
@byorgey byorgey changed the title Custom unification algorithm Custom implementation of unification Nov 25, 2023
@byorgey
Copy link
Member Author

byorgey commented Apr 13, 2024

It may look like I haven't been very active recently --- and it's true I haven't had as much time to work on Swarm in general --- but in fact I've been working steadily on this. In the refactor/custom-unification branch I have finally succeeded in completely ripping out unification-fd and replacing it with a custom (naive) implementation which passes all the tests. Unfortunately it is quite a bit slower (running the entire test suite went from around 120s on my machine to 190s, a 60% increase). I am currently reading Efficient Functional Unification and Substitution and hopeful it may provide ideas for an implementation which is fast enough but not terribly complex (I have looked into the guts of unification-fd and it is so highly optimized and complex that I despair of understanding it).

Once I finish this I will be able to tackle #154 which I am very excited about. Then eventually I'll get back around to #1087 .

@mergify mergify bot closed this as completed in #1802 Apr 27, 2024
mergify bot pushed a commit that referenced this issue Apr 27, 2024
…1802)

Closes #1661; towards #154.

`unification-fd` is very powerful, and extremely fast, but it was written a long time ago and its age shows.  It was not possible to incorporate it into our effects system in a nice way, necessitating the use of concrete monad transformers in the typechecking code.  In addition it is impossible to customize, and we have been contemplating new type system features such as #153 and #154 that turn out to require hooking into the way the unification algorithm works (see #154 (comment) for more details).

This PR thus removes the dependency on `unification-fd` and implements our own version of unification.  It is not quite as fast as `unification-fd` but I consider the slowdown acceptable in order to gain e.g. recursive types.  And of course there is also room to optimize it.

The custom `UTerm` from `unification-fd` is replaced with the standard `Free` (free monad) construction from the `free` package, and the custom `Fix` from `unification-fd` is replaced with the one from `data-fix`.

We also get rid of the `unifyCheck` function, which used to be a quick short-circuiting way to check whether two types definitely did not unify or might unify, allowing us to give better error messages more quickly.  Now, the `=:=` unification operator itself just does this.
@byorgey byorgey mentioned this issue Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-Moderate Effort Should take a moderate amount of time to address. L-Type inference The process of inferring the type of a Swarm expression. Z-Refactoring This issue is about restructuring the code without changing the behaviour to improve code quality.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant