Fix A* implementation #125

gdalle · 2022-05-13T11:18:10Z

This PR resolves two issues with the A* algorithm:

[Port] [BUG] a_star fails on SimpleWeightedDiGraph #59 To make A* work with custom edge types E that do not define a constructor E(s,d), I created an undocumented method make_edge(g,s,d) that falls back on SimpleEdge. This follows a lengthy discussion in Change A* output to a vector of tuples #116 where several approaches were debated, some of them breaking.
Error in A* implementation #120 To make A* correct, I removed the closed_set variable which prevented it to be exponential in the worst-case scenario (see this Wikipedia paragraph from the article which inspired our implementation). However, I see no obvious way to test this pathological behavior so I'm submitting the PR as is.

codecov · 2022-05-13T11:25:59Z

Codecov Report

Merging #125 (74477b4) into master (cc041b8) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #125      +/-   ##
==========================================
- Coverage   97.54%   97.54%   -0.01%     
==========================================
  Files         109      109              
  Lines        6318     6314       -4     
==========================================
- Hits         6163     6159       -4     
  Misses        155      155

etiennedeg · 2022-05-15T13:56:08Z

Why name it make_edge(g,s,d) when it gathers an existing edge?

gdalle · 2022-05-15T16:29:14Z

@etiennedeg it doesn't necessarily gather an existing edge, it recreates an edge from s to d. If you look at the A* code I use it on (s, d) = (0, 0) to get the right type for the vector of edges returned by the method. This edge obviously doesn't exist, but that's the workaround I thought I needed.

EDIT: I just realized this is breaking too, since the constructor E(s, d) of the custom edge type will never be called even when it exists. So I guess this only leaves the proposal by @matbesancon (#116 (comment)):

(::Type{E})(src::Integer, dst::Integer) where {E <: AbstractEdge} = SimpleEdge(src, dst)

simonschoelly · 2022-05-15T16:47:48Z

I was not involved in that discussion but I still wonder, if we can't solve that issue without adding another method to the interface?

gdalle · 2022-05-15T16:51:58Z

@simonschoelly it was quite a long discussion (see this previous PR), but the conclusion I drew is that there is no clean way out of this mess.
@matbesancon suggested what I think is the only non-breaking fix, but the real solution would be to change the behavior of A* and return a vector of vertices instead of edges.
The hidden addition to the interface (not documented) was the option favored by @etiennedeg but as stated above I now believe it doesn't work.

Please feel free to weigh in though! In my view it is a really important issue: if A* doesn't work on the standard library for weighted graphs, this kinda looks bad ^^

jpfairbanks · 2022-05-15T17:01:56Z

I think that switching to returning a sequence of vertices precludes support for multigraphs in the future. I would rather return a sequence of SimpleEdge for all graph types or have it give a copy of the actual edges in the input graph. I think the latter option would avoid needing a way to generically construct an edge of arbitrary type.

We could add a flag to the function signature to enable this behavior. If avoiding a SemVer bump is important.

gdalle · 2022-05-15T17:11:58Z

Fair point, I hadn't though of multigraphs.
Returning the actual edge object was the idea behind the proposal by @etiennedeg : he suggested we add a get_edge(g, s, d) method to the interface (#116 (comment)). The only issue is that we would need to enforce the existence of this method in all AbstractGraph implementations, which (to me) seems just as breaking as changing the return type of a_star.
Of course, if we don't care about semver bumps, then we can go ahead and tag 2.0 for this, but to me it seems like a waste of a good major version, because there surely are many more things that we would like to break.

gdalle · 2022-05-15T17:25:46Z

Another option would be to introduce a new method with a different name, say astar instead of a_star, and to clearly explain the difference between them in the docs. That way we ensure there is no loss of functionality, while providing a new working implementation to our liking

simonschoelly · 2022-05-15T17:46:08Z

I think a get_edge would be something we need in the future (actually get_edges so that we can support multigraphs) - but is probably a 2.0 thing.

The way make_edge is implemented here is also sort of breaking - if another graph type had an edgetype E that supported E(s, d) then this change will make a-star stop working.

My proposal is to somehow fix this downstream, i.e. in SimpeWeightedGraphs and other packages. As we probably do not to copy the whole a-star code into that package, we use some helper function similar to yours, lets call it _make_edge, that works somehow like that

   function _make_edge(g, s, d) 
      return edgetype(g)(s, d)
      # TODO maybe catch exception and output a friendly message to show what to implement
    end

then instead of exporting that function in the interface, we keep it somewhere near the astar function. We can then import and override that function in SimpeWeightedGraphs.jl.

This is of course also not ideal, but I think we should avoid making the interface more complicated than it already is, as otherwise we will scare people away from creating their own graph types.

Apart from that, I also like James idea of adding a keyword argument to the function to control behavior.

simonschoelly · 2022-05-15T17:55:58Z

Another option would be to introduce a new method with a different name, say astar instead of a_star, and to clearly explain the difference between them in the docs. That way we ensure there is no loss of functionality, while providing a new working implementation to our liking

This could also be in a submodule where we can stage v2.0 functions without respecting semver (although I am not sure about name collisions if we use the same name).

gdalle · 2022-05-15T19:58:02Z

I think a get_edge would be something we need in the future (actually get_edges so that we can support multigraphs) - but is probably a 2.0 thing.

Totally agree

The way make_edge is implemented here is also sort of breaking - if another graph type had an edgetype E that supported E(s, d) then this change will make a-star stop working.

Yeah, that is what I realized this afternoon. Sorry about that

My proposal is to somehow fix this downstream, i.e. in SimpleWeightedGraphs and other packages. As we probably do not to copy the whole a-star code into that package, we use some helper function similar to yours, lets call it _make_edge

Just to be clear, cause I'm new to this whole versioning stuff: if we require downstream users to add something to their packages for a_star to keep working, that's breaking, right?

Apart from that, I also like James idea of adding a keyword argument to the function to control behavior.

To me this sounds like a good compromise waiting for 2.0, and it keeps the interface just as simple. To enable static dispatch I think the additional argument could be the type of the edges we want in the returned path:

function a_star(g::AbstractGraph, s::Integer, d::Integer, ::Type{E}) where {E <: AbstractEdge}
    # Returns a list of edges of type `E`. The constructor `E(s, d)` should exist.
end

jpfairbanks · 2022-05-16T07:05:59Z

I think we have a broader problem here. When all graphs are SimpleGraphs then there is an equivalence between lists of vertices and paths in the graph. All of the algorithms that compute paths use this equivalence at some point. When the graphs have either metadata attached to the edges (weights, properties, etc.) or multiedges, this equivalence is no longer canonical. You need to know additional information about the graph in order to construct it.

We could either say that Graphs.jl is really a SimpleGraph library and push the handling of this additional information to the downstream packages, or we could find generic ways to providing hooks for downstream libraries to tell us what we need to know.

The approach used in this PR, introducing edgetype_to_return(came_from[curr_idx], curr_idx) as a function that is passed to the implementation makes sense as a hook for downstream libraries to tell us how to make edges. If we enforce that this function be the constructor of the Edge Type of the graph, then downstream libs have less control over what information can be required to make an edge. Having that function be an arbitrary closure makes sense because then weightedgraphs can enclose the graph variable into that closure and look up the weights that they need. This would allow our algorithms to return lists of edges of a type defined in the downstream library.

I think the core problem is that Graphs.jl implementations assume that there is an equivalence between sequences of vertices and paths in the graph. That core problem is much bigger than astar or path finding algorithms in general. We are presented with a choice to expand scope to more graph types or cleanly separate our scope from the applications of graphs.

gdalle · 2022-05-16T07:23:15Z

I think the core problem is that Graphs.jl implementations assume that there is an equivalence between sequences of vertices and paths in the graph. That core problem is much bigger than astar or path finding algorithms in general. We are presented with a choice to expand scope to more graph types or cleanly separate our scope from the applications of graphs.

Indeed it seems like there are larger things at stake than A*, namely the fate of non-simple graphs. But that would require rethinking a large part of our package, most notably related to edge construction. For instance, the get_edge(g, s, d) function suggested by @etiennedeg may no longer be precise enough if there are multi-edges. The same goes for my edgetype_to_return(s, d) hack.

However, I think the most urgent task is is to reach a consensus on this PR so that A* finally works on (simple) weighted graphs. This is a pretty serious bug that has gone unfixed for a long long time, and it bothers me a lot ^^
In parallel, the rest of the discussion initiated by @jpfairbanks could be moved to a new issue, where it will be more visible and allow the community to weigh in. Maybe something entitled "Graphs.jl in the multigraph of madness"?

jpfairbanks · 2022-05-16T07:44:47Z

Ok, so to get this fixed I think we need to just return E(u,v) or SimpleEdge(u,v) as the fallback.

This delegates the handling of weights to the downstream library, which is fine because only the downstream consumer knows how to handle the additional structure. This doesn't fix or preclude multi graphs in the future. I think that so much of Graphs.jl needs to get fixed for multgraphs that we can keep the status quo until we have an interested person who wants to really dive into multigraphs.

Questions about the PR specifically:

We don't have any other reverse dependencies in the tests now and so I am hesitant to add one for this. It would be more consistent to have the test in SWG. There is some risk introduced by the possibility that a change in SWG would break the tests for Graphs.jl
We could avoid breaking the signature of astar_impl! just by making the edgetype_to_return an optional argument with a default of edgetype(g). While the _impl functions are generally considered internal to the API, I think that some people use them for performance critical tasks we should take the opportunity to avoid the breakage.
There appear to be some superficial formatting diffs that could be removed from the PR unless they are homogenizing style.

gdalle · 2022-05-16T07:53:50Z

Yeah but the thing is, returning E(u, v) is the current (buggy) behavior, and returning SimpleEdge(u, v) would be breaking. My workaround with the additional type argument keeps the current behavior, but allows the user of eg SWG to specify that they want to return a vector of SimpleEdges. Of course this means users have to change their code to fix the bug, but it is better than us breaking the current behavior. What do you think? As for your other remarks : 1) Good point, I’ll move the test over to SWG, it makes sense there anyway 2) I was trying to respect the typical order of arguments, as specified in the official style guide, but you think internal stability trumps that? 3) Ok I’ll revert that. But someday we need to put everything through JuliaFormatter during CI because the code base is ugly ^^

…

Le 16 mai 2022 à 09:45, James ***@***.***> a écrit : Ok, so to get this fixed I think we need to just return E(u,v) or SimpleEdge(u,v) as the fallback. This delegates the handling of weights to the downstream library, which is fine because only the downstream consumer knows how to handle the additional structure. This doesn't fix or preclude multi graphs in the future. I think that so much of Graphs.jl needs to get fixed for multgraphs that we can keep the status quo until we have an interested person who wants to really dive into multigraphs. Questions about the PR specifically: We don't have any other reverse dependencies in the tests now and so I am hesitant to add one for this. It would be more consistent to have the test in SWG. There is some risk introduced by the possibility that a change in SWG would break the tests for Graphs.jl We could avoid breaking the signature of astar_impl! just by making the edgetype_to_return an optional argument with a default of edgetype(g). While the _impl functions are generally considered internal to the API, I think that some people use them for performance critical tasks we should take the opportunity to avoid the breakage. There appear to be some superficial formatting diffs that could be removed from the PR unless they are homogenizing style. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

etiennedeg · 2022-05-16T09:04:14Z

EDIT: I just realized this is breaking too, since the constructor E(s, d) of the custom edge type will never be called even when it exists.

Yes, maybe we can default get_edge to E(s, d), but it's a bit messy to default an API function to a no API function.

For instance, the get_edge(g, s, d) function suggested by @etiennedeg may no longer be precise enough if there are multi-edges.

If we decide to support multi-graphs, we will need to have a Trait to distinguish between simple graphs and general graphs, so the get_edge could just be implemented for simple graphs. (And we would also have a get_edges in the API, which could be called by get_edge)

In the current PR implementation, the edgetype_to_return argument is not optional so it is heavily breaking.
I'm not found of the flag implementation, even if it provides a workaround, it keeps the buggy behavior with the default arguments, which feels kind of unnecessary, when the graph carry enough information to avoid passing the edgetype. If no solution is perfect, as you mentioned, I think the cleaner is to deprecate a_star to astar

jpfairbanks · 2022-05-16T09:11:56Z

I was trying to respect the typical order of arguments, as specified in the official style guide, but you think internal stability trumps that?

Yes, especially here where there is a good reason for a performance-based usage of the unexpected function.

I'm not found of the flag implementation, even if it provides a workaround, it keeps the buggy behavior with the default arguments, which feels kind of unnecessary, when the graph carry enough information to avoid passing the edgetype. If no solution is perfect, as you mentioned, I think the cleaner is to deprecate a_star to astar

It makes sense to me for SWG to define a method of a_star that passes the right constructor to get back a vector of weighted edges. Then a_star on a weighted graph will do the right thing via multiple dispatch. I see this patch as adding in a hook for client libraries to tell Graphs.jl how to build edges when reconstructing the path. That constructor can lookup the right weight in the input graph via a closure.

etiennedeg · 2022-05-16T09:41:48Z

Ok, but it requires downstream packages to implement it. If we ask users to implement a hack specifically to get a_star working, we can as well ask them to implement a new API method. Graphs must already implement edgetype(g), so it feels weird to ask users what edge it should return when it is supposed to be non ambiguous.
Also, this approach will not work with multi graph, because the edges needs to be gathered during the search (and not after) so its really necessary to get a way to gather the edges in the API.

gdalle · 2022-05-16T10:26:59Z

In the current PR implementation, the edgetype_to_return argument is not optional so it is heavily breaking.

@etiennedeg I don't think so, take a look at the default 5-argument method:

function a_star(
    g::AbstractGraph{U},
    s::Integer,
    t::Integer,
    distmx::AbstractMatrix{T}=weights(g),
    heuristic::Function=n -> zero(T),
) where {T,U}
    return a_star(edgetype(g), g, s, t, distmx, heuristic)
end

https://github.com/gdalle/Graphs.jl/blob/77334fc520dfdcbd166ab2de4b266486891ead17/src/shortestpaths/astar.jl#L97-L105

Ok, but it requires downstream packages to implement it. If we ask users to implement a hack specifically to get a_star working, we can as well ask them to implement a new API method.

With the default method we don't, which is why this patch is non-breaking in my opinion.

gdalle · 2022-05-16T10:31:59Z

It makes sense to me for SWG to define a method of a_star that passes the right constructor to get back a vector of weighted edges.

@jpfairbanks with my suggestion they can do that by specifying edgetype_to_return (to be faire, a_star will error if they don't anyway). At the moment I constrain it to be a type, but it could be any callable

jpfairbanks · 2022-05-16T10:37:28Z

Right, I think that we shouldn't restrict that argument to be a type because graphs with data on their edges need to have some enclosed variable to store that metadata. It will be easiest for SWG to pass the closure below:

function edge_constructor(g)
  newedge(u,v) = SimpleWeightedEdge(u, v, g.weights[u,v])
  return newedge
end

a_star(g::SimpleWeightedGraph, ...) = a_star(g, ..., edge_constructor(g))

gdalle · 2022-05-16T11:03:50Z

In terms of order of arguments, I put the edgetype_to_return first because otherwise people have to input distmx and heuristic before getting to the edge type

etiennedeg · 2022-05-16T12:19:32Z

Ok, but it requires downstream packages to implement it. If we ask users to implement a hack specifically to get a_star working, we can as well ask them to implement a new API method.

With the default method we don't, which is why this patch is non-breaking in my opinion.

This is the same if we default get_edge to E(s, d), but as I said, it's messy to default an API function to a non API function...
To avoid that, maybe we can avoid implementing a default get_edge by checking in the code if get_edge is implemented, and if it's not the case, we use E(s, d).

gdalle · 2022-05-16T12:37:02Z

The current buggy behavior may actually be something we wish to keep, unless users explicitly declare that they want to output edges of a certain type (compatible with E(s, d)). It prevents surprises where a SimpleEdge would pop up in a code that never even mentions it.

etiennedeg · 2022-05-16T12:46:14Z

I don't understand, my proposition does not change the actual behavior until a user redefine something (either E(s, d) or get_edge)...

gdalle · 2022-05-16T12:48:51Z

Oh, right, I though you wanted get_edge(g, s, d) to fall back on SimpleEdge(s, d) (as I did in the first version of this PR). But that was breaking

gdalle · 2022-05-16T12:50:10Z

Well then it comes down to personal taste. I'm in favor of adding an argument to a_star instead of another (hidden?) function to the interface for now. I think get_edge will definitely have its place in our roadmap for 2.0, but right now we don't really need it and so I would rather do without

simonschoelly · 2022-05-19T17:29:25Z

I think the solution here is good enough enough, we should definitiely refactor this function in the future though- @gdalle could you try, if you could change SimpleWeightedGraphs, so that it works correctly with the changed function?

gdalle · 2022-05-19T19:53:54Z

I think the solution here is good enough enough, we should definitiely refactor this function in the future though- @gdalle could you try, if you could change SimpleWeightedGraphs, so that it works correctly with the changed function?

Yeah, I'll polish this PR during the week-end and transfer the test over to SWG.
Mental note: I should also use a concrete type in the A* queue while we're at it (see #129)

gdalle · 2022-05-21T08:48:00Z

So I pushed a new version that adresses most of the previous concerns and provides a satisfying fix until 2.0 comes along:

The new argument edgetype_to_return comes at the end to avoid breaking the signature of internal functions
JuliaFormatter changes have been reverted
SimpleWeightedGraphs has been deleted from the test dependencies (I'll add the test there soon)

I think this works but I have two problems left unsolved, for which I need your advice.

edgetype_to_return is still constrained to be a Type because I need it to initialize the total_path vector. This prevents the use of general closures, as wished by @jpfairbanks.
@simonschoelly pointed out in Custom priority queues for Dijkstra & Co #129 that the queue we use in A* has type PriorityQueue{Integer,T}. This can hurt performance, but I don't know which concrete type to pick.
- If we choose Int, people who want to work with BigInt will see errors.
- If we use the index type U (from g::AbstractGraph{U}), I think this can cause other problems: what happens if U = Int32 but s::Int64?

gdalle · 2022-05-21T08:53:35Z

Note however that these problems are not bugs, they only impact generality / performance. As a result, I would be in favor of merging first and improving later

gdalle · 2022-05-21T09:02:50Z

I'd be glad to get feedback on the A* test over in SWG (see my fork branch).

etiennedeg · 2022-05-21T09:26:00Z

If we use the index type U (from g::AbstractGraph{U}), I think this can cause other problems: what happens if U = Int32 but s::Int64?

If s is in the range of Int32, then it will be correctly cast. If it is not, then it can't represent a vertex of the graph, so this is a wrong input. eltype(G) is definitely the way to go.

src/shortestpaths/astar.jl

gdalle · 2022-05-22T06:02:01Z

Note to self: following the latest updates on #120, I should probably add back the closed_set

.gitignore

gdalle · 2022-05-22T19:44:36Z

I say we leave it as a function for now, this has waited long enough

…

Le 22 mai 2022 à 21:02, Simon Schölly ***@***.***> a écrit : @simonschoelly commented on this pull request. In src/shortestpaths/astar.jl: > """ function a_star(g::AbstractGraph{U}, # the g s::Integer, # the start vertex t::Integer, # the end vertex distmx::AbstractMatrix{T}=weights(g), - heuristic::Function=n -> zero(T)) where {T, U} - - E = Edge{eltype(g)} - + heuristic=n -> zero(T), Ah I didn't know that this would not for function-like objects. There is also Base.Callable but that is also too restrictive in that case. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were assigned.

gdalle · 2022-05-22T19:45:01Z

Yes because a fully reproducible docs build seemed like a good idea. I don’t know whether it is the default practice

…

Le 22 mai 2022 à 21:05, Simon Schölly ***@***.***> a écrit : @simonschoelly commented on this pull request. In .gitignore: > @@ -7,4 +7,4 @@ docs/site/ benchmark/.results/* benchmark/.tune.jld *.cov -Manifest.toml Is that on purpose? I see you also commited docs/Manifest.toml. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were assigned.

gdalle requested review from matbesancon and etiennedeg May 13, 2022 11:18

gdalle mentioned this pull request May 18, 2022

Roadmap for Graphs.jl 2.0 #128

Open

gdalle self-assigned this May 19, 2022

gdalle removed request for matbesancon and etiennedeg May 19, 2022 19:54

Fix A* with additional argument edgetype_to_return

298b9e2

gdalle requested review from matbesancon, jpfairbanks and simonschoelly May 21, 2022 08:52

Concrete key type in A* priority queue

74e5ae1

gdalle requested a review from etiennedeg May 21, 2022 13:32

This was referenced May 21, 2022

kruskal_mst and prim_mst return different types from SimpleWeightedGraph input JuliaGraphs/SimpleWeightedGraphs.jl#17

Closed

Coherent return types for Kruskal vs Prim #130

Open

simonschoelly reviewed May 21, 2022

View reviewed changes

src/shortestpaths/astar.jl Outdated Show resolved Hide resolved

simonschoelly reviewed May 21, 2022

View reviewed changes

src/shortestpaths/astar.jl Show resolved Hide resolved

Add back closed_set and type restriction heuristic::Function

74477b4

simonschoelly reviewed May 22, 2022

View reviewed changes

.gitignore Show resolved Hide resolved

simonschoelly approved these changes May 22, 2022

View reviewed changes

gdalle merged commit 92acc07 into JuliaGraphs:master May 22, 2022

This was referenced May 22, 2022

Error in A* implementation #120

Closed

[Port] [BUG] a_star fails on SimpleWeightedDiGraph #59

Closed

gdalle mentioned this pull request May 30, 2024

A* Star is problematic when working with SimpleWeightedGraph JuliaGraphs/SimpleWeightedGraphs.jl#48

Closed

Fix A* implementation #125

Fix A* implementation #125

Conversation

gdalle commented May 13, 2022

codecov bot commented May 13, 2022 • edited Loading

Codecov Report

etiennedeg commented May 15, 2022

gdalle commented May 15, 2022 • edited Loading

simonschoelly commented May 15, 2022

gdalle commented May 15, 2022 • edited Loading

jpfairbanks commented May 15, 2022

gdalle commented May 15, 2022

gdalle commented May 15, 2022

simonschoelly commented May 15, 2022

simonschoelly commented May 15, 2022

gdalle commented May 15, 2022

jpfairbanks commented May 16, 2022

gdalle commented May 16, 2022 • edited Loading

jpfairbanks commented May 16, 2022

gdalle commented May 16, 2022 via email

etiennedeg commented May 16, 2022

jpfairbanks commented May 16, 2022

etiennedeg commented May 16, 2022

gdalle commented May 16, 2022 • edited Loading

gdalle commented May 16, 2022

jpfairbanks commented May 16, 2022

gdalle commented May 16, 2022

etiennedeg commented May 16, 2022

gdalle commented May 16, 2022

etiennedeg commented May 16, 2022

gdalle commented May 16, 2022

gdalle commented May 16, 2022

simonschoelly commented May 19, 2022

gdalle commented May 19, 2022

gdalle commented May 21, 2022

gdalle commented May 21, 2022

gdalle commented May 21, 2022

etiennedeg commented May 21, 2022 • edited Loading

gdalle commented May 22, 2022

gdalle commented May 22, 2022 via email

gdalle commented May 22, 2022 via email

codecov bot commented May 13, 2022 •

edited

Loading

gdalle commented May 15, 2022 •

edited

Loading

gdalle commented May 15, 2022 •

edited

Loading

gdalle commented May 16, 2022 •

edited

Loading

gdalle commented May 16, 2022 •

edited

Loading

etiennedeg commented May 21, 2022 •

edited

Loading