Talk:Dijkstra's algorithm
This article is of interest to the following WikiProjects: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
![]() Archives |
|||
---|---|---|---|
|
|||
|
|||
Threads older than 365 days may be archived by MiszaBot. |
Contents |
Pretty muddy
if you try to take the psudocode and directly turn it into code. Well by 5 lines an it fails. set all distance to infinity. Then you check to see if edge distance is infinity? ofc it was, you just set it to that, and now you exit. —Preceding unsigned comment added by 64.134.224.50 (talk) 00:50, 21 March 2011 (UTC)
Reply: Well actually you set the root to 0, so that will be smaller than infinity, then you set its neighbors and so on... —Preceding unsigned comment added by 70.42.29.3 (talk) 20:53, 4 May 2011 (UTC)
Wrong anim?
Look at the anim. First it goes to node #2. then when it wants to go to node #3 it count the length 9. it's absolutely wrong because if you want to go from #2 to #3 you must pass from a edge length 10. So what's the meaning of this? can anyone explain it for me? —Preceding unsigned comment added by 109.162.218.57 (talk) 12:31, 7 April 2011 (UTC)
Read the algorithm, it keeps the minimum distance. 129.67.95.240 (talk) 13:03, 9 August 2011 (UTC)
Python code
According to WP:NOTREPOSITORY, the python code should not be here. I suggest that it be deleted soon. Please discuss it below. - Subh83 (talk | contribs) 17:29, 18 April 2011 (UTC)
- I don't agree that Python code is never justified on Wikipedia — often it can make a good alternative to pseudocode as a way of communicating the essence of an algorithm. But in this case I don't think it adds much to the pseudocode that is already here, especially since the style of writing gives up most of the advantages of conciseness and readability that Python usually has. So in this case I agree that it should be deleted. —David Eppstein (talk) 18:03, 18 April 2011 (UTC)
- Removed it.
Although I agree Python codes are quite intuitive in general, I am not sure if any specific language should be preferred since may programmers may not use that language and may be unfamiliar with the specific syntaxes. Also, preferred syntaxes may vary between versions of the programing language (Wikipedia:Algorithms_on_Wikipedia). - Subh83 (talk | contribs) 22:25, 18 April 2011 (UTC)
- Removed it.
Running Time
The running time is mentioned as O((|E| + |V|)log |V|) if binary heap is used. I think it assumes Relax(v) = log|V|, but with real binary heap, the operation is "remove v from heap" and "re-add v" are O(|V|) and O(logV), resp., making Relax(v) = O(|V|). Hence, the total running time should be O(|V|log|V| + |E||V|). Can someone verify? -- Naba Kumar (talk) 21:21, 26 May 2011 (UTC)
- Removing an element from binary heap takes logarithmic time, so the bound in the article is correct. -- X7q (talk) 21:57, 26 May 2011 (UTC)
-
- Removing an element (in this case, a vertex) requires finding it first, which is O(|V|), not O(log|V|) - note: you don't know the index. Relax(v) can alternatively be defined as = find key (O(|V|)) + decrease-key (O(log |V|)) = O(|V|) -- Naba Kumar (talk) 22:13, 26 May 2011 (UTC)
- If you don't know the index and have to search sequentially for it, you're doing it wrong. In this case, you do know the index: it's always at the front of the heap. And more generally if you want to find the position of a vertex in a heap (say for decrease key operations) you can simply maintain an index for each vertex that stores its position in the heap, and change these indexes whenever you change the heap. —David Eppstein (talk) 22:52, 26 May 2011 (UTC)
- Removing an element (in this case, a vertex) requires finding it first, which is O(|V|), not O(log|V|) - note: you don't know the index. Relax(v) can alternatively be defined as = find key (O(|V|)) + decrease-key (O(log |V|)) = O(|V|) -- Naba Kumar (talk) 22:13, 26 May 2011 (UTC)
-
-
-
- The decrease-key operation I pointed out is not about removing from top, but a step that happens in between step 16 and 19 in pseudo code (it's implied, but rather should be written down for clarity). Also, it's not about being "wrong". Removing a non-min element from a binary heap can not be done below O(n) without datastructure augmentation as you just described, in which case the statement in the article "With a binary heap, the algorithm requires O((|E| + |V|)log |V|) time ..." is therefore wrong verbatim when applied to the pseudo code. At best, it's insufficient to describe a casual reader how one arrived at O((|E| + |V|)log |V|) as algorithm's performance. I think, the article should make this clarification and mention performance of both non-augmented and augmented structure. Naba Kumar (talk) 10:03, 28 May 2011 (UTC)
-
-
-
-
-
- Okay, I have fixed the article by (1) adding step 19 explaining the need of decrease-key operation and (2) explaining runtime with and without the extra indexing. I believe this is much more clear now. Thanks for you inputs. Naba Kumar (talk) 10:27, 28 May 2011 (UTC)
-
-
-
-
-
-
- I don't think we need to list that suboptimal time. No one implements Dijkstra in that way. It's worse than Dijsktra without a heap (O(V^2) is better than O(VE + whatever) for connected graphs). A note explaining that an array of indexes into the binary heap is required to achieve stated time bounds would be useful though -- X7q (talk) 10:53, 28 May 2011 (UTC)
- Agreed. I just rephrased it better. Naba Kumar (talk) 15:52, 28 May 2011 (UTC)
- I don't think we need to list that suboptimal time. No one implements Dijkstra in that way. It's worse than Dijsktra without a heap (O(V^2) is better than O(VE + whatever) for connected graphs). A note explaining that an array of indexes into the binary heap is required to achieve stated time bounds would be useful though -- X7q (talk) 10:53, 28 May 2011 (UTC)
-
-
-
In addition, with a "vanilla binary heap" without the reverse index, it is still easy to get O(m log n) time: simply re-insert the vertex after each decrease-key ignoring the fact that it already has another entry in the heap, and check whenever you do a delete-min whether it's a vertex you've already seen. The disadcantage of doing it this way is primarily space (you get a heap entry per edge rather than per vertex). —David Eppstein (talk) 15:41, 28 May 2011 (UTC)
- I don't think that works in practice because the vertex being referenced by both the entries (the newly inserted and the old one) are indistinguishable since they are the same, and the "old one" is quite likely already in the wrong place in the heap because of the change in it's value. In addition, simply being just seen-before is probably not enough distinction because a vertex can possibly be decreased-key more than once (so a count is needed, rather than a flag). The reverse index approach is a sensible approach I see (albeit a bit dirty because you break abstraction of the Queue object by storing Queue data inside Vertex structure without having to resort to an external hashtable of vertex->heap_index). I hope someone identifies a cleaner data structure to use for the Queue instead of binary heap. Kh naba (talk) 16:12, 28 May 2011 (UTC)
- It does too work. I've implemented and tested it. You just have to ignore heap entries whose priority is larger than the already-determined distance for a vertex. And the reverse index is not only "a sensible approach", it is exactly the standard approach. —David Eppstein (talk) 16:21, 28 May 2011 (UTC)
- Hmm, so you kept priorities to be separate from vertex min-distances, something like Q = [.., {v1, p1}, .. {v1, p2}, ..] and v1=[min_distance, previous ...]? That would certainly work. I had the queue totally depended on the vertex structures, like Q = [... v1, ... v1, ....] and v1 = [min_distance, previous, ...]. There, it doesn't work because min_distance is the priority in both instances of v1 - as seen by the queue. Your approach is clearly better (although it introduces additional structure {v,p} to hold the queue, but that's better than the reverse-index approach). Why not describe it in the article/pseudo-code so that others could follow it? -- Naba Kumar (talk) 19:24, 28 May 2011 (UTC)
- It does too work. I've implemented and tested it. You just have to ignore heap entries whose priority is larger than the already-determined distance for a vertex. And the reverse index is not only "a sensible approach", it is exactly the standard approach. —David Eppstein (talk) 16:21, 28 May 2011 (UTC)
-
-
- And thanks for pointing out this approach, I would go for this one instead of the reverse index one. --Naba Kumar (talk) 19:31, 28 May 2011 (UTC)
- Re why not to add it to the article: because we need a reliable source that describes this alternative approach. I discussed it eight years ago here but I don't think that counts as a reliable source unless you want to go with the "established expert" clause of WP:SPS (plausible as I have published about other aspects of shortest paths but I'm certainly not going to add it myself). I don't know of published sources that talk about doing it this way. —David Eppstein (talk) 19:51, 28 May 2011 (UTC)
- And thanks for pointing out this approach, I would go for this one instead of the reverse index one. --Naba Kumar (talk) 19:31, 28 May 2011 (UTC)
-
-
-
-
-
- Here's one source which talks about this idea: [1]. They further eliminate space disadvantage you mentioned by using a balanced tree (standard in C++) which does everything heaps can do plus an efficient find operation (if you know key's priority). -- X7q (talk) 20:54, 28 May 2011 (UTC)
- @X7q: That's also an interesting approach. However, I see that "find" in the balanced-tree using the priority key could be risky because there can potentially be vertexes with same priority (you want to decrease-key your current vertex, not any vertex with same priority). I believe that's reason why this may not be a reliable choice. -- Naba Kumar (talk) 12:06, 31 May 2011 (UTC)
- We use (priority, vertex number) pairs as keys in the tree, not just priorities. And there's at most one key per vertex - decrease-key operation is implemented by removing (previous priority, vertex) and inserting (new priority, vertex). This is why it works correctly. -- X7q (talk) 14:02, 31 May 2011 (UTC)
- @X7q: That's also an interesting approach. However, I see that "find" in the balanced-tree using the priority key could be risky because there can potentially be vertexes with same priority (you want to decrease-key your current vertex, not any vertex with same priority). I believe that's reason why this may not be a reliable choice. -- Naba Kumar (talk) 12:06, 31 May 2011 (UTC)
- @David: That approach is so useful for implementers, I would hazard "citation needed" and still mention it. Ideally, it could be described so that it's trivially logical, so as not to require external proof/citation. For example, as long as it is explained that priority of a vertex only reduces and never increases, and that the queue "does not care" about duplicate entries since old priorities are still intact for older entries for it to function correctly, then I think it can survive. It's a tip most people will easily miss (or re-invent) if not mentioned. Let me put this tip in article (with ref. to you, of course) and see if it survives scrutiny :). -- Naba Kumar (talk) 11:41, 31 May 2011 (UTC)
- Here's one source which talks about this idea: [1]. They further eliminate space disadvantage you mentioned by using a balanced tree (standard in C++) which does everything heaps can do plus an efficient find operation (if you know key's priority). -- X7q (talk) 20:54, 28 May 2011 (UTC)
-
-
-
Is it certaint that Dijkstra in general has running time ? If sources are needed I highly recommend taking a look at this book: http://www.cs.berkeley.edu/~vazirani/algorithms/all.pdf (S. Dasgupta, C. H. Papadimitriou, and U. V. Vazirani, 2006: "Algorithms"). High quality stuff written by world class researchers. Section 4.4 deals with Dijkstra specifically in the case with a priority queue with explicit definition of what is meant thereby. Superpronker (talk) 13:00, 7 June 2011 (UTC)
- In fact, in the abovementioned on page 125 (the box: "Which heap is best") it is stated that the running time is given as
- where the running time for decreasekey is the same as for insert key. This becomes important in explaining some of the running times. Please, someone, verify that I have got this correctly and update appropriately if there is a mistake in the article. Superpronker (talk) 14:03, 7 June 2011 (UTC)
Algorithm
Step four of the algorithm is confusing. In its language it states that a visited nodes minimum distance will never be changed. This, is wrong as when the current node its switched a new minimum distance is calculated - keeping the distance that is minimum.
This is seen in the animation, where node 4 changes it value from 22 to 22 from steps 2 to 3.
Additionally, the end condition that 'all nodes' have been visited seems tenous, suppose an infinite undirected graph with two identified points (a,b), this would have infite computational time. The pseudocode on this page seems to address - can we make this less ambiguous? 129.67.95.240 (talk) 13:10, 9 August 2011 (UTC)
- It's not wrong: when node 4's distance changes that is the 'tentative distance' mentioned in step 3 changing. At that point node 4 is not the current node, node 3 is. 4 is being checked as a neighbor of the current node at 3.
- The all nodes termination is fine, too. Dijkstra's algorithm finds the minimal distance between a particular node and all other nodes in the graph (not a single destination), so it's obviously never going to terminate when run against an infinite graph. You need some variant of Dijkstra (such as A* to handle infinite graphs. - MrOllie (talk) 14:58, 9 August 2011 (UTC)
For me, the confusing steps are 2 and 3. On step 2, you mark all nodes as unvisited and next, on step 3, you add all neighbors of current node to the unvisited set. Firstly, is there a reason for the unvisited-mark when having a visited-mark? Secondly, maybe the unvisited set is also unnecessary (as it just binds useful space), because the nodes belonging to this set are those who have distance less than infinity and are not marked as visited. Vevek (talk) 23:28, 15 August 2011 (UTC)
I would suggest to change line 3 in the code for getting the shortest path to:
1 S := empty sequence 2 u := target 3 while u is defined: 4 insert u at the beginning of S 5 u := previous[u] 6 end while ;
This way, you also get the start node. This is not the case in the current version. Milcke (talk) 17:06, 8 October 2011 (UTC)