Module TSort
In: lib/tsort.rb

TSort implements topological sorting using Tarjan‘s algorithm for strongly connected components.

TSort is designed to be able to be used with any object which can be interpreted as a directed graph.

TSort requires two methods to interpret an object as a graph, tsort_each_node and tsort_each_child.

The equality of nodes are defined by eql? and hash since TSort uses Hash internally.

A Simple Example

The following example demonstrates how to mix the TSort module into an existing class (in this case, Hash). Here, we‘re treating each key in the hash as a node in the graph, and so we simply alias the required tsort_each_node method to Hash‘s each_key method. For each key in the hash, the associated value is an array of the node‘s child nodes. This choice in turn leads to our implementation of the required tsort_each_child method, which fetches the array of child nodes and then iterates over that array using the user-supplied block.

  require 'tsort'

  class Hash
    include TSort
    alias tsort_each_node each_key
    def tsort_each_child(node, &block)
      fetch(node).each(&block)
    end
  end

  {1=>[2, 3], 2=>[3], 3=>[], 4=>[]}.tsort
  #=> [3, 2, 1, 4]

  {1=>[2], 2=>[3, 4], 3=>[2], 4=>[]}.strongly_connected_components
  #=> [[4], [2, 3], [1]]

A More Realistic Example

A very simple `make’ like tool can be implemented as follows:

  require 'tsort'

  class Make
    def initialize
      @dep = {}
      @dep.default = []
    end

    def rule(outputs, inputs=[], &block)
      triple = [outputs, inputs, block]
      outputs.each {|f| @dep[f] = [triple]}
      @dep[triple] = inputs
    end

    def build(target)
      each_strongly_connected_component_from(target) {|ns|
        if ns.length != 1
          fs = ns.delete_if {|n| Array === n}
          raise TSort::Cyclic.new("cyclic dependencies: #{fs.join ', '}")
        end
        n = ns.first
        if Array === n
          outputs, inputs, block = n
          inputs_time = inputs.map {|f| File.mtime f}.max
          begin
            outputs_time = outputs.map {|f| File.mtime f}.min
          rescue Errno::ENOENT
            outputs_time = nil
          end
          if outputs_time == nil ||
             inputs_time != nil && outputs_time <= inputs_time
            sleep 1 if inputs_time != nil && inputs_time.to_i == Time.now.to_i
            block.call
          end
        end
      }
    end

    def tsort_each_child(node, &block)
      @dep[node].each(&block)
    end
    include TSort
  end

  def command(arg)
    print arg, "\n"
    system arg
  end

  m = Make.new
  m.rule(%w[t1]) { command 'date > t1' }
  m.rule(%w[t2]) { command 'date > t2' }
  m.rule(%w[t3]) { command 'date > t3' }
  m.rule(%w[t4], %w[t1 t3]) { command 'cat t1 t3 > t4' }
  m.rule(%w[t5], %w[t4 t2]) { command 'cat t4 t2 > t5' }
  m.build('t5')

Bugs

  • ‘tsort.rb’ is wrong name because this library uses Tarjan‘s algorithm for strongly connected components. Although ‘strongly_connected_components.rb’ is correct but too long.

References

  1. E. Tarjan, "Depth First Search and Linear Graph Algorithms",

SIAM Journal on Computing, Vol. 1, No. 2, pp. 146-160, June 1972.

Methods

Classes and Modules

Class TSort::Cyclic

Public Instance methods

The iterator version of the strongly_connected_components method. obj.each_strongly_connected_component is similar to obj.strongly_connected_components.each, but modification of obj during the iteration may lead to unexpected results.

each_strongly_connected_component returns nil.

[Source]

     # File lib/tsort.rb, line 178
178:   def each_strongly_connected_component # :yields: nodes
179:     id_map = {}
180:     stack = []
181:     tsort_each_node {|node|
182:       unless id_map.include? node
183:         each_strongly_connected_component_from(node, id_map, stack) {|c|
184:           yield c
185:         }
186:       end
187:     }
188:     nil
189:   end

Iterates over strongly connected component in the subgraph reachable from node.

Return value is unspecified.

each_strongly_connected_component_from doesn‘t call tsort_each_node.

[Source]

     # File lib/tsort.rb, line 199
199:   def each_strongly_connected_component_from(node, id_map={}, stack=[]) # :yields: nodes
200:     minimum_id = node_id = id_map[node] = id_map.size
201:     stack_length = stack.length
202:     stack << node
203: 
204:     tsort_each_child(node) {|child|
205:       if id_map.include? child
206:         child_id = id_map[child]
207:         minimum_id = child_id if child_id && child_id < minimum_id
208:       else
209:         sub_minimum_id =
210:           each_strongly_connected_component_from(child, id_map, stack) {|c|
211:             yield c
212:           }
213:         minimum_id = sub_minimum_id if sub_minimum_id < minimum_id
214:       end
215:     }
216: 
217:     if node_id == minimum_id
218:       component = stack.slice!(stack_length .. -1)
219:       component.each {|n| id_map[n] = nil}
220:       yield component
221:     end
222: 
223:     minimum_id
224:   end

Returns strongly connected components as an array of arrays of nodes. The array is sorted from children to parents. Each elements of the array represents a strongly connected component.

[Source]

     # File lib/tsort.rb, line 163
163:   def strongly_connected_components
164:     result = []
165:     each_strongly_connected_component {|component| result << component}
166:     result
167:   end

Returns a topologically sorted array of nodes. The array is sorted from children to parents, i.e. the first element has no child and the last node has no parent.

If there is a cycle, TSort::Cyclic is raised.

[Source]

     # File lib/tsort.rb, line 134
134:   def tsort
135:     result = []
136:     tsort_each {|element| result << element}
137:     result
138:   end

The iterator version of the tsort method. obj.tsort_each is similar to obj.tsort.each, but modification of obj during the iteration may lead to unexpected results.

tsort_each returns nil. If there is a cycle, TSort::Cyclic is raised.

[Source]

     # File lib/tsort.rb, line 148
148:   def tsort_each # :yields: node
149:     each_strongly_connected_component {|component|
150:       if component.size == 1
151:         yield component.first
152:       else
153:         raise Cyclic.new("topological sort failed: #{component.inspect}")
154:       end
155:     }
156:   end

Should be implemented by a extended class.

tsort_each_child is used to iterate for child nodes of node.

[Source]

     # File lib/tsort.rb, line 240
240:   def tsort_each_child(node) # :yields: child
241:     raise NotImplementedError.new
242:   end

Should be implemented by a extended class.

tsort_each_node is used to iterate for all nodes over a graph.

[Source]

     # File lib/tsort.rb, line 231
231:   def tsort_each_node # :yields: node
232:     raise NotImplementedError.new
233:   end

[Validate]