Class DiffImpl

java.lang.Object
graphql.schema.diffing.DiffImpl

public class DiffImpl extends Object
This is an algorithm calculating the optimal edit to change the source graph into the target graph.

It is based on the following two papers (both papers are from the same authors. The first one is newer, but the older one is more detailed in some aspects)

Accelerating Graph Similarity Search via Efficient GED Computation (https://lijunchang.github.io/pdf/2022-ged-tkde.pdf)

Efficient Graph Edit Distance Computation and Verification via Anchor-aware Lower Bound Estimation (https://arxiv.org/abs/1709.06810)

The algorithm is a modified version of "AStar-BMao". It is adapted to directed graphs as a GraphQL schema is most naturally represented as directed graph (vs the undirected graphs used in the papers).

  • Field Details

  • Constructor Details

  • Method Details

    • diffImpl

      DiffImpl.OptimalEdit diffImpl(Mapping startMapping, List<Vertex> allSources, List<Vertex> allTargets, AtomicInteger algoIterationCount) throws Exception
      Throws:
      Exception
    • addChildToQueue

      private void addChildToQueue(int fixedEditorialCost, DiffImpl.MappingEntry parentEntry, PriorityQueue<DiffImpl.MappingEntry> queue, DiffImpl.OptimalEdit optimalEdit, List<Vertex> allSources, List<Vertex> allTargets)
    • updateOptimalEdit

      private void updateOptimalEdit(DiffImpl.OptimalEdit optimalEdit, int newGed, Mapping mapping)
    • calculateRestOfChildren

      private void calculateRestOfChildren(List<Vertex> availableTargetVertices, HungarianAlgorithm hungarianAlgorithm, double[][] costMatrixCopy, double editorialCostForMapping, Mapping partialMapping, Vertex v_i, int upperBound, int level, LinkedBlockingQueue<DiffImpl.MappingEntry> siblings)
    • addSiblingToQueue

      private void addSiblingToQueue(int fixedEditorialCost, int level, PriorityQueue<DiffImpl.MappingEntry> queue, DiffImpl.OptimalEdit optimalEdit, List<Vertex> allSources, List<Vertex> allTargets, DiffImpl.MappingEntry mappingEntry) throws InterruptedException
      Throws:
      InterruptedException
    • expandMappingAndUpdateOptimalMapping

      private void expandMappingAndUpdateOptimalMapping(int fixedEditorialCost, int level, DiffImpl.OptimalEdit optimalEdit, List<Vertex> allSources, Mapping toExpand, int[] assignments, List<Vertex> availableTargetVertices, double lowerBoundCost)
      Extend the partial mapping to a full mapping according to the optimal matching (hungarian algo result) and update the optimal edit if we found a better one.
    • getCostMatrixSum

      private double getCostMatrixSum(double[][] costMatrix, int[] assignments)
    • calcLowerBoundMappingCost

      private double calcLowerBoundMappingCost(Vertex v, Vertex u, Mapping partialMapping, Map<Vertex,Double> isolatedVerticesCache, Map<Vertex,Vertex> nonFixedParentRestrictions)
      lower bound mapping cost between for v -> u in respect to a partial mapping. It basically tells the minimal costs we can expect for all mappings that come from extending the partial mapping with v -> u.

      This is basically the formula (5) from page 6 of https://lijunchang.github.io/pdf/2022-ged-tkde.pdf.

      The main difference is that the formula works with undirected graphs, but we have a directed graph, hence there is no 1/2 factor and for comparing the labels of anchored vertices to v/u we need to take both directions into account.

      The other optimization is that a schema graph will have never a lot of adjacent edges compared to the overall vertices count, therefore the algorithm for the anchored vertices costs iterates over the adjacent edges of v/u instead of all the mapped vertices.

      Additionally, there is a shortcut for isolated vertices, representing deletion/insertion which is also cached.

      Some naming: an anchored vertex is a vertex that is mapped via the partial mapping. An inner edge is an edge between two vertices that are both not anchored (mapped). The vertices v and u are by definition not mapped.

    • calcAnchoredVerticesCost

      private int calcAnchoredVerticesCost(Vertex v, Vertex u, Mapping partialMapping)
    • calcLowerBoundMappingCostForIsolated

      private double calcLowerBoundMappingCostForIsolated(Vertex vertex, Mapping partialMapping, boolean sourceOrTarget)
      Simplified lower bound calc if the source/target vertex is isolated