Class HnswUtil

java.lang.Object
org.apache.lucene.util.hnsw.HnswUtil

public class HnswUtil extends Object
Utilities for use in tests involving HNSW graphs
  • Constructor Details

    • HnswUtil

      private HnswUtil()
  • Method Details

    • isRooted

      static boolean isRooted(HnswGraph knnValues) throws IOException
      Returns true if every node on every level is reachable from node 0.
      Throws:
      IOException
    • componentSizes

      static List<Integer> componentSizes(HnswGraph hnsw) throws IOException
      Returns the sizes of the distinct graph components on level 0. If the graph is fully-rooted the list will have one entry. If it is empty, the returned list will be empty.
      Throws:
      IOException
    • componentSizes

      static List<Integer> componentSizes(HnswGraph hnsw, int level) throws IOException
      Returns the sizes of the distinct graph components on the given level. The forest starting at the entry points (nodes in the next highest level) is considered as a single component. If the entire graph is rooted in the entry points--that is, every node is reachable from at least one entry point--the returned list will have a single entry. If the graph is empty, the returned list will be empty.
      Throws:
      IOException
    • components

      static List<HnswUtil.Component> components(HnswGraph hnsw, int level, FixedBitSet notFullyConnected, int maxConn) throws IOException
      Throws:
      IOException
    • markRooted

      private static HnswUtil.Component markRooted(HnswGraph hnswGraph, int level, FixedBitSet connectedNodes, FixedBitSet notFullyConnected, int maxConn, int entryPoint) throws IOException
      Count the nodes in a rooted component of the graph and set the bits of its nodes in connectedNodes bitset. Rooted means nodes that can be reached from a root node.
      Parameters:
      hnswGraph - the graph to check
      level - the level of the graph to check
      connectedNodes - a bitset the size of the entire graph with 1's indicating nodes that have been marked as connected. This method updates the bitset.
      notFullyConnected - a bitset the size of the entire graph. On output, we mark nodes visited having fewer than maxConn connections. May be null.
      maxConn - the maximum number of connections for any node (aka M).
      entryPoint - a node id to start at
      Throws:
      IOException
    • nextClearBit

      private static int nextClearBit(FixedBitSet bits, int index)
    • graphIsRooted

      public static boolean graphIsRooted(IndexReader reader, String vectorField) throws IOException
      In graph theory, "connected components" are really defined only for undirected (ie bidirectional) graphs. Our graphs are directed, because of pruning, but they are *mostly* undirected. In this case we compute components starting from a single node so what we are really measuring is whether the graph is a "rooted graph". TODO: measure whether the graph is "strongly connected" ie there is a path from every node to every other node.
      Throws:
      IOException