Does anyone know whether it’s possible to emulate a graph database with a relational by adding specific meta-data tables or indices? Emulate in the sense that you can do efficient graph traversals that drill along a path for a certain non-trivial depth.

  • YummyLateness
    link
    fedilink
    English
    arrow-up
    3
    ·
    1 year ago

    I recall seeing some implementation that did that by having a node table (with whatever metadata you need to assign to each node) and a link table (with a first column as starting node and second column as ending node, with extra columns if the links require extra metadata).

    In the easiest graph configuration (a direct acyclic graph or DAG) you can perform traversals by repeatedly querying by first column on the link table. A Djkstra should also be straightforward (assuming a weight column in the link table; you can store the distances either in memory or into a ephemeral table, depending on the graph size magnitude and use case).

    More complex graphs may require some conventions (if links are not directional you would need to establish a convention on first and second column, and if more than one link can exist between two nodes then you cannot do a unique together with those two columns and need an extra identifier for each link).

    That feels like the most naive approach to your problem. If you need a more complex solution maybe look at the RDF strategies (although those are used as an alternative to relational databases if I remember correctly, so maybe they do not make sense I’m this contex).

    • MattolOPM
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Thanks, that’s helpful! Is this repeated querying essentially what graph databases do under the hood? Can this be more efficient than multiple joins? Also, what are RDF strategies?

      • YummyLateness
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        Of course it can be done more efficiently… with a non-relational database xD

        If you want performance look into real graph databases implementations. Optimizing a graph database over a relational database seems a waste of time IMHO. I may be mistaken.

        If you haven’t read this, start here: https://en.wikipedia.org/wiki/Triplestore

      • lurkingllama
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        You can do many common graph search operations in only one query using recursive CTEs (common table expressions). You should probably look into the documentation of your RDBMS because details may vary, but here’s an example from PostgreSQL’s documentation on recursive CTEs:

        WITH RECURSIVE search_graph(id, link, data, depth) AS (
            SELECT g.id, g.link, g.data, 1
            FROM graph g
          UNION ALL
            SELECT g.id, g.link, g.data, sg.depth + 1
            FROM graph g, search_graph sg
            WHERE g.id = sg.link
        ) CYCLE id SET is_cycle USING path
        SELECT * FROM search_graph;
        
        • MattolOPM
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 year ago

          How do sql query engines backed by S3 like Trino (Athena) behave for euch queries?

          • lurkingllama
            link
            fedilink
            English
            arrow-up
            2
            ·
            1 year ago

            I don’t know. But recursive CTEs habe been part of the standard since SQL:1999, so if they advertise compliance with SQL:1999 or a newer version of the standard, they should at least be able to execute the query. To find out whether they do so efficiently, you could do an experiment (remember most SQL engines allow you to look at the query plan for a given query) or read their documentation.