[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240306205923.17190-1-kuniyu@amazon.com>
Date: Wed, 6 Mar 2024 12:59:23 -0800
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <pabeni@...hat.com>
CC: <davem@...emloft.net>, <edumazet@...gle.com>, <kuba@...nel.org>,
<kuni1840@...il.com>, <kuniyu@...zon.com>, <netdev@...r.kernel.org>
Subject: Re: [PATCH v4 net-next 12/15] af_unix: Assign a unique index to SCC.
From: Paolo Abeni <pabeni@...hat.com>
Date: Tue, 05 Mar 2024 09:44:00 +0100
> On Thu, 2024-02-29 at 18:22 -0800, Kuniyuki Iwashima wrote:
> > The definition of the lowlink in Tarjan's algorithm is the
> > smallest index of a vertex that is reachable with at most one
> > back-edge in SCC. This is not useful for a cross-edge.
> >
> > If we start traversing from A in the following graph, the final
> > lowlink of D is 3. The cross-edge here is one between D and C.
> >
> > A -> B -> D D = (4, 3) (index, lowlink)
> > ^ | | C = (3, 1)
> > | V | B = (2, 1)
> > `--- C <--' A = (1, 1)
> >
> > This is because the lowlink of D is updated with the index of C.
> >
> > In the following patch, we detect a dead SCC by checking two
> > conditions for each vertex.
> >
> > 1) vertex has no edge directed to another SCC (no bridge)
> > 2) vertex's out_degree is the same as the refcount of its file
> >
> > If 1) is false, there is a receiver of all fds of the SCC and
> > its ancestor SCC.
> >
> > To evaluate 1), we need to assign a unique index to each SCC and
> > assign it to all vertices in the SCC.
> >
> > This patch changes the lowlink update logic for cross-edge so
> > that in the example above, the lowlink of D is updated with the
> > lowlink of C.
> >
> > A -> B -> D D = (4, 1) (index, lowlink)
> > ^ | | C = (3, 1)
> > | V | B = (2, 1)
> > `--- C <--' A = (1, 1)
> >
> > Then, all vertices in the same SCC have the same lowlink, and we
> > can quickly find the bridge connecting to different SCC if exists.
> >
> > However, it is no longer called lowlink, so we rename it to
> > scc_index. (It's sometimes called lowpoint.)
>
> I'm wondering if there is any reference to this variation of Tarjan's
> algorithm you can point, to help understanding, future memory,
> reviewing.
I don't have any reference... perhaps we can add comment like
/* why ? git-blame me. */ or .rst file under Documentation/ about
why GC is needed, how GC works / what algorithm is used, etc.
When I was wondering the same thing, I googled and found someone
who had the same question, but there was no reference.
https://stackoverflow.com/questions/23213993/what-is-the-lowelink-mean-of-tarjans-algorithm
There might be a text book but I couldn't find online resources.
Even wiki says it looks odd.
> // The next line may look odd - but is correct.
> // It says w.index not w.lowlink; that is deliberate and from the original paper
> v.lowlink := min(v.lowlink, w.index)
https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm
Regarding "lowpoint", I saw it in the wiki for the first time.
> The lowlink is different from the lowpoint, which is the smallest
> index reachable from v through any part of the graph.[1]: 156 [2]
In a pdf linked from the wiki:
> lowpoint(v) = The lowest numbered vertex reachable from v using
> zero or more tree edges followed by at most one back or cross edge.
https://www.cs.cmu.edu/~15451-f18/lectures/lec19-DFS-strong-components.pdf
But I've just found that the original paper used LOWPT, which
is called lowlink now... :S
> LOWPT(v) :=min(LOWPT(v) ,NUMBER(w)) ;
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4569669
Powered by blists - more mailing lists