linux-kernel - Re: [RFC] DEPT(DEPendency Tracker) with DLM(Distributed Lock Manager)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20250529072202.GA13739@system.software.com>
Date: Thu, 29 May 2025 16:22:02 +0900
From: Byungchul Park <byungchul@...com>
To: Alexander Aring <aahringo@...hat.com>
Cc: kernel_team@...ynix.com, linux-kernel@...r.kernel.org,
	gfs2 <gfs2@...ts.linux.dev>
Subject: Re: [RFC] DEPT(DEPendency Tracker) with DLM(Distributed Lock Manager)

On Wed, May 28, 2025 at 08:00:02AM -0400, Alexander Aring wrote:
> Hi,
> 
> On Sun, May 25, 2025 at 8:13 PM Alexander Aring <aahringo@...hat.com> wrote:
> >
> > Hi,
> >
> > On Thu, May 22, 2025 at 1:28 AM Byungchul Park <byungchul@...com> wrote:
> > >
> > > On Thu, May 22, 2025 at 02:24:53PM +0900, Byungchul Park wrote:
> > > > Hi Alexander,
> > > >
> > > > We briefly talked about dept with DLM in an external channel.  However,
> > > > it'd be great to discuss what to aim and how to make it in more detail,
> > > > in this mailing list.
> > > >
> > > > It's worth noting that dept doesn't track dependencies beyond different
> > > > contexts to avoid adding false dependencies by any chance, which means
> > > > though dept checks the dependency sanity *globally*, when it comes to
> > > > creating dependencies, it happens only within e.g. each single system
> > > > call context, each single irq context, each worker context, and so on,
> > > > with its unique context id assigned to each independent context.
> > > >
> > > > In order for dept to work on DLM, we need a way to assign a unique
> > > > context id to each interesting context in DLM's point of view, and let
> > > > dept know the id.  Once making it done, I think dept can work on DLM
> > > > perfectly.
> > >
> > > Plus, we need a way to share the global dependency graph used by dept
> > > between nodes too.
> > >
> >
> > Having everything simulated and having nodes separated as
> > net-namespaces in one Linux kernel instance is I think at first
> > simpler to do and will show the "proof of concepts".
> > Sharing data between nodes is then just some memory area that is not
> > separated by per "struct net" context.
> 
> Alternatively the master node of the lock (this node knows everything
> about the lock operations being done including the nodes that are
> waiting to get the lock granted) can be used to detect cycles, we

Sounds good.

> already do that for some simple cases when converting locks directly
> [0]. Maybe this is already enough to have all the information, but it

It seems that DLM already tries to detect a deadlock.  Can you provide
an example scenario where the current detection logic doesn't work?
It'd help me define what to do for better DLM and dept.

> is not just a "wait_event()" mechanism, there needs to be some other
> API to use DEPT for this case?

It'd be required to modify dept to work with isolated context ids - each
id corresponding to each node, not simple kernel contexts e.g. system
call or irq context.  Which is not that hard to implement I think.

Answering to your question, We might need to add a few dept annotations.
Even though I'm afraid I don't understand how DLM works enough, for
example:

   1. when recieving a lock(L1) request from a node(N1), that might wait,
   
      annotate dept_wait(L1, events that can wake up N1) for N1 context,
      where events are all the events that can wake up N1 from waiting.

   2. when recieving a lock(L1) request from a node(N1), that does not
      involve wait but just tries,

      no need to annotate.

   3. when the request is granted,

      annotate dept_request_event(L1) which means there might be waiters
      for L1 to be released from now on.

   4. when releasing the lock(L1) no matter who releases the lock - the
      releaser doesn't have to be N1,

      annotate dept_event(L1) for the releaser.

Roughly, these annotations are needed, but again, it'd be helpful if you
provide an example scenario where the current detection logic you have
doesn't work, for better discussion.

	Byungchul

> 
> - Alex
> 
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/dlm/lock.c?h=v6.15#n2163