lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 14 Nov 2023 10:24:23 -1000
From:   Tejun Heo <tj@...nel.org>
To:     junxiao.bi@...cle.com
Cc:     linux-kernel@...r.kernel.org, linux-raid@...r.kernel.org,
        gregkh@...uxfoundation.org
Subject: Re: [PATCH] kernfs: support kernfs notify in memory recliam context

Hello,

On Tue, Nov 14, 2023 at 12:09:19PM -0800, junxiao.bi@...cle.com wrote:
> On 11/14/23 11:06 AM, Tejun Heo wrote:
> > On Tue, Nov 14, 2023 at 10:59:47AM -0800, Junxiao Bi wrote:
> > > kernfs notify is used in write path of md (md_write_start) to wake up
> > > userspace daemon, like "mdmon" for updating md superblock of imsm raid,
> > > md write will wait for that update done before issuing the write, if this
> > How is forward progress guarnateed for that userspace daemon? This sounds
> > like a really fragile setup.
> 
> For imsm raid, userspace daemon "mdmon" is responsible for updating raid
> metadata, kernel will use kernfs_notify to wake up the daemon anywhere
> metadata update is required. If the daemon can't move forward, write may
> hung, but that will be a bug in the daemon?

I see. That sounds very fragile and I'm not quite sure that can ever be made
to work reliably. While memlocking everything needed and being really
judicious about which syscalls to make will probably get you pretty far,
there are things like task_work which gets scheduled and executed when a
task is about to return to userspace. Those things are allowed to grab e.g.
mutexes in the kernel and allocate memory and so on, and can deadlock under
memory pressure.

Even just looking at kernfs_notify, it down_reads()
root->kernfs_supers_rwsem which might already be write-locked by somebody
who's waiting on memory allocation.

The patch you're proposing removes one link in the dependency chain but
there are many on there. I'm not sure this is fixable. Nobody writes kernel
code thinking that userspace code can be on the memory reclaim dependency
chain.

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ