lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49a43774-bf97-4b20-8382-4fb921f34c66@csgroup.eu>
Date: Tue, 17 Dec 2024 09:59:16 +0100
From: Christophe Leroy <christophe.leroy@...roup.eu>
To: Eric Dumazet <edumazet@...gle.com>
Cc: "David S. Miller" <davem@...emloft.net>, Jakub Kicinski
 <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
 Simon Horman <horms@...nel.org>, "Eric W. Biederman"
 <ebiederm@...ssion.com>, linux-kernel@...r.kernel.org,
 netdev@...r.kernel.org, Maxime Chevallier <maxime.chevallier@...tlin.com>,
 TRINH THAI Florent <florent.trinh-thai@...soprasteria.com>,
 CASAUBON Jean Michel <jean-michel.casaubon@...soprasteria.com>
Subject: Re: [PATCH net] net: sysfs: Fix deadlock situation in sysfs accesses



Le 17/12/2024 à 09:16, Eric Dumazet a écrit :
> On Tue, Dec 17, 2024 at 8:18 AM Christophe Leroy
> <christophe.leroy@...roup.eu> wrote:
>>
>> The following problem is encountered on kernel built with
>> CONFIG_PREEMPT. An snmp daemon running with normal priority is
>> regularly calling ioctl(SIOCGMIIPHY). Another process running with
>> SCHED_FIFO policy is regularly reading /sys/class/net/eth0/carrier.
>>
>> After some random time, the snmp daemon gets preempted while holding
>> the RTNL mutex then the high priority process is busy looping into
>> carrier_show which bails out early due to a non-successfull
>> rtnl_trylock() which implies restart_syscall(). Because the snmp
>> daemon has a lower priority, it never gets the chances to release
>> the RTNL mutex and the high-priority task continues to loop forever.
>>
>> Replace the trylock by lock_interruptible. This will increase the
>> priority of the task holding the lock so that it can release it and
>> allow the reader of /sys/class/net/eth0/carrier to actually perform
>> its read.
>>

...

>>
>> Fixes: 336ca57c3b4e ("net-sysfs: Use rtnl_trylock in sysfs methods.")
>> Signed-off-by: Christophe Leroy <christophe.leroy@...roup.eu>
>> ---
> 
> At a first glance, this might resurface the deadlock issue Eric W. Biederman
> was trying to fix in 336ca57c3b4e ("net-sysfs: Use rtnl_trylock in
> sysfs methods.")

Are you talking about the deadlock fixed (incompletely) by 5a5990d3090b 
("net: Avoid race between network down and sysfs"), or the complement 
provided by 336ca57c3b4e ?

My understanding is that mutex_lock() will return EINTR only if a signal 
is pending so there is no need to set signal_pending like it was when 
using mutex_trylock() which does nothing when the mutex is already locked.

And an EINTR return is expected and documented for a read() or a 
write(), I can't see why we want ERESTARTNOINTR instead of ERSTARTSYS. 
Isn't it the responsibility of the user app to call again read or write 
if it has decided to not install the necessary sigaction for an 
automatic restart ?

Do you think I should instead use rtnl_lock_killable() and return 
ERESTARTNOINTR in case of failure ? In that case, is it still possible 
to interrupt a blocked 'cat /sys/class/net/eth0/carrier' which CTRL+C ?

> 
> I was hoping that at some point, some sysfs write methods could be
> marked as : "We do not need to hold the sysfs lock"


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ