lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <41455798-1dcb-135f-516d-25ab9a8082f5@linux.intel.com>
Date:   Wed, 19 Oct 2022 18:57:38 +0100
From:   Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com>
To:     "Jason A. Donenfeld" <Jason@...c4.com>
Cc:     "Eric W. Biederman" <ebiederm@...ssion.com>,
        linux-kernel@...r.kernel.org,
        "Intel-gfx@...ts.freedesktop.org" <Intel-gfx@...ts.freedesktop.org>,
        Ville Syrjälä <ville.syrjala@...ux.intel.com>
Subject: Re: signal: break out of wait loops on kthread_stop()


On 19/10/2022 17:00, Jason A. Donenfeld wrote:
> On Wed, Oct 19, 2022 at 7:31 AM Tvrtko Ursulin
> <tvrtko.ursulin@...ux.intel.com> wrote:
>>
>>
>> Hi,
>>
>> A question regarding a7c01fa93aeb ("signal: break out of wait loops on
>> kthread_stop()") if I may.
>>
>> We have a bunch code in i915, possibly limited to self tests (ie debug
>> builds) but still important for our flows, which spawn kernel threads
>> and exercises parts of the driver.
>>
>> Problem we are hitting with this patch is that code did not really need
>> to be signal aware until now. Well to say that more accurately - we were
>> able to test the code which is normally executed from userspace, so is
>> signal aware, but not worry about -ERESTARTSYS or -EINTR within the test
>> cases itself.
>>
>> For example threads which exercise an internal API for a while until the
>> parent calls kthread_stop. Now those tests can hit unexpected errors.
>>
>> Question is how to best approach working around this change. It is of
>> course technically possible to rework our code in more than one way,
>> although with some cost and impact already felt due reduced pass rates
>> in our automated test suites.
>>
>> Maybe an opt out kthread flag from this new behavior? Would that be
>> acceptable as a quick fix? Or any other comments?
> 
> You can opt out by running `clear_tsk_thread_flag(current,
> TIF_NOTIFY_SIGNAL);` at the top of your kthread. But you should really
> fix your code instead. Were I your reviewer, I wouldn't merge code
> that took the lazy path like that. However, that should work, if you
> do opt for the quick fix.

Right, but our hand is a bit forced at the moment. Since 6.1-rc1 has 
propagated to our development tree on Monday, our automated testing 
started failing significantly, which prevents us merging new work until 
resolved. So a quick fix trumps the ideal road in the short term. Just 
because it is quick.

Also, are you confident that the change will not catch anyone else by 
surprise? In the original thread I did not spot any concerns about the 
kthreads being generally unprepared to start receiving EINTR/ERESTARTSYS 
from random call chains.

Regards,

Tvrtko

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ