lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <874k3qxc3i.fsf@email.froward.int.ebiederm.org>
Date:   Tue, 22 Mar 2022 10:04:33 -0500
From:   "Eric W. Biederman" <ebiederm@...ssion.com>
To:     Tony Battersby <tonyb@...ernetics.com>
Cc:     Jens Axboe <axboe@...nel.dk>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Oleg Nesterov <oleg@...hat.com>,
        Olivier Langlois <olivier@...llion01.com>
Subject: Re: [PATCH] kernel: make TIF_NOTIFY_SIGNAL and core dumps co-exist

Tony Battersby <tonyb@...ernetics.com> writes:

> On 8/19/21 10:59, Jens Axboe wrote:
>> On 8/18/21 8:57 PM, Linus Torvalds wrote:
>>> On Tue, Aug 17, 2021 at 8:06 PM Jens Axboe <axboe@...nel.dk> wrote:
>>>> task_work being added with notify == TWA_SIGNAL will utilize
>>>> TIF_NOTIFY_SIGNAL for signaling the targeted task that work is available.
>>>> If this happens while a task is going through a core dump, it'll
>>>> potentially disturb and truncate the dump as a signal interruption.
>>> This patch seems (a) buggy and (b) hacky.
>>>
>>>> --- a/kernel/task_work.c
>>>> +++ b/kernel/task_work.c
>>>> @@ -41,6 +41,12 @@ int task_work_add(struct task_struct *task, struct callback_head *work,
>>>>                 head = READ_ONCE(task->task_works);
>>>>                 if (unlikely(head == &work_exited))
>>>>                         return -ESRCH;
>>>> +               /*
>>>> +                * TIF_NOTIFY_SIGNAL notifications will interfere with
>>>> +                * a core dump in progress, reject them.
>>>> +                */
>>>> +               if (notify == TWA_SIGNAL && (task->flags & PF_SIGNALED))
>>>> +                       return -ESRCH;
>>> This basically seems to check task->flags with no serialization.
>>>
>>> I'm sure it works 99.9% of the time in practice, since you'd be really
>>> unlucky to hit any races, but I really don't see what the
>>> serialization logic is.
>>>
>>> Also, the main user that actually triggered the problem already has
>>>
>>>         if (unlikely(tsk->flags & PF_EXITING))
>>>                 goto fail;
>>>
>>> just above the call to task_work_add(), so this all seems very hacky indeed.
>>>
>>> Of course, I don't see what the serialization for _that_ one is either.
>>>
>>> Pls explain. You can't just randomly add tests for random flags that
>>> get modified by other random code.
>> You're absolutely right. On the io_uring side, in the current tree,
>> there's only one check where current != task being checked - and that's
>> in the poll rewait arming. That one should likely just go away. It may
>> be fine as it is, as it just pertains to ring exit cancelations. We want
>> to ensure that we don't rearm poll requests if the process is canceling
>> and going away. I'll take a closer look at that one.
>>
>> For this particular patch, I agree it's racy. I'll see if I can come up
>> with something better...
>>
>
> Continuing this thread from August 2021:
>
> I previously tested a version of Jens' patch backported to 5.10 and it
> fixed my problem.  Now I am trying to upgrade kernels, and 5.17 still
> has the same problem - coredumps from an io_uring program to a pipe are
> truncated.  Jens' patch applied to 5.17 again fixes the problem.  Has
> there been any progress with fixing the problem upstream?
>
> Reference:
>
> https://lore.kernel.org/all/8af373ec-9609-35a4-f185-f9bdc63d39b7@cybernetics.com/
> https://lore.kernel.org/all/76d3418c-e9ba-4392-858a-5da8028e3526@kernel.dk/

I am still slowly working on this.  (I was unfortunately preempted by
some painful to track down and fix regressions elsewhere).

When I was doubly checking to be certain I understood the problem the
case you describe is one of the easy cases that needs to be handled.

There is at least one more difficult interaction that is not solved by
squelching task_work_add after PF_SIGNALED is set, and I am not 100%
convinced that it is even correct to squelch task_work_add at that point
in the code.

The progress I have made to date that I am sending to Linus for v5.18 is
the removal of tracehook.h which makes the code much more
understandable.

I think I have a general solution that I am planning to post after
v5.18-rc1 that I have not tested yet on the cases that I know about,
but I expect it will work.

So I think that puts a good general fix 2-3 weeks out.

This is quite possibly a case where perfection is getting in the way of
the good, but I honestly can't judge anything except a fix that cleans
up everything and is complete.  There are too many weird and subtle
interactions that I don't understand.

So I am going to continue concentrating on a good general solution so
that the code is readable and makes sense.

Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ