lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190726080133.yrxsaaxasxudyjj4@brauner.io>
Date:   Fri, 26 Jul 2019 10:01:34 +0200
From:   Christian Brauner <christian@...uner.io>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        Oleg Nesterov <oleg@...hat.com>, Arnd Bergmann <arnd@...db.de>,
        Kees Cook <keescook@...omium.org>,
        Joel Fernandes <joel@...lfernandes.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Tejun Heo <tj@...nel.org>, David Howells <dhowells@...hat.com>,
        Jann Horn <jannh@...gle.com>,
        Andrew Lutomirski <luto@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Aleksa Sarai <cyphar@...har.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        Android Kernel Team <kernel-team@...roid.com>
Subject: Re: [RFC][PATCH 1/5] exit: kill struct waitid_info

On Thu, Jul 25, 2019 at 07:46:50AM -0500, Eric W. Biederman wrote:
> Linus Torvalds <torvalds@...ux-foundation.org> writes:
> 
> > On Wed, Jul 24, 2019 at 7:47 AM Christian Brauner <christian@...uner.io> wrote:
> >>
> >> The code here uses a struct waitid_info to catch basic information about
> >> process exit including the pid, uid, status, and signal that caused the
> >> process to exit. This information is then stuffed into a struct siginfo
> >> for the waitid() syscall. That seems like an odd thing to do. We can
> >> just pass down a siginfo_t struct directly which let's us cleanup and
> >> simplify the whole code quite a bit.
> >
> > Ack. Except I'd like the commit message to explain where this comes
> > from instead of that "That seems like an odd thing to do".
> >
> > The _original_ reason for "struct waitid_info" was that "siginfo_t" is
> > huge because of all the insane padding that various architectures do.
> >
> > So it was introduced by commit 67d7ddded322 ("waitid(2): leave copyout
> > of siginfo to syscall itself") very much to avoid stack usage issues.
> > And I quote:
> >
> >     collect the information needed for siginfo into
> >     a small structure (waitid_info)
> >
> > simply because "sigset_t" was big.
> >
> > But that size came from the explicit "pad it out to 128 bytes for
> > future expansion that will never happen", and the kernel using the
> > same exact sigset_t that was exposed to user space.
> >
> > Then in commit 4ce5f9c9e754 ("signal: Use a smaller struct siginfo in
> > the kernel") we got rid of the insane padding for in-kernel use,
> > exactly because it causes issues like this.
> >
> > End result: that "struct waitid_info" no longer makes sense. It's not
> > appreciably smaller than kernel_siginfo_t is today, but it made sense
> > at the time.
> 
> Apologies.  I meant to reply yesterday but I was preempted by baby
> issues.
> 
> I strongly disagree that this direction makes sense.  The largest
> value that I see from struct waitid_info is that it makes it possible to
> reason about which values are returned where struct kernel_siginfo does
> not.
> 
> One of the details the existence of struct waitid_info makes clear is
> that unlike the related child death path the wait code does not
> fillin si_utime and si_stime.  Which is very important to know when you
> are dealing with y2038 issues and Arnd Bergmann is.
> 
> The most egregious example I know of using siginfo wrong is:
> 70f1b0d34bdf ("signal/usb: Replace kill_pid_info_as_cred with
> kill_pid_usb_asyncio").  But just by moving struct siginfo out of the
> program logic and into dedicated little functions that just deal with
> the craziness of struct siginfo I have found lots of little bugs.
> 
> We don't need that kind of invitation to bugs in the wait logic.

I don't think it's a strong enough argument for rejecting this change.
Suspecting that something might go wrong if we simplify something is a
valid call to proceed with caution and be on the lookout for potential
regressions so we can react fast. I respect that. But it's not
necessarily a good argument to reject a change.

I'm happy to switch from an initializer (which is not even clear is a
bug) to using clear_siginfo().
And I'm also going to split this patch out of the P_PIDFD patch but I'm
going to send this out again. I haven't heard a sound argument why this
patch is worse than what we have right now in there.

Christian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ