linux-kernel - Re: [PATCH v2] x86: Fix x32 System V message queue syscalls

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <7842A462-0ADB-4EE3-B4CB-AE6DCD70CE1C@jrtc27.com>
Date:   Sun, 1 Nov 2020 01:27:35 +0000
From:   Jessica Clarke <jrtc27@...c27.com>
To:     Rich Felker <dalias@...c.org>
Cc:     Andy Lutomirski <luto@...nel.org>,
        Florian Weimer <fweimer@...hat.com>,
        linux-x86_64@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        X86 ML <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] x86: Fix x32 System V message queue syscalls

On 1 Nov 2020, at 01:22, Rich Felker <dalias@...c.org> wrote:
> On Sat, Oct 31, 2020 at 04:30:44PM -0700, Andy Lutomirski wrote:
>> cc: some libc folks
>> 
>> On Mon, Oct 12, 2020 at 6:45 AM Jessica Clarke <jrtc27@...c27.com> wrote:
>>> 
>>> POSIX specifies that the first field of the supplied msgp, namely mtype,
>>> is a long, not a __kernel_long_t, and it's a user-defined struct due to
>>> the variable-length mtext field so we can't even bend the spec and make
>>> it a __kernel_long_t even if we wanted to. Thus we must use the compat
>>> syscalls on x32 to avoid buffer overreads and overflows in msgsnd and
>>> msgrcv respectively.
>> 
>> This is a mess.
>> 
>> include/uapi/linux/msg.h has:
>> 
>> /* message buffer for msgsnd and msgrcv calls */
>> struct msgbuf {
>>        __kernel_long_t mtype;          /* type of message */
>>        char mtext[1];                  /* message text */
>> };
>> 
>> Your test has:
>> 
>> struct msg_long {
>>    long mtype;
>>    char mtext[8];
>> };
>> 
>> struct msg_long_ext {
>>    struct msg_long msg_long;
>>    char mext[4];
>> };
>> 
>> and I'm unclear as to exactly what you're trying to do there with the
>> "mext" part.
>> 
>> POSIX says:
>> 
>>       The application shall ensure that the argument msgp points to  a  user-
>>       defined  buffer that contains first a field of type long specifying the
>>       type of the message, and then a data portion that holds the data  bytes
>>       of the message. The structure below is an example of what this user-de‐
>>       fined buffer might look like:
>> 
>>           struct mymsg {
>>               long   mtype;       /* Message type. */
>>               char   mtext[1];    /* Message text. */
>>           }
>> 
>> NTP has this delightful piece of code:
>> 
>>   44 typedef union {
>>   45   struct msgbuf msgp;
>>   46   struct {
>>   47     long mtype;
>>   48     int code;
>>   49     struct timeval tv;
>>   50   } msgb;
>>   51 } MsgBuf;
>> 
>> bluefish has:
>> 
>> struct small_msgbuf {
>> long mtype;
>> char mtext[MSQ_QUEUE_SMALL_SIZE];
>> } small_msgp;
>> 
>> 
>> My laptop has nothing at all in /dev/mqueue.
>> 
>> So I don't really know what the right thing to do is.  Certainly if
>> we're going to apply this patch, we should also fix the header.  I
>> almost think we should *delete* struct msgbuf from the headers, since
>> it's all kinds of busted, but that will break the NTP build.  Ideally
>> we would go back in time and remove it from the headers.
>> 
>> Libc people, any insight?  We can probably fix the bug without
>> annoying anyone given how lightly x32 is used and how lightly POSIX
>> message queues are used.
> 
> If it's that outright wrong and always has been, I feel like the old
> syscall numbers should just be deprecated and new ones assigned.
> Otherwise, there's no way for userspace to be safe against data
> corruption when run on older kernels. If there's a new syscall number,
> libc can just use the new one unconditionally (giving ENOSYS on
> kernels where it would be broken) or have a x32-specific
> implementation that makes the old syscall and performs translation if
> the new one fails with ENOSYS.

That doesn't really help broken code continue to work reliably, as
upgrading libc will just pull in the new syscall for a binary that's
expecting the broken behaviour, unless you do symbol versioning, but
then it'll just break when you next recompile the code, and there's no
way for that to be diagnosed given the *application* has to define the
type. But given it's application-defined I really struggle to see how
any code out there is actually expecting the current x32 behaviour as
you'd have to go really out of your way to find out that x32 is broken
and needs __kernel_long_t. I don't think there's any way around just
technically breaking ABI whilst likely really fixing ABI in 99.999% of
cases (maybe 100%).

Jess