linux-kernel - Re: [PATCH 2/2] coredump: add a new elf note with siginfo fields of the signal

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAK1hOcN79Gewb4x1W=WEDL6ESpcRQ0SD5mPzjRN6TY8jJvhCQw@mail.gmail.com>
Date:	Fri, 14 Sep 2012 14:27:37 +0200
From:	Denys Vlasenko <vda.linux@...glemail.com>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Amerigo Wang <amwang@...hat.com>,
	Roland McGrath <roland@...k.frob.com>
Subject: Re: [PATCH 2/2] coredump: add a new elf note with siginfo fields of
 the signal

On Thu, Sep 13, 2012 at 5:36 PM, Oleg Nesterov <oleg@...hat.com> wrote:
> On 09/13, Denys Vlasenko wrote:
>>
>> This patch adds a new elf note, NT_SIGINFO, which contains
>> the remaining fields of siginfo_t.
>
> I can't really comment this patch, but...
>
>> +struct coredump_siginfo {
...
>> +};
...
>
> I can't understand the layout. struct siginfo is union, for example
> si_overrun only makes sense if si_code = SI_TIMER.
>
> Not sure this is right. I think fill_siginfo_note() should either do
> memcpy() and let userspace to decode this (raw) info, or this layout
> should be unified with copy_siginfo_to_user().

The reason I did not do that is two-fold. First, and most important one,
is layout and alignment reasons. This is *coredump file*,
not in-memory structure. While siginfo_t in memory is always on the same
machine, coredump may be looked at on a completely different
architecture. It would be not too hard if the only differences are 32/64
bitness and whether longs need to be aligned as ints or twice that,
but there are many more subtleties when we deal with insane nested
struct/union beast siginfo_t is.

Example #1:

        union {
...
                struct {
                        __kernel_pid_t _pid;    /* sender's pid */
                        __ARCH_SI_UID_T _uid;   /* sender's uid */
                } _kill;

How wide is __ARCH_SI_UID_T?
Are we sure developer working on machine with architecture A
will get it right for arches B, C, D, and obscure one XYZ
he doesn't even know exists?

Example #2: on which architectures si_trapno exists?

Basically, to resolve these problems userspace coredump
parsing code will be forced to carry *per-architecture*
struct siginfo definitions. Which probably means 20+ structs.
A small nightmare. Imagine how many "oh no, our struct
siginfo for (e.g.) Blackfin was broken for last 3 years
and we didn't notice!" bugs there will be...

With my approach, userspace developers won't be haunted
by these problems: in coredump, we use *unambiguous
format*, where all fields exist on all architectures,
all fields are separate (no nested unions) and all fields
are ints or longs (no shorts, no chars).
Basically, only two different layouts will exist:
32-bit one, and 64-bit one.

Yes, we might be wasting a few bytes, but (1) we also _save_
a few bytes by not saving signo/errno/code redundantly,
and (2) a few bytes saved in *coredump* is well in the noise.

> Note also that we do not expose the upper bits of si_code to user-space,
> probably coredump should do the same, I dunno.

Yes, we shouldn't leak data.

-- 
vda
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/