linux-kernel - Re: [BUG -tip] kmemleak and stacktrace cause page faul

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.21.1910231457400.2308@nanos.tec.linutronix.de>
Date:   Wed, 23 Oct 2019 15:21:05 +0200 (CEST)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Cyrill Gorcunov <gorcunov@...il.com>
cc:     LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>, linux-mm@...ck.org,
        Catalin Marinas <catalin.marinas@....com>
Subject: Re: [BUG -tip] kmemleak and stacktrace cause page faul

On Tue, 22 Oct 2019, Cyrill Gorcunov wrote:
> On Tue, Oct 22, 2019 at 05:23:25PM +0300, Cyrill Gorcunov wrote:
> > 
> > I presume the kmemleak tries to save stack trace too early when estack_pages are not
> > yet filled.
> 
> Indeed, at this stage of boot the percpu_setup_exception_stacks has not been called
> yet and estack_pages full of crap
> 
> [    0.157502] stk 0x1008 k 1 begin 0x0 end 0xd000 estack_pages 0xffffffff82014880 ep 0xffffffff82014888
> [    0.159395] estack_pages[0] = 0x0
> [    0.160046] estack_pages[1] = 0x5100000001000
> [    0.160881] estack_pages[2] = 0x0
> [    0.161530] estack_pages[3] = 0x6100000003000
> [    0.162343] estack_pages[4] = 0x0
> [    0.162962] estack_pages[5] = 0x0
> [    0.163523] estack_pages[6] = 0x0
> [    0.164065] estack_pages[7] = 0x8100000007000
> [    0.164978] estack_pages[8] = 0x0
> [    0.165624] estack_pages[9] = 0x9100000009000
> [    0.166448] estack_pages[10] = 0x0
> [    0.167064] estack_pages[11] = 0xa10000000b000
> [    0.168055] estack_pages[12] = 0x0

Errm. estack_pages is statically initialized and it's an array of:.

struct estack_pages {
        u32     offs;
        u16     size;
        u16     type;
};

[0,2,4,5,6,8,10,12] are guard pages so 0 is not that crappy at all

The rest looks completely valid if you actually decode it proper.

e.g. 0x51000 00001000

     bit  0-31: 00001000		Offset 0x1000: 1 Page
     bit 32-47: 1000			Size 0x1000:   1 Page
     bit 48-63: 5			Type 5: STACK_TYPE_EXCEPTION + ESTACK_DF

So, no. This is NOT the problem.

But yes, you are right that percpu_setup_exception_stacks() has not yet
been called simply because the percpu entry area has not been mapped yet.

So lets look at the full context:

        begin = (unsigned long)__this_cpu_read(cea_exception_stacks);

When percpu_setup_exception_stacks() has not been called yet, then begin
should be 0.

        end = begin + sizeof(struct cea_exception_stacks);

end should be 0 + sizeof(struct cea_exception_stacks);

        /* Bail if @stack is outside the exception stack area. */
        if (stk < begin || stk >= end)
                return false;

So 'begin <= stk < end' must be true to get to the below:

        /* Calc page offset from start of exception stacks */
        k = (stk - begin) >> PAGE_SHIFT;

which gives a valid 'k' no matter what 'begin' is. And obviously 'k' cannot
be outside of the array size of estack_pages.

        /* Lookup the page descriptor */
        ep = &estack_pages[k];

Ergo ep must be a valid pointer pointing to the statically allocated and
statically initialized estack_pages array.

        /* Guard page? */
        if (!ep->size)

How on earth can dereferencing ep crash the machine?

                return false;

That does not make any sense.

Surely, we should not even try to decode exception stack when
cea_exception_stacks is not yet initialized, but that does not explain
anything what you are observing.

Thanks,

	tglx