[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <276439ce-aad9-e72f-7bc9-c57fb4a59339@c-s.fr>
Date: Fri, 14 Dec 2018 09:01:52 +0100
From: Christophe Leroy <christophe.leroy@....fr>
To: Michael Ellerman <mpe@...erman.id.au>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>
Cc: linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH] powerpc/mm: make NULL pointer deferences explicit on bad
page faults.
Hi Michael,
Le 14/12/2018 à 01:57, Michael Ellerman a écrit :
> Hi Christophe,
>
> You know it's the trivial patches that are going to get lots of review
> comments :)
I'm so happy to get comments.
>
> Christophe Leroy <christophe.leroy@....fr> writes:
>> As several other arches including x86, this patch makes it explicit
>> that a bad page fault is a NULL pointer dereference when the fault
>> address is lower than PAGE_SIZE
>
> I'm being pedantic, but it's not necessarily a NULL pointer dereference.
> It might just be a direct access to a low address, eg:
>
> char *p = 0x100;
> *p = 0;
>
> That's not a NULL pointer dereference.
>
> But other arches do print this so I guess it's OK to add, and in most
> cases it will be an actual NULL pointer dereference.
>
> I wonder though if we should use 4096 rather than PAGE_SIZE, given
> that's the actual value other arches are using. We support 256K pages on
> some systems, which is getting quite large.
Those invalid accesses are catched because the first page is marked non
present or non accessible in the page table, so I thing using PAGE_SIZE
here is valid regardless of the page size.
Looks like the arches have PAGE_SHIFT ranging from 12 to 16 mainly.
>
>> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
>> index d51cf5f4e45e..501a1eadb3e9 100644
>> --- a/arch/powerpc/mm/fault.c
>> +++ b/arch/powerpc/mm/fault.c
>> @@ -631,13 +631,16 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
>> switch (TRAP(regs)) {
>> case 0x300:
>> case 0x380:
>> - printk(KERN_ALERT "Unable to handle kernel paging request for "
>> - "data at address 0x%08lx\n", regs->dar);
>> + pr_alert("Unable to handle kernel %s for data at address 0x%08lx\n",
>> + regs->dar < PAGE_SIZE ? "NULL pointer dereference" :
>> + "paging request",
>> + regs->dar);
>
> This is now too long I think, with printk time you get:
>
> [ 1096.450711] Unable to handle kernel NULL pointer dereference for data at address 0x00000000
>
> Which is 93 columns. It's true on many systems it doesn't really matter
> any more, but it would still be good if it was shorter.
>
> I like that on x86 they prefix it with "BUG:", just to avoid any confusion.
>
> What if we had for the NULL pointer case:
>
> BUG: Kernel NULL pointer dereference at 0x00000000
>
> And for the normal case:
>
> BUG: Unable to handle kernel data access at 0x00000000
>
> Note on the very next line we print:
> Faulting instruction address: 0xc000000000795cc8
>
> So there should be no confusion about whether "at" refers to the data
> address or the instruction address.
Agreed
>
>> case 0x400:
>> case 0x480:
>> - printk(KERN_ALERT "Unable to handle kernel paging request for "
>> - "instruction fetch\n");
>> + pr_alert("Unable to handle kernel %s for instruction fetch\n",
>> + regs->nip < PAGE_SIZE ? "NULL pointer dereference" :
>> + "paging request");
>
> I don't really like using "NULL pointer dereference" here, that
> terminology makes me think of a load/store, I think it confuses things
> rather than making it clearer.
>
> What about:
>
> BUG: Unable to handle kernel instruction fetch at 0x00000000
I think we still need to make it explicit that we jumped there due to a
NULL function pointer, allthought I don't have a good text idea yet for
this.
>
>
>> break;
>> case 0x600:
>> printk(KERN_ALERT "Unable to handle kernel paging request for "
>
> It would be good to clean up these other cases as well. They seem to be
> trying to use the "page request for" terminology which leads to them
> being very wordy. I assume that was done to help people grepping kernel
> logs for errors, but I think we should not worry about that if we have
> the "BUG:" prefix.
>
> So we have:
> printk(KERN_ALERT "Unable to handle kernel paging request for "
> "unaligned access at address 0x%08lx\n", regs->dar);
>
> What about:
>
> BUG: Unable to handle kernel unaligned access at 0x00000000
>
> And:
> printk(KERN_ALERT "Unable to handle kernel paging request for "
> "unknown fault\n");
>
> What about:
>
> BUG: Unable to handle unknown paging fault at 0x00000000
>
>
> Thoughts?
Looks like good ideas I'll carry on.
Christophe
Powered by blists - more mailing lists