[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b6b5300d-35a0-3bc0-ad1d-f2af433ef27e@csgroup.eu>
Date: Fri, 16 Apr 2021 16:40:52 +0200
From: Christophe Leroy <christophe.leroy@...roup.eu>
To: Steven Price <steven.price@....com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Michael Ellerman <mpe@...erman.id.au>,
akpm@...ux-foundation.org
Cc: linux-arch@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-kernel@...r.kernel.org, linux-s390@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org,
linux-riscv@...ts.infradead.org, x86@...nel.org, linux-mm@...ck.org
Subject: Re: [PATCH v1 3/5] mm: ptdump: Provide page size to notepage()
Le 16/04/2021 à 15:00, Steven Price a écrit :
> On 16/04/2021 12:08, Christophe Leroy wrote:
>>
>>
>> Le 16/04/2021 à 12:51, Steven Price a écrit :
>>> On 16/04/2021 11:38, Christophe Leroy wrote:
>>>>
>>>>
>>>> Le 16/04/2021 à 11:28, Steven Price a écrit :
>>>>> On 15/04/2021 18:18, Christophe Leroy wrote:
>>>>>> In order to support large pages on powerpc, notepage()
>>>>>> needs to know the page size of the page.
>>>>>>
>>>>>> Add a page_size argument to notepage().
>>>>>>
>>>>>> Signed-off-by: Christophe Leroy <christophe.leroy@...roup.eu>
>>>>>> ---
>>>>>> arch/arm64/mm/ptdump.c | 2 +-
>>>>>> arch/riscv/mm/ptdump.c | 2 +-
>>>>>> arch/s390/mm/dump_pagetables.c | 3 ++-
>>>>>> arch/x86/mm/dump_pagetables.c | 2 +-
>>>>>> include/linux/ptdump.h | 2 +-
>>>>>> mm/ptdump.c | 16 ++++++++--------
>>>>>> 6 files changed, 14 insertions(+), 13 deletions(-)
>>>>>>
>>>>> [...]
>>>>>> diff --git a/mm/ptdump.c b/mm/ptdump.c
>>>>>> index da751448d0e4..61cd16afb1c8 100644
>>>>>> --- a/mm/ptdump.c
>>>>>> +++ b/mm/ptdump.c
>>>>>> @@ -17,7 +17,7 @@ static inline int note_kasan_page_table(struct mm_walk *walk,
>>>>>> {
>>>>>> struct ptdump_state *st = walk->private;
>>>>>> - st->note_page(st, addr, 4, pte_val(kasan_early_shadow_pte[0]));
>>>>>> + st->note_page(st, addr, 4, pte_val(kasan_early_shadow_pte[0]), PAGE_SIZE);
>>>>>
>>>>> I'm not completely sure what the page_size is going to be used for, but note that KASAN
>>>>> presents an interesting case here. We short-cut by detecting it's a KASAN region at a high
>>>>> level (PGD/P4D/PUD/PMD) and instead of walking the tree down just call note_page() *once* but
>>>>> with level==4 because we know KASAN sets up the page table like that.
>>>>>
>>>>> However the one call actually covers a much larger region - so while PAGE_SIZE matches the
>>>>> level it doesn't match the region covered. AFAICT this will lead to odd results if you enable
>>>>> KASAN on powerpc.
>>>>
>>>> Hum .... I successfully tested it with KASAN, I now realise that I tested it with
>>>> CONFIG_KASAN_VMALLOC selected. In this situation, since
>>>> https://github.com/torvalds/linux/commit/af3d0a686 we don't have any common shadow page table
>>>> anymore.
>>>>
>>>> I'll test again without CONFIG_KASAN_VMALLOC.
>>>>
>>>>>
>>>>> To be honest I don't fully understand why powerpc requires the page_size - it appears to be
>>>>> using it purely to find "holes" in the calls to note_page(), but I haven't worked out why such
>>>>> holes would occur.
>>>>
>>>> I was indeed introduced for KASAN. We have a first commit
>>>> https://github.com/torvalds/linux/commit/cabe8138 which uses page size to detect whether it is a
>>>> KASAN like stuff.
>>>>
>>>> Then came https://github.com/torvalds/linux/commit/b00ff6d8c as a fix. I can't remember what the
>>>> problem was exactly, something around the use of hugepages for kernel memory, came as part of
>>>> the series
>>>> https://patchwork.ozlabs.org/project/linuxppc-dev/cover/cover.1589866984.git.christophe.leroy@csgroup.eu/
>>>
>>>
>>>
>>>
>>> Ah, that's useful context. So it looks like powerpc took a different route to reducing the KASAN
>>> output to x86.
>>>
>>> Given the generic ptdump code has handling for KASAN already it should be possible to drop that
>>> from the powerpc arch code, which I think means we don't actually need to provide page size to
>>> notepage(). Hopefully that means more code to delete ;)
>>>
>>
>> Yes ... and no.
>>
>> It looks like the generic ptdump handles the case when several pgdir entries points to the same
>> kasan_early_shadow_pte. But it doesn't take into account the powerpc case where we have regular
>> page tables where several (if not all) PTEs are pointing to the kasan_early_shadow_page .
>
> I'm not sure I follow quite how powerpc is different here. But could you have a similar check for
> PTEs against kasan_early_shadow_pte as the other levels already have?
>
> I'm just worried that page_size isn't well defined in this interface and it's going to cause
> problems in the future.
>
I'm trying. I reverted the two commits b00ff6d8c and cabe8138.
At the moment, I don't get exactly what I expect: For linear memory I get one line for each 8M page
whereas before reverting the patches I got one 16M line and one 112M line.
And for KASAN shadow area I get two lines for the 2x 8M pages shadowing linear mem then I get one 4M
line for each PGDIR entry pointing to kasan_early_shadow_pte.
0xf8000000-0xf87fffff 0x07000000 8M huge rw present
0xf8800000-0xf8ffffff 0x07800000 8M huge rw present
0xf9000000-0xf93fffff 0x01430000 4M r present
0xf9400000-0xf97fffff 0x01430000 4M r present
0xf9800000-0xf9bfffff 0x01430000 4M r present
0xf9c00000-0xf9ffffff 0x01430000 4M r present
0xfa000000-0xfa3fffff 0x01430000 4M r present
0xfa400000-0xfa7fffff 0x01430000 4M r present
0xfa800000-0xfabfffff 0x01430000 4M r present
0xfac00000-0xfaffffff 0x01430000 4M r present
0xfb000000-0xfb3fffff 0x01430000 4M r present
0xfb400000-0xfb7fffff 0x01430000 4M r present
0xfb800000-0xfbbfffff 0x01430000 4M r present
0xfbc00000-0xfbffffff 0x01430000 4M r present
0xfc000000-0xfc3fffff 0x01430000 4M r present
0xfc400000-0xfc7fffff 0x01430000 4M r present
0xfc800000-0xfcbfffff 0x01430000 4M r present
0xfcc00000-0xfcffffff 0x01430000 4M r present
0xfd000000-0xfd3fffff 0x01430000 4M r present
0xfd400000-0xfd7fffff 0x01430000 4M r present
0xfd800000-0xfdbfffff 0x01430000 4M r present
0xfdc00000-0xfdffffff 0x01430000 4M r present
0xfe000000-0xfe3fffff 0x01430000 4M r present
0xfe400000-0xfe7fffff 0x01430000 4M r present
0xfe800000-0xfebfffff 0x01430000 4M r present
0xfec00000-0xfeffffff 0x01430000 4M r present
Any idea ?
Christophe
Powered by blists - more mailing lists