lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 6 Oct 2017 11:25:16 -0400
From:   Pasha Tatashin <pasha.tatashin@...cle.com>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     linux-kernel@...r.kernel.org, sparclinux@...r.kernel.org,
        linux-mm@...ck.org, linuxppc-dev@...ts.ozlabs.org,
        linux-s390@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        x86@...nel.org, kasan-dev@...glegroups.com, borntraeger@...ibm.com,
        heiko.carstens@...ibm.com, davem@...emloft.net,
        willy@...radead.org, ard.biesheuvel@...aro.org,
        mark.rutland@....com, will.deacon@....com, catalin.marinas@....com,
        sam@...nborg.org, mgorman@...hsingularity.net,
        steven.sistare@...cle.com, daniel.m.jordan@...cle.com,
        bob.picco@...cle.com
Subject: Re: [PATCH v10 05/10] mm: zero reserved and unavailable struct pages

Hi Michal,

> 
> As I've said in other reply this should go in only if the scenario you
> describe is real. I am somehow suspicious to be honest. I simply do not
> see how those weird struct pages would be in a valid pfn range of any
> zone.
> 

There are examples of both when unavailable memory is not part of any 
zone, and where it is part of zones.

I run Linux in kvm with these arguments:

         qemu-system-x86_64
         -enable-kvm
         -cpu kvm64
         -kernel $kernel
         -initrd $initrd
         -m 512
         -smp 2
         -device e1000,netdev=net0
         -netdev user,id=net0
         -boot order=nc
         -no-reboot
         -watchdog i6300esb
         -watchdog-action debug
         -rtc base=localtime
         -serial stdio
         -display none
         -monitor null

This patch reports that there are 98 unavailable pages.

They are: pfn 0 and pfns in range [159, 255].

Note, trim_low_memory_range() reserves only pfns in range [0, 15], it 
does not reserve [159, 255] ones.

e820__memblock_setup() reports linux that the following physical ranges 
are available:
     [1 , 158]
[256, 130783]

Notice, that exactly unavailable pfns are missing!

Now, lets check what we have in zone 0: [1, 131039]

pfn 0, is not part of the zone, but pfns [1, 158], are.

However, the bigger problem we have if we do not initialize these struct 
pages is with memory hotplug. Because, that path operates at 2M 
boundaries (section_nr). And checks if 2M range of pages is hot 
removable. It starts with first pfn from zone, rounds it down to 2M 
boundary (sturct pages are allocated at 2M boundaries when vmemmap is 
created), and and checks if that section is hot removable. In this case 
start with pfn 1 and convert it down to pfn 0.

Later pfn is converted to struct page, and some fields are checked. Now, 
if we do not zero struct pages, we get unpredictable results. In fact 
when CONFIG_VM_DEBUG is enabled, and we explicitly set all vmemmap 
memory to ones, I am getting the following panic with kernel test 
without this patch applied:

[   23.277793] BUG: unable to handle kernel NULL pointer dereference at 
          (null)
[   23.278863] IP: is_pageblock_removable_nolock+0x35/0x90
[   23.279619] PGD 0 P4D 0
[   23.280031] Oops: 0000 [#1] PREEMPT
[   23.280527] CPU: 0 PID: 249 Comm: udevd Not tainted 
4.14.0-rc3_pt_memset10-00335-g5e2c7478bed5-dirty #8
[   23.281735] task: ffff88001f4e2900 task.stack: ffffc90000314000
[   23.282532] RIP: 0010:is_pageblock_removable_nolock+0x35/0x90
[   23.283275] RSP: 0018:ffffc90000317d60 EFLAGS: 00010202
[   23.283948] RAX: ffffffffffffffff RBX: ffff88001d92b000 RCX: 
0000000000000000
[   23.284862] RDX: 0000000000000000 RSI: 0000000000200000 RDI: 
ffff88001d92b000
[   23.285771] RBP: ffffc90000317d80 R08: 00000000000010c8 R09: 
0000000000000000
[   23.286542] R10: 0000000000000000 R11: 0000000000000000 R12: 
ffff88001db2b000
[   23.287264] R13: ffffffff81af6d00 R14: ffff88001f7d5000 R15: 
ffffffff82a1b6c0
[   23.287971] FS:  00007f4eb857f7c0(0000) GS:ffffffff81c27000(0000) 
knlGS:0000000000000000
[   23.288775] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   23.289355] CR2: 0000000000000000 CR3: 000000001f4e6000 CR4: 
00000000000006b0
[   23.290066] Call Trace:
[   23.290323]  ? is_mem_section_removable+0x5a/0xd0
[   23.290798]  show_mem_removable+0x6b/0xa0
[   23.291204]  dev_attr_show+0x1b/0x50
[   23.291565]  sysfs_kf_seq_show+0xa1/0x100
[   23.291967]  kernfs_seq_show+0x22/0x30
[   23.292349]  seq_read+0x1ac/0x3a0
[   23.292687]  kernfs_fop_read+0x36/0x190
[   23.293074]  ? security_file_permission+0x90/0xb0
[   23.293547]  __vfs_read+0x16/0x30
[   23.293884]  vfs_read+0x81/0x130
[   23.294214]  SyS_read+0x44/0xa0
[   23.294537]  entry_SYSCALL_64_fastpath+0x1f/0xbd
[   23.295003] RIP: 0033:0x7f4eb7c660a0
[   23.295364] RSP: 002b:00007ffda6cffe28 EFLAGS: 00000246 ORIG_RAX: 
0000000000000000
[   23.296152] RAX: ffffffffffffffda RBX: 00000000000003de RCX: 
00007f4eb7c660a0
[   23.296934] RDX: 0000000000001000 RSI: 00007ffda6cffec8 RDI: 
0000000000000005
[   23.297963] RBP: 00007ffda6cffde8 R08: 7379732f73656369 R09: 
6f6d656d2f6d6574
[   23.299198] R10: 726f6d656d2f7972 R11: 0000000000000246 R12: 
0000000000000022
[   23.300400] R13: 0000561d68ea7710 R14: 0000000000000000 R15: 
00007ffda6d05c78
[   23.301591] Code: c1 ea 35 49 c1 e8 2b 48 8b 14 d5 c0 b6 a1 82 41 83 
e0 03 48 85 d2 74 0c 48 c1 e8 29 25 f0 0f 00 00 48 01 c2 4d 69 c0 98 
05 00 00 <48> 8b 02 48 89 fa 48 83 e0 f8 49 8b 88 28 b5 d3 81 48 29 c2 49
[   23.304739] RIP: is_pageblock_removable_nolock+0x35/0x90 RSP: 
ffffc90000317d60
[   23.305940] CR2: 0000000000000000

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ