lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170405145304.wxzfavqxnyqtrlru@arbab-laptop>
Date:   Wed, 5 Apr 2017 09:53:05 -0500
From:   Reza Arbab <arbab@...ux.vnet.ibm.com>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     Mel Gorman <mgorman@...e.de>, linux-mm@...ck.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Yasuaki Ishimatsu <yasu.isimatu@...il.com>,
        Tang Chen <tangchen@...fujitsu.com>, qiuxishi@...wei.com,
        Kani Toshimitsu <toshi.kani@....com>, slaoub@...il.com,
        Joonsoo Kim <js1304@...il.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Zhang Zhen <zhenzhang.zhang@...wei.com>,
        David Rientjes <rientjes@...gle.com>,
        Daniel Kiper <daniel.kiper@...cle.com>,
        Igor Mammedov <imammedo@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Chris Metcalf <cmetcalf@...lanox.com>,
        Dan Williams <dan.j.williams@...il.com>,
        Heiko Carstens <heiko.carstens@...ibm.com>,
        Lai Jiangshan <laijs@...fujitsu.com>,
        Martin Schwidefsky <schwidefsky@...ibm.com>
Subject: Re: [PATCH 0/6] mm: make movable onlining suck less

On Wed, Apr 05, 2017 at 11:24:27AM +0200, Michal Hocko wrote:
>On Wed 05-04-17 08:42:39, Michal Hocko wrote:
>> On Tue 04-04-17 16:43:39, Reza Arbab wrote:
>> > It's new. Without this patchset, I can repeatedly
>> > add_memory()->online_movable->offline->remove_memory() all of a node's
>> > memory.
>>
>> This is quite unexpected because the code obviously cannot handle the
>> first memory section. Could you paste /proc/zoneinfo and
>> grep . -r /sys/devices/system/memory/auto_online_blocks/memory*, after
>> onlining for both patched and unpatched kernels?
>
>Btw. how do you test this? I am really surprised you managed to
>hotremove such a low pfn range.

When I boot, I have node 0 (4GB) and node 1 (empty):

Early memory node ranges
  node   0: [mem 0x0000000000000000-0x00000000ffffffff]
Initmem setup node 0 [mem 0x0000000000000000-0x00000000ffffffff]
On node 0 totalpages: 65536
  DMA zone: 64 pages used for memmap
  DMA zone: 0 pages reserved
  DMA zone: 65536 pages, LIFO batch:1
Could not find start_pfn for node 1
Initmem setup node 1 [mem 0x0000000000000000-0x0000000000000000]
On node 1 totalpages: 0

My steps from there:

1. add_memory(1, 0x100000000, 0x100000000)
2. echo online_movable > /sys/devices/system/node/node1/memory[511..256]
3. echo offline > /sys/devices/system/node/node1/memory[256..511]
4. remove_memory(1, 0x100000000, 0x100000000)

After step 2, regardless of kernel:

$ cat /proc/zoneinfo
Node 0, zone      DMA
  per-node stats
      nr_inactive_anon 418
      nr_active_anon 2710
      nr_inactive_file 4895
      nr_active_file 1945
      nr_unevictable 0
      nr_isolated_anon 0
      nr_isolated_file 0
      nr_pages_scanned 0
      workingset_refault 0
      workingset_activate 0
      workingset_nodereclaim 0
      nr_anon_pages 2654
      nr_mapped    739
      nr_file_pages 7314
      nr_dirty     1
      nr_writeback 0
      nr_writeback_temp 0
      nr_shmem     474
      nr_shmem_hugepages 0
      nr_shmem_pmdmapped 0
      nr_anon_transparent_hugepages 0
      nr_unstable  0
      nr_vmscan_write 0
      nr_vmscan_immediate_reclaim 0
      nr_dirtied   3259
      nr_written   460
  pages free     53520
        min      63
        low      128
        high     193
   node_scanned  0
        spanned  65536
        present  65536
        managed  65218
      nr_free_pages 53520
      nr_zone_inactive_anon 418
      nr_zone_active_anon 2710
      nr_zone_inactive_file 4895
      nr_zone_active_file 1945
      nr_zone_unevictable 0
      nr_zone_write_pending 1
      nr_mlock     0
      nr_slab_reclaimable 438
      nr_slab_unreclaimable 808
      nr_page_table_pages 32
      nr_kernel_stack 2080
      nr_bounce    0
      numa_hit     313226
      numa_miss    0
      numa_foreign 0
      numa_interleave 3071
      numa_local   313226
      numa_other   0
      nr_free_cma  0
        protection: (0, 0, 0, 0)
  pagesets
    cpu: 0
              count: 2
              high:  6
              batch: 1
  vm stats threshold: 12
  node_unreclaimable:  0
  start_pfn:           0
  node_inactive_ratio: 0
Node 1, zone  Movable
  per-node stats
      nr_inactive_anon 0
      nr_active_anon 0
      nr_inactive_file 0
      nr_active_file 0
      nr_unevictable 0
      nr_isolated_anon 0
      nr_isolated_file 0
      nr_pages_scanned 0
      workingset_refault 0
      workingset_activate 0
      workingset_nodereclaim 0
      nr_anon_pages 0
      nr_mapped    0
      nr_file_pages 0
      nr_dirty     0
      nr_writeback 0
      nr_writeback_temp 0
      nr_shmem     0
      nr_shmem_hugepages 0
      nr_shmem_pmdmapped 0
      nr_anon_transparent_hugepages 0
      nr_unstable  0
      nr_vmscan_write 0
      nr_vmscan_immediate_reclaim 0
      nr_dirtied   0
      nr_written   0
  pages free     65536
        min      63
        low      128
        high     193
   node_scanned  0
        spanned  65536
        present  65536
        managed  65536
      nr_free_pages 65536
      nr_zone_inactive_anon 0
      nr_zone_active_anon 0
      nr_zone_inactive_file 0
      nr_zone_active_file 0
      nr_zone_unevictable 0
      nr_zone_write_pending 0
      nr_mlock     0
      nr_slab_reclaimable 0
      nr_slab_unreclaimable 0
      nr_page_table_pages 0
      nr_kernel_stack 0
      nr_bounce    0
      numa_hit     0
      numa_miss    0
      numa_foreign 0
      numa_interleave 0
      numa_local   0
      numa_other   0
      nr_free_cma  0
        protection: (0, 0, 0, 0)
  pagesets
    cpu: 0
              count: 0
              high:  6
              batch: 1
  vm stats threshold: 14
  node_unreclaimable:  1
  start_pfn:           65536
  node_inactive_ratio: 0

After step 2, on v4.11-rc5:

$ grep . /sys/devices/system/memory/memory*/valid_zones
/sys/devices/system/memory/memory[0..254]/valid_zones:DMA
/sys/devices/system/memory/memory255/valid_zones:DMA Normal Movable
/sys/devices/system/memory/memory256/valid_zones:Movable Normal
/sys/devices/system/memory/memory[257..511]/valid_zones:Movable

After step 2, on v4.11-rc5 + all the patches from this thread:

$ grep . /sys/devices/system/memory/memory*/valid_zones
/sys/devices/system/memory/memory[0..255]/valid_zones:DMA
/sys/devices/system/memory/memory[256..511]/valid_zones:Movable

On v4.11-rc5, I can do steps 1-4 ad nauseam.
On v4.11-rc5 + all the patches from this thread, I can do things 
repeatedly, but starting on the second iteration, all the

  /sys/devices/system/node/node1/memory*

symlinks are not created. I can still proceed using the actual files,

  /sys/devices/system/memory/memory[256..511]

instead. I think it may be because step 4 does node_set_offline(1). That 
is, the node is not only emptied of memory, it is offlined completely.

I hope this made sense. :/

-- 
Reza Arbab

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ