lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220828005545.94389-1-bhe@redhat.com>
Date:   Sun, 28 Aug 2022 08:55:43 +0800
From:   Baoquan He <bhe@...hat.com>
To:     linux-kernel@...r.kernel.org
Cc:     linux-arm-kernel@...ts.infradead.org, catalin.marinas@....com,
        ardb@...nel.org, rppt@...nel.org, guanghuifeng@...ux.alibaba.com,
        mark.rutland@....com, will@...nel.org, linux-mm@...ck.org,
        thunder.leizhen@...wei.com, wangkefeng.wang@...wei.com,
        kexec@...ts.infradead.org, Baoquan He <bhe@...hat.com>
Subject: [PATCH 0/2] arm64, kdump: enforce to take 4G as the crashkernel low memory end

Problem:
=======
On arm64, block and section mapping is supported to build page tables.
However, currently it enforces to take base page mapping for the whole
linear mapping if CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabled and
crashkernel kernel parameter is set. This will cause longer time of the
linear mapping process during bootup and severe performance degradation
during running time.

Root cause:
==========
On arm64, crashkernel reservation relies on knowing the upper limit of
low memory zone because it needs to reserve memory in the zone so that
devices' DMA addressing in kdump kernel can be satisfied. However, the
limit on arm64 is variant. And the upper limit can only be decided late
till bootmem_init() is called.

And we need to map the crashkernel region with base page granularity when
doing linear mapping, because kdump needs to protect the crashkernel region
via set_memory_valid(,0) after kdump kernel loading. However, arm64 doesn't
support well on splitting the built block or section mapping due to some
cpu reststriction [1]. And unfortunately, the linear mapping is done before
bootmem_init().

To resolve the above conflict on arm64, the compromise is enforcing to
take base page mapping for the entire linear mapping if crashkernel is
set, and CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabed. Hence
performance is sacrificed.

Solution:
=========
To fix the problem, we should always take 4G as the crashkernel low
memory end in case CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabled.
With this, we don't need to defer the crashkernel reservation till
bootmem_init() is called to set the arm64_dma_phys_limit. As long as
memblock init is done, we can conclude what is the upper limit of low
memory zone.

1) both CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 are disabled or memblock_start_of_DRAM() > 4G
  limit = PHYS_ADDR_MAX+1  (Corner cases)
2) CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 are enabled:
   limit = 4G  (generic case)

Justification:
==============
In fact, kdump kernel doesn't need to cover all peripherals'
addressing bits. Only device taken as dump target need be taken care of
and its addressing bits need be satified. Currently, there are two kinds
of dumping, dumped to local storage disk or dumped through network card to
remove storage server. It means only storage disk or netowrk card taken
as dump target need be consider if their addressing bits are satisfied.
For saving memory, we usually generate kdump specific initramfs including
necessary kernel modules for dump target devices. All other unnecessary
kernel modules are excluded and their correspondent devices won't be
initialized during kdump kernel bootup.

So far, only Raspberry Pi 4 has some peripherals whcih can only address
30 bits memory range as reported in [2]. Devices on all other arm64 systems
can address 32bits memory range.

So by enforcing to take 4G as the crashkernel low memory end, the only
risk is if RPi4 owns storage disk or network card which can't address
32bits memory range because they could be set as dump target. Even if
RPi4 truly has storage devices or network card which can only address 30
bits memory range, it should be a corner case. We can document it since
crashkernel is more taken as a feature on server. Besides, RPi4 still can
use crashkernel=xM@yM to sepcify a location for 32bits addressing if it
really has that kind of storage device or network card and kdump is expected.

[1]
https://lore.kernel.org/all/YrIIJkhKWSuAqkCx@arm.com/T/#u

[2]
[PATCH v6 0/4] Raspberry Pi 4 DMA addressing support
https://lore.kernel.org/linux-arm-kernel/20190911182546.17094-1-nsaenzjulienne@suse.de/T/


======
Question to Nicolas:

Hi Nicolas,

In cover letter of [2] patchset, you told RPi4 has peripherals which
can only address 30bits range. In below sentence, do you mean "the PCIe,
V3D, GENET" can't address 32bit range, or they have wider view of
address space the same as 40-bit DMA channels? I am confused about that.

And the storage device or network card on RPi4 can address 32bit range
or 32bit range, do we have document or do you happen to know that?

"""
The new Raspberry Pi 4 has up to 4GB of memory but most peripherals can
only address the first GB: their DMA address range is
0xc0000000-0xfc000000 which is aliased to the first GB of physical
memory 0x00000000-0x3c000000. Note that only some peripherals have these
limitations: the PCIe, V3D, GENET, and 40-bit DMA channels have a wider
view of the address space by virtue of being hooked up trough a second
interconnect.
"""


Baoquan He (2):
  arm64, kdump: enforce to take 4G as the crashkernel low memory end
  arm64: remove unneed defer_reserve_crashkernel() and crash_mem_map

 arch/arm64/include/asm/memory.h |  5 ----
 arch/arm64/mm/init.c            | 24 ++++++++-------
 arch/arm64/mm/mmu.c             | 53 ++++++++++++++-------------------
 3 files changed, 36 insertions(+), 46 deletions(-)


base-commit: 10d4879f9ef01cc6190fafe4257d06f375bab92c
-- 
2.34.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ