lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3399d2af-3d42-4ac1-9b74-8475bec25f7f@163.com>
Date: Tue, 27 Feb 2024 15:51:25 +0800
From: Yaxiong Tian <13327272236@....com>
To: Mike Rapoport <rppt@...nel.org>, David Hildenbrand <david@...hat.com>
Cc: rafael@...nel.org, pavel@....cz, len.brown@...el.com,
 keescook@...omium.org, tony.luck@...el.com, gpiccoli@...lia.com,
 akpm@...ux-foundation.org, ardb@...nel.org, wangkefeng.wang@...wei.com,
 catalin.marinas@....com, will@...nel.org, linux-pm@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-hardening@...r.kernel.org,
 Yaxiong Tian <tianyaxiong@...inos.cn>, xiongxin <xiongxin@...inos.cn>
Subject: Re: [PATCH] PM: hibernate: Fix level3 translation fault in
 swsusp_save()


在 2024/2/26 17:14, Mike Rapoport 写道:
> On Mon, Feb 26, 2024 at 09:37:06AM +0100, David Hildenbrand wrote:
>> On 26.02.24 04:42, Yaxiong Tian wrote:
>>> From: Yaxiong Tian <tianyaxiong@...inos.cn>
>>>
>>> On ARM64 machines using UEFI, if the linear map is not set (can_set_direct_map()
>>> return false), swsusp_save() will fail due to can't finding the map table
>>> under the nomap memory.such as:
> can_set_direct_map() has nothing to do with presence or absence of the
> linear map.
>
> Do you mean that kernel_page_present() presumes that a page is present when
> can_set_direct_map() returns false even for NOMAP ranges?
Yes, in swsusp_save()->copy_data_pages()->page_is_saveable(),
kernel_page_present() presumes that a page is present when 
can_set_direct_map()
returns false even for NOMAP ranges.So NOMAP pages will saved in 
after,and then
cause level3 translation fault in this pages.
>>> [   48.532162] Unable to handle kernel paging request at virtual address ffffff8000000000
>>> [   48.532162] Mem abort info:
>>> [   48.532162]   ESR = 0x0000000096000007
>>> [   48.532162]   EC = 0x25: DABT (current EL), IL = 32 bits
>>> [   48.532162]   SET = 0, FnV = 0
>>> [   48.532162]   EA = 0, S1PTW = 0
>>> [   48.532162]   FSC = 0x07: level 3 translation fault
>>> [   48.532162] Data abort info:
>>> [   48.532162]   ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
>>> [   48.532162]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
>>> [   48.532162]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>>> [   48.532162] swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000eeb0b000
>>> [   48.532162] [ffffff8000000000] pgd=180000217fff9803, p4d=180000217fff9803, pud=180000217fff9803, pmd=180000217fff8803, pte=0000000000000000
>>> [   48.532162] Internal error: Oops: 0000000096000007 [#1] SMP
>>> [   48.532162] Internal error: Oops: 0000000096000007 [#1] SMP
>>> [   48.532162] Modules linked in: xt_multiport ipt_REJECT nf_reject_ipv4 xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter bpfilter rfkill at803x snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg dwmac_generic stmmac_platform snd_hda_codec stmmac joydev pcs_xpcs snd_hda_core phylink ppdev lp parport ramoops reed_solomon ip_tables x_tables nls_iso8859_1 vfat multipath linear amdgpu amdxcp drm_exec gpu_sched drm_buddy hid_generic usbhid hid radeon video drm_suballoc_helper drm_ttm_helper ttm i2c_algo_bit drm_display_helper cec drm_kms_helper drm
>>> [   48.532162] CPU: 0 PID: 3663 Comm: systemd-sleep Not tainted 6.6.2+ #76
>>> [   48.532162] Source Version: 4e22ed63a0a48e7a7cff9b98b7806d8d4add7dc0
>>> [   48.532162] Hardware name: Greatwall GW-XXXXXX-XXX/GW-XXXXXX-XXX, BIOS KunLun BIOS V4.0 01/19/2021
>>> [   48.532162] pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>> [   48.532162] pc : swsusp_save+0x280/0x538
>>> [   48.532162] lr : swsusp_save+0x280/0x538
>>> [   48.532162] sp : ffffffa034a3fa40
>>> [   48.532162] x29: ffffffa034a3fa40 x28: ffffff8000001000 x27: 0000000000000000
>>> [   48.532162] x26: ffffff8001400000 x25: ffffffc08113e248 x24: 0000000000000000
>>> [   48.532162] x23: 0000000000080000 x22: ffffffc08113e280 x21: 00000000000c69f2
>>> [   48.532162] x20: ffffff8000000000 x19: ffffffc081ae2500 x18: 0000000000000000
>>> [   48.532162] x17: 6666662074736420 x16: 3030303030303030 x15: 3038666666666666
>>> [   48.532162] x14: 0000000000000b69 x13: ffffff9f89088530 x12: 00000000ffffffea
>>> [   48.532162] x11: 00000000ffff7fff x10: 00000000ffff7fff x9 : ffffffc08193f0d0
>>> [   48.532162] x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 0000000000000001
>>> [   48.532162] x5 : ffffffa0fff09dc8 x4 : 0000000000000000 x3 : 0000000000000027
>>> [   48.532162] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000004e
>>> [   48.532162] Call trace:
>>> [   48.532162]  swsusp_save+0x280/0x538
>>> [   48.532162]  swsusp_arch_suspend+0x148/0x190
>>> [   48.532162]  hibernation_snapshot+0x240/0x39c
>>> [   48.532162]  hibernate+0xc4/0x378
>>> [   48.532162]  state_store+0xf0/0x10c
>>> [   48.532162]  kobj_attr_store+0x14/0x24
>>>
>>> QEMU ARM64 using UEFI also has the problem by setting can_set_direct_map()
>>> return false.
> Huh?
> Why would you do that?
I discovered this problem when upgrading from 5.4 to 6.6 using the 5.4 
configuration.
So I using latest linux-next code,find the problem still exist.To rule 
out the effects
of a particular machine,I also use qemu to check it.
>
>>> Since the NOMAP regions are now marked as PageReserved(), pfn walkers
>>> and the rest of core mm will treat them as unusable memory. So this
>>> regions should not saved in hibernation.
>>>
>>> This problem may cause by changes to pfn_valid() logic in commit
>>> a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify pfn_valid()").
>>>
>>> So to fix it, we add pfn_is_map_memory() check in saveable_page(). It
>>> make such regisons don't save in hibernation.
>>>
>>> Fixes: a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify pfn_valid()")
>>> Co-developed-by: xiongxin <xiongxin@...inos.cn>
>>> Signed-off-by: xiongxin <xiongxin@...inos.cn>
>>> Signed-off-by: Yaxiong Tian <tianyaxiong@...inos.cn>
>>> ---
>>>    kernel/power/snapshot.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
>>> index 0f12e0a97e43..a06e3b1869d2 100644
>>> --- a/kernel/power/snapshot.c
>>> +++ b/kernel/power/snapshot.c
>>> @@ -1400,7 +1400,7 @@ static struct page *saveable_page(struct zone *zone, unsigned long pfn)
>>>    		return NULL;
>>>    	if (PageReserved(page)
>>> -	    && (!kernel_page_present(page) || pfn_is_nosave(pfn)))
>>> +	    && (!kernel_page_present(page) || pfn_is_nosave(pfn) || !pfn_is_map_memory(pfn)))
> I think adding the check for !pfn_is_map_memory() to arm64::pfn_is_nosave()
> is the best way to fix this.
Thinks, I also think this is the best modification.
>>>    		return NULL;
>>>    	if (page_is_guard(page))
>> On top of which tree does this apply?
>>
>> All occurrences of pfn_is_map_memory() are in arch/arm64, how does this
>> compile on other architectures?
>>
>> -- 
>> Cheers,
>>
>> David / dhildenb
>>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ