lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <88DC34334CA3444C85D647DBFA962C2735B662B6@SHSMSX104.ccr.corp.intel.com>
Date:   Tue, 1 Oct 2019 05:00:25 +0000
From:   "Zhang, Jun" <jun.zhang@...el.com>
To:     Borislav Petkov <bp@...en8.de>
CC:     "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "luto@...nel.org" <luto@...nel.org>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "hpa@...or.com" <hpa@...or.com>, "He, Bo" <bo.he@...el.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH] x86/PAT: priority the PAT warn to error to highlight
 the developer

Please see my comments.

Thanks,
Jun

-----Original Message-----
From: Borislav Petkov <bp@...en8.de> 
Sent: Monday, September 30, 2019 8:03 PM
To: Zhang, Jun <jun.zhang@...el.com>
Cc: dave.hansen@...ux.intel.com; luto@...nel.org; peterz@...radead.org; tglx@...utronix.de; mingo@...hat.com; hpa@...or.com; He, Bo <bo.he@...el.com>; x86@...nel.org; linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86/PAT: priority the PAT warn to error to highlight the developer

On Sun, Sep 29, 2019 at 03:20:31PM +0800, jun.zhang@...el.com wrote:
> From: zhang jun <jun.zhang@...el.com>
> 
> Documentation/x86/pat.txt says:
> set_memory_uc() or set_memory_wc() must use together with 
> set_memory_wb()

I had to open that file to see what it actually says - btw, the filename is pat.rst now - and you're very heavily paraphrasing what is there. So try again explaining what the requirement is.
[ZJ] next parts come from pat.txt in kernel version 4.19
Drivers wanting to export some pages to userspace do it by using mmap
interface and a combination of
1) pgprot_noncached()
2) io_remap_pfn_range() or remap_pfn_range() or vm_insert_pfn()

With PAT support, a new API pgprot_writecombine is being added. So, drivers can
continue to use the above sequence, with either pgprot_noncached() or
pgprot_writecombine() in step 1, followed by step 2.

In addition, step 2 internally tracks the region as UC or WC in memtype
list in order to ensure no conflicting mapping.

Note that this set of APIs only works with IO (non RAM) regions. If driver
wants to export a RAM region, it has to do set_memory_uc() or set_memory_wc()
as step 0 above and also track the usage of those pages and use set_memory_wb()
before the page is freed to free pool.

> if break the PAT attribute, there are tons of warning like:
> [   45.846872] x86/PAT: NDK MediaCodec_:3753 map pfn RAM range req

That's some android NDK thing, it seems: "The Android NDK is a toolset that lets you implement parts of your app in native code,... " lemme guess, they have a kernel module?
[ZJ] no, "NDK MediaCodec_" is an android codec2.0 process. It want to use WC memory.

> write-combining for [mem 0x1e7a80000-0x1e7a87fff], got write-back and 
> in the extremely case, we see kernel panic unexpected like:
> list_del corruption. prev->next should be ffff88806dbe69c0, but was 
> ffff888036f048c0

This is not really helpful. You need to explain what exactly you're doing - not shortening the error messages.
[ZJ] android codec2.0 want to use WC memory. Which use ion to allocate memory. So, we enable drivers/staging/android/ion, which work well except X86, x86 need to set_memory_wc().
So there are tons of warning, then list_del corruption. I use this patch(https://www.lkml.org/lkml/2019/9/29/25), list crash disappear.
Next is error message.
<4>[49967.389732] x86/PAT: NDK MediaCodec_:10602 map pfn RAM range req write-combining for [mem 0x1091f8000-0x1091f8fff], got write-back
<4>[49967.389747] x86/PAT: .vorbis.decoder:10606 map pfn RAM range req write-combining for [mem 0x1091f4000-0x1091f7fff], got write-back
<4>[49967.390622] x86/PAT: .vorbis.decoder:10606 map pfn RAM range req write-combining for [mem 0x109090000-0x109090fff], got write-back
<4>[49967.390687] x86/PAT: .vorbis.decoder:10606 map pfn RAM range req write-combining for [mem 0x1091f8000-0x1091fbfff], got write-back
<4>[49967.390855] x86/PAT: NDK MediaCodec_:10602 map pfn RAM range req write-combining for [mem 0x1091f4000-0x1091f4fff], got write-back
<4>[49967.391405] x86/PAT: .vorbis.decoder:10606 map pfn RAM range req write-combining for [mem 0x109098000-0x109098fff], got write-back
<4>[49967.391454] x86/PAT: NDK MediaCodec_:10602 map pfn RAM range req write-combining for [mem 0x1091f8000-0x1091f8fff], got write-back
<4>[49967.391474] x86/PAT: .vorbis.decoder:10606 map pfn RAM range req write-combining for [mem 0x1091f4000-0x1091f7fff], got write-back
<4>[49967.392641] x86/PAT: .vorbis.decoder:10606 map pfn RAM range req write-combining for [mem 0x14eb68000-0x14eb68fff], got write-back
<4>[49967.392708] x86/PAT: .vorbis.decoder:10606 map pfn RAM range req write-combining for [mem 0x1091f8000-0x1091fbfff], got write-back
<4>[49967.393001] x86/PAT: NDK MediaCodec_:10602 map pfn RAM range req write-combining for [mem 0x1091f4000-0x1091f4fff], got write-back
<4>[49967.394066] x86/PAT: NDK MediaCodec_:10602 map pfn RAM range req write-combining for [mem 0x1091f8000-0x1091f8fff], got write-back
<6>[50045.677129] binder: 3390:3390 transaction failed 29189/-22, size 88-0 line 3131
<3>[50046.153621] list_del corruption. prev->next should be ffff89598004c960, but was ffff895ad46e4590
<4>[50046.163464] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<4>[50046.169297] CPU: 1 PID: 18792 Comm: Binder:3390_1B Tainted: G     U     O      4.19.68-PKT-190905T163945Z-00031-g9de920e66b4e #1
<4>[50046.182213] RIP: 0010:__list_del_entry_valid+0x78/0x90
<4>[50046.187947] Code: e8 00 a1 c1 ff 0f 0b 48 89 fe 48 c7 c7 60 1a b6 a5 e8 ef a0 c1 ff 0f 0b 48 89 f2 48 89 fe 48 c7 c7 98 1a b6 a5 e8 db a0 c1 ff <0f> 0b 48 c7 c7 d8 1a b6 a5 e8 cd a0 c1 ff 0f 0b 90 90 90 90 90 90
<4>[50046.208906] RSP: 0018:ffffa5d5c85cbc88 EFLAGS: 00010282
<4>[50046.214733] RAX: 0000000000000054 RBX: ffff89598004c960 RCX: 0000000000000000
<4>[50046.222689] RDX: 0000000000000000 RSI: ffffffffa5b3c047 RDI: 00000000ffffffff
<4>[50046.230645] RBP: ffffa5d5c85cbc88 R08: 00000000001d8f83 R09: 0000000000000036
<4>[50046.238608] R10: ffffa5d5c7e97b08 R11: ffffffffa5b10d5e R12: ffff895ad46e4400
<4>[50046.246570] R13: ffff8959d9237e48 R14: ffff8959d9237e00 R15: ffff895ad46e4400
<4>[50046.254535] FS:  0000736cc6941010(0000) GS:ffff895afba80000(0000) knlGS:0000736ccaaa9400
<4>[50046.263568] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[50046.269982] CR2: 00000000133bb018 CR3: 000000014e91c000 CR4: 00000000003406e0
<4>[50046.277950] Call Trace:
<4>[50046.280680]  binder_dequeue_work_head_ilocked+0x1f/0x50
<4>[50046.286512]  binder_thread_read+0x2c3/0x12e0
<4>[50046.291277]  ? binder_alloc_free_buf+0x2a/0x30
<4>[50046.296238]  ? finish_wait+0x90/0x90
<4>[50046.300226]  ? __might_sleep+0x1b/0x20
<4>[50046.304404]  binder_ioctl+0x79d/0xa46
<4>[50046.308489]  do_vfs_ioctl+0xa9/0x6d0
<4>[50046.312480]  ? _raw_spin_unlock_irqrestore+0x28/0x50
<4>[50046.318021]  ksys_ioctl+0x75/0x80
<4>[50046.321719]  __x64_sys_ioctl+0x1a/0x20
<4>[50046.325902]  do_syscall_64+0x55/0x100
<4>[50046.329984]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[50046.335620] RIP: 0033:0x736dae4f57b7
<4>[50046.339606] Code: cc cc cc b8 60 00 00 00 0f 05 48 3d 01 f0 ff ff 72 09 f7 d8 89 c7 e8 c8 fc ff ff c3 cc cc cc cc cc cc cc b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 72 09 f7 d8 89 c7 e8 a8 fc ff ff c3 cc cc cc cc
<4>[50046.360566] RSP: 002b:0000736cc6940aa8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
<4>[50046.369018] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000736dae4f57b7
<4>[50046.376979] RDX: 0000736cc6940ba0 RSI: 00000000c0306201 RDI: 0000000000000021
<4>[50046.384940] RBP: 00000000fffffff7 R08: 0000000000000000 R09: 0000000000000000
<4>[50046.392903] R10: 0000000000000000 R11: 0000000000000206 R12: 0000736d20c68a00
<4>[50046.400863] R13: 0000736cc6940ba0 R14: 0000736d20c68aa8 R15: 0000736d20c68b20
<4>[50046.408826] Modules linked in: hci_uart bluetooth ecdh_generic rfkill_gpio dwc3_pci dwc3 pcie8xxx(O) snd_usb_audio xhci_pci tpm_crb mei_me xhci_hcd snd_usbmidi_lib tpm mlan(O) mei snd_hwdep igb_avb(O) cfg80211 brd snd_soc_sst_bxt_tdf8532 snd_soc_tdf8532 snd_soc_skl trusty_timer snd_soc_skl_ipc trusty_wall snd_soc_sst_ipc trusty_log snd_soc_sst_dsp trusty_virtio trusty_ipc trusty_mem snd_hda_ext_core trusty snd_hda_core virtio_ring virtio videobuf2_dma_sg
<4>[50046.454020] ---[ end trace 66821e043574850e ]---


> so it's better to priority the warn to error to highlight to remind 
> the developer.

Whut?

From reading what is trying hard to be a sentence, I can only guess what you're trying to say here. And it doesn't make it clear why is pr_warn() not enough and it has to be pr_err().
[ZJ] usually warning don't output to console, and could be ignored. I think PAT error could cause crash, so it is very serious. 

> Signed-off-by: zhang jun <jun.zhang@...el.com>
> Signed-off-by: he, bo <bo.he@...el.com>

And this SOB chain is wrong.
[ZJ] What is SOB? I use git format-patch  -1. Checkpatch.pl, then git send-email.

> ---
>  arch/x86/mm/pat.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index 
> d9fbd4f69920..43a4dfdcedc8 100644
> --- a/arch/x86/mm/pat.c
> +++ b/arch/x86/mm/pat.c
> @@ -897,7 +897,7 @@ static int reserve_pfn_range(u64 paddr, unsigned 
> long size, pgprot_t *vma_prot,
>  
>  		pcm = lookup_memtype(paddr);
>  		if (want_pcm != pcm) {
> -			pr_warn("x86/PAT: %s:%d map pfn RAM range req %s for [mem %#010Lx-%#010Lx], got %s\n",
> +			pr_err("x86/PAT: %s:%d map pfn RAM range req %s for [mem 
> +%#010Lx-%#010Lx], got %s!!!\n",

Three "!!!" would make this more urgent, huh?

How about you make the error message more informative and user-friendly, instead?
[ZJ] the whole log, please see the attachment. 

Thx.

--
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

View attachment "pat.txt" of type "text/plain" (11318 bytes)

Download attachment "apanic_console" of type "application/octet-stream" (27313 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ