[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4jgtYMKgEB4jnQ0g4fQPO39BCOmQM8Zo231=_D7L6wH=A@mail.gmail.com>
Date: Tue, 6 Aug 2019 14:11:56 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Jane Chu <jane.chu@...cle.com>
Cc: Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
Linux MM <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-nvdimm <linux-nvdimm@...ts.01.org>
Subject: Re: [PATCH v4 2/2] mm/memory-failure: Poison read receives SIGKILL
instead of SIGBUS if mmaped more than once
On Tue, Aug 6, 2019 at 10:28 AM Jane Chu <jane.chu@...cle.com> wrote:
>
> Mmap /dev/dax more than once, then read the poison location using address
> from one of the mappings. The other mappings due to not having the page
> mapped in will cause SIGKILLs delivered to the process. SIGKILL succeeds
> over SIGBUS, so user process looses the opportunity to handle the UE.
>
> Although one may add MAP_POPULATE to mmap(2) to work around the issue,
> MAP_POPULATE makes mapping 128GB of pmem several magnitudes slower, so
> isn't always an option.
>
> Details -
>
> ndctl inject-error --block=10 --count=1 namespace6.0
>
> ./read_poison -x dax6.0 -o 5120 -m 2
> mmaped address 0x7f5bb6600000
> mmaped address 0x7f3cf3600000
> doing local read at address 0x7f3cf3601400
> Killed
>
> Console messages in instrumented kernel -
>
> mce: Uncorrected hardware memory error in user-access at edbe201400
> Memory failure: tk->addr = 7f5bb6601000
> Memory failure: address edbe201: call dev_pagemap_mapping_shift
> dev_pagemap_mapping_shift: page edbe201: no PUD
> Memory failure: tk->size_shift == 0
> Memory failure: Unable to find user space address edbe201 in read_poison
> Memory failure: tk->addr = 7f3cf3601000
> Memory failure: address edbe201: call dev_pagemap_mapping_shift
> Memory failure: tk->size_shift = 21
> Memory failure: 0xedbe201: forcibly killing read_poison:22434 because of failure to unmap corrupted page
> => to deliver SIGKILL
> Memory failure: 0xedbe201: Killing read_poison:22434 due to hardware memory corruption
> => to deliver SIGBUS
>
> Signed-off-by: Jane Chu <jane.chu@...cle.com>
> Suggested-by: Naoya Horiguchi <n-horiguchi@...jp.nec.com>
Looks good, ignore the checkpatch warning about too long subject line,
looks appropriate to me:
Reviewed-by: Dan Williams <dan.j.williams@...el.com>
Powered by blists - more mailing lists