[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20210603051055.GA244241@hori.linux.bs1.fc.nec.co.jp>
Date: Thu, 3 Jun 2021 05:10:56 +0000
From: HORIGUCHI NAOYA(堀口 直也)
<naoya.horiguchi@....com>
To: Andrew Morton <akpm@...ux-foundation.org>
CC: "linux-mm@...ck.org" <linux-mm@...ck.org>,
Tony Luck <tony.luck@...el.com>,
Aili Yao <yaoaili@...gsoft.com>,
Oscar Salvador <osalvador@...e.de>,
David Hildenbrand <david@...hat.com>,
Borislav Petkov <bp@...en8.de>,
Andy Lutomirski <luto@...nel.org>, Jue Wang <juew@...gle.com>,
Naoya Horiguchi <nao.horiguchi@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v5 3/3] mm,hwpoison: Send SIGBUS with error virutal
address
On Fri, May 21, 2021 at 12:01:56PM +0900, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <naoya.horiguchi@....com>
>
> Now an action required MCE in already hwpoisoned address surely sends a
> SIGBUS to current process, but the SIGBUS doesn't convey error virtual
> address. That's not optimal for hwpoison-aware applications.
>
> To fix the issue, make memory_failure() call kill_accessing_process(),
> that does pagetable walk to find the error virtual address. It could
> find multiple virtual addresses for the same error page, and it seems
> hard to tell which virtual address is correct one. But that's rare
> and sending incorrect virtual address could be better than no address.
> So let's report the first found virtual address for now.
>
> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@....com>
> ---
> change log v4 -> v5:
> - switched to first found approach,
> - introduced check_hwpoisoned_pmd_entry() to fix build failure on arch
> without thp support.
>
> change log v3 -> v4:
> - refactored hwpoison_pte_range to save indentation,
> - updated patch description
>
> change log v1 -> v2:
> - initialize local variables in check_hwpoisoned_entry() and
> hwpoison_pte_range()
> - fix and improve logic to calculate error address offset.
> ---
...
> +static int kill_accessing_process(struct task_struct *p, unsigned long pfn,
> + int flags)
> +{
> + int ret;
> + struct hwp_walk priv = {
> + .pfn = pfn,
> + };
> + priv.tk.tsk = p;
> +
> + mmap_read_lock(p->mm);
> + ret = walk_page_range(p->mm, 0, TASK_SIZE, &hwp_walk_ops,
> + (void *)&priv);
> + if (!ret && priv.tk.addr)
Sorry, I found a silly mistake, the walk_page_range() got to return 1 when it
found at least error virtual address since v5, so this if-condition should be
like this.
@@ -691,7 +691,8 @@ static int kill_accessing_process(struct task_struct *p, unsigned long pfn,
mmap_read_lock(p->mm);
ret = walk_page_range(p->mm, 0, TASK_SIZE, &hwp_walk_ops,
(void *)&priv);
- if (!ret && priv.tk.addr)
+ if (ret == 1 && priv.tk.addr)
kill_proc(&priv.tk, pfn, flags);
mmap_read_unlock(p->mm);
return ret ? -EFAULT : -EHWPOISON;
Andrew, this patch is now in linux-mm, so could you apply this fix onto
mmhwpoison-send-sigbus-with-error-virutal-address.patch ?
Or if it's better to resend a whole patch, please let me know.
Thanks,
Naoya Horiguchi
Powered by blists - more mailing lists