[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcxDJ5gH9XvZ1bMsRqqU8bTpGLsz75+pWMnj52b-nMZHKhdtQ@mail.gmail.com>
Date: Mon, 19 Apr 2021 19:03:01 -0700
From: Jue Wang <juew@...gle.com>
To: nao.horiguchi@...il.com, "Luck, Tony" <tony.luck@...el.com>
Cc: akpm@...ux-foundation.org, bp@...en8.de, david@...hat.com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org, luto@...nel.org,
naoya.horiguchi@....com, osalvador@...e.de, yaoaili@...gsoft.com
Subject: Re: [PATCH v1 3/3] mm,hwpoison: add kill_accessing_process() to find
error virtual address
On Tue, 13 Apr 2021 07:43:20 +0900, Naoya Horiguchi wrote:
> This patch suggests to do page table walk to find the error virtual
> address. If we find multiple virtual addresses in walking, we now can't
> determine which one is correct, so we fall back to sending SIGBUS in
> kill_me_maybe() without error info as we do now. This corner case needs
> to be solved in the future.
Instead of walking the page tables, I wonder what about the following idea:
When failing to get vaddr, memory_failure just ensures the mapping is removed
and an hwpoisoned swap pte is put in place; or the original page is flagged with
PG_HWPOISONED and kept in the radix tree (e.g., for SHMEM THP).
NOTE: no SIGBUS is sent to user space.
Then do_machine_check just returns to user space to resume execution, the
re-execution will result in a #PF and should land to the exact page fault
handling code that generates a SIGBUS with the precise vaddr info:
https://github.com/torvalds/linux/blob/7af08140979a6e7e12b78c93b8625c8d25b084e2/mm/memory.c#L3290
https://github.com/torvalds/linux/blob/7af08140979a6e7e12b78c93b8625c8d25b084e2/mm/memory.c#L3647
Thanks,
-Jue
Powered by blists - more mailing lists