lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230427023045.GA3499768@hori.linux.bs1.fc.nec.co.jp>
Date:   Thu, 27 Apr 2023 02:31:08 +0000
From:   HORIGUCHI NAOYA(堀口 直也) 
        <naoya.horiguchi@....com>
To:     Kefeng Wang <wangkefeng.wang@...wei.com>
CC:     "Luck, Tony" <tony.luck@...el.com>,
        "chu, jane" <jane.chu@...cle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Christian Brauner <brauner@...nel.org>,
        "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Miaohe Lin <linmiaohe@...wei.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Tong Tiangen <tongtiangen@...wei.com>,
        Jens Axboe <axboe@...nel.dk>
Subject: Re: [PATCH v2] mm: hwpoison: coredump: support recovery from
 dump_user_range()

On Thu, Apr 27, 2023 at 09:06:46AM +0800, Kefeng Wang wrote:
> 
> 
> On 2023/4/26 23:45, Luck, Tony wrote:
> > > > > Thanks for your confirm, and what your option about add
> > > > > MCE_IN_KERNEL_COPYIN to EX_TYPE_DEFAULT_MCE_SAFE/FAULT_MCE_SAFE type
> > > > > to let do_machine_check call queue_task_work(&m, msg, kill_me_never),
> > > > > which kill every call memory_failure_queue() after mc safe copy return?
> > > > 
> > > > I haven't been following this thread closely. Can you give a link to the e-mail
> > > > where you posted a patch that does this? Or just repost that patch if easier.
> > > 
> > > The major diff changes is [1], I will post a formal patch when -rc1 out,
> > > thanks.
> > > 
> > > [1]
> > > https://lore.kernel.org/linux-mm/6dc1b117-020e-be9e-7e5e-a349ffb7d00a@huawei.com/
> > 
> > There seem to be a few misconceptions in that message. Not sure if all of them
> > were resolved.  Here are some pertinent points:
> > 
> > > > > In my understanding, an MCE should not be triggered when MC-safe copy
> > > > > tries
> > > > > to access to a memory error.  So I feel that we might be talking about
> > > > > different scenarios.
> > 
> > This is wrong. There is still a machine check when a MC-safe copy does a read
> > from a location that has a memory error.

Yes, the above was my first impression to be proven wrong ;)

> > 
> > The recovery flow in this case does not involve queue_task_work(). That is only
> > useful for machine check exceptions taken in user context. The queued work will
> > be executed to call memory_failure() from the kernel, but in process context (not
> > from the machine check exception stack) to handle the error.
> > 
> > For machine checks taken by kernel code (MC-safe copy functions) the recovery
> > path is here:
> > 
> >                  if (m.kflags & MCE_IN_KERNEL_RECOV) {
> >                          if (!fixup_exception(regs, X86_TRAP_MC, 0, 0))
> >                                  mce_panic("Failed kernel mode recovery", &m, msg);
> >                  }
> > 
> >                  if (m.kflags & MCE_IN_KERNEL_COPYIN)
> >                          queue_task_work(&m, msg, kill_me_never);
> > 
> > The "fixup_exception()" ensures that on return from the machine check handler
> > code returns to the extable[] fixup location instead of the instruction that was
> > loading from the memory error location.
> > 
> > When the exception was from one of the copy_from_user() variants it makes
> > sense to also do the queue_task_work() because the kernel is going to return
> > to the user context (with an EFAULT error code from whatever system call was
> > attempting the copy_from_user()).
> > 
> > But in the core dump case there is no return to user. The process is being
> > terminated by the signal that leads to this core dump. So even though you
> > may consider the page being accessed to be a "user" page, you can't fix
> > it by queueing work to run on return to user.
> 
> For coredump,the task work will be called too, see following code,
> 
> get_signal
> 	sig_kernel_coredump
> 		elf_core_dump
> 			dump_user_range
> 				_copy_from_iter // with MC-safe copy, return without panic
> 	do_group_exit(ksig->info.si_signo);
> 		do_exit
> 			exit_task_work
> 				task_work_run
> 					kill_me_never
> 						memory_failure
> 
> I also add debug print to check the memory_failure() processing after
> add MCE_IN_KERNEL_COPYIN to MCE_SAFE exception type, also tested CoW of
> normal page and huge page, it works too.

Sounds nice to me.
Maybe this information is worth documenting in the patch description.

Thanks,
Naoya Horiguchi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ