[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210408084958.GC10192@zn.tnic>
Date: Thu, 8 Apr 2021 10:49:58 +0200
From: Borislav Petkov <bp@...en8.de>
To: "Luck, Tony" <tony.luck@...el.com>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Andy Lutomirski <luto@...nel.org>,
Aili Yao <yaoaili@...gsoft.com>,
HORIGUCHI NAOYA( 堀口 直也)
<naoya.horiguchi@....com>
Subject: Re: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user
hits poison
On Wed, Apr 07, 2021 at 02:43:10PM -0700, Luck, Tony wrote:
> On Wed, Apr 07, 2021 at 11:18:16PM +0200, Borislav Petkov wrote:
> > On Thu, Mar 25, 2021 at 05:02:34PM -0700, Tony Luck wrote:
> > > Andy Lutomirski pointed out that sending SIGBUS to tasks that
> > > hit poison in the kernel copying syscall parameters from user
> > > address space is not the right semantic.
> >
> > What does that mean exactly?
>
> Andy said that a task could check a memory range for poison by
> doing:
>
> ret = write(fd, buf, size);
> if (ret == size) {
> memory range is all good
> }
>
> That doesn't work if the kernel sends a SIGBUS.
>
> It doesn't seem a likely scenario ... but Andy is correct that
> the above ought to work.
We need to document properly what this is aiming to fix. He said
something yesterday along the lines of kthread_use_mm() hitting a SIGBUS
when a kthread "attaches" to an address space. I'm still unclear as to
how exactly that happens - there are only a handful of kthread_use_mm()
users in the tree...
> Yes. This is for kernel reading memory belongng to "current" task.
Provided "current" is really the task to which the poison page belongs.
That kthread_use_mm() thing sounded like the wrong task gets killed. But that
needs more details.
> Same in that the page gets unmapped. Different in that there
> is no SIGBUS if the kernel did the access for the user.
What is even the actual use case with sending tasks SIGBUS on poison
consumption? KVM? Others?
Are we documenting somewhere: "if your process gets a SIGBUS and this
and that, which means your page got offlined, you should do this and
that to recover"?
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists