lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 7 Oct 2020 18:49:51 +0000
From:   "Luck, Tony" <tony.luck@...el.com>
To:     David Laight <David.Laight@...LAB.COM>,
        Borislav Petkov <bp@...en8.de>
CC:     "Song, Youquan" <youquan.song@...el.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v3 4/6] x86/mce: Avoid tail copy when machine check
 terminated a copy from user

>> Machine checks are more serious. Just give up at the point where the
>> main copy loop triggered the #MC and return from the copy code as if
>> the copy succeeded. The machine check handler will use task_work_add() to
>> make sure that the task is sent a SIGBUS.
>
> Isn't that just plain wrong?

It isn't pretty. I'm not sure how wrong it is.

> If copy is reported as succeeding the kernel code will use the 'old'
> data that is in the buffer as if it had been read from userspace.
> This could end up with kernel stack data being written to a file.

I ran a test with:

	write(fd, buf, 512)

With poison injected into buf[256] to force a machine check mid-copy.

The size of the file did get incremented by 512 rather than 256. Which isn't good.

The data in the file up to the 256 byte mark was the user data from buf[0 ... 255].

The data in the file past offset 256 was all zeroes. I suspect that isn't by chance.
The kernel has to defend against a user writing a partial page and using mmap(2)
on the same file to peek at data past EOF and up to the next PAGE_SIZE boundary.
So I think it must zero new pages allocated in page cache as they are allocated to
a file.

> Even zeroing the rest of the kernel buffer is wrong.

It wouldn't help/change anything.

> IIRC the code to try to maximise the copy has been removed.
> So the 'slow' retry wont happen any more.

Which code has been removed (and when ... TIP, and my testing, is based on 5.9-rc1)

-Tony

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ