[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dec32bd7-36ad-7273-5874-62328a98bb30@virtuozzo.com>
Date: Thu, 12 Jan 2017 12:46:03 +0300
From: Dmitry Safonov <dsafonov@...tuozzo.com>
To: Andy Lutomirski <luto@...capital.net>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Dmitry Safonov" <0x7f454c46@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
"Ingo Molnar" <mingo@...hat.com>, "H. Peter Anvin" <hpa@...or.com>,
Andy Lutomirski <luto@...nel.org>,
Borislav Petkov <bp@...e.de>, X86 ML <x86@...nel.org>
Subject: Re: [PATCH 1/2] x86/mm: don't mmap() over 4GB with compat syscall
On 01/12/2017 01:26 AM, Andy Lutomirski wrote:
> On Wed, Jan 11, 2017 at 10:17 AM, Dmitry Safonov <dsafonov@...tuozzo.com> wrote:
>> During fixing CRIU bugs on ZDTM tests for 32-bit C/R, I found that
>> compatible ia32/x32 syscalls mmap() and mmap2() can return address
>> over 4Gb in x86_64 applications, which results in returning lower
>> 4 bytes of address while dropping the higher bytes.
>> It happens because mmap() upper limit doesn't differ native/compat
>> syscalls for 64-bit task, it's: (TASK_UNMAPPED_BASE + random_factor)
>> which is: (PAGE_ALIGN(TASK_SIZE / 3)) + random_factor
>> (in case of legacy mmap it's just TASK_SIZE).
>> This patch limits higher address that can be mmaped with compat
>> syscalls in 64-bit applications with IA32_PAGE_OFFSET (+randomization).
>>
>> Signed-off-by: Dmitry Safonov <dsafonov@...tuozzo.com>
>> ---
>> arch/x86/kernel/sys_x86_64.c | 37 +++++++++++++++++++++++++++++--------
>> 1 file changed, 29 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
>> index a55ed63b9f91..0893725db6e6 100644
>> --- a/arch/x86/kernel/sys_x86_64.c
>> +++ b/arch/x86/kernel/sys_x86_64.c
>> @@ -100,7 +100,7 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
>> static void find_start_end(unsigned long flags, unsigned long *begin,
>> unsigned long *end)
>> {
>> - if (!test_thread_flag(TIF_ADDR32) && (flags & MAP_32BIT)) {
>> + if (!test_thread_flag(TIF_ADDR32)) {
>> /* This is usually used needed to map code in small
>> model, so it needs to be in the first 31bit. Limit
>> it to that. This means we need to move the
>> @@ -109,14 +109,24 @@ static void find_start_end(unsigned long flags, unsigned long *begin,
>> malloc knows how to fall back to mmap. Give it 1GB
>> of playground for now. -AK */
>> *begin = 0x40000000;
>> - *end = 0x80000000;
>> - if (current->flags & PF_RANDOMIZE) {
>> - *begin = randomize_page(*begin, 0x02000000);
>> +
>> + if (flags & MAP_32BIT) {
>> + if (current->flags & PF_RANDOMIZE)
>> + *begin = randomize_page(*begin, 0x02000000);
>> + *end = 0x80000000;
>> + return;
>> + }
>> + if (current->thread.status & TS_COMPAT) {
>> + if (current->flags & PF_RANDOMIZE)
>> + *begin = randomize_page(*begin,
>> + 1UL << mmap_rnd_compat_bits);
>> + *end = IA32_PAGE_OFFSET;
>> + return;
>> }
>> - } else {
>> - *begin = current->mm->mmap_legacy_base;
>> - *end = TASK_SIZE;
>> }
>> +
>> + *begin = current->mm->mmap_legacy_base;
>> + *end = TASK_SIZE;
>> }
>>
>> unsigned long
>> @@ -187,10 +197,21 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
>> return addr;
>> }
>>
>> + if (current->thread.status & TS_COMPAT) {
>
> in_compat_syscall(), please.
Indeed, forgot about the helper.
>
> Also, we need to verify that, if this is called execve(), it does the
> right thing.
Hmm, not sure I get it right.
A test for calling compat sys_execve() from and for 64-bit ELF?
>> + if (current->flags & PF_RANDOMIZE) {
>> + unsigned long rnd = 1UL << mmap_rnd_compat_bits;
>> +
>> + info.high_limit =
>> + randomize_page(IA32_PAGE_OFFSET - rnd, rnd);
>> + } else {
>> + info.high_limit = IA32_PAGE_OFFSET;
>> + }
>> + } else {
>> + info.high_limit = mm->mmap_base;
>> + }
>
> This code was incomprehensible before and it's worse now. Could you
> try to clean it up a bit? For example, a patch that simply folds
> find_start_end() into its sole caller as the first patch in the series
> without changing any semantics would probably help.
Well, yep, I also don't like how this code looks like.
That will need to add a parameter to find_start_end() whether
allocation is bottom-up or up-bottom.
I'll try to cleanup for v2.
>
> --Andy
>
--
Dmitry
Powered by blists - more mailing lists