lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <48178DEB.90200@henry.ne.arcor.de>
Date:	Tue, 29 Apr 2008 23:06:51 +0200
From:	Henry Nestler <Henry.Ne@...or.de>
To:	Pekka Enberg <penberg@...helsinki.fi>
CC:	Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Alexander Viro <viro@....linux.org.uk>,
	Vegard Nossum <vegard.nossum@...il.com>
Subject: Re: [PATCH] x86: endless page faults in mount_block_root for Linux
 2.6

Pekka Enberg wrote:
> On Tue, Apr 29, 2008 at 5:33 PM, Ingo Molnar <mingo@...e.hu> wrote:
>>  btw., i have a kmemcheck-reported bug fixed in this same area with the
>>  patch below. I dont remember the details anymore, but the root mount
>>  code did something really, really weird here.
>>
>>  Subject: init: root mount fix
>>  From: Ingo Molnar <mingo@...e.hu>
>>  Date: Tue Apr 29 16:31:50 CEST 2008
>>
>>  Signed-off-by: Ingo Molnar <mingo@...e.hu>
>>  ---
>>   init/do_mounts.c |    8 ++++++--
>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>
>>  Index: linux/init/do_mounts.c
>>  ===================================================================
>>  --- linux.orig/init/do_mounts.c
>>  +++ linux/init/do_mounts.c
>>  @@ -201,9 +201,13 @@ static int __init do_mount_root(char *na
>>         return 0;
>>   }
>>
>>  +#if PAGE_SIZE < PATH_MAX
>>  +# error increase the fs_names allocation size here
>>  +#endif
>>
>> +
>>   void __init mount_block_root(char *name, int flags)
>>   {
>>  -       char *fs_names = __getname();
>>  +       char *fs_names = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1);
>>
>>         char *p;
>>   #ifdef CONFIG_BLOCK
>>         char b[BDEVNAME_SIZE];
>>  @@ -251,7 +255,7 @@ retry:
>>
>>  #endif
>>         panic("VFS: Unable to mount root fs on %s", b);
>>   out:
>>  -       putname(fs_names);
>>  +       free_pages((unsigned long)fs_names, 1);
>>   }
>>
>>   #ifdef CONFIG_ROOT_NFS
> 
> It could have been a bug in early kmemcheck too. We don't check memory
> allocated with the page allocator, only slab, so this shouldn't
> trigger anything.
> 

Using "__get_free_pages" don't help. The real problem is the page after
the allocated page. Not the page where fs_names starts.

Have just printk some adresses from fs_names. They are c1152000,
c1150000, c2736000, c0450000, and so. All this adresses are not in
vmalloc. See boot messages. Was booting with mem=40:
  virtual kernel memory layout:
    fixmap  : 0xffffc000 - 0xfffff000   (  12 kB)
    vmalloc : 0xc3000000 - 0xffffa000   ( 975 MB)
    lowmem  : 0xc0000000 - 0xc2800000   (  40 MB)

In mount_block_root the loop
   for (p = fs_names; *p; p += strlen(p)+1) {
can point behind the allocated page. What is, if the function
exact_copy_from_user access to "p+PAGE_SIZE" where p=fs_names+9 and this
page is not mapped?

The problem I see, is, that sys_mount is designed for userland calls.
But mount_block_root give kernel space as parameter (address >=
c000000). In mount_block_root (fs/namespace.c) the size will roll over,
and is limited to PAGE_SIZE. For example TASK_SIZE=c0000000,
data=c1152000...c2736000:
   size = TASK_SIZE - (unsigned long)data;
   if (size > PAGE_SIZE)
           size = PAGE_SIZE;
   i = size - exact_copy_from_user((void *)page, data, size);

There, "exact_copy_from_user" is all times called with 4096 as size, if
comes from mount_block_root. That's why I would give only page aligned
parameters from mount_block_root to sys_mount.

Sorry, that I operate with hexnumbers. Memory mapping is not my favorite
source code, and with the numbers it is more clear to see here.

-- 
Henry N.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ