linux-kernel - Re: ia64 hang/mca running gdb 'make check'

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100730043423.GH20265@lackof.org>
Date:	Thu, 29 Jul 2010 22:34:23 -0600
From:	dann frazier <dannf@...nf.org>
To:	Hugh Dickins <hughd@...gle.com>
Cc:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	linux-ia64@...r.kernel.org, linux-kernel@...r.kernel.org,
	Rik van Riel <riel@...hat.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Nick Piggin <npiggin@...e.de>, Mel Gorman <mel@....ul.ie>,
	Minchan Kim <minchan.kim@...il.com>,
	Ralf Baechle <ralf@...ux-mips.org>
Subject: Re: ia64 hang/mca running gdb 'make check'

On Thu, Jul 29, 2010 at 07:01:56PM -0700, Hugh Dickins wrote:
> On Thu, 29 Jul 2010, dann frazier wrote:
> > On Wed, Jul 28, 2010 at 08:50:18PM -0700, Hugh Dickins wrote:
> > > 
> > > Let's note that gdb's gcore is building up its own version of a
> > > coredump, not going through the get_dump_page() code I was wondering
> > > about.  If I read gcore correctly (possibly not!), it will be reading
> > > selected areas from /proc/<pid>/mem i.e. using access_process_vm().
> > 
> > This appears to be correct. I was able to collect the following
> > stacktrace using INIT:
> > 
> > [ 2535.074197] Backtrace of pid 4605 (gdb)
> > [ 2535.074197] 
> > [ 2535.074197] Call Trace:
> > [ 2535.074197]  [<a00000010000bb00>] ia64_native_leave_kernel+0x0/0x270
> > [ 2535.074197]                                 sp=e000004081c77c40 bsp=e000004081c71018
> > [ 2535.074197]  [<a000000100334720>] __copy_user+0x160/0x960
> > [ 2535.074197]                                 sp=e000004081c77e10 bsp=e000004081c71018
> > [ 2535.074197]  [<a000000100176b00>] access_process_vm+0x2c0/0x380
> > [ 2535.074197]                                 sp=e000004081c77e10 bsp=e000004081c70f60
> 
> Thanks a lot, dann.  But it was the [vdso] line in foo's /proc/<pid>/maps
> which you sent me privately, that set me thinking on the right track.
> Here's what I believe is the appropriate patch: please give it a try
> and let us know...

dannf@...600:~> gdb foo
GNU gdb (GDB) SUSE (7.0-0.4.16)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "ia64-suse-linux".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/dannf/foo...done.
(gdb) break leaf
Breakpoint 1 at 0x4000000000000401: file foo.c, line 2.
(gdb) run
Starting program: /home/dannf/foo 

Breakpoint 1, leaf () at foo.c:2
2	     return 0;
(gdb) gcore
Saved corefile core.3952
(gdb) 

good work Hugh!

     -dann

> 
> [PATCH] mm: fix ia64 crash when gcore reads gate area
> 
> Debian's ia64 autobuilders have been seeing kernel freeze or reboot
> when running the gdb testsuite (Debian bug 588574): dannf bisected to
> 2.6.32 62eede62dafb4a6633eae7ffbeb34c60dba5e7b1 "mm: ZERO_PAGE without
> PTE_SPECIAL"; and reproduced it with gdb's gcore on a simple target.
> 
> I'd missed updating the gate_vma handling in __get_user_pages(): that
> happens to use vm_normal_page() (nowadays failing on the zero page),
> yet reported success even when it failed to get a page - boom when
> access_process_vm() tried to copy that to its intermediate buffer.
> 
> Fix this, resisting cleanups: in particular, leave it for now reporting
> success when not asked to get any pages - very probably safe to change,
> but let's not risk it without testing exposure.
> 
> Why did ia64 crash with 16kB pages, but succeed with 64kB pages?
> Because setup_gate() pads each 64kB of its gate area with zero pages.
> 
> Reported-by: Andreas Barth <aba@....so.argh.org>
> Bisected-by: dann frazier <dannf@...ian.org>
> Signed-off-by: Hugh Dickins <hughd@...gle.com>
> Cc: stable@...nel.org
> ---
> 
>  mm/memory.c |   16 +++++++++++++---
>  1 file changed, 13 insertions(+), 3 deletions(-)
> 
> --- 2.6.35-rc6/mm/memory.c	2010-05-30 17:58:57.000000000 -0700
> +++ linux/mm/memory.c	2010-07-29 17:57:29.000000000 -0700
> @@ -1394,10 +1394,20 @@ int __get_user_pages(struct task_struct
>  				return i ? : -EFAULT;
>  			}
>  			if (pages) {
> -				struct page *page = vm_normal_page(gate_vma, start, *pte);
> +				struct page *page;
> +
> +				page = vm_normal_page(gate_vma, start, *pte);
> +				if (!page) {
> +					if (!(gup_flags & FOLL_DUMP) &&
> +					     is_zero_pfn(pte_pfn(*pte)))
> +						page = pte_page(*pte);
> +					else {
> +						pte_unmap(pte);
> +						return i ? : -EFAULT;
> +					}
> +				}
>  				pages[i] = page;
> -				if (page)
> -					get_page(page);
> +				get_page(page);
>  			}
>  			pte_unmap(pte);
>  			if (vmas)
> 

-- 
dann frazier

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/