lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080826203449.GD5300@tux1.brauer.lan>
Date:	Tue, 26 Aug 2008 22:34:49 +0200
From:	Gerhard Brauer <gerhard.brauer@....de>
To:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
Cc:	"Luiz Fernando N. Capitulino" <lcapitulino@...driva.com.br>,
	"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...e.hu>,
	linux-kernel@...r.kernel.org
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

On Tue, Aug 26, 2008 at 02:15:58PM -0400, Mathieu Desnoyers wrote:
> 
> Ok, it might still be caused by paravirt and alternatives instruction
> patching. What if you also do :
> 
> alternative_instructions()
> 
> +        unsigned long flags;
>         /* The patching is not fully atomic, so try to avoid local interruptions
>            that might execute the to be patched code.
>            Other CPUs are not running. */
>         stop_nmi();
> #ifdef CONFIG_X86_MCE
>         stop_mce();
> #endif
> +        local_irq_save(flags);
> 
> 
> ...
> +        local_irq_restore(flags);
>         restart_nmi();
> #ifdef CONFIG_X86_MCE
>         restart_mce();
> #endif
> 
> ?

Hej! This last changes (in addition to the others you mentioned) seems
to be a good shot. I could reboot 8 times the guest, compile several
packages (something which always leeds to the oops) and currently i
build two big packages simultan. So this is heavy IO.

I will try tomorrow more heavy build tests (to gain the good feeling to
the vbox+guest kernel again like it was with 2.6.25), but i think your
changes goes in the right direction.

Here is the diff what i've changed on your hints:

,----[ arch/x86/kernel/alternative.c ]
| --- alternative.c.org	2008-07-13 23:51:29.000000000 +0200
| +++ alternative.c	2008-08-26 21:35:20.000000000 +0200
| @@ -343,6 +343,7 @@
|  void alternatives_smp_switch(int smp)
|  {
|  	struct smp_alt_module *mod;
| +	unsigned long flags;
|  
|  #ifdef CONFIG_LOCKDEP
|  	/*
| @@ -359,7 +360,7 @@
|  		return;
|  	BUG_ON(!smp && (num_online_cpus() > 1));
|  
| -	spin_lock(&smp_alt);
| +	spin_lock_irqsave(&smp_alt, flags);
|  
|  	/*
|  	 * Avoid unnecessary switches because it forces JIT based VMs to
| @@ -383,7 +384,7 @@
|  						mod->text, mod->text_end);
|  	}
|  	smp_mode = smp;
| -	spin_unlock(&smp_alt);
| +	spin_unlock_irqrestore(&smp_alt, flags);
|  }
|  
|  #endif
| @@ -420,6 +421,7 @@
|  
|  void __init alternative_instructions(void)
|  {
| +	unsigned long flags;
|  	/* The patching is not fully atomic, so try to avoid local interruptions
|  	   that might execute the to be patched code.
|  	   Other CPUs are not running. */
| @@ -427,6 +429,7 @@
|  #ifdef CONFIG_X86_MCE
|  	stop_mce();
|  #endif
| +	local_irq_save(flags);
|  
|  	apply_alternatives(__alt_instructions, __alt_instructions_end);
|  
| @@ -465,6 +468,7 @@
|  				(unsigned long)__smp_locks,
|  				(unsigned long)__smp_locks_end);
|  
| +	local_irq_restore(flags);
|  	restart_nmi();
|  #ifdef CONFIG_X86_MCE
|  	restart_mce();
| @@ -508,33 +512,5 @@
|   */
|  void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
|  {
| -	unsigned long flags;
| -	char *vaddr;
| -	int nr_pages = 2;
| -	struct page *pages[2];
| -	int i;
| -
| -	if (!core_kernel_text((unsigned long)addr)) {
| -		pages[0] = vmalloc_to_page(addr);
| -		pages[1] = vmalloc_to_page(addr + PAGE_SIZE);
| -	} else {
| -		pages[0] = virt_to_page(addr);
| -		WARN_ON(!PageReserved(pages[0]));
| -		pages[1] = virt_to_page(addr + PAGE_SIZE);
| -	}
| -	BUG_ON(!pages[0]);
| -	if (!pages[1])
| -		nr_pages = 1;
| -	vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL);
| -	BUG_ON(!vaddr);
| -	local_irq_save(flags);
| -	memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
| -	local_irq_restore(flags);
| -	vunmap(vaddr);
| -	sync_core();
| -	/* Could also do a CLFLUSH here to speed up CPU recovery; but
| -	   that causes hangs on some VIA CPUs. */
| -	for (i = 0; i < len; i++)
| -		BUG_ON(((char *)addr)[i] != ((char *)opcode)[i]);
| -	return addr;
| +	return text_poke_early(addr, opcode, len);
|  }
`----

So if Luiz and others could also try all 3 mentioned changes, maybe we
have a solution. I also will build tomorrow a new LiveCD/Install ISO
with these patches to see if the error there is also gone.

> Thanks,
> 
> Mathieu

Gerhard


-- 
Was wir wissen, ist ein Tropfen.
Was wir nicht wissen, ein Ozean (Newton)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ