lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 05 Aug 2009 10:44:38 +0900
From:	Tejun Heo <tj@...nel.org>
To:	Jeremy Fitzhardinge <jeremy@...p.org>
CC:	Rusty Russell <rusty@...tcorp.com.au>,
	Ingo Molnar <mingo@...hat.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Problem with percpu values when bringing up second CPU?

Hello,

Jeremy Fitzhardinge wrote:
> I just tracked down a bug I was having to a change where I changed one
> of my Xen event channel variables to a percpu variable, relating to
> masking an event channel.
> 
> The symptom was that shortly after bringing up the second CPU, the first
> CPU's timer events stopped arriving, apparently because they had become
> masked.

Hmmmm...

> The event channels masks are declared as:
> 
> #define NR_EVENT_CHANNEL_LONGS (NR_EVENT_CHANNELS/BITS_PER_LONG)
> static DEFINE_PER_CPU(unsigned long,
>                      cpu_evtchn_mask[NR_EVENT_CHANNEL_LONGS]) =
>        {[0 ... NR_EVENT_CHANNEL_LONGS-1] = ~0ul };	/* everything masked by default */
> 
> My theory about what's happening is that when the second CPU comes up,
> it allocates separate percpu areas for each CPU, but it is somehow
> failing to accurately copy CPU 0's percpu data over; either it isn't
> copying it all (ie, using the initialized values rather than the current
> values), or failing to copy the values in an interrupt-atomic way.
>
> Does this sound plausible?

Percpu areas aren't setup when the first cpu comes up.  They're
allocated and copied from the master copy during early init when only
the boot cpu is running.

> When I convert this back to an ad-hoc percpu variable (an array indexed
> by cpu number), it goes back to working.  Also, if I boot with maxcpus=1
> it also works with percpu data.

Hmmm... strange.  Can you try to print out the values along the boot
process and see when things go wrong?

> Also, because we don't have large pages under Xen, it always allocates
> percpu as 4k pages:
> 
> PERCPU: Allocated 21 4k pages, static data 82080 bytes

I don't think the choice of first chunk allocator would cause any
difference.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ