lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.58.0802041650510.4454@gandalf.stny.rr.com>
Date:	Mon, 4 Feb 2008 17:03:47 -0500 (EST)
From:	Steven Rostedt <rostedt@...dmis.org>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...e.hu>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Christoph Hellwig <hch@...radead.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
	Gregory Haskins <ghaskins@...ell.com>,
	Arnaldo Carvalho de Melo <acme@...stprotocols.net>,
	Thomas Gleixner <tglx@...utronix.de>,
	Tim Bird <tim.bird@...sony.com>,
	Sam Ravnborg <sam@...nborg.org>,
	"Frank Ch. Eigler" <fche@...hat.com>,
	Jan Kiszka <jan.kiszka@...mens.com>,
	John Stultz <johnstul@...ibm.com>,
	Arjan van de Ven <arjan@...radead.org>,
	Steven Rostedt <srostedt@...hat.com>
Subject: Re: [PATCH 02/22 -v7] Add basic support for gcc profiler instrumentation


On Mon, 4 Feb 2008, Paul E. McKenney wrote:
> OK, will see what I can do...
>
> > On Sat, 2 Feb 2008, Paul E. McKenney wrote:
> >
> > > Yep, you have dependencies, so something like the following:
> > >
> > > initial state:
> > >
> > > 	struct foo {
> > > 		int a;
> > > 	};
> > > 	struct foo x = { 0 };
> > > 	struct foo y = { 0 };
> > > 	struct foo *global_p = &y;
> > > 	/* other variables are appropriately declared auto variables */
> > >
> > > 	/* No kmalloc() or kfree(), hence no RCU grace periods. */
> > > 	/* In the terminology of http://lwn.net/Articles/262464/, we */
> > > 	/* are doing only publish-subscribe, nothing else. */
> > >
> > > writer:
> > >
> > > 	x.a = 1;
> > > 	smp_wmb();  /* or smp_mb() */
> > > 	global_p = &x;
> > >
> > > reader:
> > >
> > > 	p = global_p;
> > > 	ta = p->a;
> > >
> > > Both Alpha and aggressive compiler optimizations can result in the reader
> > > seeing the new value of the pointer (&x) but the old value of the field
> > > (0).  Strange but true.  The fix is as follows:
> > >
> > > reader:
> > >
> > > 	p = global_p;
> > > 	smp_read_barrier_depends();  /* or use rcu_dereference() */
> > > 	ta = p->a;
> > >
> > > So how can this happen?  First note that if smp_read_barrier_depends()
> > > was unnecessary in this case, it would be unnecessary in all cases.
> > >
> > > Second, let's start with the compiler.  Suppose that a highly optimizing
> > > compiler notices that in almost all cases, the reader finds p==global_p.
> > > Suppose that this compiler also notices that one of the registers (say
> > > r1) almost always contains this expected value of global_p, and that
> > > cache pressure ensures that an actual load from global_p almost always
> > > generates an expensive cache miss.  Such a compiler would be within its
> > > rights (as defined by the C standard) to generate code assuming that r1
> > > already had the right value, while also generating code to validate this
> > > assumption, perhaps as follows:
> > >
> > > 	r2 = global_p;  /* high latency, other things complete meanwhile */
> > > 	ta == r1->a;
> > > 	if (r1 != r2)
> > > 		ta = r2->a;
> > >
> > > Now consider the following sequence of events on a superscalar CPU:
> >
> > I think you missed one step here (causing my confusion). I don't want to
> > assume so I'll try to put in the missing step:
> >
> > 	writer: r1 = p;  /* happens to use r1 to store parameter p */
>
> You lost me on this one...  The writer has only the following three steps:

You're right. I meant "writer:  r1 = x;"

>
> writer:
>
> 	x.a = 1;
> 	smp_wmb();  /* or smp_mb() */
> 	global_p = &x;
>
> Where did the "r1 = p" come from?  For that matter, where did "p" come
> from?
>
> > > 	reader: r2 = global_p; /* issued, has not yet completed. */
> > > 	reader: ta = r1->a; /* which gives zero. */
> > > 	writer: x.a = 1;
> > > 	writer: smp_wmb();
> > > 	writer: global_p = &x;
> > > 	reader: r2 = global_p; /* this instruction now completes */
> > > 	reader: if (r1 != r2) /* and these are equal, so we keep bad ta! */
> >
> > Is that the case?
>
> Ah!  Please note that I am doing something unusual here in that I am
> working with global variables, as opposed to the normal RCU practice of
> dynamically allocating memory.  So "x" is just a global struct, not a
> pointer to a struct.
>

But lets look at a simple version of my original code anyway ;-)

Writer:

void add_op(struct myops *x) {
	/* x->next may be garbage here */
	x->next = global_p;
	smp_wmb();
	global_p = x;
}

Reader:

void read_op(void)
{
	struct myops *p = global_p;

	while (p != NULL) {
		p->func();
		p = next;
		/* if p->next is garbage we crash */
	}
}


Here, we are missing the read_barrier_depends(). Lets look at the Alpha
cache issue:


reader reads the new version of global_p, and then reads the next
pointer. But since the next pointer is on a different cacheline than
global_p, it may have somehow had that in it's cache still. So it uses the
old next pointer which contains the garbage.

Is that correct?

But I will have to admit, that I can't see how an aggressive compiler
might have screwed this up. Being that x is a parameter, and the function
add_op is not in a header file.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ