lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 19 Apr 2009 17:53:54 -0700
From:	Roland Dreier <rdreier@...co.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	"H. Peter Anvin" <hpa@...or.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"Robert P. J. Day" <rpjday@...shcourse.ca>,
	Hitoshi Mitake <h.mitake@...il.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: arch/x86/Kconfig selects invalid HAVE_READQ, HAVE_WRITEQ vars

 > Look at the drivers that define their own wrappers:
 > 
 > #ifndef readq
 > static inline unsigned long long readq(void __iomem *addr)
 > {
 >         return readl(addr) | (((unsigned long long)readl(addr + 4)) << 32LL);
 > }
 > #endif
 > 
 > ... it's the obvious 32-bit semantics for reading a 64-bit value 
 > from an mmio address. We made that available on 32-bit too.

But look at, say, drivers/infiniband/hw/amso100/c2.h:

#ifndef readq
static inline u64 readq(const void __iomem * addr)
{
	u64 ret = readl(addr + 4);
	ret <<= 32;
	ret |= readl(addr);

	return ret;
}
#endif

Notice that it reads from addr+4 *before* it reads from addr, rather
than after as in your example (and in fact your example depends on
undefined compiler semantics, since there is no sequence point between
the two operands of the | operator).  Now, I don't know that hardware,
so I don't know if it makes a difference, but the niu example I gave in
my original email shows that given hardware with clear-on-read
registers, the order does very much matter.

In a similar vein, drivers/infiniband/hw/mthca (which I wrote) deals
with hardware that has 64-bit registers, where we can write in two
32-bit chunks, as long as we have the right order and no other writes to
the same page of registers come in between.  So on 32-bit architectures,
the driver must use a spinlock around the pair of 32-bit writes (see
drivers/infiniband/hw/mthca/mthca_doorbell.h for the code).  And the
simple fact is that if that driver used "#ifdef writeq" (instead of "#if
BITS_PER_LONG == 64" as it actually does) then it would be broken on
32-bit x86 right now.

 > > So I would strongly suggest reverting 2c5643b1 since as far as I 
 > > can tell it just sets a trap for subtle bugs that only show up on 
 > > 32-bit x86 [...]

 > Heh. It "only" shows up on the platform that ~80% of all our kernel 
 > testers use? ;-)

Well, most of the drivers using readq()/writeq() are probably driving
"high-end" hardware (InfiniBand, 10G ethernet, "enterprise" SCSI) that
is much more tilted to 64-bit architectures.  But yes, such bugs would
probably be seen quickly -- but the effort to debug "works on x86-64,
fails on x86-32 under high load" bugs is pretty big, given that the
symptoms of non-atomic access to a 64-bit register are probably pretty
mysterious (you can read about how the niu bug I mentioned was fixed --
it took a while to zero in on the root cause).

 > So, are you arguing for a per driver definition of readq/writeq? If 
 > so then that does not make much technical sense. If not ... then 
 > what is your technical point?

Yes, I am arguing for exactly that, because dealing with the semantics
of non-atomic access to 64-bit registers involved low-level knowledge of
the specific hardware being driven.

As it stands 32-bit x86 has readq()/writeq() that are subtly different
subtly different from all other 64-bit architectures, in a way that sets
a booby trap for any driver that uses them.  So yes I stick to my
original point that the commit that added them for 32-bit x86 should be
reverted.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists