lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.1.00.0804021422570.14670@woody.linux-foundation.org>
Date:	Wed, 2 Apr 2008 14:31:46 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Mikulas Patocka <mikulas@...ax.karlin.mff.cuni.cz>
cc:	viro@...iv.linux.org.uk, linux-kernel@...r.kernel.org
Subject: Re: [PATCH]: Fix SMP-reordering race in mark_buffer_dirty



On Wed, 2 Apr 2008, Mikulas Patocka wrote:
> 
> So you're right, the gain of mfence is so little that you can remove it 
> and use only test_set_buffer_dirty.

Well, I suspect that part of the issue is that quite often you end up 
with *both* because the buffer wasn't already dirty from before.

Re-dirtying a dirty buffer is pretty common for things like bitmap blocks 
etc, so it's probably a worthy optimization if it has no cost, and on 
Core2 I suspect your version is worth it, but it's not like it's going to 
be necessarily a 99% kind of case. I suspect quite a lot of the 
mark_buffer_dirty() calls are actually on clean buffers.

(Of course, a valid argument is that if it was already dirty, we'll skip 
the other expensive parts, so only the "already dirty" case is worth 
optimizing for. Maybe true. There might also be cases where it means one 
less dirty cacheline in memory.)

> I don't know if there are other architectures where smb_mb() would be 
> significantly faster than test_and_set_bit.

Probably none, since it test_and_set_bit() implies a smp_mb(), and 
generally the bigger cost is in the barrier than in the bit setting 
itself.

Core 2 is the outlier in having a noticeably faster "mfence" than atomic 
instructions (and judging by noises Intel makes, Nehalem will undo that 
outlier).

				Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ