linux-kernel - Re: [PATCH]: Fix SMP-reordering race in mark_buffer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.1.00.0804021422570.14670@woody.linux-foundation.org>
Date:	Wed, 2 Apr 2008 14:31:46 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Mikulas Patocka <mikulas@...ax.karlin.mff.cuni.cz>
cc:	viro@...iv.linux.org.uk, linux-kernel@...r.kernel.org
Subject: Re: [PATCH]: Fix SMP-reordering race in mark_buffer_dirty

On Wed, 2 Apr 2008, Mikulas Patocka wrote:
> 
> So you're right, the gain of mfence is so little that you can remove it 
> and use only test_set_buffer_dirty.

Well, I suspect that part of the issue is that quite often you end up 
with *both* because the buffer wasn't already dirty from before.

Re-dirtying a dirty buffer is pretty common for things like bitmap blocks 
etc, so it's probably a worthy optimization if it has no cost, and on 
Core2 I suspect your version is worth it, but it's not like it's going to 
be necessarily a 99% kind of case. I suspect quite a lot of the 
mark_buffer_dirty() calls are actually on clean buffers.

(Of course, a valid argument is that if it was already dirty, we'll skip 
the other expensive parts, so only the "already dirty" case is worth 
optimizing for. Maybe true. There might also be cases where it means one 
less dirty cacheline in memory.)

> I don't know if there are other architectures where smb_mb() would be 
> significantly faster than test_and_set_bit.

Probably none, since it test_and_set_bit() implies a smp_mb(), and 
generally the bigger cost is in the barrier than in the bit setting 
itself.

Core 2 is the outlier in having a noticeably faster "mfence" than atomic 
instructions (and judging by noises Intel makes, Nehalem will undo that 
outlier).

				Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/