lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4EFDF470.9050104@linux.vnet.ibm.com>
Date:	Fri, 30 Dec 2011 11:27:12 -0600
From:	Seth Jennings <sjenning@...ux.vnet.ibm.com>
To:	Dan Magenheimer <dan.magenheimer@...cle.com>
CC:	Greg Kroah-Hartman <gregkh@...e.de>,
	Brian King <brking@...ux.vnet.ibm.com>,
	devel@...verdev.osuosl.org, linux-kernel@...r.kernel.org,
	Konrad Wilk <konrad.wilk@...cle.com>,
	Nitin Gupta <ngupta@...are.org>
Subject: Re: [PATCH] staging: zcache: fix serialization bug in zv stats

On 12/30/2011 11:02 AM, Dan Magenheimer wrote:
>> From: Seth Jennings [mailto:sjenning@...ux.vnet.ibm.com]
>> Sent: Friday, December 30, 2011 9:42 AM
>> To: Greg Kroah-Hartman
>> Cc: Seth Jennings; Dan Magenheimer; Brian King; devel@...verdev.osuosl.org; linux-
>> kernel@...r.kernel.org
>> Subject: [PATCH] staging: zcache: fix serialization bug in zv stats
>>
>> In a multithreaded workload, the zv_curr_dist_counts
>> and zv_cumul_dist_counts statistics are being corrupted
>> because the increments and decrements in zv_create
>> and zv_free are not atomic.
>>
>> This patch converts these statistics and their corresponding
>> increments/decrements/reads to atomic operations.
>>
>> Based on v3.2-rc7
>>
>> Signed-off-by: Seth Jennings <sjenning@...ux.vnet.ibm.com>
> 
> I'm inclined to nack this change, at least unless inside an #ifdef DEBUG,
> as these counts are interesting to a developer but not useful to a normal
> end user, whereas the incremental cost for atomic_inc and atomic_dec are
> non-trivial.  I don't think any off-by-one in these counters could
> result in a bug and, before promotion from staging, they probably
> should just go away.  (They are fun to "watch -d" though ;-)

In my test, it hammers on particular chunk size and the counter is off
by hundreds :-/

I too was worried about performance impact, however, my tests showed
no degradation.  That's probably because there are bigger bottlenecks
elsewhere.

Perhaps we can commit this for now, so that the code is correct, and
revisit this when we try to replace zbud with zsmalloc.  I'm sure
we'll have to rethink the statistics at that time.

The only other option, IMO, is the remove the chunk stats altogether
until we can find a solution that is both fast and correct.

I think that continuing with incorrect stats, regardless of the degree
to which they are incorrect, isn't really a viable option.

--
Seth

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ