lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 4 Aug 2007 20:37:40 +0400
From:	Evgeniy Polyakov <>
To:	Daniel Phillips <>
	Peter Zijlstra <>
Subject: Re: Distributed storage.

On Fri, Aug 03, 2007 at 06:19:16PM -0700, Daniel Phillips ( wrote:
> It depends on the characteristics of the physical and virtual block 
> devices involved.  Slow block devices can produce surprising effects.  
> Ddsnap still qualifies as "slow" under certain circumstances (big 
> linear write immediately following a new snapshot). Before we added 
> throttling we would see as many as 800,000 bios in flight.  Nice to 

Mmm, sounds tasty to work with such a system :)

> know the system can actually survive this... mostly.  But memory 
> deadlock is a clear and present danger under those conditions and we 
> did hit it (not to mention that read latency sucked beyond belief). 
> Anyway, we added a simple counting semaphore to throttle the bio traffic 
> to a reasonable number and behavior became much nicer, but most 
> importantly, this satisfies one of the primary requirements for 
> avoiding block device memory deadlock: a strictly bounded amount of bio 
> traffic in flight.  In fact, we allow some bounded number of 
> non-memalloc bios *plus* however much traffic the mm wants to throw at 
> us in memalloc mode, on the assumption that the mm knows what it is 
> doing and imposes its own bound of in flight bios per device.   This 
> needs auditing obviously, but the mm either does that or is buggy.  In 
> practice, with this throttling in place we never saw more than 2,000 in 
> flight no matter how hard we hit it, which is about the number we were 
> aiming at.  Since we draw our reserve from the main memalloc pool, we 
> can easily handle 2,000 bios in flight, even under extreme conditions.
> See:
>     down(&info->throttle_sem);
> To be sure, I am not very proud of this throttling mechanism for various 
> reasons, but the thing is, _any_ throttling mechanism no matter how 
> sucky solves the deadlock problem.  Over time I want to move the 

make_request_fn is always called in process context, we can wait in it
for memory in mempool. Although that means we already in trouble.

I agree, any kind of high-boundary leveling must be implemented in
device itself, since block layer does not know what device is at the end
and what it will need to process given block request.

	Evgeniy Polyakov
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to
More majordomo info at

Powered by blists - more mailing lists