linux-kernel - Re: [Lsf] Postgresql performance problems with IO latency, especially during fsync()

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrWQhTodxYmmCPqqH0n3aD7dCj+_xOF-DL8SGGU0d4GpJg@mail.gmail.com>
Date:	Wed, 26 Mar 2014 16:28:18 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	David Lang <david@...g.hm>
Cc:	Andres Freund <andres@...quadrant.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	Linux FS Devel <linux-fsdevel@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	lsf@...ts.linux-foundation.org,
	Wu Fengguang <fengguang.wu@...el.com>, rhaas@...razel.de
Subject: Re: [Lsf] Postgresql performance problems with IO latency, especially
 during fsync()

On Wed, Mar 26, 2014 at 4:11 PM, Andy Lutomirski <luto@...capital.net> wrote:
> On Wed, Mar 26, 2014 at 3:35 PM, David Lang <david@...g.hm> wrote:
>> On Wed, 26 Mar 2014, Andy Lutomirski wrote:
>>
>>>>> I'm not sure I understand the request queue stuff, but here's an idea.
>>>>>  The block core contains this little bit of code:
>>>>
>>>>
>>>> I haven't read enough of the code yet, to comment intelligently ;)
>>>
>>>
>>> My little patch doesn't seem to help.  I'm either changing the wrong
>>> piece of code entirely or I'm penalizing readers and writers too much.
>>>
>>> Hopefully some real block layer people can comment as to whether a
>>> refinement of this idea could work.  The behavior I want is for
>>> writeback to be limited to using a smallish fraction of the total
>>> request queue size -- I think that writeback should be able to enqueue
>>> enough requests to get decent sorting performance but not enough
>>> requests to prevent the io scheduler from doing a good job on
>>> non-writeback I/O.
>>
>>
>> The thing is that if there are no reads that are waiting, why not use every
>> bit of disk I/O available to write? If you can do that reliably with only
>> using part of the queue, fine, but aren't you getting fairly close to just
>> having separate queues for reading and writing with such a restriction?
>>
>
> Hmm.
>
> I wonder what the actual effect of queue length is on throughput.  I
> suspect that using half the queue gives you well over half the
> throughput as long as the queue isn't tiny.
>
> I'm not so sure I'd go so far as having separate reader and writer
> queues -- I think that small synchronous writes should also not get
> stuck behind large writeback storms, but maybe that's something that
> can be a secondary goal.  That being said, separate reader and writer
> queues might solve the immediate problem.  It won't help for the case
> where a small fsync blocks behind writeback, though, and that seems to
> be a very common cause of Firefox freezing on my system.
>
> Is there an easy way to do a proof-of-concept?  It would be great if
> there was a ten-line patch that implemented something like this
> correctly enough to see if it helps.  I don't think I'm the right
> person to do it, because my knowledge of the block layer code is
> essentially nil.

I think it's at least a bit more subtle than this.  cfq distinguishes
SYNC and ASYNC, but very large fsyncs are presumably SYNC.  Deadline
pays no attention to rw flags.

Anyway, it seems like there's basically nothing prioritizing what
happens when the number of requests exceeds the congestion thresholds.
 I'd happily bet a beverage* that Postgres's slow requests are
spending an excessive amount of time waiting to get into the queue in
the first place.

* Since I'm back home now, any actual beverage transaction will be
rather delayed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/