linux-kernel - Re: [GIT PULL] writeback changes for 3.5-rc1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFxHt8q8+jQDuoaK=hObX+73iSBTa4bBWodCX3s-y4Q1GQ@mail.gmail.com>
Date:	Mon, 28 May 2012 10:09:56 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Fengguang Wu <fengguang.wu@...el.com>
Cc:	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [GIT PULL] writeback changes for 3.5-rc1

Ok, pulled.

However, I have an independent question for you - have you looked at
any kind of per-file write-behind kind of logic?

The reason I ask is that pretty much every time I write some big file
(usually when over-writing a harddisk), I tend to use my own hackish
model, which looks like this:

#define BUFSIZE (8*1024*1024ul)

        ...
        for (..) {
                ...
                if (write(fd, buffer, BUFSIZE) != BUFSIZE)
                        break;
                sync_file_range(fd, index*BUFSIZE, BUFSIZE,
SYNC_FILE_RANGE_WRITE);
                if (index)
                        sync_file_range(fd, (index-1)*BUFSIZE,
BUFSIZE, SYNC_FILE_RANGE_WAIT_BEFORE|SYNC_FILE_RANGE_WRITE|SYNC_FILE_RANGE_WAIT_AFTER);
                ....

and it tends to be *beautiful* for both disk IO performane and for
system responsiveness while the big write is in progress.

And I'm wondering if we couldn't expose this kind of write-behind
logic from the kernel. Sure, it only works for the "contiguous write
of a single large file" model, but that model isn't actually all
*that* unusual.

Right now all the write-back logic is based on the
balance_dirty_pages() model, which is more of a global dirty model.
Which obviously is needed too - this isn't an "either or" kind of
thing, it's more of a "maybe we could have a streaming detector *and*
the 'random writes' code". So I was wondering if anybody had ever been
looking more at an explicit write-behind model that uses the same kind
of "per-file window" that the read-ahead code does.

(The above code only works well for known streaming writes, but the
*model* of saying "ok, let's start writeout for the previous streaming
block, and then wait for the writeout of the streaming block before
that" really does tend to result in very smooth IO and minimal
disruption of other processes..)

                      Linus

On Mon, May 28, 2012 at 4:41 AM, Fengguang Wu <fengguang.wu@...el.com> wrote:
>
> Please pull the writeback changes, mainly from Jan Kara to avoid
> iput() in the flusher threads.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/