lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFyXdtVJ=SG9bU4qfggCR6DPvz4vOFYxKJZx_WdyFv+3Fw@mail.gmail.com>
Date:	Wed, 28 Nov 2012 12:03:23 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Mikulas Patocka <mpatocka@...hat.com>
Cc:	Jens Axboe <axboe@...nel.dk>,
	Jeff Chua <jeff.chua.linux@...il.com>,
	Lai Jiangshan <laijs@...fujitsu.com>, Jan Kara <jack@...e.cz>,
	lkml <linux-kernel@...r.kernel.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [PATCH] Introduce a method to catch mmap_region (was: Recent
 kernel "mount" slow)

On Wed, Nov 28, 2012 at 11:50 AM, Mikulas Patocka <mpatocka@...hat.com> wrote:
>
> mmap_region() doesn't care about the block size. But a lot of
> page-in/page-out code does.

That seems a bogus argument.

mmap() is in *no* way special. The exact same thing happens for
regular read/write. Yet somehow the mmap code is special-cased, while
the normal read-write code is not.

I suspect it might be *easier* to trigger some issues with mmap, but
that still isn't a good enough reason to special-case it. We don't add
locking to one please just because that one place shows some race
condition more easily. We fix the locking.

So for example, maybe the code that *actually* cares about the buffer
size (the stuff that allocates buffers in fs/buffer.c) needs to take
that new percpu read lock. Basically, any caller of
"alloc_page_buffers()/create_empty_buffers()" or whatever.

I also wonder whether we need it *at*all*. I suspect that we could
easily have multiple block-sizes these days for the same block device.
It *used* to be (millions of years ago, when dinosaurs roamed the
earth) that the block buffers were global and shared with all users of
a partition. But that hasn't been true since we started using the page
cache, and I suspect that some of the block size changing issues are
simply entirely stale.

Yeah, yeah, there could be some coherency issues if people write to
the block device through different block sizes, but I think we have
those coherency issues anyway. The page-cache is not coherent across
different mapping inodes anyway.

So I really suspect that some of this is "legacy logic". Or at least
perhaps _should_ be.

                    Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ