linux-kernel - Re: O_DIRECT patch for processors with VIPT cache for mainline kernel (specifically arm in our case)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Fri, 21 Nov 2008 17:10:19 +0000
From:	Catalin Marinas <catalin.marinas@....com>
To:	Russell King - ARM Linux <linux@....linux.org.uk>
Cc:	Nick Piggin <nickpiggin@...oo.com.au>,
	linux-fsdevel@...r.kernel.org, Naval Saini <navalnovel@...il.com>,
	linux-arch@...r.kernel.org,
	linux-arm-kernel@...ts.arm.linux.org.uk,
	linux-kernel@...r.kernel.org, naval.saini@....com
Subject: Re: O_DIRECT patch for processors with VIPT cache for mainline
	kernel (specifically arm in our case)

On Thu, 2008-11-20 at 09:19 +0000, Russell King - ARM Linux wrote:
> On Thu, Nov 20, 2008 at 05:59:00PM +1100, Nick Piggin wrote:
> > - The page is sent to the block layer, which stores into the page. Some
> >   block devices like 'brd' will potentially store via the kernel linear map
> >   here, and they probably don't do enough cache flushing. But a regular
> >   block device should go via DMA, which AFAIK should be OK? (the user address
> >   should remain invalidated because it would be a bug to read from the buffer
> >   before the read has completed)
> 
> This is where things get icky with lots of drivers - DMA is fine, but
> many PIO based drivers don't handle the implications of writing to the
> kernel page cache page when there may be CPU cache side effects.

And for PIO devices, what cache flushing function should be used if the
page isn't available (maybe just a pointer to a buffer) to do a
flush_dcache_page? Should this be done in the filesystem layer?

> If the cache is in read allocate mode, then in this case there shouldn't
> be any dirty cache lines.  (That's not always the case though, esp. via
> conventional IO.)  If the cache is in write allocate mode, PIO data will
> sit in the kernel mapping and won't be visible to userspace.

This problem re-appeared in our tests when someone started using an ext2
filesystem with mtd+slram with write-allocate caches. Basically the D
cache isn't flushed to ensure its coherency with the I cache.

The very dirty workaround was to use flush_icache_range (I know, it's
meant for something completely different) in mtd_blkdevs.c. The slram.c
doesn't have access to any page information either and I noticed that
there are very few block devices that call flush_dcache_page directly.
Should this be done by the filesystem layer?

It looks to me like some filesystems like cramfs call flush_dcache_page
in their "readpage" functions but ext2 doesn't. My other, less dirty,
workaround for the mtd+slram problem is below. It appears to solve the
problem though I'm not sure it is the right solution (ext2 uses
mpage_readpages).

diff --git a/fs/mpage.c b/fs/mpage.c
index 552b80b..979a4a9 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -51,6 +51,7 @@ static void mpage_end_io_read(struct bio *bio, int err)
 			prefetchw(&bvec->bv_page->flags);

 		if (uptodate) {
+			flush_dcache_page(page);
 			SetPageUptodate(page);
 		} else {
 			ClearPageUptodate(page);

Thanks.

-- 
Catalin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/