[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20061224212113.GA31813@mellanox.co.il>
Date: Sun, 24 Dec 2006 23:21:13 +0200
From: "Michael S. Tsirkin" <mst@...lanox.co.il>
To: Linus Torvalds <torvalds@...l.org>
Cc: Andrei Popa <andrei.popa@...eo.ro>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Andrew Morton <akpm@...l.org>,
Gordon Farquharson <gordonfarquharson@...il.com>,
Martin Michlmayr <tbm@...ius.com>,
Hugh Dickins <hugh@...itas.com>,
Nick Piggin <nickpiggin@...oo.com.au>,
Arjan van de Ven <arjan@...radead.org>,
openib-general@...nib.org,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm: fix page_mkclean_one (was: 2.6.19 file content corruption on ext3)
> Quoting Linus Torvalds <torvalds@...l.org>:
> Subject: Re: [PATCH] mm: fix page_mkclean_one (was: 2.6.19 file content corruption on ext3)
>
> Peter, tell me I'm crazy, but with the new rules, the following condition
> is a bug:
>
> - shared mapping
> - writable
> - not already marked dirty in the PTE
>
> because that combination means that the hardware can mark the PTE dirty
> without us even realizing (and thus not marking the "struct page *"
> dirty).
Er.
Sorry about bumping in, and I'm not sure I understand all of the discussion,
but this reminded me of an old issue with COW that created what looks
like a vaguely similiar data corruption on infiniband. We solved this for
infiniband with MADV_DONTFORK, but I always wondered why does it not affect
other parts of kernel. Small reminder from that discussion:
down mmap sem
get user pages
up mmap sem
page becomes shared, and COW (e.g. fork)
process writes to first byte of page <----- gets a copy
Now we had a problem: struct page that we got from get user pages
does not point to a correct page in our process.
For example: if at some point we map this page for DMA, and
hardware writes to last byte of page -----> process does not
see this data.
So for infiniband, what we do is a combination of
- prevent page from becoming COW while hardware might DMA to this page, and
- ask users not to write to page if hardware might DMA to same page
(even if its using different bytes).
I just wandered - is there some chance something like this could be happening in
the fs code?
HTH,
--
MST
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists