[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190315033821.GC11334@mit.edu>
Date: Thu, 14 Mar 2019 23:38:21 -0400
From: "Theodore Ts'o" <tytso@....edu>
To: Lukas Czerner <lczerner@...hat.com>
Cc: linux-ext4@...r.kernel.org, Frank Sorenson <fsorenso@...hat.com>,
stable@...r.kernel.org
Subject: Re: [PATCH] ext4: Fix data corruption caused by unaligned direct AIO
On Wed, Mar 06, 2019 at 12:06:42PM +0100, Lukas Czerner wrote:
> Ext4 needs to serialize unaligned direct AIO because the zeroing of
> partial blocks of two competing unaligned AIOs can result in data
> corruption.
>
> However it decides not to serialize if the potentially unaligned aio is
> past i_size with the rationale that no pending writes are possible past
> i_size. Unfortunately if the i_size is not block aligned and the second
> unaligned write lands past i_size, but still into the same block, it has
> the potential of corrupting the previous unaligned write to the same
> block.
>
> This is (very simplified) reproducer from Frank
>
> // 41472 = (10 * 4096) + 512
> // 37376 = 41472 - 4096
>
> ftruncate(fd, 41472);
> io_prep_pwrite(iocbs[0], fd, buf[0], 4096, 37376);
> io_prep_pwrite(iocbs[1], fd, buf[1], 4096, 41472);
>
> io_submit(io_ctx, 1, &iocbs[1]);
> io_submit(io_ctx, 1, &iocbs[2]);
>
> io_getevents(io_ctx, 2, 2, events, NULL);
>
> Without this patch the 512B range from 40960 up to the start of the
> second unaligned write (41472) is going to be zeroed overwriting the data
> written by the first write. This is a data corruption.
>
> 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> *
> 00009200 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30
> *
> 0000a000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> *
> 0000a200 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31
>
> With this patch the data corruption is avoided because we will recognize
> the unaligned_aio and wait for the unwritten extent conversion.
>
> 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> *
> 00009200 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30
> *
> 0000a200 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31
> *
> 0000b200
>
> Reported-by: Frank Sorenson <fsorenso@...hat.com>
> Signed-off-by: Lukas Czerner <lczerner@...hat.com>
> Fixes: e9e3bcecf44c ("ext4: serialize unaligned asynchronous DIO")
> Cc: <stable@...r.kernel.org>
Thanks, applied.
- Ted
Powered by blists - more mailing lists