lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 18 Oct 2021 11:43:49 -0700
From:   Andrew Morton <akpm@...ux-foundation.org>
To:     Zhengyuan Liu <liuzhengyuang521@...il.com>
Cc:     viro@...iv.linux.org.uk, tytso@....edu,
        linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
        mysql@...ts.mysql.com, linux-ext4@...r.kernel.org,
        刘云 <liuyun01@...inos.cn>,
        Zhengyuan Liu <liuzhengyuan@...inos.cn>
Subject: Re: Problem with direct IO

On Mon, 18 Oct 2021 09:09:06 +0800 Zhengyuan Liu <liuzhengyuang521@...il.com> wrote:

> Ping.
> 
> I think this problem is serious and someone may  also encounter it in
> the future.
> 
> 
> On Wed, Oct 13, 2021 at 9:46 AM Zhengyuan Liu
> <liuzhengyuang521@...il.com> wrote:
> >
> > Hi, all
> >
> > we are encounting following Mysql crash problem while importing tables :
> >
> >     2021-09-26T11:22:17.825250Z 0 [ERROR] [MY-013622] [InnoDB] [FATAL]
> >     fsync() returned EIO, aborting.
> >     2021-09-26T11:22:17.825315Z 0 [ERROR] [MY-013183] [InnoDB]
> >     Assertion failure: ut0ut.cc:555 thread 281472996733168
> >
> > At the same time , we found dmesg had following message:
> >
> >     [ 4328.838972] Page cache invalidation failure on direct I/O.
> >     Possible data corruption due to collision with buffered I/O!
> >     [ 4328.850234] File: /data/mysql/data/sysbench/sbtest53.ibd PID:
> >     625 Comm: kworker/42:1
> >
> > Firstly, we doubled Mysql has operating the file with direct IO and
> > buffered IO interlaced, but after some checking we found it did only
> > do direct IO using aio. The problem is exactly from direct-io
> > interface (__generic_file_write_iter) itself.
> >
> > ssize_t __generic_file_write_iter()
> > {
> > ...
> >         if (iocb->ki_flags & IOCB_DIRECT) {
> >                 loff_t pos, endbyte;
> >
> >                 written = generic_file_direct_write(iocb, from);
> >                 /*
> >                  * If the write stopped short of completing, fall back to
> >                  * buffered writes.  Some filesystems do this for writes to
> >                  * holes, for example.  For DAX files, a buffered write will
> >                  * not succeed (even if it did, DAX does not handle dirty
> >                  * page-cache pages correctly).
> >                  */
> >                 if (written < 0 || !iov_iter_count(from) || IS_DAX(inode))
> >                         goto out;
> >
> >                 status = generic_perform_write(file, from, pos = iocb->ki_pos);
> > ...
> > }
> >
> > From above code snippet we can see that direct io could fall back to
> > buffered IO under certain conditions, so even Mysql only did direct IO
> > it could interleave with buffered IO when fall back occurred. I have
> > no idea why FS(ext3) failed the direct IO currently, but it is strange
> > __generic_file_write_iter make direct IO fall back to buffered IO, it
> > seems  breaking the semantics of direct IO.

That makes sense.

> > The reproduced  environment is:
> > Platform:  Kunpeng 920 (arm64)
> > Kernel: V5.15-rc
> > PAGESIZE: 64K
> > Mysql:  V8.0
> > Innodb_page_size: default(16K)

This is all fairly mature code, I think.  Do you know if earlier
kernels were OK, and if so which versions?

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ