lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191211024137.GB61323@mit.edu>
Date:   Tue, 10 Dec 2019 21:41:37 -0500
From:   "Theodore Y. Ts'o" <tytso@....edu>
To:     Ming Lei <ming.lei@...hat.com>
Cc:     Andrea Vai <andrea.vai@...pv.it>,
        "Schmid, Carsten" <Carsten_Schmid@...tor.com>,
        Finn Thain <fthain@...egraphics.com.au>,
        Damien Le Moal <Damien.LeMoal@....com>,
        Alan Stern <stern@...land.harvard.edu>,
        Jens Axboe <axboe@...nel.dk>,
        Johannes Thumshirn <jthumshirn@...e.de>,
        USB list <linux-usb@...r.kernel.org>,
        SCSI development list <linux-scsi@...r.kernel.org>,
        Himanshu Madhani <himanshu.madhani@...ium.com>,
        Hannes Reinecke <hare@...e.com>,
        Omar Sandoval <osandov@...com>,
        "Martin K. Petersen" <martin.petersen@...cle.com>,
        Greg KH <gregkh@...uxfoundation.org>,
        Hans Holmberg <Hans.Holmberg@....com>,
        Kernel development list <linux-kernel@...r.kernel.org>,
        linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: AW: Slow I/O on USB media after commit
 f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6

On Tue, Dec 10, 2019 at 04:05:50PM +0800, Ming Lei wrote:
> > > The path[2] is expected behaviour. Not sure path [1] is correct,
> > > given
> > > ext4_release_file() is supposed to be called when this inode is
> > > released. That means the file is closed 4358 times during 1GB file
> > > copying to usb storage.
> > > 
> > > [1] insert requests when returning to user mode from syscall
> > > 
> > >   b'blk_mq_sched_request_inserted'
> > >   b'blk_mq_sched_request_inserted'
> > >   b'dd_insert_requests'
> > >   b'blk_mq_sched_insert_requests'
> > >   b'blk_mq_flush_plug_list'
> > >   b'blk_flush_plug_list'
> > >   b'io_schedule_prepare'
> > >   b'io_schedule'
> > >   b'rq_qos_wait'
> > >   b'wbt_wait'
> > >   b'__rq_qos_throttle'
> > >   b'blk_mq_make_request'
> > >   b'generic_make_request'
> > >   b'submit_bio'
> > >   b'ext4_io_submit'
> > >   b'ext4_writepages'
> > >   b'do_writepages'
> > >   b'__filemap_fdatawrite_range'
> > >   b'ext4_release_file'
> > >   b'__fput'
> > >   b'task_work_run'
> > >   b'exit_to_usermode_loop'
> > >   b'do_syscall_64'
> > >   b'entry_SYSCALL_64_after_hwframe'
> > >     4358

I'm guessing that your workload is repeatedly truncating a file (or
calling open with O_TRUNC) and then writing data to it.  When you do
this, then when the file is closed, we assume that since you were
replacing the previous contents of a file with new contents, that you
would be unhappy if the file contents was replaced by a zero length
file after a crash.  That's because ten years, ago there were a *huge*
number of crappy applications that would replace a file by reading it
into memory, truncating it, and then write out the new contents of the
file.  This could be a high score file for a game, or a KDE or GNOME
state file, etc.

So if someone does open, truncate, write, close, we still immediately
writing out the data on the close, assuming that the programmer really
wanted open, truncate, write, fsync, close, but was too careless to
actually do the right thing.

Some workaround[1] like this is done by all of the major file systems,
and was fallout the agreement from the "O_PONIES"[2] controversy.
This was discussed and agreed to at the 2009 LSF/MM workshop.  (See
the "rename, fsync, and ponies" section.)

[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45
[2] https://blahg.josefsipek.net/?p=364
[3] https://lwn.net/Articles/327601/

So if you're seeing a call to filemap_fdatawrite_range as the result
of a fput, that's why.

In any case, this behavior has been around for a decade, and it
appears to be incidental to your performance difficulties with your
USB thumbdrive and block-mq.

						- Ted

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ