lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 9 Jul 2020 08:16:30 -0400
From:   Sven Van Asbroeck <thesven73@...il.com>
To:     Al Viro <viro@...iv.linux.org.uk>
Cc:     Deepa Dinamani <deepa.kernel@...il.com>,
        David Howells <dhowells@...hat.com>,
        "Darrick J. Wong" <darrick.wong@...cle.com>,
        Janos Farkas <chexum+dev@...il.com>,
        Jeff Layton <jlayton@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1 2/2] romfs: address performance regression since v3.10

Hello Al,

You are the closest I could find to a romfs maintainer. get_maintainer.pl
doesn't appear to list any.

This attempted performance regression fix didn't generate much feedback
(to say the least). It's however a real issue for us when supporting a legacy
product where we don't have the luxury of switching to a better-supported
fs.

Is there anything I can do to further this? Is lkml
currently accepting bug / regression fixes to romfs, or is it obsolete?

Many thanks,
Sven

On Mon, Jun 22, 2020 at 8:45 PM Sven Van Asbroeck <thesven73@...il.com> wrote:
>
> Problem
> -------
> romfs sequential read performance has regressed very badly since
> v3.10. Currently, reading a large file inside a romfs image is
> up to 12x slower compared to reading the romfs image directly.
>
> Benchmarks:
> - use a romfs image which contains a single 250M file
> - calculate the md5sum of the romfs image directly (test 1)
>   $ time md5sum image.romfs
> - loop-mount the romfs image, and calc the md5sum of the file
>   inside it (test 2)
>   $ mount -o loop,ro image.romfs /mnt/romfs
>   $ time md5sum /mnt/romfs/file
> - drop caches in between
>   $ echo 3 > /proc/sys/vm/drop_caches
>
> imx6 (arm cortex a9) on emmc, running v5.7.2:
> (test 1)  5 seconds
> (test 2) 60 seconds (12x slower)
>
> Intel i7-3630QM on Samsung SSD 850 EVO (EMT02B6Q),
>     running Ubuntu with v4.15.0-106-generic:
> (test 1) 1.3 seconds
> (test 2) 3.3 seconds (2.5x slower)
>
> To show that a regression has occurred since v3.10:
>
> imx6 on emmc, running v3.10.17:
> (test 1) 16 seconds
> (test 2) 18 seconds
>
> Proposed Solution
> -----------------
> Increase the blocksize from 1K to PAGE_SIZE. This brings the
> sequential read performance close to where it was on v3.10:
>
> imx6 on emmc, running v5.7.2:
> (test 2 1K blocksize) 60 seconds
> (test 2 4K blocksize) 22 seconds
>
> Intel on Ubuntu running v4.15:
> (test 2 1K blocksize) 3.3 seconds
> (test 2 4K blocksize) 1.9 seconds
>
> There is a risk that this may increase latency on random-
> access workloads. But the test below suggests that this
> is not a concern:
>
> Benchmark:
> - use a 630M romfs image consisting of 9600 files
> - loop-mount the romfs image
>   $ mount -o loop,ro image.romfs /mnt/romfs
> - drop all caches
> - list all files in the filesystem (test 3)
>   $ time find /mnt/romfs > /dev/null
>
> imx6 on emmc, running v5.7.2:
> (test 3 1K blocksize) 9.5 seconds
> (test 3 4K blocksize) 9   seconds
>
> Intel on Ubuntu, running v4.15:
> (test 3 1K blocksize) 1.4 seconds
> (test 3 4K blocksize) 1.2 seconds
>
> Practical Solution
> ------------------
> Introduce a mount-option called 'largeblocks'. If present,
> increase the blocksize for much better sequential performance.
>
> Note that the Linux block layer can only support n-K blocks if
> the underlying block device length is also aligned to n-K. This
> may not always be the case. Therefore, the driver will pick the
> largest blocksize which the underlying block device can support.
>
> Cc: Al Viro <viro@...iv.linux.org.uk>
> Cc: Deepa Dinamani <deepa.kernel@...il.com>
> Cc: David Howells <dhowells@...hat.com>
> Cc: "Darrick J. Wong" <darrick.wong@...cle.com>
> Cc: Janos Farkas <chexum+dev@...il.com>
> Cc: Jeff Layton <jlayton@...nel.org>
> To: linux-kernel@...r.kernel.org
> Signed-off-by: Sven Van Asbroeck <TheSven73@...il.com>
> ---
>  fs/romfs/super.c | 62 ++++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 57 insertions(+), 5 deletions(-)
>
> diff --git a/fs/romfs/super.c b/fs/romfs/super.c
> index 6fecdea791f1..93565aeaa43c 100644
> --- a/fs/romfs/super.c
> +++ b/fs/romfs/super.c
> @@ -65,7 +65,7 @@
>  #include <linux/slab.h>
>  #include <linux/init.h>
>  #include <linux/blkdev.h>
> -#include <linux/fs_context.h>
> +#include <linux/fs_parser.h>
>  #include <linux/mount.h>
>  #include <linux/namei.h>
>  #include <linux/statfs.h>
> @@ -460,6 +460,54 @@ static __u32 romfs_checksum(const void *data, int size)
>         return sum;
>  }
>
> +enum romfs_param {
> +       Opt_largeblocks,
> +};
> +
> +static const struct fs_parameter_spec romfs_fs_parameters[] = {
> +       fsparam_flag("largeblocks", Opt_largeblocks),
> +       {}
> +};
> +
> +/*
> + * Parse a single mount parameter.
> + */
> +static int romfs_parse_param(struct fs_context *fc, struct fs_parameter *param)
> +{
> +       struct fs_parse_result result;
> +       int opt;
> +
> +       opt = fs_parse(fc, romfs_fs_parameters, param, &result);
> +       if (opt < 0)
> +               return opt;
> +
> +       switch (opt) {
> +       case Opt_largeblocks:
> +               fc->fs_private = (void *) 1;
> +               break;
> +       default:
> +               return -EINVAL;
> +       }
> +
> +       return 0;
> +}
> +
> +/*
> + * pick the largest blocksize which the underlying block device
> + * is a multiple of. Or fall back to legacy (ROMBSIZE).
> + */
> +static int romfs_largest_blocksize(struct super_block *sb)
> +{
> +       loff_t device_sz = i_size_read(sb->s_bdev->bd_inode);
> +       int blksz;
> +
> +       for (blksz = PAGE_SIZE; blksz > ROMBSIZE; blksz >>= 1)
> +               if ((device_sz % blksz) == 0)
> +                       break;
> +
> +       return blksz;
> +}
> +
>  /*
>   * fill in the superblock
>   */
> @@ -467,17 +515,19 @@ static int romfs_fill_super(struct super_block *sb, struct fs_context *fc)
>  {
>         struct romfs_super_block *rsb;
>         struct inode *root;
> -       unsigned long pos, img_size;
> +       unsigned long pos, img_size, dev_blocksize;
>         const char *storage;
>         size_t len;
>         int ret;
>
>  #ifdef CONFIG_BLOCK
> +       dev_blocksize = fc->fs_private ? romfs_largest_blocksize(sb) :
> +                                        ROMBSIZE;
>         if (!sb->s_mtd) {
> -               sb_set_blocksize(sb, ROMBSIZE);
> +               sb_set_blocksize(sb, dev_blocksize);
>         } else {
> -               sb->s_blocksize = ROMBSIZE;
> -               sb->s_blocksize_bits = blksize_bits(ROMBSIZE);
> +               sb->s_blocksize = dev_blocksize;
> +               sb->s_blocksize_bits = blksize_bits(dev_blocksize);
>         }
>  #endif
>
> @@ -573,6 +623,7 @@ static int romfs_get_tree(struct fs_context *fc)
>  static const struct fs_context_operations romfs_context_ops = {
>         .get_tree       = romfs_get_tree,
>         .reconfigure    = romfs_reconfigure,
> +       .parse_param    = romfs_parse_param,
>  };
>
>  /*
> @@ -607,6 +658,7 @@ static struct file_system_type romfs_fs_type = {
>         .owner          = THIS_MODULE,
>         .name           = "romfs",
>         .init_fs_context = romfs_init_fs_context,
> +       .parameters     = romfs_fs_parameters,
>         .kill_sb        = romfs_kill_sb,
>         .fs_flags       = FS_REQUIRES_DEV,
>  };
> --
> 2.17.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ