lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Wed, 20 Sep 2017 10:43:59 +0900
From:   Shunki Fujita <shunki-fujita@...ozu.co.jp>
To:     viro@...iv.linux.org.uk, andrew.patterson@....com
Cc:     linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: [RFC PATCH] fs: don't flush pagecahce when expanding block device

I have a trouble with my service and have two questions and an RFC patch.

I run a web service as follows.

- Its data is on a multipath device which size is sometimes grown.
- It uses many page caches
- It's response time is usually about 0.2 s.

When I grow the multipath device, kernel flushes this device's all caches
and the response time becomes 20 s, 100 times slower than usual.

IIUC, this flushing seems not to be necessary because of the reason described later.
However, looking at the comments of the past commit, it seems that the buffer cache
is intentionally flushed not only in shrinking case but also growing case.
(cf. https://github.com/torvalds/linux/commit/608aeef17a91747d6303de4df5e2c2e6899a95e8)

I have two questions about this:
1. I interpret that the following situation is concerned, is my understanding correct?

    <Situation>
    On a 10 GB device, if calls occur as shown in the table below that CPU0 shrink device to 5 GB and CPU1 grows device to 15 GB, 
    struct gendisk and struct block_device are already the same size at (*).
    So in this case, the cache is not flushed in shrinking.
    (cf. https://github.com/torvalds/linux/blob/608aeef17a91747d6303de4df5e2c2e6899a95e8/fs/block_dev.c#L897)

    CPU0 (shrink)                CPU1 (grow)
    =============================
    set_capacity()
                                          set_capacity()
                                          revalidate_disk()
    revalidate_disk() ...(*)


2. If 1 is yes, the above mentioned situation seems not to happen since all pairs of set_capacity()
and revalidate_disk() are protected by some kind of lock mechanism. If it's correct, how about
avoiding the performance problem which I mentioned by the following patch?

Thanks,
Shunki
---
It's not necessary to flush caches about a device which is under growing.

Signed-off-by: Shunki Fujita <shunki-fujita@...ozu.co.jp>
---
fs/block_dev.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 44d4a1e..d17603c 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1078,7 +1078,14 @@ void check_disk_size_change(struct gendisk *disk, struct block_device *bdev)
                "%s: detected capacity change from %lld to %lld\n",
                name, bdev_size, disk_size);
         i_size_write(bdev->bd_inode, disk_size);
-        flush_disk(bdev, false);
+        if (bdev_size > disk_size) {
+            flush_disk(bdev, false);
+        } else {
+            if (!bdev->bd_disk)
+                return;
+            if (disk_part_scan_enabled(bdev->bd_disk))
+                bdev->bd_invalidated = 1;
+        }
     }
 }
EXPORT_SYMBOL(check_disk_size_change);
-- 
2.7.4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux - Powered by OpenVZ