lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAPhsuW5sv5D8vmiJxS9SqCcit1a05F8kw80Q1TV9+26+QJkEsA@mail.gmail.com>
Date:   Thu, 9 Dec 2021 09:39:34 -0800
From:   Song Liu <song@...nel.org>
To:     Li Jinlin <lijinlin3@...wei.com>
Cc:     Jens Axboe <axboe@...nel.dk>, Hannes Reinecke <hare@...e.de>,
        Jan Kara <jack@...e.cz>, Ming Lei <ming.lei@...hat.com>,
        Tejun Heo <tj@...nel.org>,
        Luis Chamberlain <mcgrof@...nel.org>,
        linux-raid <linux-raid@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>, linfeilong@...wei.com
Subject: Re: [PATCH] md: Fix unexpected behaviour in is_mddev_idle

On Tue, Nov 30, 2021 at 6:56 PM Li Jinlin <lijinlin3@...wei.com> wrote:
>
> The value of curr_events may be INT_MAX when mddev initializes IO event
> counters. Then, rdev->last_events will be set as INT_MAX.
> If all the rdevs of mddev are in this case,
> 'curr_events - rdev->last_events > 64' will always false, and
> is_mddev_idle() will always return 1, which may cause non-sync IO very
> slow.
>
> Fix by using atomic64_t type for sync_io, and using long type for
> curr_events/last_events.
>
> Signed-off-by: Li Jinlin <lijinlin3@...wei.com>
> ---
>  drivers/md/md.c       | 6 +++---
>  drivers/md/md.h       | 4 ++--
>  include/linux/genhd.h | 2 +-
>  3 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 5111ed966947..f47035838c43 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -8429,14 +8429,14 @@ static int is_mddev_idle(struct mddev *mddev, int init)
>  {
>         struct md_rdev *rdev;
>         int idle;
> -       int curr_events;
> +       long curr_events;
>
>         idle = 1;
>         rcu_read_lock();
>         rdev_for_each_rcu(rdev, mddev) {
>                 struct gendisk *disk = rdev->bdev->bd_disk;
> -               curr_events = (int)part_stat_read_accum(disk->part0, sectors) -
> -                             atomic_read(&disk->sync_io);
> +               curr_events = (long)part_stat_read_accum(disk->part0, sectors) -
> +                             atomic64_read(&disk->sync_io);
>                 /* sync IO will cause sync_io to increase before the disk_stats
>                  * as sync_io is counted when a request starts, and
>                  * disk_stats is counted when it completes.
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index 53ea7a6961de..3f8327c42b7b 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -50,7 +50,7 @@ struct md_rdev {
>
>         sector_t sectors;               /* Device size (in 512bytes sectors) */
>         struct mddev *mddev;            /* RAID array if running */
> -       int last_events;                /* IO event timestamp */
> +       long last_events;               /* IO event timestamp */

I think we need long long here to be safe on 32-bit systems.

Thanks,
Song

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ