lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2356877.Yf5hrMSTGe@natalenko.name>
Date:   Tue, 20 Jul 2021 11:05:29 +0200
From:   Oleksandr Natalenko <oleksandr@...alenko.name>
To:     Ming Lei <ming.lei@...hat.com>
Cc:     linux-kernel@...r.kernel.org, Jens Axboe <axboe@...com>,
        Christoph Hellwig <hch@....de>,
        Sagi Grimberg <sagi@...mberg.me>,
        linux-nvme@...ts.infradead.org,
        David Jeffery <djeffery@...hat.com>,
        Laurence Oberman <loberman@...hat.com>,
        Paolo Valente <paolo.valente@...aro.org>,
        Jan Kara <jack@...e.cz>, Sasha Levin <sashal@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Keith Busch <kbusch@...nel.org>
Subject: Re: New warning in nvme_setup_discard

Hello, Ming.

On pondělí 19. července 2021 8:27:29 CEST Oleksandr Natalenko wrote:
> On pondělí 19. července 2021 3:40:40 CEST Ming Lei wrote:
> > On Sat, Jul 17, 2021 at 02:35:14PM +0200, Oleksandr Natalenko wrote:
> > > On sobota 17. července 2021 14:19:59 CEST Oleksandr Natalenko wrote:
> > > > On sobota 17. července 2021 14:11:05 CEST Oleksandr Natalenko wrote:
> > > > > On sobota 17. července 2021 11:35:32 CEST Ming Lei wrote:
> > > > > > Maybe you need to check if the build is OK, I can't reproduce it
> > > > > > in
> > > > > > my
> > > > > > VM, and BFQ is still builtin:
> > > > > > 
> > > > > > [root@...st-01 ~]# uname -a
> > > > > > Linux ktest-01 5.14.0-rc1+ #52 SMP Fri Jul 16 18:56:36 CST 2021
> > > > > > x86_64
> > > > > > x86_64 x86_64 GNU/Linux [root@...st-01 ~]# cat
> > > > > > /sys/block/nvme0n1/queue/scheduler
> > > > > > [none] mq-deadline kyber bfq
> > > > > 
> > > > > I don't think this is an issue with the build… BTW, with
> > > > > `initcall_debug`:
> > > > > 
> > > > > ```
> > > > > [    0.902555] calling  bfq_init+0x0/0x8b @ 1
> > > > > [    0.903448] initcall bfq_init+0x0/0x8b returned -28 after 507
> > > > > usecs
> > > > > ```
> > > > > 
> > > > > -ENOSPC? Why? Also re-tested with the latest git tip, same result
> > > > > :(.
> > > > 
> > > > OK, one extra pr_info, and I see this:
> > > > 
> > > > ```
> > > > [    0.871180] blkcg_policy_register: BLKCG_MAX_POLS too small
> > > > [    0.871612] blkcg_policy_register: -28
> > > > ```
> > > > 
> > > > What does it mean please :)? The value seems to be hard-coded:
> > > > 
> > > > ```
> > > > include/linux/blkdev.h
> > > > 60:#define BLKCG_MAX_POLS               5
> > > > ```
> > > 
> > > OK, after increasing this to 6 I've got my BFQ back. Please see [1].
> > > 
> > > [1]
> > > https://lore.kernel.org/linux-block/20210717123328.945810-1-oleksandr@na
> > > t
> > > alenko.name/
> > 
> > OK, after you fixed the issue in blkcg_policy_register(), can you
> > reproduce the discard issue on v5.14-rc1 with BFQ applied? If yes,
> > can you test the patch I posted previously?
> 
> Yes, the issue is reproducible with both v5.13.2 and v5.14-rc1. I haven't
> managed to reproduce it with v5.13.2+your patch. Now I will build v5.14-
> rc2+your patch and test further.

I'm still hammering v5.14-rc2 + your patch, and I cannot reproduce the issue. 
Given I do not have a reliable reproducer (I'm just firing up the kernel build, 
and the issue pops up eventually, sooner or later, but usually within a couple 
of first tries), for how long I should hammer it for your fix to be considered 
proven?

Thanks.

-- 
Oleksandr Natalenko (post-factum)


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ