netdev - Re: [net] 6922110d15: suspend-stress.fail

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220608054553.GA7499@1wt.eu>
Date:   Wed, 8 Jun 2022 07:45:53 +0200
From:   Willy Tarreau <w@....eu>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     kernel test robot <oliver.sang@...el.com>,
        Heiner Kallweit <hkallweit1@...il.com>,
        Geert Uytterhoeven <geert+renesas@...der.be>,
        Florian Fainelli <f.fainelli@...il.com>,
        LKML <linux-kernel@...r.kernel.org>, netdev@...r.kernel.org,
        lkp@...ts.01.org, lkp@...el.com, rui.zhang@...el.com,
        yu.c.chen@...el.com
Subject: Re: [net]  6922110d15: suspend-stress.fail

On Tue, Jun 07, 2022 at 05:47:30PM -0700, Jakub Kicinski wrote:
> On Sun, 5 Jun 2022 22:39:35 +0800 kernel test robot wrote:
> > Greeting,
> > 
> > FYI, we noticed the following commit (built with gcc-11):
> > 
> > commit: 6922110d152e56d7569616b45a1f02876cf3eb9f ("net: linkwatch: fix failure to restore device state across suspend/resume")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > 
> > in testcase: suspend-stress
> > version: 
> > with following parameters:
> > 
> > 	mode: freeze
> > 	iterations: 10
> > 
> > 
> > 
> > on test machine: 4 threads Ivy Bridge with 4G memory
> > 
> > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> > 
> > 
> > 
> > 
> > If you fix the issue, kindly add following tag
> > Reported-by: kernel test robot <oliver.sang@...el.com>
> > 
> > 
> > Suspend to freeze 1/10:
> > Done
> > Suspend to freeze 2/10:
> > network not ready
> > network not ready
> > network not ready
> > network not ready
> > network not ready
> > network not ready
> > network not ready
> > network not ready
> > network not ready
> > network not ready
> > network not ready
> > Done
> 
> What's the failure? I'm looking at this script:
> 
> https://github.com/intel/lkp-tests/blob/master/tests/suspend-stress
> 
> And it seems that we are not actually hitting any "exit 1" paths here.

I'm not sure how the test has to be interpreted but one possible
interpretation is that the link really takes time to re-appear and
that prior to the fix, the link was believed to still be up since
the event was silently lost during suspend, while now the link is
correctly being reported as being down and something is waiting for
it to be up again, as it possibly should. Thus it could be possible
that the fix revealed an incorrect expectation in that test.

Willy