lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20210105122328.3e5569a4@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date:   Tue, 5 Jan 2021 12:23:28 -0800
From:   Jakub Kicinski <kuba@...nel.org>
To:     Alex Elder <elder@...aro.org>
Cc:     David Miller <davem@...emloft.net>,
        Network Development <netdev@...r.kernel.org>,
        Eric Dumazet <edumazet@...gle.com>
Subject: Re: Missed schedule_napi()?

On Mon, 4 Jan 2021 10:46:09 -0600 Alex Elder wrote:
> I have a question about whether it's possible to effectively
> miss a schedule_napi() call when a disable_napi() is underway.
> 
> I'm going to try to represent the code in question here
> in an interleaved way to explain the scenario; I hope
> it's clear.
> 
> Suppose the SCHED flag is clear.  And suppose two
> concurrent threads do things in the sequence below.
> 
> Disabling thread	| Scheduling thread
> ------------------------+----------------------
> void napi_disable(struct napi_struct *n)
> {			| bool napi_schedule_prep(struct napi_struct *n)
>    might_sleep();	| {
>                          |   unsigned long val, new;
>                          |
>                          |   do {
>    set_bit(NAPI_STATE_DISABLE, &n->state);
>                          |     val = READ_ONCE(n->state);
>                          |     if (unlikely(val & NAPIF_STATE_DISABLE))
>                          |       return false;
> 			|	. . .
>    while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
>       msleep(1);		|
>         . . .		|
> 
> We start with the SCHED bit clear.  The disabling thread
> sets the DISABLE bit as it begins.  The scheduling thread
> checks the state and finds that it is disabled, so it
> simply returns false, and the napi_schedule() caller will
> *not* call __napi_schedule().
> 
> But even though NAPI is getting disabled, the scheduling thread
> wants it recorded that a NAPI poll should be scheduled, even
> if it happens later.  In other words, it seems like this
> case is essentially a MISSED schedule.
> 
> The disabling thread sets the SCHED bit, having found it was
> not set previously, and thereby disables NAPI processing until
> it is re-enabled.
> 
> Later, napi_enable() will clear the SCHED bit, allowing NAPI
> processing to continue, but there is no record that the
> scheduling thread indicated that a poll was needed,
> 
> Am I misunderstanding this?  If so, can someone please explain?
> It seems to me that the napi_schedule() call is "lost".

AFAICT your analysis is correct. At the same time the NAPI API does 
not (to the best of my knowledge) give any guarantees about NAPI
invocations matching the number of __napi_schedule() calls.

The expectation is that the communication channel will be "reset" 
after the napi_disable() call, processing or dropping all the events
which were outstanding after napi_disable().

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ