lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 4 Jan 2021 10:46:09 -0600
From:   Alex Elder <elder@...aro.org>
To:     David Miller <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>
Cc:     Network Development <netdev@...r.kernel.org>
Subject: Missed schedule_napi()?

I have a question about whether it's possible to effectively
miss a schedule_napi() call when a disable_napi() is underway.

I'm going to try to represent the code in question here
in an interleaved way to explain the scenario; I hope
it's clear.

Suppose the SCHED flag is clear.  And suppose two
concurrent threads do things in the sequence below.

Disabling thread	| Scheduling thread
------------------------+----------------------
void napi_disable(struct napi_struct *n)
{			| bool napi_schedule_prep(struct napi_struct *n)
   might_sleep();	| {
                         |   unsigned long val, new;
                         |
                         |   do {
   set_bit(NAPI_STATE_DISABLE, &n->state);
                         |     val = READ_ONCE(n->state);
                         |     if (unlikely(val & NAPIF_STATE_DISABLE))
                         |       return false;
			|	. . .
   while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
      msleep(1);		|
        . . .		|

We start with the SCHED bit clear.  The disabling thread
sets the DISABLE bit as it begins.  The scheduling thread
checks the state and finds that it is disabled, so it
simply returns false, and the napi_schedule() caller will
*not* call __napi_schedule().

But even though NAPI is getting disabled, the scheduling thread
wants it recorded that a NAPI poll should be scheduled, even
if it happens later.  In other words, it seems like this
case is essentially a MISSED schedule.

The disabling thread sets the SCHED bit, having found it was
not set previously, and thereby disables NAPI processing until
it is re-enabled.

Later, napi_enable() will clear the SCHED bit, allowing NAPI
processing to continue, but there is no record that the
scheduling thread indicated that a poll was needed,

Am I misunderstanding this?  If so, can someone please explain?
It seems to me that the napi_schedule() call is "lost".

Thanks.

					-Alex

Powered by blists - more mailing lists