lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 1 Jul 2019 13:50:18 -0700
From:   Gerd Rausch <gerd.rausch@...cle.com>
To:     David Miller <davem@...emloft.net>
Cc:     santosh.shilimkar@...cle.com, netdev@...r.kernel.org
Subject: Re: [PATCH net-next 1/7] net/rds: Give fr_state a chance to
 transition to FRMR_IS_FREE

Hi David,

On 01/07/2019 11.27, David Miller wrote:
> From: Gerd Rausch <gerd.rausch@...cle.com>
> Date: Mon, 1 Jul 2019 09:39:44 -0700
> 
>> +			/* Memory regions make it onto the "clean_list" via
>> +			 * "rds_ib_flush_mr_pool", after the memory region has
>> +			 * been posted for invalidation via "rds_ib_post_inv".
>> +			 *
>> +			 * At that point in time, "fr_state" may still be
>> +			 * in state "FRMR_IS_INUSE", since the only place where
>> +			 * "fr_state" transitions to "FRMR_IS_FREE" is in
>> +			 * is in "rds_ib_mr_cqe_handler", which is
>> +			 * triggered by a tasklet.
>> +			 *
>> +			 * So in case we notice that
>> +			 * "fr_state != FRMR_IS_FREE" (see below), * we wait for
>> +			 * "fr_inv_done" to trigger with a maximum of 10msec.
>> +			 * Then we check again, and only put the memory region
>> +			 * onto the drop_list (via "rds_ib_free_frmr")
>> +			 * in case the situation remains unchanged.
>> +			 *
>> +			 * This avoids the problem of memory-regions bouncing
>> +			 * between "clean_list" and "drop_list" before they
>> +			 * even have a chance to be properly invalidated.
>> +			 */
>> +			frmr = &ibmr->u.frmr;
>> +			wait_event_timeout(frmr->fr_inv_done,
>> +					   frmr->fr_state == FRMR_IS_FREE,
>> +					   msecs_to_jiffies(10));
>> +			if (frmr->fr_state == FRMR_IS_FREE)
>> +				break;
> 
> If we see FRMR_IS_FREE after the timeout, what cleans this up?
> 

In that case, the memory-region is subjected to the
"rds_ib_free_frmr(ibmr, true)" call that follows:
In essence making it onto the "drop_list".

It's the same as if it wouldn't transition to FRMR_IS_FREE at all.
In both cases, the memory region should get dropped, and the application
would have been penalized by an extra 10msec wait-time (for having tried to invalidate it).

> Also, why 10msec?

It's empirical.
I had added some debugging code (not part of this submission) that traced
the return value of "wait_event_timeout" in order to see the out-lier in terms
of processing the "IB_WR_LOCAL_INV" request.

On my test-systems the majority of requests were done in less than 1msec.
I saw an outlier at almost 2msec once.
So I gave it an extra order-of-magnitude multiplier for extra buffer / paranoia.

> Why that specific value and not some other value?

I looked around to find what Mellanox or any other reference material had
so say about the expected turn--around time of an "IB_WR_LOCAL_INV" ought to be.
I wasn't able to find any.

Please note that even if there was an upper-bound specified, such as minutes:
It wouldn't necessarily be a good idea to penalize an application by wait-times
up to one minute, if the alternative is to just put this memory region on a
drop-list and pick another one (which is suggested here).

> Why not wait for however long it takes for the tasklet to run and clean it up?

Two reasons I can think of:
1) The penalty of long wait-times would go to the application
2) If there were a firmware-bug, the "wait_event" would not terminate

Thanks,

  Gerd

Powered by blists - more mailing lists