lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <01c251f4-c8f8-fcb8-bccc-341d4a3db90a@oracle.com>
Date:   Mon, 1 Jul 2019 13:55:27 -0700
From:   Gerd Rausch <gerd.rausch@...cle.com>
To:     santosh.shilimkar@...cle.com, netdev@...r.kernel.org
Cc:     David Miller <davem@...emloft.net>
Subject: Re: [PATCH net-next 3/7] net/rds: Wait for the FRMR_IS_FREE (or
 FRMR_IS_STALE) transition after posting IB_WR_LOCAL_INV

Hi Santosh,

On 01/07/2019 13.41, santosh.shilimkar@...cle.com wrote:
>> @@ -144,7 +146,29 @@ static int rds_ib_post_reg_frmr(struct rds_ib_mr *ibmr)
>>           if (printk_ratelimit())
>>               pr_warn("RDS/IB: %s returned error(%d)\n",
>>                   __func__, ret);
>> +        goto out;
>> +    }
>> +
>> +    if (!frmr->fr_reg)
>> +        goto out;
>> +
>> +    /* Wait for the registration to complete in order to prevent an invalid
>> +     * access error resulting from a race between the memory region already
>> +     * being accessed while registration is still pending.
>> +     */
>> +    wait_event_timeout(frmr->fr_reg_done, !frmr->fr_reg,
>> +               msecs_to_jiffies(100));
>> +
> This arbitrary timeout in this patch as well as pacth 1/7 which
> Dave pointed out has any logic ?
> 

It's empirical (see my response to David's question):
Memory registrations took longer than invalidations, hence 100msec instead of 10msec.

> MR registration command issued to hardware can at times take as
> much as command timeout(e.g 60 seconds in CX3) and upto that its still
> legitimate operation and not necessary failure. We shouldn't add
> arbitrary time outs in ULPs.

Where did you find the 60 seconds for CX3 you are referring to?
Is there a "generic" upper-bound that is not tied to a specific vendor / HCA?
Can you provide a pointer?

Thanks,

  Gerd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ