lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 13 Jun 2019 12:32:24 -0400
From:   Doug Ledford <dledford@...hat.com>
To:     Bart Van Assche <bvanassche@....org>,
        HÃ¥kon Bugge <haakon.bugge@...cle.com>,
        Jason Gunthorpe <jgg@...pe.ca>,
        Leon Romanovsky <leon@...nel.org>,
        Parav Pandit <parav@...lanox.com>,
        Steve Wise <swise@...ngridcomputing.com>
Cc:     linux-rdma@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] RDMA/cma: Make CM response timeout and # CM retries
 configurable

On Thu, 2019-06-13 at 08:28 -0700, Bart Van Assche wrote:
> On 6/13/19 7:25 AM, Doug Ledford wrote:
> > So, to revive this patch, what I'd like to see is some attempt to
> > actually quantify a reasonable timeout for the default backlog
> > depth,
> > then the patch should actually change the default to that
> > reasonable
> > timeout, and then put in the ability to adjust the timeout with
> > some
> > sort of doc guidance on how to calculate a reasonable timeout based
> > on
> > configured backlog depth.
> 
> How about following the approach of the SRP initiator driver? It
> derives 
> the CM timeout from the subnet manager timeout. The assumption
> behind 
> this is that in large networks the subnet manager timeout has to be
> set 
> higher than its default to make communication work. See also 
> srp_get_subnet_timeout().

Theoretically, the subnet manager needs a longer timeout in a bigger
network because it's handling more data as a single point of lookup for
the entire subnet.  Individual machines, on the other hand, have the
same backlog size (by default) regardless of the size of the network,
and there is no guarantee that if the admin increased the subnet
manager timeout, that they also increased the backlog queue depth size.
So, while I like things that auto-tune like you are suggesting, the
problem is that the one item does not directly correlate with the
other.

-- 
Doug Ledford <dledford@...hat.com>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57
2FDD

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ