lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c0a09e5c0902281658v48084cd8w8f95871395020e20@mail.gmail.com>
Date:	Sat, 28 Feb 2009 16:58:25 -0800
From:	Andrew Grover <andy.grover@...il.com>
To:	Andi Kleen <andi@...stfloor.org>
Cc:	Andy Grover <andy.grover@...cle.com>, netdev@...r.kernel.org,
	rds-devel@....oracle.com, general@...ts.openfabrics.org
Subject: Re: [ofa-general] [PATCH 0/26] Reliable Datagram Sockets (RDS), take 
	2

On Sat, Feb 28, 2009 at 2:36 PM, Andi Kleen <andi@...stfloor.org> wrote:
>> The previous solution for IPC that Oracle was using was based on UDP,
>> which I think could be considered very close to using raw sockets --
>> each process is responsible for its own acks, retransmits, everything.
>> Doing this on a highly loaded machine resulted in a cascade where
>> performance got worse and worse.
>
> Could you describe that cascade in more detail?
> The problem was that the retransmits didn't have high enough priority?

I think the gist of it is:

Higher load -> more time before a process runs -> rcvbuf overfills ->
ACKs dropped -> timeouts -> more retransmissions -> even higher load.

Things are fine until they hit a point where everything goes to hell.

>> Additionally, our interconnect is primarily Infiniband. It natively
>> implements a reliable datagram connection type so RDS leverages that.
> So perhaps it would make more sense to have a thin direct interface
> to that IB service? Or perhaps it already exists? (I admit I don't know
> the IB interfaces very well)

The most direct userspace API is uDAPL -- apps can create IB
connections (queue pairs) directly. This was tried but didn't work out
so well. A queue pair (QP) is a TX/RX ring -- a nontrivial amount of
memory. If each process needs a new QP to talk to every other process
then the number of RAM-hungry QPs becomes huge.

RDS is only slightly less direct -- apps don't create queue pairs,
they create RDS sockets. RDS uses only one QP for all traffic to each
remote node, so the number of QPs on a node is equal to the number of
remote nodes, as opposed to (number of local processes * number of
remote processes).

Regards -- Andy
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ