lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 28 Feb 2009 12:44:37 -0800
From:	Andrew Grover <andy.grover@...il.com>
To:	Andi Kleen <andi@...stfloor.org>
Cc:	Andy Grover <andy.grover@...cle.com>, netdev@...r.kernel.org,
	rds-devel@....oracle.com, general@...ts.openfabrics.org
Subject: Re: [ofa-general] [PATCH 0/26] Reliable Datagram Sockets (RDS), take 
	2

On Fri, Feb 27, 2009 at 9:56 PM, Andi Kleen <andi@...stfloor.org> wrote:
> On Fri, Feb 27, 2009 at 05:53:19PM -0800, Andrew Grover wrote:
>> On Fri, Feb 27, 2009 at 9:08 AM, Andi Kleen <andi@...stfloor.org> wrote:
>> >> This patchset against net-next adds support for RDS sockets. RDS is an
>> >> Oracle-originated protocol used to send IPC datagrams (up to 1MB)
>> >> reliably, and is used currently in Oracle RAC and Exadata products.
>> >
>> > Perhaps I missed it earlier, but what is the rationale for putting
>> > this as a socket type into the kernel? I assume they also work
>> > directly as implemented in user space using raw sockets or similar,
>> > don't they?
>>
>> You want me to implement my fancy protocol in userspace???
>
> I just asked why you're putting it in kernel space.
>
>> Do I even get to write it in C or do I need to use Ruby?
>
> Well normally people who add new subsystems to the kernel explain
> why they do that. Perhaps it's obvious to you, but at least to
> me it isn't.

Sure thing, sorry to be flippant :-)

The previous solution for IPC that Oracle was using was based on UDP,
which I think could be considered very close to using raw sockets --
each process is responsible for its own acks, retransmits, everything.
Doing this on a highly loaded machine resulted in a cascade where
performance got worse and worse. Moving this to kernel code made a big
difference.

Additionally, our interconnect is primarily Infiniband. It natively
implements a reliable datagram connection type so RDS leverages that.
RDS multiplexes all processes' traffic between two hosts over a single
IB connection. Since RDS is managing IB connections at the host level
(but based on socket traffic) this is also more naturally a fit for
kernel code.

Regards -- Andy
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ