lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPi7mHqV6Pgk5tSt+fJn_2aEZKZbO=aO78JXgw7MvDaW0musoQ@mail.gmail.com>
Date:	Thu, 18 Aug 2011 16:18:47 -0700
From:	San Mehat <san@...gle.com>
To:	Alan Cox <alan@...rguk.ukuu.org.uk>
Cc:	davem@...emloft.net, mst@...hat.com, rusty@...tcorp.com.au,
	linux-kernel@...r.kernel.org,
	virtualization@...ts.linux-foundation.org, netdev@...r.kernel.org,
	digitaleric@...gle.com, mikew@...gle.com, miche@...gle.com,
	maccarro@...gle.com
Subject: Re: [RFC 0/0] Introducing a generic socket offload framework

On Thu, Aug 18, 2011 at 3:57 PM, Alan Cox <alan@...rguk.ukuu.org.uk> wrote:
>> The Berkeley sockets coprocessor is a virtual PCI device which has the ability
>> to offload socket activity from an unmodified application at the BSD sockets
>
> Ok I think there is an important question here. Why is this being
> designed for a specific virtual interface. Unix has always had the notion
> that socket operations can be in part generic and that you can pass a
> properly designed program a socket without any notion of what it is for.

Sorry Alan if I wasn't clear, but I'm not quite sure what you're asking...

If you're asking 'why have you only spec'ed out a virtual interface
for this' then
my answer would be 'but of course you could design this in real hardware and
have a proper driver :)'. If you'd prefer that I call that out
specifically I'm happy to do so.

I have no desire to change the 'genericness' of sockets.. just the
opposite - i wish to
introduce the notion that sockets (can be) completely generic (when
offloaded) as far as
the guest is concerned.

>
>> Lastly, pushing socket processing back into the host allows for host-side
>> control of the network protocols used, which limits the potential congestion
>> problems that can arise when various guests are using their own congestion
>> control algorithms.
>
> Does that not depend which side does the congestion and who parcels out
> buffers ?

It does, and it does.

>
>> Since we wish to allow these paravirtualized sockets to coexist peacefully with
>> the existing Linux socket system, we've chosen to introduce the idea that a
>> socket can at some point transition from being managed by the O/S socket system
>> to a more enlightened 'hardware assisted' socket. The transition is managed by
>> a 'socket coprocessor' component which intercepts and gets first right of
>> refusal on handling certain global socket calls (connect, sendto, bind, etc...).
>> In this initial design, the policy on whether to transition a socket or not is
>> made by the virtual hardware, although we understand that further measurement
>> into operation latency is warranted.
>
> Q: whay happens about in process socket syscalls in another thread ?
> Thats always been the ugly in these cases either by intercepting or by
> swapping file operations on an object.
>
>>  * SOCK_HWASSIST
>>     Indicates socket operations are handled by hardware
>
> This guest only view means you can't use the abstraction for local
> sockets too.
>

To be honest, the way we're attempting to integrate is in such a way
that you *could*
offload AF_LOCAL sockets...  but that world gets a bit too much like
the 'Twilight Zone'
for my current linkings..

>> In order to support a variety of socket address families, addresses are
>> converted from their native socket family to an opaque string. Our initial
>> design formats these strings as URIs. The currently supported conversions are:
>
> That makes a lot of sense to me, because its a well understood
> abstraction and you can offload other stuff to this kind of generic
> socket including things like http protocol acceleration, SSL and so on.
>
> Plus its always been annoying that you can't open a socket, but a URI
> interface solves that...

Indeed.

>
>>  * We don't handle SOCK_SEQPACKET, SOCK_RAW, SOCK_RDM, or SOCK_PACKET sockets.
>
> But there is no reason SEQPACKET and RDM couldn't be added I assume?

No reason I can think of - we just did not have a specific requirement
for it at the time.

>
> Ok other questions
>
> Suppose instead you just add an abstracted socket interface of
>
>        AF_SOMETHING, PF_URI

Mike Waychison and I were saving the 'PF_URI' discussion for a future
date, but indeed
we're on the same wave-length :). Our initial requirements are for an
'extremely minimal
burden of support' on the userspace environments, so we decided to
open up a separate
discussion on PF_URI

>
> it would be easy to convert programs. It would be easier to write
> properly generic programs. It would be easy write some small helpers that
> are a good deal less insane than the existing inet ones. At that point
> you could turn the problem on its head. Instead of 'borrowing' sockets
> for a fairly specific concept of hw assist you ask the reverse question,
> who can accelerate this URI be it some kind of virtual machine interface,
> something funky like raw data over infiniband, or plain old 'use the
> TCP/IP stack'.

Completely agree.

>
> Your decision making code is going to be interesting but it only has to
> make the decision once in simple cases.

Yup.

>
> And yes there is still the complicated cases such as 'the routing table
> has changed from vitual host to via siberia now what' but I don't believe
> your proposal addresses that either.

Can you be more specific? If you mean solving the 'keeping your tcp connections
open to non virtual endpoints across a migration (or whatever)' then
no it doesn't :)

>
> Alan
>

Thanks man,

-san


-- 
San Mehat | Staff Software Engineer | san@...gle.com | 415-366-6172
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ