lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 07 Oct 2014 01:26:11 +0200
From:	Hannes Frederic Sowa <hannes@...essinduktion.org>
To:	John Fastabend <john.r.fastabend@...el.com>
Cc:	Daniel Borkmann <dborkman@...hat.com>,
	John Fastabend <john.fastabend@...il.com>,
	Jesper Dangaard Brouer <jbrouer@...hat.com>,
	"John W. Linville" <linville@...driver.com>,
	Neil Horman <nhorman@...driver.com>,
	Florian Westphal <fw@...len.de>, gerlitz.or@...il.com,
	netdev@...r.kernel.org, john.ronciak@...el.com, amirv@...lanox.com,
	eric.dumazet@...il.com, danny.zhou@...el.com,
	Willem de Bruijn <willemb@...gle.com>
Subject: Re: [net-next PATCH v1 1/3] net: sched: af_packet support for direct
 ring access

Hi John,

On Mon, Oct 6, 2014, at 22:37, John Fastabend wrote:
> > I find the six additional ndo ops a bit worrisome as we are adding more
> > and more subsystem specific ndoops to this struct. I would like to see
> > some unification here, but currently cannot make concrete proposals,
> > sorry.
> 
> I agree it seems like a bit much. One thought was to split the ndo
> ops into categories. Switch ops, MACVLAN ops, basic ops and with this
> userspace queue ops. This sort of goes along with some of the switch
> offload work which is going to add a handful more ops as best I can
> tell.

Thanks for your mail, you answered all of my questions.

Have you looked at <https://code.google.com/p/kernel/wiki/ProjectUnetq>?
Willem (also in Cc) used sysfs files which get mmaped to represent the
tx/rx descriptors. The representation was independent of the device and
IIRC the prototype used a write(fd, "", 1) to signal the kernel it
should proceed with tx. I agree, it would be great to be syscall-free
here.

For the semantics of the descriptors we could also easily generate files
in sysfs. I thought about something like tracepoints already do for
representing the data in the ringbuffer depending on the event:

-- >8 --
# cat /sys/kernel/debug/tracing/events/net/net_dev_queue/format 
name: net_dev_queue
ID: 1006
format:
	field:unsigned short common_type;       offset:0;       size:2;
	signed:0;
	field:unsigned char common_flags;       offset:2;       size:1;
	signed:0;
	field:unsigned char common_preempt_count;       offset:3;      
	size:1; signed:0;
	field:int common_pid;   offset:4;       size:4; signed:1;

	field:void * skbaddr;   offset:8;       size:8; signed:0;
	field:unsigned int len; offset:16;      size:4; signed:0;
	field:__data_loc char[] name;   offset:20;      size:4;
	signed:1;

print fmt: "dev=%s skbaddr=%p len=%u", __get_str(name), REC->skbaddr,
REC->len
-- >8 --

Maybe the macros from tracing are reusable (TP_STRUCT__entry), e.g.
endianess would need to be added. Hopefully there is already a user
space parser somewhere in the perf sources. An easier to parse binary
representation could be added easily and maybe even something vDSO alike
if people care about that.

Maybe this open/mmap per queue also kills some of the ndo_ops?

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ