lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160627131158.GE2058@nanopsycho.orion>
Date:	Mon, 27 Jun 2016 15:11:58 +0200
From:	Jiri Pirko <jiri@...nulli.us>
To:	Roopa Prabhu <roopa@...ulusnetworks.com>
Cc:	Anuradha Karuppiah <anuradhak@...ulusnetworks.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"davem@...emloft.net" <davem@...emloft.net>,
	Nogah Frankel <nogahf@...lanox.com>,
	Ido Schimmel <idosch@...lanox.com>,
	Elad Raz <eladr@...lanox.com>,
	Yotam Gigi <yotamg@...lanox.com>,
	Or Gerlitz <ogerlitz@...lanox.com>,
	Nikolay Aleksandrov <nikolay@...ulusnetworks.com>,
	John Linville <linville@...driver.com>,
	Thomas Graf <tgraf@...g.ch>,
	Andy Gospodarek <gospo@...ulusnetworks.com>,
	Scott Feldman <sfeldma@...il.com>, sd@...asysnail.net,
	eranbe@...lanox.com, Alexei Starovoitov <ast@...mgrid.com>,
	Eric Dumazet <edumazet@...gle.com>,
	"hannes@...essinduktion.org" <hannes@...essinduktion.org>,
	Florian Fainelli <f.fainelli@...il.com>,
	David Ahern <dsa@...ulusnetworks.com>
Subject: Re: [patch net-next v5 0/4] return offloaded stats as default and
 expose original sw stats

Mon, Jun 27, 2016 at 03:02:31PM CEST, roopa@...ulusnetworks.com wrote:
>[resending ...my previous reply sent some non-text content]
>
>On Sun, Jun 26, 2016 at 11:51 PM, Jiri Pirko <jiri@...nulli.us> wrote:
>>
>> Mon, Jun 27, 2016 at 04:53:53AM CEST, roopa@...ulusnetworks.com wrote:
>> >On Sun, Jun 26, 2016 at 11:15 AM, Jiri Pirko <jiri@...nulli.us> wrote:
>> >> Sun, Jun 26, 2016 at 08:06:40PM CEST, roopa@...ulusnetworks.com wrote:
>> >>>On 6/26/16, 2:33 AM, Jiri Pirko wrote:
>> >>>> Sat, Jun 25, 2016 at 05:50:59PM CEST, roopa@...ulusnetworks.com wrote:
>> >>>>> On Thu, Jun 23, 2016 at 8:40 AM, Jiri Pirko <jiri@...nulli.us> wrote:
>> >>>>>> Thu, Jun 23, 2016 at 05:11:26PM CEST, anuradhak@...ulusnetworks.com wrote:
>> >>>>>>>>>>> we can't separate CPU and HW stats there. In some cases (or ASICs) HW
>> >>>>>>>>>>> counters do
>> >>>>>>>>>>> not include CPU generated packets....you will have to add CPU
>> >>>>>>>>>>> generated pkt counters to the
>> >>>>>>>>>>> hw counters for such virtual device stats.
>> >>>>>>>>>> Can you please provide and example how that could happen?
>> >>>>>>>>> example is the bridge vlan stats I mention below. These are usually counted
>> >>>>>>>>> by attaching hw virtual counter resources. And CPU generated packets
>> >>>>>>>>> in some cases maybe setup to bypass the ASIC pipeline because the CPU
>> >>>>>>>>> has already made the required decisions. So, they may not be counted by
>> >>>>>>>>> by such hw virtual counters.
>> >>>>>>>> Bypass ASIC? How do the packets get on the wire?
>> >>>>>>>>
>> >>>>>>> Bypass the "forwarding pipeline" in the ASIC that is. Obviously the
>> >>>>>>> ASIC ships the CPU generated packet out of the switch/front-panel
>> >>>>>>> port. Continuing Roopa's example of vlan netdev stats.... To get the
>> >>>>>>> HW stats counters are typically tied to the ingress and egress vlan hw
>> >>>>>>> entries. All the incoming packets are subject to the ingress vlan
>> >>>>>>> lookup irrespective of whether they get punted to the CPU or whether
>> >>>>>>> they are forwarded to another front panel port. In that case the
>> >>>>>>> ingress HW stats does represent all packets. However for CPU
>> >>>>>>> originated packets egress vlan lookups are bypassed in the ASIC (this
>> >>>>>>> is common forwarding option in most ASICs) and the packet shipped as
>> >>>>>>> is out of front-panel port specified by the CPU. Which means these
>> >>>>>>> packets will NOT be counted against the egress VLAN HW counter; hence
>> >>>>>>> the need for summation.
>> >>>>>> Driver will know about this, and will provide the stats accordignly to
>> >>>>>> the core. Who else than driver should resolve this.
>> >>>>>>
>> >>>>> The point was/is that there should be only two categories:
>> >>>>> 1) the base default stats: can contain 'only sw', 'only hw' or 'a
>> >>>>> summation of hw and sw' in some cases.
>> >>>>> The user does not care about the breakdown.
>> >>>>>
>> >>>>> 2) everything else falls into the second category: driver provided
>> >>>>> breakdown of stats for easier debugging.
>> >>>>> This today is ethtool stats and we can have an equivalent nested
>> >>>>> attribute for this in the new stats api.
>> >>>>> Lets call it IFLA_STATS_LINK_DRIVER or you pick a name. Lets make it
>> >>>>> nested and extensible (like ethtool is) and
>> >>>>> driver can expose any kind of stats there.
>> >>>>> ie lets move the stats you are proposing to this category of stats.....
>> >>>>> instead of introducing a third category 'SW stats'.
>> >>>> What you are proposing is essentially what our patchset does. We expose
>> >>>> 2 sets of stats. hw and pure sw. hw includes all, driver will take
>> >>>> care of it cause he knows what is going on in hw.
>> >>>the splitting into hw and sw is causing some confusion with respect
>> >>
>> >> I still don't get why you are talking about split :( I see no split.
>> >>
>> >>
>> >>>to existing stats and will be confusing for future stats. And i am not sure how many
>> >>>users would prefer the split this way.
>> >>>So, instead of doing the split, i think we should at this time introduce
>> >>>driver specific stats (like ethtool) as a nested netlink attribute.
>> >>
>> >> The default netlink stats should be hw (or accumulated as you call it).
>> >> The reason is to avoid confusion for existing apps. Another attribute is
>> >> possible for more break-out stats - that is what this patchset is doing.
>> >>
>> >> Ethtool stats are wrong and useless for apps as they are driver-specific.
>> >
>> >apps only care about overall stats. thats the aggregate stats
>> >provided by the default netlink netdev api to the user...which already exists.
>> >
>> >they don't care about your new breakdown either.
>>
>> Agreed. That is what our patchset is doing.
>>
>>
>> >
>> >breakdown of stats are used for debugging and thats what ethtool stats provide.
>> >
>> >
>> >
>> >>
>> >>
>> >>>>
>> >>>> Btw mirroring random string stats into Netlink is not a good idea IMO.
>> >>>Any reason you say that ?. I am thinking it would be much easier with netlink.
>> >>>keeping it simple, it is a nested attribute with stat-name and value pair.
>> >>>
>> >>>struct stat {
>> >>>    char stats_name[STATS_NAME_LEN];    /* STATS_NAME_LEN = 32 */
>> >>>    __u64 stat;
>> >>>};
>> >>
>> >> No please. This should be well defined generic group of stats.
>> >> Driver-specific names/stats stats are wrong.
>> >>
>> >
>> >they are meant for debugging. are you saying the new stats api should
>> >not contain 'ethtool like' stats ?
>> >
>> >ethtool stats are very valuable today. They are extensible.
>> >They cannot be made generic and they are specific to a hardware or use case.
>> >
>> >We use it for our switch port stats too. Base aggregate stats summed
>> >up and provided as default netdev stats. via ethtool we provide lot
>> >more hardware specific breakdown.
>>
>> Leave it in ethtool then. I really think it is not idea to put random
>> named-stats in netlink.
>
>
>
>The point of the stats api is to bring all stats together.
>ethtool stats are very valuable. so there is no point leaving them out.
>they will be in here sooner or later.
>
>when there is already a discussion of moving ethtool to netlink,..i am not sure
>I understand the hesitation on moving ethtool stats to netlink.

It's not well defined. That's the issue. It's "debugfs"...


>
>>
>>
>> This patchset uses well defined values for slowpatch(/sw/cpu) stats.
>> That is I believe the only way to do this.
>>
>
>It seems to be your preferred way. I don't think it is the only way to
>do it. It can very well be part of the nested
>breakdown stats i am talking about.

We are using existing well-defined structure. Very easy and clean.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ