lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160609123733.GB6305@linutronix.de>
Date:	Thu, 9 Jun 2016 14:37:33 +0200
From:	Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To:	Alison Chaiken <alison@...oton-tech.com>
Cc:	Steven Rostedt <rostedt@...dmis.org>,
	LKML <linux-kernel@...r.kernel.org>,
	linux-rt-users <linux-rt-users@...r.kernel.org>,
	netdev <netdev@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Clark Williams <williams@...hat.com>,
	Eric Dumazet <eric.dumazet@...il.com>,
	David Miller <davem@...emloft.net>
Subject: Re: [PATCH][RT] netpoll: Always take poll_lock when doing polling

* Alison Chaiken | 2016-06-07 17:19:43 [-0700]:

>Sorry to be obscure; I had applied that patch to v4.1.6-rt5.

Using the latest is often not a bad choice compared to the random tree
you have here.

>> What I remember from testing the two patches on am335x was that before a
>> ping flood on gbit froze the serial console but with them it the ping
>> flood was not noticed.
>
>I compiled a kernel from upstream d060a36 "Merge branch
>'ti-linux-4.1.y' of git.ti.com:ti-linux-kernel/ti-linux-kernel into
>ti-rt-linux-4.1.y" which is unpatched except for using a
>board-appropriate device-tree.    The serial console is responsive
>with all our RT userspace applications running alongside a rapid
>external ping.   However, our main event loop misses frequently as
>soon as ping faster than 'ping -i 0.0002' is run.    mpstat shows that
>the sum of the hard IRQ rates in a second is equal precisely to the
>NET_RX rate, which is ~3400/s.   Does the fact that 3400 < (1/0.0002)
>already mean that some packets are dropped?   ftrace shows that

Not necessarily. The ping command reports how many packets were
received. It is possible that the sender was not able to send that many
packates _or_ the received was able to process more packets during a
single interrupt.

>cpsw_rx_poll() is called even when there is essentially no network
>traffic, so I'm not sure how to tell if NAPI is working as intended.

You should see an invocation of __raise_softirq_irqoff_ksoft() and then
cpsw's poll function should run in "ksoftirqd/" context instead in the
context of the task it runs now.

>I tried running the wakeup_rt tracer, but it loads the system too
>much.     With ftrace capturing IRQ, scheduler and net events, we're
>writing out markers into the trace buffer when the event loop makes
>its deadline and then when it misses so that we can compare the normal
>and long-latency intervals, but there doesn't appear to be a smoking
>gun in the difference between the two.

You would need to figure out what adds the latency. My understanding is
that your RT application is doing CAN traffic and is not meeting the
deadline. So you drop CAN packets in the end?

>Thanks for all your help,
>Alison

Sebastian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ