lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 8 Nov 2018 21:56:37 -0500
From:   Genevieve Bastien <gbastien@...satic.net>
To:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc:     "David S. Miller" <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>, rostedt <rostedt@...dmis.org>,
        Ingo Molnar <mingo@...hat.com>
Subject: Re: [PATCH] net: Add trace events for all receive exit points

On 2018-11-08 03:25 PM, Mathieu Desnoyers wrote:
> ----- On Nov 8, 2018, at 2:56 PM, Geneviève Bastien gbastien@...satic.net wrote:
>
>> Trace events are already present for the receive entry points, to indicate
>> how the reception entered the stack.
>>
>> This patch adds the corresponding exit trace events that will bound the
>> reception such that all events occurring between the entry and the exit
>> can be considered as part of the reception context. This greatly helps
>> for dependency and root cause analyses.
>>
>> Without this, it is impossible to determine whether a sched_wakeup
>> event following a netif_receive_skb event is the result of the packet
>> reception or a simple coincidence after further processing by the
>> thread.
> As a clarification: it is indeed not possible with tracepoint instrumentation,
> which I think is your point here. It might be possible to do so by using other
> mechanisms like kretprobes, but considering that the "entry" point was deemed
> important enough to have a tracepoint, it would be good to add the matching exit
> events.
>
> Being able to link the packet reception entry/exit with wakeups is key to
> allow tools to perform automated network stack latency analysis, so I think
> the use case justifies adding the missing "exit" events.
Thanks for the precision. Indeed so far, kretprobes have been used as a
workaround, but it is harder to setup and has quite more overhead. After
seeing those "entry" points, I thought corresponding "exits" would be
the best candidate to encapsulate the "network reception" context.

For an example of dependency and critical path analysis of 'wget', see
this screenshot here, arranged from Trace Compass:
https://framapic.org/DgSdNwiuymib/PDPdHJBGCJGR.png

The top view shows the dependency analysis without the "exit" events:
the wakeup is not associated with the packet reception, so the source is
considered to be the process that was running at that time, an IRQ
thread. So wget was blocked by the iwlwifi IRQ thread, who was himself
simply woken up by an hardware interrupt, while in fact, wget was
waiting for the network.

The bottom view shows the dependency analysis with the "exit" events:
because the wakeup happens between the "entry" and "exit", we know the
packet reception is the source of the wakeup and if we happen to know
where that packet came from, we can follow the dependency to the packet
source. So wget was blocked waiting for a network request to another
machine which sent back the reply and we clearly see the time spent on
the other machine and the network latency during this blockage.

I hope this kind of possibilities for analyses justify adding those
tracepoints.

Thanks,
Geneviève


>
> It might be great if you can provide a glimpse of the wakeup dependency chain
> analysis prototypes you have created, which rely on this new event, in order
> to justify adding it.
>
> Thanks,
>
> Mathieu
>
>> Signed-off-by: Geneviève Bastien <gbastien@...satic.net>
>> CC: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
>> CC: Steven Rostedt <rostedt@...dmis.org>
>> CC: Ingo Molnar <mingo@...hat.com>
>> CC: David S. Miller <davem@...emloft.net>
>> ---
>> include/trace/events/net.h | 59 ++++++++++++++++++++++++++++++++++++++
>> net/core/dev.c             | 30 ++++++++++++++++---
>> 2 files changed, 85 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/trace/events/net.h b/include/trace/events/net.h
>> index 00aa72ce0e7c..318307511018 100644
>> --- a/include/trace/events/net.h
>> +++ b/include/trace/events/net.h
>> @@ -117,6 +117,23 @@ DECLARE_EVENT_CLASS(net_dev_template,
>> 		__get_str(name), __entry->skbaddr, __entry->len)
>> )
>>
>> +DECLARE_EVENT_CLASS(net_dev_template_simple,
>> +
>> +	TP_PROTO(struct sk_buff *skb),
>> +
>> +	TP_ARGS(skb),
>> +
>> +	TP_STRUCT__entry(
>> +		__field(void *,	skbaddr)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__entry->skbaddr = skb;
>> +	),
>> +
>> +	TP_printk("skbaddr=%p", __entry->skbaddr)
>> +)
>> +
>> DEFINE_EVENT(net_dev_template, net_dev_queue,
>>
>> 	TP_PROTO(struct sk_buff *skb),
>> @@ -244,6 +261,48 @@ DEFINE_EVENT(net_dev_rx_verbose_template,
>> netif_rx_ni_entry,
>> 	TP_ARGS(skb)
>> );
>>
>> +DEFINE_EVENT(net_dev_template_simple, napi_gro_frags_exit,
>> +
>> +	TP_PROTO(struct sk_buff *skb),
>> +
>> +	TP_ARGS(skb)
>> +);
>> +
>> +DEFINE_EVENT(net_dev_template_simple, napi_gro_receive_exit,
>> +
>> +	TP_PROTO(struct sk_buff *skb),
>> +
>> +	TP_ARGS(skb)
>> +);
>> +
>> +DEFINE_EVENT(net_dev_template_simple, netif_receive_skb_exit,
>> +
>> +	TP_PROTO(struct sk_buff *skb),
>> +
>> +	TP_ARGS(skb)
>> +);
>> +
>> +DEFINE_EVENT(net_dev_template_simple, netif_receive_skb_list_exit,
>> +
>> +	TP_PROTO(struct sk_buff *skb),
>> +
>> +	TP_ARGS(skb)
>> +);
>> +
>> +DEFINE_EVENT(net_dev_template_simple, netif_rx_exit,
>> +
>> +	TP_PROTO(struct sk_buff *skb),
>> +
>> +	TP_ARGS(skb)
>> +);
>> +
>> +DEFINE_EVENT(net_dev_template_simple, netif_rx_ni_exit,
>> +
>> +	TP_PROTO(struct sk_buff *skb),
>> +
>> +	TP_ARGS(skb)
>> +);
>> +
>> #endif /* _TRACE_NET_H */
>>
>> /* This part must be outside protection */
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 0ffcbdd55fa9..e670ca27e829 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -4520,9 +4520,14 @@ static int netif_rx_internal(struct sk_buff *skb)
>>
>> int netif_rx(struct sk_buff *skb)
>> {
>> +	int ret;
>> +
>> 	trace_netif_rx_entry(skb);
>>
>> -	return netif_rx_internal(skb);
>> +	ret = netif_rx_internal(skb);
>> +	trace_netif_rx_exit(skb);
>> +
>> +	return ret;
>> }
>> EXPORT_SYMBOL(netif_rx);
>>
>> @@ -4537,6 +4542,7 @@ int netif_rx_ni(struct sk_buff *skb)
>> 	if (local_softirq_pending())
>> 		do_softirq();
>> 	preempt_enable();
>> +	trace_netif_rx_ni_exit(skb);
>>
>> 	return err;
>> }
>> @@ -5222,9 +5228,14 @@ static void netif_receive_skb_list_internal(struct
>> list_head *head)
>>  */
>> int netif_receive_skb(struct sk_buff *skb)
>> {
>> +	int ret;
>> +
>> 	trace_netif_receive_skb_entry(skb);
>>
>> -	return netif_receive_skb_internal(skb);
>> +	ret = netif_receive_skb_internal(skb);
>> +	trace_netif_receive_skb_exit(skb);
>> +
>> +	return ret;
>> }
>> EXPORT_SYMBOL(netif_receive_skb);
>>
>> @@ -5247,6 +5258,8 @@ void netif_receive_skb_list(struct list_head *head)
>> 	list_for_each_entry(skb, head, list)
>> 		trace_netif_receive_skb_list_entry(skb);
>> 	netif_receive_skb_list_internal(head);
>> +	list_for_each_entry(skb, head, list)
>> +		trace_netif_receive_skb_list_exit(skb);
>> }
>> EXPORT_SYMBOL(netif_receive_skb_list);
>>
>> @@ -5634,12 +5647,17 @@ static gro_result_t napi_skb_finish(gro_result_t ret,
>> struct sk_buff *skb)
>>
>> gro_result_t napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
>> {
>> +	gro_result_t ret;
>> +
>> 	skb_mark_napi_id(skb, napi);
>> 	trace_napi_gro_receive_entry(skb);
>>
>> 	skb_gro_reset_offset(skb);
>>
>> -	return napi_skb_finish(dev_gro_receive(napi, skb), skb);
>> +	ret = napi_skb_finish(dev_gro_receive(napi, skb), skb);
>> +	trace_napi_gro_receive_exit(skb);
>> +
>> +	return ret;
>> }
>> EXPORT_SYMBOL(napi_gro_receive);
>>
>> @@ -5753,6 +5771,7 @@ static struct sk_buff *napi_frags_skb(struct napi_struct
>> *napi)
>>
>> gro_result_t napi_gro_frags(struct napi_struct *napi)
>> {
>> +	gro_result_t ret;
>> 	struct sk_buff *skb = napi_frags_skb(napi);
>>
>> 	if (!skb)
>> @@ -5760,7 +5779,10 @@ gro_result_t napi_gro_frags(struct napi_struct *napi)
>>
>> 	trace_napi_gro_frags_entry(skb);
>>
>> -	return napi_frags_finish(napi, skb, dev_gro_receive(napi, skb));
>> +	ret = napi_frags_finish(napi, skb, dev_gro_receive(napi, skb));
>> +	trace_napi_gro_frags_exit(skb);
>> +
>> +	return ret;
>> }
>> EXPORT_SYMBOL(napi_gro_frags);
>>
>> --
>> 2.19.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ