lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 29 Jun 2018 10:48:34 -0700
From:   Jesus Sanchez-Palencia <jesus.sanchez-palencia@...el.com>
To:     Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc:     Network Development <netdev@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        jan.altenberg@...utronix.de,
        Vinicius Gomes <vinicius.gomes@...el.com>,
        kurt.kanzenbach@...utronix.de, Henrik Austad <henrik@...tad.us>,
        Richard Cochran <richardcochran@...il.com>,
        ilias.apalodimas@...aro.org, ivan.khoronzhuk@...aro.org,
        Miroslav Lichvar <mlichvar@...hat.com>,
        Willem de Bruijn <willemb@...gle.com>,
        Jamal Hadi Salim <jhs@...atatu.com>,
        Cong Wang <xiyou.wangcong@...il.com>,
        Jiří Pírko <jiri@...nulli.us>
Subject: Re: [PATCH v1 net-next 14/14] net/sched: Make etf report drops on
 error_queue

Hi Willem,


On 06/28/2018 07:27 AM, Willem de Bruijn wrote:

(...)

> 
>>  struct sock_txtime {
>>         clockid_t       clockid;        /* reference clockid */
>> -       u16             flags;          /* bit 0: txtime in deadline_mode */
>> +       u16             flags;          /* bit 0: txtime in deadline_mode
>> +                                        * bit 1: report drops on sk err queue
>> +                                        */
>>  };
> 
> If this is shared with userspace, should be defined in an uapi header.
> Same on the flag bits below. Self documenting code is preferable over
> comments.


Fixed for v2.


> 
>>  /*
>> diff --git a/include/net/sock.h b/include/net/sock.h
>> index 73f4404e49e4..e681a45cfe7e 100644
>> --- a/include/net/sock.h
>> +++ b/include/net/sock.h
>> @@ -473,6 +473,7 @@ struct sock {
>>         u16                     sk_clockid;
>>         u16                     sk_txtime_flags;
>>  #define SK_TXTIME_DEADLINE_MASK        BIT(0)
>> +#define SK_TXTIME_RECV_ERR_MASK        BIT(1)
> 
> Integer bitfields are (arguably) more readable. There is no requirement
> that the user interface be the same as the in-kernel implementation. Indeed
> if you can save bits in struct sock, that is preferable (but not so for the ABI,
> which cannot easily be extended).


Sure, changed for v2.

(...)


>> diff --git a/net/sched/sch_etf.c b/net/sched/sch_etf.c
>> index 5514a8aa3bd5..166f4b72875b 100644
>> --- a/net/sched/sch_etf.c
>> +++ b/net/sched/sch_etf.c
>> @@ -11,6 +11,7 @@
>>  #include <linux/kernel.h>
>>  #include <linux/string.h>
>>  #include <linux/errno.h>
>> +#include <linux/errqueue.h>
>>  #include <linux/rbtree.h>
>>  #include <linux/skbuff.h>
>>  #include <linux/posix-timers.h>
>> @@ -124,6 +125,35 @@ static void reset_watchdog(struct Qdisc *sch)
>>         qdisc_watchdog_schedule_ns(&q->watchdog, ktime_to_ns(next));
>>  }
>>
>> +static void report_sock_error(struct sk_buff *skb, u32 err, u8 code)
>> +{
>> +       struct sock_exterr_skb *serr;
>> +       ktime_t txtime = skb->tstamp;
>> +
>> +       if (!skb->sk || !(skb->sk->sk_txtime_flags & SK_TXTIME_RECV_ERR_MASK))
>> +               return;
>> +
>> +       skb = skb_clone_sk(skb);
>> +       if (!skb)
>> +               return;
>> +
>> +       sock_hold(skb->sk);
> 
> Why take an extra reference? The skb holds a ref on the sk.


Yes, the cloned skb holds a ref on the socket, but the documentation of
skb_clone_sk() makes this explicit suggestion:

(...)
 * When passing buffers allocated with this function to sock_queue_err_skb
 * it is necessary to wrap the call with sock_hold/sock_put in order to
 * prevent the socket from being released prior to being enqueued on
 * the sk_error_queue.
 */

which I believe is here just so we are protected against a possible race after
skb_orphan() is called from sock_queue_err_skb(). Please let me know if I'm
misreading anything.

And for v2 I will move the sock_hold() call to immediately before the
sock_queue_err_skb() to avoid any future confusion.



> 
>> +
>> +       serr = SKB_EXT_ERR(skb);
>> +       serr->ee.ee_errno = err;
>> +       serr->ee.ee_origin = SO_EE_ORIGIN_LOCAL;
> 
> I suggest adding a new SO_EE_ORIGIN_TXTIME as opposed to overloading
> the existing
> local origin. Then the EE_CODE can start at 1, as ee_code can be
> demultiplexed by origin.


OK, it looks better indeed. Fixed for v2.


> 
>> +       serr->ee.ee_type = 0;
>> +       serr->ee.ee_code = code;
>> +       serr->ee.ee_pad = 0;
>> +       serr->ee.ee_data = (txtime >> 32); /* high part of tstamp */
>> +       serr->ee.ee_info = txtime; /* low part of tstamp */
>> +
>> +       if (sock_queue_err_skb(skb->sk, skb))
>> +               kfree_skb(skb);
>> +
>> +       sock_put(skb->sk);
>> +}


Thanks,
Jesus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ