lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <875zavh1re.fsf@mellanox.com>
Date:   Fri, 10 Jul 2020 16:40:21 +0200
From:   Petr Machata <petrm@...lanox.com>
To:     Cong Wang <xiyou.wangcong@...il.com>
Cc:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        David Miller <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Eric Dumazet <eric.dumazet@...il.com>,
        Jiri Pirko <jiri@...lanox.com>,
        Ido Schimmel <idosch@...lanox.com>
Subject: Re: [PATCH net-next v1 2/5] net: sched: Introduce helpers for qevent blocks


Cong Wang <xiyou.wangcong@...il.com> writes:

> On Wed, Jul 8, 2020 at 5:13 PM Petr Machata <petrm@...lanox.com> wrote:
>>
>>
>> Petr Machata <petrm@...lanox.com> writes:
>>
>> > Cong Wang <xiyou.wangcong@...il.com> writes:
>> >
>> > I'll think about it some more. For now I will at least fix the lack of
>> > locking.
>>
>> I guess I could store smp_processor_id() that acquired the lock in
>> struct qdisc_skb_head. Do a trylock instead of lock, and on fail check
>> the stored value. I'll need to be careful about the race between
>> unsuccessful trylock and the test, and about making sure CPU ID doesn't
>> change after it is read. I'll probe this tomorrow.
>
> Like __netif_tx_lock(), right? Seems doable.

Good to see it actually used, I wasn't sure if the idea made sense :)

Unfortunately it is not enough.

Consider two threads (A, B) and two netdevices (eth0, eth1):

- "A" takes eth0's root lock and proceeds to classification
- "B" takes eth1's root lock and proceeds to classification
- "A" invokes mirror to eth1, waits on lock held by "B"
- "B" invakes mirror to eth0, waits on lock held by "A"
- Some say they are still waiting to this day.

So one option that I see is to just stash the mirrored packet in a queue
instead of delivering it right away:

- s/netif_receive_skb/netif_rx/ in act_mirred

- Reuse the RX queue for TX packets as well, differentiating the two by
  a bit in SKB CB. Then process_backlog() would call either
  __netif_receive_skb() or dev_queue_transmit().

- Drop mirred_rec_level guard.

This seems to work, but I might be missing something non-obvious, such
as CB actually being used for something already in that context. I would
really rather not introduce a second backlog queue just for mirred
though.

Since mirred_rec_level does not kick in anymore, the same packet can end
up being forwarded from the backlog queue, to the qdisc, and back to the
backlog queue, forever. But that seems OK, that's what the admin
configured, so that's what's happening.

If this is not a good idea for some reason, this might work as well:

- Convert the current root lock to an rw lock. Convert all current
  lockers to write lock (which should be safe), except of enqueue, which
  will take read lock. That will allow many concurrent threads to enter
  enqueue, or one thread several times, but it will exclude all other
  users.

  So this guards configuration access to the qdisc tree, makes sure
  qdiscs don't go away from under one's feet.

- Introduce another spin lock to guard the private data of the qdisc
  tree, counters etc., things that even two concurrent enqueue
  operations shouldn't trample on. Enqueue takes this spin lock after
  read-locking the root lock. act_mirred drops it before injecting the
  packet and takes it again afterwards.

Any opinions y'all?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ