[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150921182539.GB2291@nanopsycho>
Date: Mon, 21 Sep 2015 20:25:39 +0200
From: Jiri Pirko <jiri@...nulli.us>
To: Vivien Didelot <vivien.didelot@...oirfairelinux.com>
Cc: Scott Feldman <sfeldma@...il.com>, Netdev <netdev@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
Ido Schimmel <idosch@...lanox.com>, eladr@...lanox.com,
Florian Fainelli <f.fainelli@...il.com>,
Guenter Roeck <linux@...ck-us.net>
Subject: Re: [patch net-next RFC 0/6] switchdev: introduce tranction enfra
and for pre-commit split
Mon, Sep 21, 2015 at 07:13:58PM CEST, vivien.didelot@...oirfairelinux.com wrote:
>Hi Jiri, Scott,
>
>On Sep. Monday 21 (39) 10:09 AM, Jiri Pirko wrote:
>> Mon, Sep 21, 2015 at 09:23:24AM CEST, sfeldma@...il.com wrote:
>> >On Sat, Sep 19, 2015 at 5:29 AM, Jiri Pirko <jiri@...nulli.us> wrote:
>> >> Jiri Pirko (6):
>> >> switchdev: rename "trans" to "trans_ph".
>> >> switchdev: introduce transaction infrastructure for attr_set and
>> >> obj_add
>> >> rocker: switch to local transaction phase enum
>> >> switchdev: move transaction phase enum under transaction structure
>> >> rocker: use switchdev transaction queue for allocated memory
>> >> switchdev: split commit and prepare phase into two callbacks
>> >
>> >Patches compile, but first test bombs. Cut-and-paste of dump at end
>> >of this email.
>>
>> Told you :)
>>
>>
>> >
>> >I'm not sure I'm liking this patchset because it looks like a way for
>> >switchdev drivers to easily opt-out of the prepare-commit transaction
>> >model by simply not implementing the *_pre op. I would rather drivers
>> >explicitly handle the PREPARE phase in code, even if that means
>> >skipping it gracefully (in code) with a comment (in code) explaining
>> >why it does not matter for this device/operation. That's what DSA had
>> >done, mostly because it was a retro-fit.
>>
>> Each driver should handle this inside it. If it does not need prepare
>> state, it simply does not implement it. That is the same for all cb,
>> ndos, netdev notifiers, etc. It is much cleaner and nicer to have these as
>> separate callbacks. Implementing multiple callback in one is just ugly,
>> sorry.
>
>This is true, (in DSA) we don't have to implement the prepare phase if
>we fully support the feature in hardware.
>
>To give a real example, Marvell switch drivers currently implement all
>add/del/dump calls for VLAN FDB (where VID 0 means the port itself). No
>prepare phase needed.
>
>Now, I have local patches to enable strict 802.1Q mode in these switches
>(all the logic is based on the hardware VLAN table). But it does not use
>per-port FDB, so fdb_add with VID 0 doesn't make sense anymore. That's
>why we need to push the feature checking down to the drivers in DSA.
>
>I have another pending patch to add .port_fdb_pre_add, where mv88e6xxx
>code will return -EOPNOTSUPP if the given VID is 0.
>
>Another example: mv88e6xxx support tagged VLANs, so no hardware check
>needed. But the Broadcom Starfighter 2 only supports port-based VLANs
>(which is today wrongly implemented through "bridge_join/leave"). By
>implementing .port_vlan_pre_add (another pending patch for DSA), the
>driver will be able to return -EOPNOTSUPP if !BRIDGE_VLAN_INFO_PVID.
>
>Also, having logic in switchdev drivers to check SWITCHDEV_TRANS_NONE
>and SWITCHDEV_TRANS_ABORT is not really nice. Having switchdev handle
>the abort phase (calling each destructor) and getting rid of the
>SWITCHDEV_TRANS_* flags sounds better to me.
Agree, if pre/commit is going to be in one function, we should have
only prepare/commit enums. It can be carried around as a single bool
value in switchdev_trans structure. Will include that in my transaction
patchset.
>
>> >Also, the patchset removes the ABORT callback in case of a rollback
>> >due to a failed PREPARE. We can't make the assumption that it's just
>> >a memory list to destroy on ABORT. The driver, on PREPARE, may have
>> >reserved device space or staged an operation on the device which we'll
>> >need to undo on ABORT.
>>
>> Yep, just register an item with custom destructor, there you can do
>> whatever. Also, I believe much nicer comparing to current code.
>>
>>
>> >
>> >So we need ABORT back, and we need PREPARE to not be optional, so
>> >what's left list enqueue/dequeue helpers, which I'm not seeing much
>> >value in up-leveling as the driver can do list_add/del itself.
>>
>> Why would every driver do it itself, over and over when there can be a
>> clean infrastructure to do that. Including abort phase. Without the driver
>> needed to be involved.
>
>Maybe the term ".destructor" has a too strong meaning to deallocation,
>but you can indeed do whatever you need in this function.
It is a destructor. Don't know about a better name, suggestions?
>
>> >Am I missing something? I didn't see a motivation statement for the
>> >RFC so I'm not sure where you wanted to take this.
>>
>> I want to make current code much nicer, easier to read and implement in
>> other drivers. Look at rocker.c and how often there is == PREPARE there.
>> It's nearly impossible to followthe code, sorry.
>>
>> My next patchset is to un-mess rocker.c (that freaking ofdpa stuff is
>> everywhere)
>
>I'm basically for this patchset, but I do have some concerns about the
>general switchdev transaction design.
>
>I certainly don't have the perspective that you guys have, so please
>tell me if I'm totally wrong. From my point of view (with DSA drivers),
>the emphase should be done on asking the hardware if it can handle or
>not a given transaction (not simply an hardware feature!).
>
>The driver can handle any allocation and preparation itself, and also
>errors can actually occur *during* a commit phase.
But during commit phase, the odds that error is going to happen is a lot
smaller. Prepare phase should resolve all common fails like memory
allocations and resource limitation checks.
>
>Being able to return -EOPNOTSUPP from any add/del/dump function would be
>just perfect.
Looking at this, I agree that for switchdev_port_obj_add and
switchdev_port_attr_set if would make more sense to return 0 in case of
hw does not actuall support. Callers check for -EOPNOTSUP and treat
that as 0 anyway. Feel free to send a patch for this.
>
>I ideally imagine the following implementation in switchdev drivers:
>
> int foo_port_obj_add(struct net_device *dev, int id, void *obj)
> {
> struct foo *foo = netdev_priv(dev);
> struct switchdev_obj_port_fdb *fdb;
> struct switchdev_obj_port_vlan *vlan;
> int err;
>
> switch(id) {
> case SWITCHDEV_OBJ_PORT_FDB:
> fdb = obj;
> err = foo_port_fdb_add(foo, fdb);
> break;
> case SWITCHDEV_OBJ_PORT_VLAN:
> vlan = obj;
> err = foo_port_vlan_add(foo, vlan);
> break;
> default:
> err = -EOPNOTSUPP;
> break;
> }
>
> return err;
> }
>
>Where foo_port_{fdb,vlan}_add can still return -EOPNOTSUPP depending on
>their current state.
Yep, seems correct.
>
>Thanks,
>-v
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists