lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <20240906115015.GA16423@willie-the-truck> Date: Fri, 6 Sep 2024 12:50:15 +0100 From: Will Deacon <will@...nel.org> To: Robin Murphy <robin.murphy@....com> Cc: mark.rutland@....com, linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org, jialong.yang@...ngroup.cn Subject: Re: [PATCH v3 2/3] perf: Add driver for Arm NI-700 interconnect PMU On Wed, Sep 04, 2024 at 06:15:24PM +0100, Robin Murphy wrote: > On 2024-09-04 1:24 pm, Will Deacon wrote: > > On Mon, Sep 02, 2024 at 07:47:18PM +0100, Robin Murphy wrote: > > > On 02/09/2024 3:47 pm, Will Deacon wrote: > > > > > +static ssize_t arm_ni_format_show(struct device *dev, > > > > > + struct device_attribute *attr, char *buf) > > > > > +{ > > > > > + struct arm_ni_format_attr *fmt = container_of(attr, typeof(*fmt), attr); > > > > > + int lo = __ffs(fmt->field), hi = __fls(fmt->field); > > > > > + > > > > > + return sysfs_emit(buf, "config:%d-%d\n", lo, hi); > > > > > +} > > > > > > > > Nit: if you end up adding single-bit config fields in the future, this > > > > will quietly do the wrong thing. Maybe safe-guard the 'lo==hi' case (even > > > > if you just warn once and return without doing anything). > > > > > > The counter-argument is that I don't foresee having any reason to add > > > single-bit config fields here in future, nor indeed config1 or config2 > > > fields, so I intentionally pruned the would-be dead code while copy-pasting > > > this implementation from arm-cmn. Yes, if someone were to make an incomplete > > > change without paying attention or testing they could introduce a bug, but > > > when is that ever not true? > > > > I guess I'm just a little more wary when it comes to UAPI. Somebody starts > > relying on the broken message and then you're toast. It's also incredibly > > easy to avoid by construction and the dead code isn't hurting anybody. > > Even presuming some theoretical unreviewed broken patch did get merged and > into real-world kernels without ever being fixed, I still struggle to > imagine how userspace could somehow grow to *rely* on one PMU driver > displaying a format attribute in an unexpected manner inconsistent with > every other PMU driver, as opposed to the far more likely scenario of going > wrong trying to parse it. I think you lose the game when you try to imagine what userspace could do :) But c'mon, this is simple to address and then we don't have to imagine anything. > Anyway, after playing with some fun compile-time checks yesterday I've just > realised there is actually an even simpler solution for doing the right > thing in general, so I guess thanks for leaning on this :) Hurrah! Thanks. > > > > > + name = devm_kasprintf(ni->dev, GFP_KERNEL, "arm_ni_%d_cd_%d", ni->id, cd->id); > > > > > + if (!name) > > > > > + return -ENOMEM; > > > > > + > > > > > + err = cpuhp_state_add_instance(arm_ni_hp_state, &cd->cpuhp_node); > > > > > + if (err) > > > > > + return err; > > > > > > > > What happens if there's a CPU hotplug operation here? Can we end up calling > > > > perf_pmu_migrate_context() concurrently with perf_pmu_register()? > > > > > > Yes. Alternatively we could register the PMU before the hotplug handler, > > > then potentially miss a hotplug event and leave a user-visible PMU > > > associated with an invalid CPU. This is a known issue for all system PMU > > > drivers, and the conclusion 5 years ago was that it's impractical to close > > > this race from outside perf core itself[1][2]. > > > > Ok, I'm going to walk right into the trap you've set me... > > > > Why can't we prevent hotplug (e.g. with cpus_read_lock()) while we're > > setting this up? > > > > ... and climbing back out of that trap, is the conversation you had with > > Thomas written down anywhere? > > I think that must have been the time we talked in person in Cambridge. I > can't now remember if there is (or was) anything in perf_pmu_register() > that's actually incompatible with being called under cpus_read_lock(), but I > do recall that the overall concept of exporting more bits of the hotplug > machinery in order to copy-paste the same boilerplate bodge in every PMU > driver was... unpopular. Unpopular, maybe, but you only have to look at a few PMU drivers to see we're already in that situation for other aspects of the code. Consolidation would be welcome, but I'd sooner have boilerplate code than bugs. > > I don't want to block this patch, but if five years has passed with > > nobody looking at this then we probably need to address that at some > > point before adding more and more broken drivers. > > Looking at it again, is there actually a problem with the current state of > things? Following through the call path: > > perf_pmu_migrate_context() > __perf_pmu_remove() > perf_event_groups_first() > __group_cmp() > perf_event_groups_cmp() > > the "pmu" pointer only seems to be used as a key to match events in the > current CPU context, which obviously won't find anything at this point. > AFAICS it's never even dereferenced, unless any events *are* found for > __perf_pmu_install() to do something with, which necessarily implies an > initialised and registered PMU for them to have been opened in the first > place. You might well be right, but this could will change in future and I don't think it's been written to work safely under this race. I also pointed out that a similar race exists on the unregister path. Will
Powered by blists - more mailing lists