[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4e30cd3b7ad00fcb27a5f929c28dd69bdceb979c.camel@infradead.org>
Date: Tue, 02 Dec 2025 09:50:27 +0000
From: David Woodhouse <dwmw2@...radead.org>
To: Babis Chalios <bchalios@...zon.es>, "richardcochran@...il.com"
<richardcochran@...il.com>, "andrew+netdev@...n.ch"
<andrew+netdev@...n.ch>, "davem@...emloft.net" <davem@...emloft.net>,
"edumazet@...gle.com" <edumazet@...gle.com>, "kuba@...nel.org"
<kuba@...nel.org>, "pabeni@...hat.com" <pabeni@...hat.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Cc: "Graf (AWS), Alexander" <graf@...zon.de>, "mzxreary@...inter.de"
<mzxreary@...inter.de>
Subject: Re: [PATCH 2/2] ptp: vmclock: support device notifications
On Mon, 2025-12-01 at 13:15 +0100, Babis Chalios wrote:
> On Fri, 2025-11-28 at 13:00 +0000, David Woodhouse wrote:
> > Generally looks sane to me, thanks.
> >
> > I haven't given much brain power to whether POLLHUP is the right
> > thing to return when poll() is invalid; I guess you have.
> >
>
> I was looking at the possible alternatives. The semantics of POLLHUP
> seem to me the correct ones, i.e.: the other end (device) "closed its
> end of the channel" which is exactly what it has happened.
Makes sense.
> > I also haven't looked hard into the locking on fst->seq which is
> > accessed during poll() and read(). Have you?
> >
>
> Hmmm, hadn“t considered it because I can't think of any reason why
> someone would be poll()ing and read()ing concurrently from different
> threads but you're right, there's a race. I wonder though what the
> correct semantics are here. Imagine thread A is waiting (blocked in
> poll()) and thread B is reading from the same file descriptor while a
> notification arrives. I can see a scenario that this causes thread A to
> loose a notification (and consequently a missed chance to act on
> disruption_marker and vm_generation_counter changes). What do you
> think?
There's a theoretical race condition even with read(), where you set
fst->seq to the seqno you've just read. As long as they're using
pread() and not serialised by f_pos_lock, I think that two threads can
be reading at the (about) same time, and see *different* seqnos, for
example if they happen just before/after a LM event. Then in all the
steal time and preemption it's random *which* one of them gets to set
fst->seq first, and which one comes afterwards.
So even the setting of fst->seq wants to use an atomic_t and/or cmpxchg
to ensure it's only ever moving forwards.
Then it's either a READ_ONCE() or atomic_read() in poll().
> > Your vmclock_setup_notification() function can return error, but you
> > ignore that. Which *might* have been intentional, to allow the device
> > to be used even without notifications if something goes wrong... but
> > then the condition for poll() returning POLLHUP is wrong, because that
> > only checks the flag that the hypervisor set, and not whether
> > notifications are *actually* working.
>
> This is just a bug. My intention is to mark the device probing as
> failed if the device has advertised support for notifications **and**
> setting up the notification handler failed.
OK.
> > In open() you simply read seq_count from the vmclock structure but it
> > might be odd at that point. Do we want to wait for it to be even,
> > like read() does? Or just initialise fst->seq to zero?
> >
>
> Good catch. I think the easiest thing is to set fst->seq to zero.
>
> > And is there really no devm-like helper which will free your
> > fp->private_data for you on release()? That seems surprising.
> >
>
> I'm not aware of any such mechanism. It seems weird to me though, how
> would it know what kind of data I put there and what exactly should it
> call to free them?
>
In fs/ alone there seem to be 11 examples of release() functions which
do nothing but kfree(file->private_data); return 0;.
Maybe there *should* be such a helper?
int simple_release(struct inode *inode, struct file *file)
{
kfree(file->private_data);
return 0;
}
EXPORT_SYMBOL(simple_release);
Download attachment "smime.p7s" of type "application/pkcs7-signature" (5069 bytes)
Powered by blists - more mailing lists