lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 17 Nov 2023 14:34:10 +0100
From: Andrew Lunn <andrew@...n.ch>
To: Alice Ryhl <aliceryhl@...gle.com>
Cc: fujita.tomonori@...il.com, benno.lossin@...ton.me,
	miguel.ojeda.sandonis@...il.com, netdev@...r.kernel.org,
	rust-for-linux@...r.kernel.org, tmgross@...ch.edu,
	wedsonaf@...il.com
Subject: Re: [PATCH net-next v7 1/5] rust: core abstractions for network PHY
 drivers

> And you would also update the struct invariant accordingly:
> 
> /// # Invariants
> ///
> /// Referencing a `phy_device` using this struct asserts that the X
> /// mutex is held.
> #[repr(transparent)]
> pub struct Device(Opaque<bindings::phy_device>);
> 
> 
> 
> 
> 
> > +// During the calls to most functions in [`Driver`], the C side (`PHYLIB`) holds a lock that is
> > +// unique for every instance of [`Device`]. `PHYLIB` uses a different serialization technique for
> > +// [`Driver::resume`] and [`Driver::suspend`]: `PHYLIB` updates `phy_device`'s state with
> > +// the lock held, thus guaranteeing that [`Driver::resume`] has exclusive access to the instance.
> > +// [`Driver::resume`] and [`Driver::suspend`] also are called where only one thread can access
> > +// to the instance.
> 
> I used "X mutex" as an example for the synchronization mechanism in the
> above snippets, but it sounds like its more complicated than that? Here
> are some possible alternatives I could come up with:

So X would be phy_device->lock.

Suspend and resume don't have this lock held. I don't actually
remember the details, but there is an email discussion with Russell
King which does explain the problem we had, and why we think it is
safe to not hold the lock.

> /// # Invariants
> ///
> /// Referencing a `phy_device` using this struct asserts that the X
> /// mutex is held, or that the reference has exclusive access to the
> /// entire `phy_device`.
> #[repr(transparent)]
> pub struct Device(Opaque<bindings::phy_device>);

You can never have exclusive access to the entire phy_device, because
it contains a mutex. Other threads can block on that mutex, which
involves changing the linked list in the mutex.

But that is also a pretty common pattern, put the mutex inside the
structure it protects. So when you say 'exclusive access to the entire
`phy_device`' you actually mean excluding mutex, spinlocks, atomic
variables, etc?

> /// Referencing a `phy_device` using this struct asserts exclusive
> /// access to the following fields: phy_id, state, speed, duplex. And
> /// read access to the following fields: link, autoneg_complete,
> /// autoneg.
> #[repr(transparent)]
> pub struct Device(Opaque<bindings::phy_device>);

For the Rust code, you can maybe do this. But the Rust code calls into
C helpers to get the real work done, and they expect to have access to
pretty much everything in phy_device.

> /// # Invariants
> ///
> /// Referencing a `phy_device` using this struct asserts that the user
> /// is inside a Y scope as defined in Documentation/foo/bar.
> #[repr(transparent)]
> pub struct Device(Opaque<bindings::phy_device>);

There is no such documentation that i know of, except it does get
repeated again and again on the mailling lists. Its tribal knowledge.

> But I don't know how these things are actually synchronized. Maybe
> it is some sixth option. I would be happy to help draft these safety
> comments once the actual synchronization mechanism is clear to me.

The model is: synchronisation is not the drivers problem.

With a few years of experience reviewing drivers, you notice that
there are a number of driver writers who have no idea about
locking. They don't even think about it. So where possible, its best
to hide all the locking from them in the core. The core is in theory
developed by developers who do mostly understand locking and have a
better chance of getting it right. Its also exercised a lot more,
since its shared by all drivers.

My experience with this one Rust driver so far is that Rust developers
have problems accepting that its not the drivers problem.

	Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ