This is a very strong explanation of what’s going on. And as a follow-up, I believe that ZeroTier present a single Ethernet broadcast domain, and so WoL tricks are more likely to work naturally there than with Wireguard. I haven’t used ZeroTier, and I do use Wireguard via Tailscale/Headscale. I’ve never missed the Ethernet features of ZeroTier and they CAN result in a very chatty wan if you’re not careful. But I think ZT would make this straightforward.
Though as other people note… the simplest/least-disruptive change is probably to expose some scripty thing on the rpi that can be triggered via be triggered over a routed protocol and then have the rpi emit the Ethernet broadcast packets from the physical network.
Another user posted the blog where they discuss their speedup techniques: https://tailscale.com/blog/more-throughput/
It’s likely that the kernel version can use similar techniques to surpass the performance of the userspace version that tailscale uses, but no one has put in the work to to make the kernel implementation as sophisticated as the userspace one.