Hi @MikeF @NinjaGaijin , I've got the 1.36 Utilities Integration on a branch on my repo prepping a migration to the Expressive Avatar update. Just trying to get a feel for the differences and limitations of the new features. As far as I can tell, it's not yet possible to serialize and manipulate the cool mouth stuff for remote avatars, is that correct? The OvrAvatar.cs script itself seems to show that those features only come online when it's being driven by a Local Driver Component.
I was curious how this was going to be achieved at scale since it's being driven by a fair bit of DSP, and it appears that for the time being it simply isn't being done at scale. Will it eventually replace the Mouth Vertex Animation system? I imagine that's probably just being left in for backward compatibility sake. How does the team anticipate supporting this eventually, since it's implied that it must be done? I like the simplicity of sending a normalized float to the VoiceAmplitude member in the Mouth Vertex Animation system, but I figure you'd probably want an audio buffer being fed at regular intervals for the new system.
Just curious if I could get some more detail here and looking for any corrections to my current understanding. Thanks!
So you are correct in thinking that by default the only way to receive facial blend information is to have a local avatar component, but this should also be included in the pose packets sent over to remote avatars. This does result in a bit of a delay between voip and the visuals that accompany it via lipsync. It took us quite a bit of time to get it working decently on our end, but we were using photon voice so the two systems dont really speak to eachother which makes it harder to deal with (infact they conflict with each other since they both want control over the mic).
I think the intended implementation is to use the platform voip but that comes along with some crossplatform limitations that might not work for everyone.
you might be able to roll your own system though since the blend shapes are completely exposed in the skinned mesh renderer so you could potentially drive it locally for remote avatars based on incoming audio alone instead of audio + pose packet
OK, so if I understand you correctly, as long as the LocalAvatar that is getting serialized for network replication is the one that has CanOwnMicrophone enabled and is technically doing the blendshape work locally on a first person version invisibly, that that gets serialized into the SDK Avatar packets?
The only reason I'm skeptical about that is because I have already been updating VoiceAmplitude locally on the previous incarnation of the Avatars SDK with the value I'm getting from our VoIP client with Mouth Vertex Animation enabled, and yet it's not getting picked up as part of the packet serialization process by Photon. I ended up having to send it as a separate IObservable Component observed by my Photon view in order to get remote avatar's mouth vertex moving. But I will give it a try here with the new tech and see what happens.
Our stack uses PUN 2 for replication and Vivox for VoIP, so I'm glad to hear you were prototyping in-house without the bespoke Oculus VoIP as it is. I'm okay with the latency you mention. Utilities-wise, we're only using Platform and Avatar (and I guess LipSync now too).
It's also a little tricky because our app has a mirror in it, so we'd need the blendshapes to be driven in that separate LocalAvatar mirror as well and I can't set CanOwnMicrophone for it if I'm already running all that CPU for the version I can't see that's going out to everyone else. Would be good to have some flexibility on this moving forward (i.e. there can only be one LocalAvatar running the blendshape stuff, determined by that CanOwnMicrophone).
Correct thats how i have it set in my project and it works. The old mouth vertex animation was totally separate from the pose packet system so your previous setup make sense, but with the new system everything gets thrown into the pose update