Showing results for 
Search instead for 
Did you mean: 

Expressive Avatars: Lipsync, VoIP and Android Mic permissions

Level 7
Hey folks,
I'm sure you've seen the latest updates to Oculus Avatars, which introduced OVRLipsync driven mouth movement, eye gaze simulation and micro-expressions.
I wanted to flag something we came up against while we worked on this update, in case it was causing issues with folks building multiplayer Quest and Go experiences. 

Android only allows access to the microphone from a single process. This wasn't an issue when networking avatars previously, as the mic input wasn't being used. But with the expressive update, we specifically need to run the mic through the OVRLipsync plugin to generate blend-shapes and drive the mouth shapes.

Trying to hook up the mic to both VoIP and Lipsync therefore causes an inevitable race condition. The loser gets a bunch of zeros.
So either there's no networked audio, or no blend-shapes. :disappointed:

Fortunately, Oculus VoIP and Photon both have an available workaround, in the form of the ability to relay the mic buffer using SetMicrophoneFilterCallback()
(for Oculus VoIP) as documented here:

We're in the process of documenting the specifics of how this can then be wired up to Lipsync and Avatars in more detail, but in the meantime, please refer to Social Starter in the Avatar SDK Unity samples, which has implemented the Avatar / Lipsync / VoIP stack correctly.

Level 5
I was just going to write this!
ETA for Unreal integration??

Level 4
I wanted to ask, on Quest, can Lipsync still perform great if there's say 6 people chatting, or would it start show lag?

And does it only hook in to one's local mic, then transmit only the phoneme data result to others, or does everyone's local client analyze all (e.g. 6) audio streams?

- Anyland dev -

Level 2
We indeed had some difficulties to make lipsync and Photon Voice VOIP coexist in our project.

The closest we were to a success was to activate the lipsync microphone capture, capture the MouthAnchor audiosource clip and send it to Photon Recorder.

It works, but there’s a audible/visible difference between the VOIP and the avatar lip moves (around 0.25 or 0.5 seconds).

I have 2 questions :

1) is this kind of lag expected in the solution you will document in the future ? Or maybe it is our solution that leads to those delays (it works a bit in a reverse way compared to the one you imply, as the microphone is not used for the VOIP first, but by the lipsync first, in our implementation).

2) prior reading this topic, we thought this lag was normal as the voice and the avatar mesh sync go through different paths, so we tried to find another solution.
We did not capture the lip sync locally (the microphone is fully given to the VOIP), but instead, we wanted on the receiver side to forward the received audio to the avatar, so that it blends the avatar moves send by the network with the lip moves computed locally with the audio (sent through the network).

We did not succeed yet in making it work, so 2.1) is it something that you think possible ? and 2.2) I’ll describe our hack to see if we are not just missing an important step 🙂

On the RemoteAvatar gameobject, we removed the remote driver component and added a local driver component (please, wait, even if it sounds crazy 😉 ). 
We also set CanOwnMicrophone to false on OvrAvatar. With these settings, and OvrLipSyncContext is created on the Mouth of the remote avatar.
To continue having the OvrAvatar work properly, we added a child GameObject to RemoveAvatar, with a remote driver component on it, and we used it to fill the driver field in the RemoteAvatar’s OvrAvatar component.
It "works" as the OvrAvatar class checks the kind of driver (local/remote) on the gameObject locally to initialize the audio part, but use the driver field for other usages (I did say it was a hack 😉 ).
From there, we expected that sending the audio data (received through the network) to the OvrAvatar UpdateVoiceData method would do the job (we checked that it is properly send, with a success result, to the OvrLipSyncContext). However, no lip move happens.
I wondered if some additional thing must be done to blend the network avatar description with the local lip sync, so I tried to disable the avatar moves to free the lip sync from a potential override by the avatar, but it did not work either.

By advance, thanks for any suggestions 🙂 

Level 3
I managed to get the workaround using SetMicrophoneFilterCallback() working decently. I had to resample the audio (it's 48000 Hz but the lip sync component expects 16000) and amplify it a bit, but my quest avatar is now animating its mouth a fairly well (not as well as with the regular voice capture implementation but better than nothing). I am also seeing a small delay between voice and animation similar to gizmhail. I'm gonna try to tweak it but I can at least verify that it is a viable workaround.

Level 2
Hey all, good to see that the problem is already "known". We are facing the same problem. We can't get Lipsync working together with Photon voice network due to the android limitation...
Can you please explain how to use the workaround together with Photon?
Thank you in advance!

Level 2
Hey !
I am facing the same issue as @worldofvrgmbh, I want to make Photon Voice work with the lip sync on the Oculus avatars for a Quest app. I tried the SetMicrophoneFilterCallback, unckecking the CanOwnMicrophone on the local and remote players but it still doesn't work . I'm probably just doing something wrong but since there is no documentation on how to do it, I can't be 100% sure.
Does someone know if @Ross_Beef's documentation of "the specifics of how this can then be wired up to Lipsync and Avatars in more detail" exist yet ?
Thanks in advance if you can help me on this matter !

PS : weird stuff, we found that the lip sync with photon voice works without any change on the Quest 2, too bad we want it to works on Quest 1 too ^^

@ManzaVision any luck making it work with the quest 1?