Speech Recognition in GEAR VR. Tips, Tools & Tricks? — Oculus
Welcome to the Oculus Developer Forums!

Your participation on the forum is subject to the Oculus Code of Conduct.

In general, please be respectful and kind. If you violate the Oculus Code of Conduct, your access to the developer forums may be revoked at the discretion of Oculus staff.

Speech Recognition in GEAR VR. Tips, Tools & Tricks?

SilkWebwareSilkWebware Posts: 11
NerveGear
edited March 2016 in Oculus Go Development
Hi,

Has anyone tried using speech recognition in a GEAR VR app?
Is it even possible with the CPU constraints? My graphics are such that I can afford to drop the timeWarp vsync rate to support 30Hz if necessary.

I'm looking for tips, tricks and tools.

Thanks in advance.
Nick
Gear VR development blog at https://gearvrdevelopment.wordpress.com/
The first article is to help new comers set up their development environment using Unity 5.3
https://www.linkedin.com/in/nickharrow Nick Harrow (I'm always open to new opportunites)

Comments

  • leftlerleftler Posts: 103
    Brain Burst
    I know the Samsung Internet Beta uses it as an alternative for the keyboard input.

    Also in the Oculus Sample Framework for Unity 5 there is a keyboard example, I have not had the time to look at it yet but it may contain a voice recognition part too (again, I have no idea if it does, this is just speculation).
  • SilkWebwareSilkWebware Posts: 11
    NerveGear
    Thanks for your reply.

    Yes, I've seen the Samsung Internet Beta. Graphically it's not very intense.

    I guess I'll have a play and see what I can do. I'll report my findings back here.
    Gear VR development blog at https://gearvrdevelopment.wordpress.com/
    The first article is to help new comers set up their development environment using Unity 5.3
    https://www.linkedin.com/in/nickharrow Nick Harrow (I'm always open to new opportunites)
  • SilkWebwareSilkWebware Posts: 11
    NerveGear
    Before, when I asked the original question, I though this was going to be trivial.

    5 days research and many dead ends. The lag when using google speech recognition is way to long for a VR game and the wrappers I've tried from the Unity asset store don't manage to cut out the "beep" sound when listening is activated.

    I've also been down the road of an asset called "Word Detection" which matches learned sound profile with words and tries to match them from a preset dictionary. It's not accurate enough and performance is an issue.

    Really would like to know if anyone is working on a solution. All I really need is a solution that can quickly, efficiently, locally and accurately distinguish between words in a list without it needing to learn. (Generic English word distinguishing). For example "One, Two, Three, Four, Five"

    Maybe too much to ask for at the moment?

    Nick
    Gear VR development blog at https://gearvrdevelopment.wordpress.com/
    The first article is to help new comers set up their development environment using Unity 5.3
    https://www.linkedin.com/in/nickharrow Nick Harrow (I'm always open to new opportunites)
  • 8bit8bit Posts: 94 Oculus Start Member
    Hey @SilkWebware,

    Thanks for sharing your experiences in pursuing this. I also am interested in this and would like to hear of any solutions people have found for voice recognition on mobile.

    I have some experience pulling the sound data using FMOD on a native Android app on gear. The polling was fast enough and FMOD had some nice built in FFT functionality that seemed efficient enough. Of course the hard work is actually tracking and comparing to see what people have "said". I simply used the data to detect when users made a loud sound and the number of loud sounds they made... This required some basic processing but not like real voice detection. I was detecting when they "bark" and I needed clean, independent bark detection :) Anyway, it worked really well when users had a headset with microphone but not so well without a microphone plugged in (mostly due to outside noise). So from my experience I would say the hardware is capable of listening fast enough and of doing the FFT per frame to pull the data but I haven't gone as far as doing the actual "smart" part of it. My basic understanding is that you need to do two analysis, one for consonants and one for vowels and mush them together to do your speech recognition. Lol, do you like my official wording - mush!

    Sorry I could not be of more help but I do hope a solution is out there and accessible.

    Oh and if you're curious here's a blog about my last milestone where I implemented it.
    Probably not that useful but I don't know, maybe you'll get something from it...
    http://vrdevblog.com/2016/02/05/milesto ... game-flow/
    Since this time I have actually pivoted so probably not going to pursue speech recognition at the moment but totally interested for when I return to a project like this.

    Good luck!
  • SilkWebwareSilkWebware Posts: 11
    NerveGear
    @8bit

    Thanks for the reply and sharing your experience too.
    Audio processing is not a strong point of mine but it probably will be in 2 months time.. lol.

    I don't give give when I have a problem like this so I'll be sure to update this thread as I discover any new avenues.

    Thanks for the link to your blog too. Little bits of inspiration, like that, really help.

    Regards
    Nick
    Gear VR development blog at https://gearvrdevelopment.wordpress.com/
    The first article is to help new comers set up their development environment using Unity 5.3
    https://www.linkedin.com/in/nickharrow Nick Harrow (I'm always open to new opportunites)
  • leftlerleftler Posts: 103
    Brain Burst
    Oh, almost forgot. You may want to hit a email up to the devs who made "Daydream Blue", the robot in it has basic voice recognition (but it only uses a word bank of about 6 words).
  • SilkWebwareSilkWebware Posts: 11
    NerveGear
    @leftler

    Thanks. Yes, that's all I really need. The ability to distinguish between a set of words. Even if that means only distinguishing the first uttered phoneme at the start of a particular word.

    Nick
    Gear VR development blog at https://gearvrdevelopment.wordpress.com/
    The first article is to help new comers set up their development environment using Unity 5.3
    https://www.linkedin.com/in/nickharrow Nick Harrow (I'm always open to new opportunites)
  • ogiogi Posts: 1
    My company is working on on-device ASR, we recently launched iOS framework and plan to address Android and implement a number of optimizations later this year. More details at http://keenresearch.com

    If any of you are interested in using ASR, please contact me directly at ogi@keenresearch.com
  • CorvusVRCorvusVR Posts: 53
    Hiro Protagonist
  • delphinius81delphinius81 Posts: 297
    Nexus 6
    For SR, pocketsphinx (android version of cmusphinx) is a good choice. It's a pain to get built and integrated with Unity though - someone on a project a worked on at a previous job spent weeks getting it working, but he was Unity novice. However, if you can make it through the build process for pocketsphinx android and know how to create a unity plugin, it's rather powerful. Plus, it will run locally on device, so you don't have to worry about network latency. Recognition times were pretty quick, even on mobile, but that is largely because you provide the (limited) grammar for it to recognize.

    Honestly, if you managed to get it built and working as a Unity plugin, I'd love to use it too! :)
  • play_eduplay_edu Posts: 13
    NerveGear
    leftler said:
    I know the Samsung Internet Beta uses it as an alternative for the keyboard input.

    Also in the Oculus Sample Framework for Unity 5 there is a keyboard example, I have not had the time to look at it yet but it may contain a voice recognition part too (again, I have no idea if it does, this is just speculation).
    in Oculus Sample Framework for Unity 5 there is only keyboard demo, not voice API demo.
    also, I try many other demos but not found any Speech Recognition demo. So I mad android native voice API Plugin but  Plugin work with gear VR not with oculus go.  any buddy has the solution for Oculus Go. If need then I can ready to parches the plugin. sorry for my bad English.
  • play_eduplay_edu Posts: 13
    NerveGear
    edited July 2018
    leftler said:

  • play_eduplay_edu Posts: 13
    NerveGear
    ...
  • rdmitchell1997rdmitchell1997 Posts: 53 Oculus Start Member
    I would definitely take a look at IBM Watson and what that as a service has to offer. it is incredibly versatile and allows speech recognition and also uses machine learning to create a natural speech system, for example, if you programmed it to listen for the word 'Hi' the machine learning would also know that the word 'Hiya' and 'Hello' had the same meaning.

    it is quite hard to explain but the IBM Watson Sandbox demo explains it best. hope this helps!
    Sincerely,

    Robert Mitchell

    Email: rdmitchell@blueyonder.co.uk
    Website: http://vr.port.ac.uk/
  • play_eduplay_edu Posts: 13
    NerveGear
    edited July 2018
    HI,
    thanks for replay.
    IBM Watson or google voice Sevice provide nice API but its only work with internet not offline. Android device speech engine works great and also offline and free but its work on Android devices(gear vr) not on oculus go. 

    Best regards,
    Play_edu
  • SmartDigitalNetworksSmartDigitalNetworks Posts: 3
    NerveGear
    I wonder if part of the solution is buried in the Audio SDK in the Oculus Lipsync Code. Seem there is some code listening for 15 visemes. Maybe this function can be rewritten to do other things when it hears the 15 sounds and or maybe even change what it is listening for. https://developer.oculus.com/documentation/audiosdk/latest/concepts/audio-ovrlipsync-sample-details/

    Other Oculus Lipsync Scripts

    OVRLipSyncMicInput is for use with a GameObject which has an AudioSource attached to it. It takes input from any attached microphone and pipes it through the AudioSource.

    We recommend looking at the other scripts and prefabs included with this integration. They will provide more insight as to what is possible with Oculus Lipsync. We include, for example, some helper scripts to facilitate easy on-screen (in VR) debugging.

  • motorsepmotorsep Posts: 1,362 Oculus Start Member
    @SmartDigitalNetworks

    I don't believe it's supported in UE4 and it's only to do lipsync. It's not going to recognize voice.
  • NishantbtpsNishantbtps Posts: 2
    NerveGear
    Can anyone Give a solution for using Android Speech engine in Gear Vr?? 
Sign In or Register to comment.