It may sound a little nutty, but if the focus is to provide interconnected virtual worlds over the internet, then surely we need a common language? VRML was a fantastic failure in the 90s (I had a friend who worked for sub-club.com), but now with focus on web VR and facebook, then it could be time for a revival.
"X3D is a royalty-free ISO standard XML-based file format for representing 3D computer graphics. It is successor to the Virtual Reality Modeling Language (VRML)."
The HTTP can well include binary data, not only textual. You don't have to use text formats. But even if you do, there's open formats such as wavefront obj.