What is the realism of a 3D video call? What is artificiality that can make our video calls uncomfortable? Is it the same as with 3D computer games or movies?
We think it is related to the “avatar approach” to 3D communication. This approach is a classic: all participants of a call have their own avatars and only the expressions, movements and sound are transmitted. Then they are applied to avatars and displayed to users through XR headset (for instance).
This approach can be perfected, of course, like in the “codec avatars — Facebook 2020 Research: Photorealistic Avatars & Full Body Tracking, where the description of expression and movement is maxed to cover as much as possible.
Still, maybe the avatar approach itself is an obstacle for realism. In the process of translating the image of a person as seen by camera into the description of expression and applying it back to the avatar, the momentous reality is travelling through too many layers of complex transformations. Ultimate realism is the one with zero transformations.
Why do we complain if there is not enough realism in communication? Isn’t it that we prefer to look beautiful rather than realistic? Don’t we need high quality and resolution of the image rather than realism? Actually, we want all these and realism :-), as a low resolution realistic experience can be better than a too artificial one in high resolution!
To have comfortable and efficient communication, we can use any nonverbal cues, which aid our message. We don’t know what all the elements of such nonverbal content are, but the more realistically we reproduce the remote person, the less nonverbal content will be lost. As humans, we have very powerful 3D perception. So, if we display the other person in 3D well, it should be very natural and comfortable for us to perceive her/him.How to achieve realism in 3D video calls? In the end, the quality of remote person reproduction is defined by how much information we collect, transfer and reproduce. The collection and reproduction are defined by the available cameras, depth sensors and 3D headsets. The transfer is defined by the information we lose in the process of packing-sending-unpacking between the camera of person A and the 3D headset of person B. We believe that transferring as much information as possible with minimal possible touching is the key to realism. Instead of layered transformation we try to locally recover the information which is broken or missing.
When will 3D video calls arrive? We work hard to bring them to you soon. The XR/VR headsets are becoming available, current smartphones include high quality cameras, 3D sensors and powerful processors and GPUs, and we have the vision and the technology to make 3D calls real.