The survey involved 5616 American consumers aged 18-54. The results showed that 69 % people view video with sound off in public places and 25 % watch with sound off on private places. As many as 80% of consumers said they were more likely to watch a video to completion if captions were available. Half of the respondents then emphasized that subtitles were important to them because they usually watch videos with the sound turned off.
Adding subtitles to your videos does not have to be complicated nor expensive. Thanks to the technology of Automatic Speech Recognition (ASR), the online application Beey.io offers a fast and easy way to transcribe videos, interviews, podcasts and other audio or video files into text.
Immersive content has become a popular medium for storytelling. This type of content is typically accessed via a head-mounted visual display within which the viewer is located at the center of the action with the freedom to look around and explore the scene. The criteria for subtitle position for immersive media still need to be defined. Guiding mechanisms are necessary for circumstances in which the speakers are not visible and viewers, lacking an audio cue, require visual information to guide them through the virtual scene. The aim of this reception study is to compare different subtitling strategies: always-visible position to fixed-position and arrows to radar. To do this, feedback on preferences, immersion (using the IPQ questionnaire) and head movements was gathered from 40 participants (20 hearing and 20 hard of hearing). Results show that always-visible subtitles with arrows are the preferred option. Always-visible and arrows achieved higher scores in the IPQ questionnaire than fixed-position and radar. Head-movement patterns show that participants move more freely when the subtitles are always-visible than when they are in a fixed position, meaning that with always-visible subtitles the experience is more realistic, because the viewers do not feel constrained by the implementation of subtitles.
Subtitles have become an intrinsic part of audiovisual content; what Díaz-Cintas (2013, 2014) has labeled as the commoditization of subtitling. Nowadays, subtitles can be easily accessed on digital televisions, video-on-demand platforms, video games and so on, with just one click. The number of subtitle consumers is also increasing and their needs and reasons for the use of the service are varied: they are learning a new language, watching television at night and do not want to disturb their children, commuting on a train and do not have headphones, they have hearing loss and are using subtitles in order to better understand the content, or as non-native speakers, they simply prefer to watch content in its original language but with subtitles to assist.
Research on subtitles in immersive media is relatively recent. Researchers in this field have highlighted several of the challenges being faced when designing and implementing subtitles in immersive media (Rothe et al., 2018; Brown et al., 2018). Firstly, the position of the subtitles needs to be defined. Subtitles on a traditional screen are already standardized and usually located at the bottom-center of the screen which is static. However, the field of view (FoV) in 360º content is dynamic and viewers can decide where to look at any time during the scene. Therefore, the position of the subtitles needs to be carefully defined so as to avoid loss of content during the experience.
The subtitles need to be located in a comfortable field of view (CFoV), that is, a safe area that is guaranteed to be visible for its users. If subtitles overlay the CFoV, then they will be cropped and therefore unintelligible. In addition, a guiding mechanism needs to be included to enhance accessibility. If the speaker is outside the FoV, persons with hearing loss (for whom the audio cues are not always helpful) will need a guiding system which will indicate where to look. As 360º content aims to provide an immersive experience, subtitles must therefore be created in a way that does not disrupt this immersion. Finally, some viewers suffer from VR sickness or dizziness when watching VR content. The design of the subtitles should not worsen this negative effect, and the subtitles should be easy to read.
In the Immersive Accessibility (ImAc) project, some preliminary studies have approached this topic to gather feedback from users before developing a solution (Agulló et al., 2018; Agulló & Matamala, 2019). In a focus group carried out in Spain, participants with hearing loss were asked how they would like to receive subtitles in 360º videos. They agreed that they would like them to be as similar as possible to those shown on traditional screens. They also stated that they would like the subtitles to be bottom-center of their vantage point and always in front of them. The participants also highlighted the importance of using the current Spanish standard for SDH (AENOR, 2003). Regarding directions, participants suggested the inclusion of arrows, text in brackets (to the left and to the right), and a compass or radar to indicate where the speakers are in the scene (Agulló & Matamala, 2019). In a different preliminary study, feedback from a limited number of users was gathered regarding the CFoV and two guiding mechanisms (icons representing arrows and a compass). Results showed that the users preferred the arrows as a guiding mechanism and the largest font in the CFoV because it was easier to read (Agulló et al., 2018).
Some reception studies have already been conducted regarding subtitles in immersive media (Rothe et al., 2018; Brown et al., 2018). The BBC Research & Development team proposed four behaviors for subtitles in 360º videos based on previous literature and design considerations. The four behaviors are:
The team tested the four different options with 24 hearing participants, using six clips which lasted 1 to 2 minutes. The Static-Follow behavior was preferred by participants because the subtitles were considered easy to locate and gave participants the freedom to move around the scene. Some issues were highlighted, such as obstruction (a black background box was used) and VR sickness (Brown et al., 2018). Rothe et al. (2018) conducted another study with 34 hearing participants comparing two positions: (a) static subtitles, which were always visible in front of the viewer following their head movements; and (b) dynamic subtitles, which were placed near the speaker in a fixed position. Participants did not state a clear preference in this study; however, dynamic subtitles performed better regarding workload, VR sickness, and immersion.
Current solutions carried out by broadcasters such as NYT or BBC are mainly burnt-in subtitles, spaced evenly in the 360º sphere, every 120º. An example can be watched in the short documentary The Displaced by NYT (The New York Times, Within et al., 2015). In other media such as video games, subtitles that are always visible in front of the viewer are the most frequently used (Sidenmark et al., 2019). Different solutions are being tested and implemented by content creators, but a consensus on which option works best has not yet been reached.
This study aims to further clarify which subtitles are more suitable for immersive environments and for all kinds of users, including hearing participants and participants with hearing loss. With this aim, the current solutions implemented by the main broadcasters (fixed subtitles located at 120º) are compared to the solutions developed in the ImAc project (always-visible subtitles and guiding mechanisms). Please note that fixed-position subtitles in ImAc terminology are referred to as 120-Degree by Brown et al. (2017). In Rothe et al. (2018), dynamic subtitles are fixed in one position close to the speaker. Therefore, the implementation is different. Always-visible subtitles in ImAc terminology are equivalent to Static-follow by Brown et al. (2017) and to static subtitles by Rothe et al. (2018). This study contains a higher number of participants (40) than in previous research and includes a higher number of participants with hearing loss (20) in the sample. The subtitles were developed following SDH features (AENOR, 2003), and unprecedented research in the area of subtitle studies was made in the testing of guiding mechanisms in 360º videos. An additional contribution of this study is that longer content is used to better measure the preferences and immersion of its participants. In the following sections, the study and the results are presented.
The experiment was carried out in one session divided into two parts. In the first part, the position of the subtitles was tested. In the second part, the guiding methods were tested. A within-subject design was used to test the different conditions. Each participant was asked to watch an acclimation clip along with four other clips (see the Stimuli section). The clips were randomly presented with different conditions (fixed-position, always-visible, arrow and radar). Both the clips (except for the guiding methods part) and conditions were randomized among the participants to avoid a learning effect. The video used for the guiding methods part was a short science fiction movie, called I, Philip. The duration of this movie was around 12 minutes and was cut into two parts in order to test the two conditions (arrows and radar). However, the order of the clips was not altered, as they were contained within the narrative of the movie and the participants would not have understood the story. For this reason, only the conditions were randomized. Twenty participants watched each clip/condition combination.
An Apache webserver was installed on a pc and was set-up for the evaluation in order to host the player resources and media assets (360º videos and subtitles). A Samsung Gear VR with Samsung Galaxy S7 was used. The videos were accessible via a url that was directed to the server resources. 781b155fdc