Kai-Mikael Jdd-Aro[SMTP:kai@nada.kth.se]
USARI Technical Report xxxx
Joseph Psotka U.S. Army Research Institute
Sonya A. Lewis Howard University and the U.S. Army Research Institute
The experimental
Room and Scene (13k).
June 1995
___________________________
Army Project Number Education and Training
0611102B74F
Technology NOT YET Approved for public release; distribution is unlimited
As the Army modernizes, increasing quantities of digital information about the location of vehicles and personnel and the status of resources become available on the battlefield. Soldiers and leaders must be able to use novel visual displays of information via networked computers to communicate and make decisions. Moreover, these skills must often be used with minimal recent practice. The Army increasingly uses distributed interactive simulation (DIS) and virtual reality in training to improve these and other warfighting skills. Head-mounted displays and other novel computer-generated imagery are increasingly being relied upon for delivering crucial real-time information, as well as for training simulations. How well these displays are cognitively and perceptually interpreted is foundational knowledge for systematic deployment of these powerful new resources. Particularly when synthetic environments are created to mimic reality, as for mission rehearsal, the accuracy of visual perception and cognition will be critical for effective learning and transfer, and minimization of simulator sickness.
Although size and distance estimation are historically venerable areas of psychological research, the last few years of technological development have created unsurpassed novel tools for their exploration in ways that simply were not possible earlier. The development of head-mounted displays has created a new paradigm for research: the exploration of immersion into simulation environments. Immersion is the cognitive conviction that you are located inside the spatial framework of the visual display. The factors that help determine the accuracy of this self-location are the objects of the research reported here.
The results indicate that distance perception and self-location are substantially affected by the display field of view and the computed field of view of the synthetic environment. The errors in self-location may underlie the widespread findings of underestimation of distances in virtual worlds and computer-generated imagery. They may also contribute to a better understanding of the many findings of simulator sickness in realistic tank and helicopter trainers. These results offer a new understanding of the phenomena that may be exploited to create solutions for improving training and real-time use of computer-generated imagery.
This basic research advances several core technologies that will be transferred to more advanced research, such as simulation of dismounted infantry at PM-TRADE Field Unit, and helicopter navigation at STRATA. Future work will have direct bearing on distributed simulator design for "popped hatch" tank simulators, and effective use of motion platforms in simulators, and spatial navigation training. Learning with these new media is significantly faster and more cost effective than with traditional simulation and training approaches.
EDGAR M. JOHNSON
Director
The authors gratefully acknowledge the programming assistance of several people, particularly Richard Gebhard, Jonus Gerrits, and especially Mark Pflaging. The comments and advice of Don King and Bob Seidel were very helpful.
The Army increasingly uses distributed interactive simulation (DIS) and virtual reality in training to improve mission rehearsal and other warfighting skills. Head-mounted displays and other novel computer-generated imagery are increasingly being relied upon for delivering crucial real-time information, as well as for training simulations. How well these displays are cognitively and perceptually interpreted is foundational knowledge for systematic deployment of these powerful new resources. Particularly when synthetic environments are created to mimic reality, as for mission rehearsal, the accuracy of visual perception and cognition will be critical for effective learning and transfer, and minimization of simulator sickness.
Although size and distance estimation are historically venerable areas of psychological research, the last few years of technological development have created unsurpassed novel tools for their exploration in ways that simply were not possible. The development of head-mounted displays has created a new paradigm for research: the exploration of immersion into simulation environments. Immersion is the cognitive conviction that you are located inside the spatial framework of the visual display. The factors that help determine the accuracy of this self-location are the objects of the research reported here.
The accurate location of one's (sometimes virtual) egocenter in a geometric space is of critical importance for immersion technologies. Self-location is a relatively unexplored component of size and distance estimations. This experiment was conducted to investigate the role of field of view (FOV) and observer eye station points (ESP) in the perception of the location of one's egocenter (the personal viewpoint) in virtual space. Fifty students viewed an animated 3D model, either of a similar room to the one where they sat, or of a space of round orbs of unfamiliar size, binocularly, from ESPs of either 1/2, 1, 2, 3, 4, or 5 feet. The display was on an 190 by 245 mm monitor, at a resolution of 320 by 200 pixels with 256 colors. They saw six models of both the room or orbs designed with six geometric field of view (FOVg) conditions of 18, 28, 37, 48, 86, and 140 degrees. They drew the apparent paths of the camera in each model of the room on a bitmap image of the room as seen from infinity above.
The results indicate that distance perception and self-location are substantially affected by the display field of view and the computed field of view of the synthetic environment. The errors in self-location may underlie the widespread findings of underestimation of distances in virtual worlds and computer-generated imagery. They may also contribute to a better understanding of the many findings of simulator sickness in realistic tank and helicopter trainers. These results offer a new understanding of the phenomena that may be exploited to create solutions for improving training and real-time use of computer-generated imagery.
Differences in the paths of the camera were seen as a function of both FOVg and ESP. The width of the perceived path became smaller with larger FOVg, and this relationship between FOVg and pathwidth became increasingly non-linear and changed slope with decreasing FOV (or increasing ESP). There were also non-linearities in the relationship between pathwidth and ESP, but these were only significant for the widest FOVg. These effects on the location of one's egocenter need to become better understood, both to develop a thorough understanding of perceptual size and distance, and to increase our understanding of immersion.
This basic research offers strong evidence that narrow field of view goggles, night vision systems, and helmet mounted displays may result in substantial errors of self-location in simulated environments, that will lead to underestimation of distances. The research advances several core technologies that will be transferred to more advanced research, such as simulation of dismounted infantry at PM-TRADE Field Unit, and helicopter navigation at STRATA. Future work will have direct bearing on distributed simulator design for "popped hatch" tank simulators, and effective use of motion platforms in simulators, and spatial navigation training. Learning with these new media is significantly faster and more cost effective than with traditional simulation and training approaches.
Some work exists that may be helpful to understand the psychology of self-location, also called egocenters (e.g., Howard, 1982; Ono, 1981). Kubovy (1986) provides an insightful description of the use of techniques by Renaissance artists to manipulate the location of virtual egocenters, and thus manipulate attitudes and emotions. Franklin, Tversky, and Coon (1992) have conducted a long series of experiments examining the cues that control placement of point of view in spatial mental models derived from textual descriptions. The theory of off-sized perceptions (Gogel, 1990) is one of the few perceptual theories available to deal with cognitive modifications of perception in a way that emphasizes the importance of self location. According to the phenomenal geometry underlying off-sized perceptions, the localization of objects in space requires a combination of perceived distance, perceived direction, and the perception of the position or motion of the self. Inter-relations like these will be examined in this paper.
While examining 3D CAD models, the senior author discovered some unusual illusions reported in Psotka (1994). By creating a computer model of an experimental room with computers and tables and bookshelves in it, he was able to compare his perceptions of the model displayed by the monitor with his perceptions of the real room, containing that computer displaying that model. One illusion was particularly striking. In it, he observed an animation of the model and tried to imagine where he was and locate his virtual egocenter in the space of the model. He found it was possible to determine the location of this virtual egocenter quite naturally and automatically. It appeared to be located within the space of the model, as if the model were being viewed from the approximate center of the room, turning steadily 360 degrees. It was his perception, confirmed by other observers' reports (Psotka, 1994), that the center of observation of the model was not a stationary point, but an orbit that varied in diameter with the computed or geometric Field of View (FOVg) of the model. This oval or circular orbit was experienced even when the observer knew that the model was constructed with a fixed computed eye station point (ESPg). When the FOVg was less than the FOV of the monitor, then the perceived virtual camera or eye station point (ESPp) seemed to be closer to the objects in the animation, and when the FOVg was greater than the monitor's, the ESPp seemed to be farther than the center from the objects in the animation. Asking students and colleagues to draw the apparent location of these virtual eye station points yielded circles and ovals about the center of the room, the location of the virtual "camera" in the animations.
In those experiments sixteen students and colleagues were asked to view the animations binocularly, with corrected vision, from two viewing station points of 300 mm and 800 mm from the monitor, and determine the location and path of the camera in each animation. The room was normally lit by recessed ceiling lights. They were told that the animation was of the experimental room where they sat. They were shown a bitmap hardcopy of the room from an overhead view and asked to trace the path of the camera on it. They were not specifically told that the geometric "camera" was mathematically or "theoretically" stationary in the animations.
In general, the observers had no difficulty describing the apparent paths of the virtual camera as they saw it as oval paths of varying eccentricity centered on the geometric center of the room, and there was good agreement about the size of the ovals. The diameters of the ovals varied with the focal length of the lens (another way of expressing FOVg). The radius of these ovals in mm for each animation and station point are given in Table 1. A negative number indicates that the virtual egocenter (ESPp) was closer to the objects than the center of the model; and a positive number indicates the camera was seen as farther from the objects than the center of the model. A zero would indicate the geometric center of the room (ESPz). Both viewing station points yielded similar relationships between the radius of motion and the geometric FOVg of the animations (see Table 1).
The eye station points that yield a veridical percept of the "camera" or viewpoint inside the model (ESPp) as stationary can be interpolated from these data. These points (ESPz) are of special potential importance for Virtual Reality researchers, since all of the careful engineering of synthetic environments is really to create conditions for accurate perceptions. By interpolating these points, one can determine where the observers would have seen no camera motion (ESPz).
For the 800 mm ESP, the paths had 0 diameter with 60 degree FOVg or a geometric eye point (ESPg) of approximately 250 mm.
For the 300 mm ESP, the paths had 0 diameter with 80 degree FOVg or a geometric eye point (ESPg) of approximately 150 mm.
These students and colleagues repeatedly remarked that they appeared to be using the frame of the monitor as the frame of reference of their retinal field. That is, they thought of the monitor frame as if it were a full 180 degrees. They also remarked that they were not so much seeing the "camera" as "actually being there", inside the model of the room, seeing the model with their own eyes from inside the model. When asked to describe what was happening, they said they appeared to be contracting their field of attention to the frame of the monitor, and then treating that cognitively as if it were their entire 180 degree visual field.
By taking their descriptions seriously, one can generate a set of predictions for these frame effects on self-location. If they were in fact creating these cognitive frame effects at a processing level, then the geometric eye point of the animation would not be determined by the size of the monitor, but by a cognitive process that somehow computed where the eyepoint would have to be to have the monitor fill the entire 180 degree natural FOV. This is a kind of off-sized perception. At this cognitive level, knowing where the ESP needs to be for a particular familiar object to be a given visual angle, depends on many factors (Gogel, 1990). Explaining how or why this happens is beyond the scope of this paper. Yet, the computations that result appear to fit the empirical data in ways that are difficult to explain any other way. The key appears to be, both phenomenologically and empirically, the ratio between the virtual size of the attended monitor in degrees and the natural full field of view, roughly 180 degrees. The cognitive frame process that "expands" the monitor to 180 degrees may also then expand the geometric eye point of the model by a similar ratio.
If this cognitive frame process were to occur, then the correct ESP for viewing a model is not the ESPg, but ESPg modified by the natural FOV, 180 degrees. In fact, if one proposed that the zero station point (ESPz) is determined by the product of the animation's geometric eye station point (ESPg) times the ratio of 180/FOVg, one could calculate the predicted station points for zero camera motion (ESPz) with:
ESPz = (180/FOVg)*ESPg
For this experiment these predictions are: 8000, 1100, 287, and 50 mm (see Table 1). For these FOVg of 18, 48, 86, and 140 degrees, the predictions are quite close to the empirical observations provided by ten observers who were simply asked to move back and forward to find the station point for least apparent camera motion: 9112, 1092, 291, and 53 mm (see Table 1).
This cognitive frame process relationship, roughly dependent on the natural field of view of 180 degrees, seems to indicate that when the FOVg is approximately 180 degrees, the egocenter is located correctly (i.e., ESPg = ESPz), but when the FOVg is less than approximately 180 degrees, the egocenter is displaced proportionately.
This is very reminiscent of the proportional frame effects found throughout the perceptual literature (Rock, 1975). In these situations objects are shown in reduced vision situations, often monocularly in the dark with only the objects visible hanging in space (e.g. Beall, Loomis, Philbeck, and Fikes, 1995). Objects in smaller frames are judged to be proportionately smaller than objects in larger frames. In fact, there is a powerful tendency to base size judgements on a compromise between the absolute or physical size of an object and its proportional size in the frame. However, it is not only size that may be affected, but apparent distance (Gogel, 1990), or perhaps location of one's virtual egocenter (Kubovy, 1986).
| Computed Geometry of the Model | |||||
|---|---|---|---|---|---|
| FOVg | 18 | 48 | 86 | 140 | |
| ESPg | 800 | 290 | 140 | 40 | |
| Eye Station Point | |||||
| Group 1 - 300 mm | -541.3 | -278.7 | 83.7 | 912.5 | |
| Predicted | -770.0 | -210.0 | 3.3 | 193.3 | |
| Group 2 - 800 mm | -785.0 | -77.5 | 416.3 | 538.8 | |
| Predicted | 720.0 | -76.7 | 242.2 | 582.2 | |
| ESPz Empirical | 9112 | 1092 | 291 | 53 | |
| ESPz Predicted | 8000 | 1100 | 287 | 50 | |
To try to understand why observers did not see the virtual camera as stationary in the virtual room, but instead perceived it to move in an oval or circular path, dependent on FOVg, requires a great deal of speculation. Illusions of this kind have not been much discussed. In imagining themselves to be the camera in the model animation, a great number of cognitive processes that have never been analyzed may come into play. How it is that these observers were even able to make systematic judgments about the position of the virtual camera is not clear, and that these judgments were consistent and correlated with each other is quite mysterious. However, it is well known that people can easily imagine themselves as an allocentric participant in the space of movies or pictures, (Kubovy, 1986) so the mere fact that these observers were able to do it should not be surprising.
Since we have very little data to work with, it seems appropriate to begin creating a theoretical framework from our own introspections. Introspectively, the perceived difference from the center of the room seemed to be proportional to the difference between the ESPz that leads to a veridical perception of the room model and the actual ESP. In the case of an ESP of 800, that actually was equivalent both to the centerpoint of the model room, and the virtual location of the camera in the model. The perceived difference from the center of the room also seemed proportional to the computed FOVg and the expected standard FOV of 180 degrees. If one naively takes these factors as proportional weights to predict the width of the paths drawn, one can use them quantitatively: (ESPz - ESP) * FOVg/180 to generate the predictions of width of path in Table 1 and Figure 1.
The predictions in Table 1 appear to be a modestly good fit for the results, especially for the ESP of 800 mm. The fit for the 300 mm ESP group is not nearly as good, but since at this ESP observers were not actually sitting at the center of the room (the virtual location of the camera in the model) they may have had to do some additional computations cognitively and perceptually, so another factor may be in play in their perceptions. The fit appears to be good enough at least for the initial development of a theoretical framework and the generation of new predictions.
![]() |
The implications of this illusion of size, distance and egocenter location are more easily understood when the FOVg and the monitor's FOV are correctly matched. For the 800 mm ESP, at the smallest FOVg in this experiment (18 degrees) objects in the model displayed on the viewing monitor had roughly a 1:1 size ratio with objects in the real world; yet the impression was not one of being the real world distance (800 mm) from them, but of being very close to them, 785 mm or 98% of the distance closer, in fact. To repeat, for example, at one point in the animation the model of the monitor came into view on the monitor that was being modelled. At a FOVg of 18 degrees, this was very near the actual FOV of the monitor viewed from 800 mm (20 degree FOV). Both the real monitor and the model appeared the same "size", yet the observers knew they were roughly .8 meter from the real monitor, and still they estimated that their virtual selves were only .005 meters from the virtual monitor. The paradox in this illusion is that they also knew that the virtual monitor was being shown on the real monitor, and so they knew that they were the same distance from the real and the virtual image. Why then did they not see their virtual egocenters at the same location as the real egocenter ESP? Modern psychological theory offers few readily acceptable explanations. Our reasoning about a cognitive frame process, frames and 180 degree natural FOV appears to apply to this situation quite well.
The familiar size of objects might be affecting this illusion of virtual egocenter placement (cf. Gogel, 1990). Objects like chairs and tables and monitors have roughly expected sizes or degrees of visual angle from every distance. Egocenter location could be computed from that information. It is possible to redo this experiment with objects that have no familiar size; and even to remove linear perspective cues by using orbs in a spherical room. In the experiment described here, this was carried out, but essentially the same findings occurred. This experiment also tested predictions of the Cognitive Frame theory by systematically varying FOVg (with different computed models) and FOV (by seating observers at different distances.)
A series of experiments by Ellis (McGreevy and Ellis, 1986; Tharp and Ellis, 1990; Nemire and Ellis, 1991) probably indirectly reflects on virtual egocenters. Ellis and McGreevy (1986) discovered a systematic error in pointing the direction of objects in a virtual display. The error was a function of the geometric FOV of the display. They developed a complex model that accurately predicted these errors on the basis of memory for the size and shape of objects and geometric "distortion" based on linear projections. Tharp and Ellis (1990) provided an explanation based on errors of estimation of the pitch and yaw of the viewing direction used to produce the perspective projection. They argued that people have acquired, through experience of observing the world, a way of determining the effects of viewpoint rotations and perspective transformations. People use this experience to build a "table" of perspective transformations relating target azimuth to projected angle. They then use the wrong table. This is a little like saying that people project themselves at the wrong point, and so it may be possible to find an effect on the location of virtual egocenters in these conditions.
The cues that produce these effects are unknown but may have something to do with the relationship of the actual FOV of the display and the computed geometric FOV of the display image (FOVg). When the ratio of FOV/FOVg is greater than 1, the observers may have located the virtual egocenter too near to the objects; and when the ratio of FOV/FOVg is less than 1, the observers may have located the virtual egocenter too far from the objects. It is not clear from their data which case held, but these relationships appear to be appropriate for their results.
The accurate location of one's virtual egocenter in a geometric space is of critical importance for immersion. Furness (1992) and Howlett (1990) report that immersion is only experienced when the field of view (FOV) is greater than 60 degrees, or at least in the 60 to 90 degree FOV range. Why this should be so is not understood, nor are there theoretical frameworks for beginning to understand this phenomenon.
The question of egocenter location is also important for dealing with simulation or motion sickness. Immersion environments are notorious for producing motion sickness, and an inaccurate location of virtual egocenters may be implicated in this noxious effect. Jex (1991) reports that simulator sickness is hardly ever felt with FOV less than 60 degrees (the complement of immersion FOV). Perhaps a key variable is the quality of immersion and the accuracy of self-localization. Informal comments by users of immersion environments have yielded many descriptions of surprising errors of self-localization (Henry and Furness, 1993). As a start this research begins to explore how egocenters are determined from perceptual arrays.
An accurate model of an office was constructed using 3D Studio on a 386 PC with VGA graphics. The model contained walls, floor, and ceiling, three tables with computers and displays, two bookshelves with empty shelves, and two wastebaskets in the room. It was rendered with Phong shading at 320 by 200 pixels with 256 colors, and looked like a reasonable cartoon of the actual office holding the equipment (see Psotka, 1994).
Animations of this model were then created from the point of view of a stationary camera located at the geometric center of the room panning slowly 360 degrees around the room. Six animations were created with six different computed lenses for the scene: 17, 28, 50, 65, 85, and 135 mm. The geometric field of view (FOVg) for each of these lenses was: 140, 86, 48, 38, 28 and 18 degrees, respectively, where 140 degrees is similar to a fish-eye lens and 18 degrees is a telephoto view. The animations were viewed on a flat screen Zenith monitor whose screen dimensions were 190 by 245 mm. Students viewed the animations from six ESP locations 15, 31, 62, 93, 124, and 156 cm from the screen. At those sites the screen subtended FOVs ranging from 9.3 to 78.5 degrees, approximately. FOV is calculated by 2 times atan(.5 width of monitor/distance of eye point). Although their heads were not restrained mechanically, the students were asked and generally managed to hold their positions reasonably well.
The geometric eye point (ESPg) of each of these lenses with this particular monitor ranged from 40 to 800 mm in the room. These projection points are independent of the viewer's location. They are dependent on the actual size of the viewing screen and the geometry of the model. Thus only the nearest three viewing sites for the students fell approximately in the range of the geometric eye points computed for these scenes.
Fifty students from Psychology classes at Howard University were asked to view the animations binocularly, with corrected vision, and determine the location and path of the camera in each animation. They were seated in a 8 feet by 8 feet experimental room. The room was normally lit by recessed ceiling lights. They were first asked to estimate their seating position by placing an X on an overhead planview drawing of the room. This allocentric distance estimation procedure specifically asked them to look around the room to determine their position, especially relative to the monitor directly ahead of them, on which they would observe some animations. They were asked to observe moving images on a monitor directly ahead of them, and to draw the path the camera takes on a sheet of paper on a clip board. They were told that the animation was of another experimental room like the one where they sat. They were shown a bitmap hardcopy of the room from an overhead view and asked to mark the location and trace the path of the camera on it. They were not specifically told that the geometric "camera" was mathematically or "theoretically" stationary in the animations. They were asked to imagine that they were the camera as they watched the animation. They were asked to try to imagine the camera path that generated the animation, and ignore any up or down movement in the camera, and concentrate on the camera's position in the scene as they would see it from above. They were then asked to draw the camera path and they could check the path again, by looking at the animation as they drew. They were also asked to mark an X where the camera was when it pointed at the monitor in front of them when the animation began.
They were asked to sit in a chair and hold their head as steady as they can in the same position throughout the experiment. They were told we would not use any head restraint, but were asked to maintain the same position throughout the experiment.
There were 6 groups of 8 subjects. Each group sat at only one ESP throughout the experiment. One half saw all five computed fields of view of the Experimental Room (ER) twice before they see the five views of the Orbs Space (OS) twice. The other half saw the OS first. Each sequence of observations followed a latin square counterbalanced design of the presentation orders that put every FOV at every order of presentation once, and only once, at each viewing distance. It took an average of one hour to complete the experiment.
![]() |
Figure 2 gives the allocentric distance estimates the observers made of their location in the experimental room, as a function of the FOV of the monitor they were observing. Their distance estimates were linear with FOV, and non-linear with distance from the monitor. As others have pointed out (e.g. Predebon, 1994; Gogel, 1976) visual angle in degrees is an accurate, linear index of size and distance.
The observers could not accurately locate the position of the camera when it was pointed at the monitor in the model. The measure was taken to provide a way of distinguishing when the apparent path of the "camera" was closer to the objects than the center of the room, from instances when it was perceived to be farther from the objects than the center of the room. However, the observers could not consistently distinguish between these instances. It is not that they responded haphazardly, but so many observers simply did not understand or respond to these instructions properly, that the whole question had to be eliminated from our analysis.
As in our earlier experiments, these observers too saw the path of the virtual camera as ovals whose width was dependent both on the computed FOVg of the model and on their seating position (or screen FOV).
An analysis of variance performed on the width of the drawn path of the room model for all six groups of seating positions and six FOVg within those six groups found significant effects of FOVg (5, 25 d.f., F = 70.181, p < .0001). Because of the non-linear relationship with FOVg, the effects of seating distance were not significant (5, 44 d.f., F = 2.116, p > . 05). However the interaction between seating distance and FOVg was significant (25, 220 d.f., F = 2.821, p < .0001). This is evident in the fan-shaped lines of Figure 3, and in Figure 6. Because of the experimental design, the lines in Figure 3 show within-subject differences, whereas those in Figure 6 are between-subject differences.
The results with the round room full of orbs were virtually identical (see Figures 5 and 7). An analysis of variance performed on the width of the drawn path of the orbs model for all six groups of seating positions and six FOVg within those six groups found significant effects of FOVg (5, 25 d.f., F = 104.913, p < .0001). Because of the non-linear relationship with FOVg, the effects of seating distance were not significant (5, 44 d.f., F = 1.588, p > .05). However the interaction between seating distance and FOVg was significant (25,220 d.f., F = 1.569, p < .05). Because of the experimental design, the lines in Figure 5 show within-subject differences, whereas those in Figure 7 are between-subject differences.
Table 2 provides the numeric results for the room model, shown in Figure 5, as well as the predictions based on an analysis of Cognitive Frame effects.
![]() |
![]() |
| Computed Geometry of the Model | ||||||
|---|---|---|---|---|---|---|
| FOVg (deg) | 18 | 28 | 38 | 48 | 86 | 140 |
| ESPg (cm) | 773.4 | 491 | 366 | 275.1 | 131.5 | 44.5 |
| Eye Station Point | ||||||
| 150 mm | 586.1 | 534.8 | 505.8 | 491.9 | 407.1 | 290.3 |
| Predicted | 758.4 | 467.7 | 335.0 | 235.1 | 59.8 | 72.1 |
| 310 mm | 560.5 | 488.3 | 457.4 | 448.0 | 295.9 | 317.8 |
| Predicted | 742.4 | 442.7 | 302.2 | 192.4 | 16.6 | 196.6 |
| 620 mm | 568.9 | 496.1 | 434.8 | 380.7 | 336.7 | 309.0 |
| Predicted | 711.4 | 394.5 | 238.5 | 109.7 | 164.7 | 437.7 |
| 930 mm | 559.4 | 455.6 | 458.1 | 482.8 | 392.8 | 386.9 |
| Predicted | 680.4 | 346.3 | 174.8 | 27.1 | 312.8 | 678.8 |
| 1240 mm | 558.4 | 510.1 | 487.5 | 488.9 | 410.8 | 396.9 |
| Predicted | 649.4 | 298.1 | 111.1 | 55.6 | 460.9 | 919.9 |
| 1560 mm | 531.0 | 515.6 | 471.1 | 426.6 | 367.8 | 469.0 |
| Predicted | 617.4 | 248.3 | 45.3 | 140.9 | 613.8 | 1168.8 |
![]() |
![]() |
![]() |
The first order predictions of Cognitive Frame Theory were verified. There were powerful effects of FOVg and screen FOV on the width of the apparent path of one's self-location in the virtual space. A comparison of Figures 3 and 4 and 5 reveals that the quantitative predictions of the theory were not precisely upheld in this experiment. Perhaps, some of the difficulty might have arisen because the modelled room was not exactly identical to the experimental room. In the earlier experiment, more precise predictions were obtained when viewers sat in the same place in the real room as the camera location in the model.
Some second order, qualitative components of the predictions appear to be upheld. For instance, the theory predicts that only the widest FOVg will produce a monotonic change in the perceived width of the camera path of self-location, while the smallest FOVg will have the most curvilinear path. These predictions are upheld in the data. The data appear to be much noisier than the predictions, and with reduced variation between conditions. Part of this difficulty may simply stem from the difficulty of the task, especially converting apparent distance estimates into planview components.
The relative effects of computed FOV versus real FOV can be examined by comparing Figures 3 and 5 versus Figures 6 and 7. Although the range of the two kinds of FOV overlap considerably (18 to 140 or a 7.8 factor for Real FOV; and 9 to 80 or a 8.9 factor for computed FOVg), and are each roughly tenfold, their effects are noticeably different. Computed FOV (FOVg) has a pronounced main effect on the apparent path of the viewpoint, and a strong interaction effect at its upper range. Real FOV appears quite constant in the upper half of its range, or at least has a small and indeterminate effect, and interacts strongly with FOVg in the lower half. The strong interaction effects make it difficult to create a general rule to describe each of their isolated main effects. This is particularly true of the effects of real FOV, where, for small FOV, apparent width of the path appears to change unpredictably with increasing FOV, for all levels of FOVg (cf. Figures 6 and 7).
Adopting an alternative viewpoint as when one views a picture and enters the space of the picture, seems relatively straightforward, but it may involve some very difficult cognitive processes and transformations. In the initial part of this experiment students judged their distance from the viewing monitor by placing a mark on an overhead view of the room in which they sat. The resulting scatterplot of judged distances as a function of real distance from the monitor shows very large dispersion for such a simple task (See Figure 1). Estimating the allocentric distance of objects in a planview form may require even greater processing difficulties.
The kind of activity modelled in these animations, turning your head in place to scan a room, is a very frequent activity in real life and in virtual reality displays. It is far from certain that the findings of these experiments apply to HMD environments, but it is quite plausible to assume that they will. HMDs generally have FOVs that are far under 180 degrees, often only 60 or 90 degrees. FOVg is usually set to equal FOV, under the assumption that setting will produce the most veridical perception. However, this experiment fundamentally undermines that assumption. This research indicates that these conditions yield apparent motion of the viewpoint in eccentric paths that might be quite disturbing. It may of course be, that people quickly adapt to these effects, by cognitively over-riding them, or using other mechanisms of adaptation. But, if such apparent head motion continued to exist, it would provide the essential conditions for simulator sickness.
Although no one became nauseous, everyone reported some degree of discomfort with viewing the displays larger than 60 degrees FOVg, especially the largest. Several people asked to look away from the 140 FOVg display to reorient themselves during the experiment.
After this research was developed, similar findings in another paradigm were made available (Wright, 1995). In that research, relatively experienced helicopter pilots created magnitude production estimates of forward and lateral distance, height and speed, by flying virtual helicopters in a simulated scene. The pilots were asked to fly specific distances (in meters) or relative distances to fixed objects (e.g., .25 of the distance to the first tower). Their feedback came from changes in first person viewpoint. Using a high resolution helmet, the pilots were offered a 65 degree vertical by 125 degree horizontal FOV. Although FOVg was not specified in the experimental report, it was perhaps also 125 degrees FOVg. Given this reduced FOV (roughly .69 of the normal 180-degree FOV) one would predict on the basis of these experiments that errors of self location would make pilots think that the space of the simulation is compressed by a factor of .69. If this is the case, their movements should all be underestimates by this same factor of .69. In fact, the main effect was a drastic underestimation: .41 for forward distance and speed perception; .5 for lateral distance; and .72 for height. Only the results for height match the predictions of the Cognitive Frame hypothesis, and this may be related to the fact that only vertical motion results in no change in self location from the reference distances of the landmark towers used in this experiment. These accuracies contrast with typical real-world accuracies in similar tasks (Denz, Palmer, and Ellis, 1980; Ungs and Sangal, 1990) of .9 roughly. Even taking into account this real world compression of .9, Wright's experimental findings are even more extreme than those predicted from the experiments reported here, but they are substantially in the predicted direction. The differences between forward and lateral motion and vertical motion are not explained. The complicating factor that makes these predictions difficult is that pilots' motion and at least two self-locations are involved. More research will be needed to understand these complex interactions.
DISCLAIMER: This does not represent the official view of ARI.
Thanks to Kai-Mikael Jdd-Aro[SMTP:kai@nada.kth.se] with help in encoding this into HTML!