Artwork by Kevin Hughes(This email address is being protected from spambots. You need JavaScript enabled to view it.),Enterprise Integration Technologies.
Initial Survey of Reality Modelling Language Access Issues (Draft 1b)
Christopher Serflek
- Introduction
- HyperText Markup Language
- Virtual Reality Modeling Language
- Previous Access Solutions
- Proposed Solutions
- Conclusion
- References
INTRODUCTION
There is great interest in the Augmentative Technology (AT) community forthe potential benefits that virtual reality may offer [i]. There have been several instances in which a customized solution has beendeveloped successfully to meet a specific use[ii]. Currentlythere is a emerging standard for a networked virtual environment named VirtualReality Modeling Language (VRML).
The focus of this paper is to explore the possible access issues of thisemerging standard. I will demonstrate the access issues which cannot be met byexisting access technology. Further, I will explicate the fundamental reasonswhy this technology could not simply be modified for this new environment.Several potential access solutions will be discussed, as well as theirshortcomings and general inappropriateness. Strongly required is a conceptualframework from which new access technology can be derived, and for presentationto researchers outside the AT community for comprehending the needs of personswith disabilities. Following, I will briefly outline conceptual problems withthe general approach to virtual environments for persons with disabilities. Finally, I will propose a course of action that is congruent with the proposedframework for meeting the new challenges.
HYPERTEXT MARKUP LANGUAGE
Before proceeding towards an introduction of VRML, the reader requires abase understanding of HyperText Markup Language (HTML). HTML was developed in1989 at CERN for facilitating the construction of hypertext documents on theInternet [iii]. HTML in combination with HyperText TransferProtocol (HTTP), a Uniform Resource Locator (URL), and client software form anew way of accessing information on the Internet called the World Wide Web (WWW)or simply the Web [iv]. Subsequent extensions to HTML madeprovisions for the inclusion of images, sounds, and now tables.
The Web is very successful and popular as a means for distributinginformation world wide on a network. The reason for this is simple. First, themultimedia documents of the Web are very aesthetically pleasing. Second, asimple point and click hypermedia document system, in combination with a simpleGraphical User Interface (GUI) and a universal way of referring to information,the Uniform Resource Locator (URL) makes navigating the Web very simple andintuitive.
To date, the AT community has been successful in allowing access by eitherfinding provisions within HTML or recommending provisions for future HTMLversions. Further, there has been a reasonable degree of success in utilizingprevious access technology to facilitate users interaction with the clientsoftware and documents [v]. Beyond the scope of this paper isa thorough examination of trends which would suggest that this level of accessmay become more complex in the future. However, some of these complications aredirectly related to the introduction of VRML.
VIRTUAL REALITY MODELING LANGUAGE
Virtual Reality Modeling Language is related to HTML by its sharing ofHTTP, URL's, and initial dependence on an HTML browser. Yet, the constructionof documents between HTML and VRML are separate and distinct. VRML is a subsetof Silicon Graphics Inc. (SGI) Open Inventor three dimensional (3D) graphicsobject-oriented programming language with added network extensions[vi]. There are several reasons why such an undertaking wasinitiated. First, there is no provision within HTML for the description orinclusion of 3D objects or 3D representation of information. This omission fromHTML excludes both the possibility of creating a standard 3D environment inwhich the objects have behavioral attributes and additionally the ability forusers to interactively manipulate objects.
An example of one project that is currently being developed is a museumwhich allows users to learn about the Holocaust [vii]. Inanother example, chemists are utilizing this emerging technology for viewing 3Dchemical models remotely. This eliminates the necessity of close geographicproximity for collaborative purposes or learning [viii].
I will now proceed into a brief discussion of how VRML 1.0 is implemented,presenting only the most basic elements of this specification and the aspectsthat are directly relevant to the accessibility discussion. For a completediscussion of the full VRML 1.0 specification the reader is directed to view thecomplete specification.
The initial specification is an intentionally simplistic one. The reasonbeing that this should facilitate rapid implementation of this emergingstandard, and allow for the rapid development of client browser software. Inaccordance with the simplicity is the primary reliance on simple geometricobjects for the construction of all objects. So, the primitive sphere, cube,and cones are combined to form more complex objects. Following is an example ofthe source code for creating a cube.
FILE FORMAT/DEFAULTS Cube { width 2 # SFFloat height 2 # SFFloat depth 2 # SFFloat }
From the VRML Version 1.0Specification (Draft) [ix].
Various properties can be assigned to these objects such as color anddullness. Graphic image files can be mapped onto the objects to form the visualappearance of textures. Further, various MIME types can be associated with theobjects, such as inline graphics and sound files.
Initially interaction with the objects is quite limited. First, the objectsthemselves will be static, such that the user will be presented with only a twodimensional image of a three dimensional scene. The type of interaction theuser will have with the image is one of "point and click". Forexample, a user can click on the visual representation of an object causingeither a new image to be presented, initiating a link to a new virtualenvironment, or linking to an HTML document.. Intended for the VRML 2.0specification is the inclusion of behaviors for the 3D objects. This will beboth a large step towards the creation of true virtual environments, and a newchallenge for access technology researchers. Before analyzing VRML specificaccess concerns, it is beneficial to review previously developed accesssolutions.
PREVIOUS ACCESS SOLUTIONS
Creating access solutions for DOS based machines was simplified as allinteractions with the system were based on ASCII code and viewed in a serialfashion. This point will be demonstrated through several examples. To begin,the main method for inputting information was through a standard keyboard. Ifthe user was not able or if it was not beneficial to the user to enterinformation via the keyboard, several alternative methods were created. Forexample, if the user was restricted by mobility impairments such that they couldonly operate a single switch reliably, the user could enter Morse Code which wastranslated into ASCII. To complement this approach, several word predictionschemes were developed to lessen the amount of necessary work. If enteringfunctional command codes as keystroke combinations were troublesome, then thisprocess could be extrapolated over time into single keystrokes acting as a unit.
Concerning output, other adaptive techniques were devised. For users withlow vision and blindness, software and hardware combinations were developed toallow for the information to be presented in a linear auditory fashion. This isaccomplished relatively easily, for all information intended for presentationupon a monitor was first written to the BIOS in ASCII. Generally this solutionworked quite well, especially in applications such as word processors. Thereason for this being that generally the only information being presented on thescreen was alphanumeric.
An example of where this process was impeded, is the University ofToronto's library system UTLink where attempts were made to enhance the atheticsby providing a pseudo menu system. During this scanning of the BIOS, screenreaders would attempt to pronounce the non-alphanumeric characters. The onlysolution for situations such as this was to create a profile for the offendingsystem or application.
The Graphical User Interface was introduced as a general method forproviding users with a more intuitive and easy way of interacting with acomputer system. Unfortunately, the introduction of the GUI introduced a new setof engineering problems for the AT community. The major challenge was that allinteraction was no longer based upon a standard form, such as ASCII, nor couldall of the relevant functional information be apprehended.
Concerning input, the mouse now played a dominant role. For persons withmobility impairments, this device ranges from difficult to impossible tooperate. Methods devised to allow blind users to operate this device arewanting at best, although methods for others with special needs has beenrelatively successful.
The transmission of information has also become a complex issue for notall information is represented in a standard textual form. Nor was text alwaysassociated in a fully descriptive form to complement the graphic symbols.Secondly, there was not a standard reliable way for programmers to access theinformation intended for screen presentation. This challenge is furthercomplicated by software developers not following standards devised by theoperating system manufacturers.
The method developed to contend with this situation is to construct anOff-Screen Model (OSM). Basically an OSM functions by creating an hierarchicaltextual representation of the menus, icons, windows, and documents. Using aseparate keypad, a blind user can navigate a GUI in a cumbersome, serial, andcognitively taxing auditory fashion. More ideal is the possibility ofpresenting a tactile representation of the GUI. Efforts in this regard arepromising, however they still require a considerable degree of research.
There is little reason to believe that existing and current methods willfunction fully and completely in a VRML environment. To explore the reasons whythis is so, I will now present three potential methods of making VRMLaccessible.
PROPOSED SOLUTIONS
THE TIRE-PATCH SOLUTION
The first potential solution to accessibility issues is analogous topatching a hole in a tire. This approach would attempt to develop a cushionbetween VRML and existing access technology. In this model, a VRML browserwould be developed to be as compatible with existing technology as possible. The emphasis of this approach is to provide functionality first, and enjoymentor ease of use as secondary. The solution would be broken into two parts,namely access for the browser and access for the information as separate anddistinct components.
Concerning the browser, the software would be developed in such a manneras to facilitate its use with existing technology for each specific platform. For example, under a Windows environment, the navigation and other functionalcomponents of the browser would be created in such a manner as to facilitate theauditory presentation of that information by as many screen reader packages aspossible. In essence, the browser would not aid adaptive technology, it wouldonly be constructed such that it would not impede it. For the actual VRMLinformation the browser must be more proactive.
The reason for the above stated necessity is obvious. Virtual realityoffers a new way of experiencing information within an artificially constructedenvironment, and a new way to directly interact with that environment. This ismore than just an extension of a 2D GUI to a 3D environment particularly whendynamic unpredictable properties of the environment are involved. Currentaccess technology is simply not designed for this task. It is feasible that amodified parser could act as an intermediary between VRML source code andexisting access technology.
For users who are blind, there is obviously no reason to dedicatecomputational resources to the task of rendering 3D graphics. Instead theparser could extract the textual description that would be associated with aparticular scene or object and present this to a user. Further, any commentsthat were associated with the source code would be presented to the user. Anylinks that were specified for use through the objects would be abstracted andpresented in a linear form for assistive technology. Additionally, anyqualitative information such as color or the texture-mapped filenames associatedwith the objects would be presented. These links could be filter such that any HTML document reference could be retrieved in part, such that the heading andtitle of the document could be presented to the user to develop a contextualunderstanding of the particular link. This would serve to substituteinformation that would otherwise have been presented by the visual presentationof a 3D object. Further any direct reference to gif, jpeg, or video MPEG wouldsimply be mentioned, whereas audio or AVI/Quick-Time files would be presented asa retrieval option.
Similar techniques could be created to lessen the work for persons withmobility impairments. Contained within the source code of VRML is a metric fordescribing within 3D space the location and the size of the objects. A modifiedparser could interpret this information into 2D coordinates for presentation tosoftware such that it could act as a magnet for a mouse pointer. This wouldeliminate the need for fine-motor control necessary for the positioning of amouse on an object.
This is a workable solution that could be developed and distributedrelatively quickly and inexpensively. This solution would indeed act as atire-patch allowing functionality, or for the wheels to keep spinning. Howeverthere are harsh limitations and restrictions upon this approach.
PROBLEMS WITH THE TIRE-PATCH APPROACH
The first limitation is the lack of a standard method for the presentationof 2D or 3D text within the VRML 1.0 specification, despite general interest ina standard. Fortunately the omission of a predefined method will allow for theAT community to be involved in its future specification.
A second problem is the need to present comment text in the source codefor blind users. It is likely that not all sites will transmit the commentinformation. Second, it is likely that not all of the text in comments will becontextually beneficial in aiding the user to apprehend information or navigatethe environment.
The AT community could recommend to the VRML community that textualdescriptions accompany all objects. This text should contain informationregarding both necessary functional information, and information that would helpto provide contextual clues regarding the environment and the various propertiesof the object. One possible way to ensure adherence to this provision is toexclude a file not containing this information from being a valid VRML file.This would be accomplished by requesting that all parsers refuse to generate the3D images unless there is the textual counterparts. Additionally, as VRMLauthoring packages enter the marketplace, they would require users to includethis information as a standard procedure.
Even if the above were followed, this would do little to create any senseof a virtual environment. The above solution would not allow for the use of anytruly beneficial metaphorical representation of information to easecomprehension. Nor does this approach draw upon any of the worldly skills orstrategies that these users have previously developed.
THE TRAINING WHEELS APPROACH
The goal of this approach is to provide a means and method for allowingusers to experience a new form of reality in a supporting and unencumberedmanner. This approach would focus on maximally assisting users, while avoidinghindering or adding unnecessary cognitive demands.
Presently VRML and most of VE is directed towards providing acomprehensive 3D graphical environment. Very little focus has been placed uponthe development of 3D audio, and nearly no research has been performed on theremaining modalities. Obviously persons who are blind have little use forvisually presented information. Persons with low vision and learningdisabilities benefit from multi-modal presentation of information. In fact, allusers benefit from not having an over abundant amount of information beingpresented visually [x].
Persons who are blind generally have strong abilities in using their senseof touch, and hearing for the development of cognitive maps of environments andthe creation of mental representation of external objects and concepts. Theseabilities are completely ignored in the standard interpretation of VRML sourcecode. I believe that a multi-modal environment can be derived from the currentincarnation of VRML despite its primary role as a description for visualenvironments. An adaptive parser can be utilized to generate a accessible andbeneficial VE.
It is feasible that complementary auditory and tactile environments can becreated from VRML source files . Through the combination of hand positionmonitoring and force feedback, a 3D space could be developed. This could beaccompanied by auditory metaphorical aides. This method would not initially bevery intuitive for a user. The user would still be required to invest time inlearning the various semantic meaning of the tactile and auditoryrepresentation. Further research must be conducted for developers to understandwhich method of representing these spaces best benefits the user andcorrespondingly reduce the learning curve.
PROBLEMS WITH THE TRAINING WHEELS APPROACH
There are several challenges with this approach. First a large part of theburden is placed on the hardware interface. For instance, there is currently noaffordable commercially available interface which sufficiently provides bothhand position monitoring and force feedback. Additionally, only grossvariations in a texture can be currently represented to finger tips.
At this point it is unclear what strategies could be employed to assistusers in comprehending there environment. For example, research must beperformed to understand what auditory information could be communicated from theVRML file for benefiting the user in a contextually relevant and goal achievingmanner.
One possible method for contending with access issues would be to defineMIME types which would support users in comprehending specific VE's. This wouldbe the VRML equivalent to profiles used in DOS and GUI screen readers. Anexample of other communities proposing specific MIME types to express orcommunicate specific data is the chemistry research community. This provisionallows for an international standard for communicating specific types ofinformation. There is one notable difference between the intent of a specificcommunity such as the chemistry researchers proposing a MIME standard, and thatof a proposition by the AT community.
For the chemistry MIME type definition to be successful, it is onlynecessary that all chemistry information and sites utilize this standard. Foran AT MIME type to be successful, all sites would necessarily have to includethis information. Therefore one of two outcomes would be obtained, with onlyone being desirable. First, the information contained within the MIME filewould have to the generated in either an automatic fashion, or one which is noteffort consuming. For if this file were difficult or expensive to construct, itwould not be likely that sites other than AT specific and government sites wouldprovide this. This underscores the need for a universal method for themulti-modal communication of semantically rich information that is accessiblefor all persons in a subjective and context relative fashion. We need more thanreality, we need intelligent and truly adaptive reality.
A HOLISTIC APPROACH
(This part is still very rough.)
Future VE environments must be developed such that they have equal focus onall modalities. Additionally, all information should be incorporated such thatthe information could be apprehended via various modalities in varying degrees.This approach would benefit all populations including persons with disabilities,persons with temporary injuries, and senior citizens.
The VRML browser and VRML files should be presented and largely treated asone. This would benefit the user in simplified manipulation and navigation. This is not unreasonable given the large mix of both functional andinformational aspects represented by both the browser and VRML environments.
Further research must be performed to quantify the ways in which variouspersons are able to navigate a natural environment. These abilities should betapped by the interface for virtual environments. Additionally, these abilitiesshould be assisted such that the work is lessened and goals are readilyachieved.
As this information is collected, a browser can be developed whichaugments itself such that it is of maximum benefit for the user. An extensionto VRML would necessarily have to be developed in order for these provisions tobe realized. VRML and extensions to it should be treated as a generativegrammar allowing for variation and enhancements for the manner in which VEinformation and functional controls are presented.
Finally all modifications and extension should not interfere with thestandard interpretation of VRML files, they should merely complement. Allextensions developed should be presented to the VRML specification community forcommentary and adoption into future VRML specifications.
CONCLUSION
To conclude, the prospects of inclusion in the current form of VRML is dim.There is considerable amount of research that must be performed in the nearfuture. It is of utmost importance that the AT community start to presentaccess solutions and concerns to the VRML community before VRML becomesconcrete. There are still excellent opportunities for involvement in thedevelopment process, as noted in the text example.
REFERENCES
i Brodin, J., & Magnusson, A. 1994. DisabilityApplications In The Artificial World: State of the Art. In Press
ii This email address is being protected from spambots. You need JavaScript enabled to view it. 1994. Virtual Reality Technologies and People with Disabilities. Presence: Teleoperators and Virtual Environments, Draft Version
iii Pesce,M. 1995. Background on the Virtual Reality Modeling Language.
iv Graham, I. 1995.The HTML Sourcebook. Toronto, Canada: John Wiey & Sons, Inc.
v Treviranus, J., &Serflek, C. 1995. AlternativeAccess to the World Wide Web. CSUN Conference Proceedings, In Press.
vi Bell, G., Parisi, A., & Pesce, M. 1994. The Virtual Reality Modeling Language Version1.0 Specification (Draft).http://www.eit.com/
vii Pesce, M . 1995
viii Casher, O., & Rzepa, H.S. 1995. A Chemical Collaboratoryusing Explorer EyeChem and the Common Client Interface. http://www.ch.ic.ac.uk/rzepa/CG/CG.html
ix Bell, G., Parisi, A., &Pesce, M . 1994.