SVG for Navigating Digital News Video

Michael G. Christel and Chang Huang

CS Dept. and HCI Institute
Carnegie Mellon University
Pittsburgh, PA  15213
1-412-268-7799, 1-412-268-5684
christel@cs.cmu.edu, liz@cs.cmu.edu

Abstract

Scalable Vector Graphics (SVG) is a language for describing two-dimensional graphics in XML, specifically vector graphic shapes, images, and text. SVG is a new World Wide Web Consortium (W3C) Proposed Recommendation as of July 2001, and this paper describes how SVG provides an ideal framework for presenting manipulable, interactive summarizations into a multimedia information repository. Specifically, we present VIBE and map SVG interfaces into a digital news video library for delivery through web browsers. Pan-and-zoom visualizations of video through SVG are discussed.

Categories and Subject Descriptors

H.5.1 [Information Interfaces and Presentation]: Multimedia Information Systems - video. H.3.7 [Information Storage and Retrieval]: Digital Libraries - standards, dissemination, user issues.

General Terms

Design, Human Factors, Standardization.

Keywords

SVG, digital video library, surrogate.

Contents

Introduction

The Informedia Project at Carnegie Mellon University has created a multi-terabyte digital video library consisting of thousands of hours of video, segmented into over 50,000 stories, or documents. Since Informedia's inception in 1994, numerous interfaces have been developed and tested for accessing this library, including work on multimedia surrogates which represent a video document in an abbreviated manner [2, 3]. The utility and efficiency of these surrogates and their validation through usability methods have been reported in detail elsewhere [6, 8]. These interfaces are being re-implemented in HTML, XML, XSLT, XPath, and JavaScript, with the intent that use of W3C recommendations maximizes user flexibility in accessing the digital video library through a browser interface [4]. This paper discusses the use of Scalable Vector Graphics (SVG) for visualizing sets of digital news video. The SVG support for vector-based drawing offers a quick way to zoom into areas of interest and show those in greater detail without the need for communicating back to a web server.

HTML, XML, XSLT, and XPath are all W3C Recommendations, with published references available through the W3C [9]. As of July, 2001 SVG is a "Proposed Recommendation" one step away from "Recommendation."

These W3C references are utilized in the implementation of the Informedia digital video library web interface. Extensible Markup Language (XML) is the universal format for structured documents and data on the Web. Extensible Stylesheet Language Transformations (XSLT) is a language for translating XML documents into other XML documents or into HTML. XSLT makes use of the expression language defined by the XML Path Language (XPath) for selecting elements for processing, for conditional processing and for generating text. Finally, SVG presents another way to display data besides HTML. SVG describes two-dimensional graphics in XML, specifically vector graphic shapes, images, and text. Figure 1 shows an architecture that maximizes presentation flexibility by sending XML to the client and relying on client-side script or transformations to convert the XML into HTML or SVG.

Box and lines diagram showing that SVG views can be generated with no Web server involvement

Figure 1. Client-side processing of XML, where user interaction can produce multiple HTML and SVG views without Web server involvement.

Multiple XSLT transformations, e.g., one for low bandwidth users, another for high bandwidth users, optional additional ones for specific languages, age groups, etc., allow the video data to be widely disseminated in different forms based on W3C standards. We will focus on different HTML and SVG presentations implementing Informedia surrogates and summaries. We make use of Microsoft's IE 5.5 browser and their XML Parser 3.0, an Internet Explorer add-on released in November 2000 supporting client-side XSLT. We use Adobe's SVG Viewer 2.0 plug-in for Internet Explorer released in April 2001.

For the HTML shown in Figure 2, an "ascending date thumbnail view" XSL transformation file was used to convert the XML, via the following JavaScript where xmldoc holds the XML data:

xslcontents.async = false;
xslcontents.load("http:ascdate.xsl");
try {
  res=xmldoc.documentElement.transformNode(
    xslcontents.documentElement);
} catch (exception) {
  res = HandleTranslationRuntimeError(exception);}
if (res != "") resultHTML.innerHTML = res;

The property "innerHTML" is used to take the output of the transformation and render it as HTML within the document object "resultHTML" on the current HTML page. If a different transformation is used, say for example to order thumbnails by descending size, then a different XSLT file such as descsize.xsl becomes the parameter in this script. The first time a transformation is used it must be downloaded from the server, but after that it is cached in the browser just like other URLs.

7 thumbnail images, one brightly bordered to indicate it is the only match to both South Africa and Kenya

Figure 2. Thumbnails for region query (oldest 7 docs. shown).

Unfortunately, there is not the equivalent innerHTML or innerSVG property for SVG as there is for HTML, to read in plain text and convert it into an SVG document dynamically. However, the SVG Document Object Model (DOM) is compatible with the W3C DOM Level 2 Recommendation, and so the SVG document can be built through script calling DOM methods like cloneNode, setAttribute, and appendChild. Hence, for SVG output the client script navigates the XML using XPath and produces the appropriate SVG through SVG DOM methods.

VIBE as SVG

Visualization by Example (VIBE) was developed to emphasize relationships of result documents to query word. For users unfamiliar or uncomfortable with Boolean logic, VIBE allows a visual plot to be manipulated to discover and/or relationships between query entities and documents. Entities are drawn as anchors that can be picked up and dragged, with documents plotted against the entity anchor positions based on the contribution of those entities to the documents' relevance scores. Manipulating anchor points lets the user resolve any ambiguities in the two-dimensional plot [7].

The VIBE plot for the full result set of Figure 2 is shown in Figure 3. Figure 3 is an SVG document, created through script from the same XML that was used via XSLT to generate Figure 2. We extended the traditional use of VIBE to address text queries into the domain of map region searches. For Figure 3, the anchors are gazetteer entries such as cities and countries, from within the query map region shown in Figure 4.

The VIBE visualization conveys semantics through positioning. In Figure 3 the green boxes represent video documents. The position of the box indicates which anchors are found in that document. For example, the absence of any document box at Namibia indicates that stories dealing with Namibia always discuss another entity in this African region. By dragging the anchor for Namibia and other terms and seeing how the green boxes immediately replot to the newly positioned anchors, the user can discover that all Namibian stories also deal with South Africa or South African cities.

VIBE plot of documents to query anchors (African city/country names), implemented in SVG; South Africa and Kenya are highlighted, as is one box for the only document that mentions both those locations

Figure 3. VIBE plot for Figure 3, with focus on South Africa and Kenya.

A user can interact with the SVG VIBE plot to emphasize particular relationships. Just as Figure 3 shows that only one document discusses both South Africa and Kenya through yellow highlighting, anchors can be highlighted and the corresponding document points matching the criteria highlighted as well. The one document matching both South Africa and Kenya is colored yellow in the plot of Figure 4. This VIBE interface can of course be enriched to overlay other information dimensions through size, shape, and color, as detailed elsewhere [2, 3].

The Adobe SVG Viewer provides default interface options to zoom in and out, pan, and return to the original view. When zooming in, the vector graphics are cleanly redrawn without pixelation. SVG allows VIBE plots to be generated quickly, rendered cleanly, and manipulated efficiently, while providing standard ways for zooming into and out of regions of interest.

Maps as SVG

A casual examination of the SVG examples currently available on the Web reveals a number of maps rendered as SVG. Maps are well suited to vector representations, since world scale overviews can be provided, with boundaries and coastlines sharpened as the map is zoomed down to smaller areas of countries, states, and cities. All of the different views are supported by the same SVG document, rather than individual bitmap raster files that need to be separately accessed from a Web server. We make use of gazetteer and geographic information from ESRI [5] for indexing video documents geographically and creating SVG map interfaces for query and display. Maps are used in the following ways for the Informedia web interface:

SVG map, with box showing African region query used to produce other figures; South Africa and Kenya are highlighted

Figure 4. Map as both query input interface and for feedback on focused areas.

The same result set viewed in different ways in Figures 2 and 3 can be viewed on a map as well, with countries colored if they are dealt with in one or more of the 74 documents returned by this query. Dynamic query sliders can be used to give the user control in setting a more narrow focus [1]. Figure 5 shows a date slider setting the focus to the time period November 2 through November 9, 1999. The United States and predominantly the East African countries stay in focus, along with Yemen and Egypt, indicating a concentration of stories for those regions during this week of interest.

SVG map plot showing that East African countries are discussed during week of Nov. 2, 1999

Figure 5. SVG map as visualizer interface.

We are investigating overlaying added detail, such as the Informedia video surrogates of storyboards, thumbnails, and titles, to summary interfaces as shown in Figures 3 and 5. When a SVG summary interface is zoomed to a point where only a few documents are left represented, video surrogates could be displayed, one per document, in the available screen space. A more ambitious goal is to summarize across document sets, so that a surrogate represents one or more video documents.

Future research work might even fold in temporal components like audio or video skims. SVG supports the SMIL Recommendation for synchronized presentation. The first use of animated SVG documents as Informedia web interface elements may be interactive maps: as video plays, its countries, states, and cities highlight during their period of discussion and are unhighlighted when they lose focus. A non-SVG implementation of interactive maps is described elsewhere [3], but SVG again offers the advantage of user-controlled browser display of data without the need to issue requests back to the server. Ideally, richer SVG summarization interfaces can be given temporal structure and be "played" to reveal additional information as temporal features, just as size, color, shape, and location can convey meaning in the current map and VIBE SVG interfaces.

Acknowledgments

This material is based on work supported by the National Science Foundation (NSF) under Cooperative Agreement No. IRI 9817496. Partial support for this work also comes from NSF's National Science, Mathematics, Engineering, and Technology Education Digital Library Program under grant DUE-0085834.

References

  1. Ahlberg, C. and Shneiderman, B. Visual Information Seeking: Tight Coupling of Dynamic Query Filters with Starfield Displays, Proc. ACM Conf. on Human Factors in Computing Systems (Boston, MA, April 1994), 313-322.
  2. Christel, M.G. Visual Digests for News Video Libraries. In Proc. ACM Multimedia '99 (Orlando, FL, Nov. 1999), ACM Press, 303-311.
  3. Christel, M.G., Olligschlaeger, A.M, and Huang, C. Interactive Maps for a Digital Video Library. IEEE MultiMedia 7, 1 (Jan.-Mar. 2000), 60-67.
  4. Christel, M., Maher, B., and Begun, A. XSLT for Tailored Access to a Digital Video Library. Proc. Joint Conf. Digital Libraries (Roanoke, VA, June 2001), ACM Press, 290-299.
  5. Environmental Systems Research Institute (ESRI), Inc., home page, http://www.esri.com/.
  6. Informedia Project web site at Carnegie Mellon University, http://www.informedia.cs.cmu.edu.
  7. Olsen, K. A., Korfhage, R. R., Sochats, K. M., Spring, M. B., and Williams, J. G. Visualization of a Document Collection: The VIBE System. Info. Processing & Mgt., 29(1), 69-81.
  8. Wactlar, H., Christel, M., Gong, Y., and Hauptmann, A.   Lessons Learned from the Creation and Deployment of a Terabyte Digital Video Library. IEEE Computer, 32, 2 (Feb. 1999), 66-73.
  9. World Wide Web Consortium (W3C) home page with links to W3C Recommendations, http://www.w3.org/.

Copyright © 2001 ACM.