Technical Report 2002-06-01
Internetworking and Media Communications Research Laboratories
Department of Math & Computer Science, Kent State University
http://medianet.kent.edu/technicalreports.html



Encoded Test Video Set from Dynamic Reflex Windowing



Oleg Komogortsev and Javed I. Khan
Internetworking and Media Communications Research Laboratories
Department of Computer Science
Kent State University, 233 MSB, Kent, OH 44242

Last Revised June 31, 2002


Abstract

The human vision offers a tremendous scope of data compression that is for human visual consumption. Only about 2 degree in our about 140 degrees vision span has sharp vision. A fascinating body of research exists in vision and psychology geared towards the understanding of human visual perception system. The possibility of eye-tracking based perceptual compression has been anticipated for some time by many researchers. We have recently implemented one such system-- a live eye-gaze integrated media streaming system. It integrates a streaming server, a real-time live media transcoder and a live magnetic head-tracker integrated high-speed eye tracker. The system intakes live perceptual information related to subjectís eye position and head movement via an eye-tracker (ET) and a magnetic head tracker (MHT) device. The media transcoder between the server and the player in a networked environment correspondingly blends the perceptual information with the media and accordingly controls the spatio-temporal resolution of the presentation. The transcoder mediated architecture decouples serving from network operation and thus, can be used for the transmission of both stored and live media. Though, the architecture is independent of any media type, this system currently handles ISO/IEC 13818-2 MPEG-2 streaming. This is one of the first live implementation of such a system. A unique challenge of this real time perceptual streaming is how to handle the fast nature of human eye-gaze interaction with relatively complex MPEG-2 rate transcoding scheme, and the control loop delay associated with streaming in the network. We have designed a live eye gaze interaction based dynamic foveation windowing scheme to address the challenge.

This report contains experiment clips used in testing the performance of this system. †The videos are MPEG-2 ISO 13818-2 streams. The videos in the first column show the original high bit rate version of the clips (4MB). The videos in the second column show the visual windows estimated in the frames. The third column shows results of perceptially encoded video with 1 MB target bit rate. As it can be noted, in perceptually encoded clip, the objects of perceptual focus maintained †near input stream quality.

*The technical detail of the algorithms are not included here.


Video Samples:

Sample Name

Original Sample

Reflex Window with Eye-gaze

Perceptually Encoded

Car

car.m2v

car_rw_rg.m2v

car_percept.m2v

Shamu

shamu.m2v

shamu_rw_rg.m2v

shamu_percept.m2v

Airplanes

airplanes.m2v

airplanes_rw_rg.m2v

airplanes_percept.m2v