AVT: Active Video Transcoder System

Architecture Document 
Internal Release Version 1.01 (DRAFT) 
Date: March, 2000 
comments/bug report: perceptmedia@molokia.medianet.kent.edu

---

 
 

We are developing the Active Video Transcoder (AVT) system that can perform full range of video transformation on an MPEG-2 video bit-stream system developed at the MEDIANET lab. The AVT can be dynamically installed and activated on network splice-points inside an Active Network. The AVT system can generate a new bit-stream matching the network characteristics and receiver requirement downstream. 


 

What is Active Video Transcoder system?

In the past years, the growth of backbone and periphery technology is pushing up the upper limits of network speed. Here it seems that the diversity in network capacity and the variability in the available quality of service among different parts of the Internet is also rapidly increasing with it.In deed the two most rapidly expanding area of the Internet is bandwidth limited. Most projection indicates that it is the international front, which will experience the most aggressive growth in the second decade of post-web Internet. The other is the integration of the wireless networks. While the mainstream networking research has focused on raving up the bandwidth, however, very little focus has been paid to make system operable across asymmetric network capacities, which is rapidly becoming a concern. 

Video communication is one of the most demanding applications on the Internet. Today’s video technology is ill suited to cope with variation. For example, in a video multicast distribution tree, if there are receivers with varying capacity, there are only three options. Either, the server will be serving at a high resolution version and low capacity receiver will be cut-off, or the server will serve with minimum version forcing the high-res client to be satisfied with the low-res version despite there local capacity. The third option is also not a good one, where the server has to serve multiple versions of the stream resulting in redundant information flow.

We suggest optimum, and more intelligent result can be achieved, however it will require two fundamental innovations in the video and network technology. The first step is to develop a video transmission technology that can allow dynamic adaptation. The second is to find a new model of networking where such adaptive units can be implanted right inside the network splice points where the networks and links of varying characteristics meets in a global scale internet, a federation of networks with varying capacities and characteristics.Interestingly, video is just one of the first pressing application which requires such adaptation. Almost all the emerging network-based applications will require such adaptation ability.

---

Approach: 

In this research we are developing an Active network based trans-coder, which can perform full transformation on an active junction point. Digital video computations are massively computation intensive. In this two-part research correspondingly, our goal is explore the current limits of technology and at the same time propose viable models of in stream transcoding. In the other part we are developing an Active Network node system, which will use the Active Video Transcoder as an example, but will provide a means for launching and maintenance of active Transcoder elements for other applications. Our objective here is to find how seamlessly, large traducing systems can be launched in a global Internet not only to absorb the variations of the networked involved, but also withstanding the variations of the splice nodes themselves. This document briefly describes the architecture of the Active Video Transcoder (AVT) system.

---

Description of the Decoder and Encoder System: 

Fig-1

Modules: 

Below we first describe the two principle modules inside: the decoder and the encoder systems. The decoder reads an ISO/IEC 13818-2 stream. The processes involved are: VLD (Variable Length Decoder), Inverse Quantization, Inverse DCT computation. For P, and B frames it also includes a feedback loop, where reference frames are looped through an inverse motion compensator. 

The encoding process is reverse however, more complicated. For I pictures, it works only in forward loop, with DCT, quantization and variable length encoding. For P and B picture computation, frames are buffered, decoded and then subtracted from the current frame. During the macro-block wise subtraction, the best to subtract is searched by motion compensation process. 


Fig-2

---

Description of the Transcoder: 

The Active Video Transcoder system consists two parts: one is decoder part and the encoder. Decoder takes the input video bit stream and converts the bit stream into the uncompressed video (format is separate YUV). Encoder gets the frame pictures and uses the modified parameter from the parameter file and encodes the frames to a new bit stream. Full decoding and especially full encoding is an expensive process. The AVT however, runs a reduced re-encoder. It contains optimized version that reuses computations, when the new transcoding parameters allow it to do so. (We are still developing the optimization methods). 

Optimization Model: 

One of the prime design challenges here is to match the line rate for real-time processing.We divided up the optimization problem in two stages. If we consider that the incoming line bandwidth is B1 and outgoing line bandwidth is B2 (presumably B2<B1), Let, T is any interval of time. The first objective is to reduce the incoming stream of size D1=TxB1 into a reduced sized stream D2=TxB2=D1xB2/B1 with minimum loss of multimedia quality. Typically more computation power is needed to reduce quality loss. 

While, the above steady state optimization ensures that no incoming data is lost. However, it does not account for the delay introduced in the transcoding path. Therefore, the second optimization objective is to minimize to obtain minimum T, Tmin for which above optimization holds. Typically, it requires finer grain computability of the stream. On one hand the grain of computability is constraint by the dependency in the incoming bit-stream we call it hard dependency, on the other hand side it depends on the concurrent computing power of the Transcoder.

Concurrent Architecture: 

Our objective is to meet the optimization objective by multi-processor transducing. In this first AVT model, we have designed a concurrent Transcoder where the decoder and the re-encoder work in tandem. The main process forks a child process—decoder and child process—encoder and let them run concurrently. This enables us even to adjust the asymmetric decoding/ coding algorithms by scheduling more that one processors for re-encoding.

Inter process Communication: 

At the initialization, they set up a set of pipes for inter process communication. In the current version we use three pipes (as shown in Fig-3) for piping Y, U and V frames. 


Fig-3

Adaptation: 

In this first model we have focused on rate adaptation parameters macro-block level. This level of adaptation enables us to perform rate adjustment based on perceptual coding. The compression or conversely the size of the output bitstream can be changed by some parameters in the parameter file, like: the bit_rate, frame_rate_code,  d0i, d0p, d0b, intra_quant_matrix  and non_intra_quqnt_matrix. The level and stages of decoding and recoding is determined by the selection of adaptation parameters. Deeper parameters require deeper probing. For this model we first need to decode the bitstream into the video frames In this first version of the Active Video Transcoder system, we implemented a full decoding and optimized recoding for the specified parameters.

 
 



Fig-4 interface Diagram

When the system starts, it gets the parameter file, input bit stream file and the output bit stream file from the command line, then it forks two threads: one is decoder; Another is the encoder. These two threads run concurrently. The decoder takes the input bit stream and converts it and put the uncompressed video to the three pipes (which are the input of the encoder).

The Transcoder receives the video frames from the pipe and re-encodes them to the new compressed bit stream depending on the parameter file. The pipe setup helps the decoder and the encoder to run synchronously. The encoder waits until the decode finishing a frame and putting the data into the socket; the decoder will not begin decode the next frame until the encoder remove the data from the socket. Thus, the system keeps processing frame by frame the Y, U, and V frames. We also change the output frame order of the decoder (from the display order to the coded order), so that the encoder can get the frame in the coded order.

---


Acknowledgement: The development of this research has been supported by a DARPA research grant.