Raymond R.-F. Liao, Pradubkiat Bouklee and Andrew T. Campbell
mailto:%20{liao,%20pradu,%20campbell}@comet.columbia.edu
COMET Group, Center for Telecommunications Research, Columbia University
Abstract
In this paper, we present the design, implementation and evaluation of a utility generator that plays an important role in a utility-based adaptation framework by dynamically generating bandwidth utility curves through a passive process of inspecting video content on-the-fly. The utility generator interacts with a network adaptation mechanism to periodically renegotiate bandwidth to match the bandwidth requested by the generated utility curves and comprises four elements: a profile selector, distortion meter, utility assembler and long-range utility predictor. Using a set of MPEG video traces we analyze the performance of the utility generator and discuss the process of long-range utility prediction, which represents a technique that keeps the resulting utility curves meaningful over longer periods approaching network adaptation time scales. We demonstrate through experimentation that the utility generator system represents a promising new approach to delivering scalable media over content-based networks.
The performance of distributed networked content (e.g., video streams) can be improved by exploiting the intrinsic scalability of content through rate control techniques coupled with effective media scaling [1] and periodic bandwidth renegotiation. This can result in a significant increase in the utilization of network capacity [2]. Media scaling can be efficiently performed inside network for packet video by inspecting scaling information carried in packet headers and then selectively dropping packets on-the-fly. These techniques are also well suited toward scaling content in response to time-varying link capacity found in wireless mobile networks [15] and the Internet [8].
Bandwidth utility curves [3] can successfully characterize the range over which content-based applications can operate. Figure 1 illustrates a family of utility curves that are normalized over a 5-level mean-opinion-score (MOS) [4] representing different degrees of subjective video quality. Linear utility curves represent applications that are insensitive to bandwidth variation over any particular range. While piece-wise linear approximations of quadratic utility curves represent applications with diminishing utility gain over increasing bandwidth allocation. In contrast discrete curves model applications that are discretely adaptive, e.g. multi-layered scalable video flows.
Studies of video quality measurement have placed emphasis on the design of accurate perceptual metrics [6, 7] while little attention has been paid to the generation of these perceptual quality metrics in the literature. In network economics research, utility curves are used solely as a theoretical abstraction of application demands in the area of network pricing [8], and optimization of resource allocation in wireline [5] and wireless [9, 3] networks. In this paper, we argue that the dynamic generation of bandwidth utility curves is an outstanding research challenge that needs to be addressed. We believe that the dynamic generation of utility curves will be a key enabler of utility-based adaptation potentially driving the resource management systems of next-generation "better than best effort" Internet and mobile networks. A major challenge to this vision is the formulation, generation and maintenance of bandwidth utility curves.
In this paper, we propose a utility generator that plays an important role in a utility-based adaptation framework by dynamically generating bandwidth utility curves through a passive process of inspecting video content on-the-fly. The utility-based adaptation framework comprises three modules: content scaler, bandwidth allocator and utility generator as illustrated in Figure 2. The content scaler performs content-based rate control interacting with the bandwidth allocator to realize utility-based bandwidth allocation [5, 3] in the network. The utility generator dynamically creates bandwidth utility curves on-demand on behalf of the content scaler and bandwidth allocator modules. In what follows, we present the design, implementation and evaluation of a utility generator system. We analyze the performance of the utility generator and discuss the process of long-range utility prediction, which keeps generated utility curves meaningful over longer time scales. We use the term utility duration to denote the duration over which a predicted utility curve is valid. By valid we mean that the predicted utility curve does not under-estimate the current resource allocation needs. The long-range utility prediction algorithm helps to increase the utility duration without sacrificing accuracy of the approach.
| |
Figure 1: Bandwidth Utility Curves Example |
Figure 2: Utility-Based Adaptation Framework |
The structure of the paper is as follows. In Section 2 we discuss the multiple time scales that are active in a utility-based adaptation framework and the motivation for long-range utility prediction. This is followed in Section 3 by a detailed discussion of the utility generator elements. In Section 4, we evaluate our approach and analyze the process of utility curve generation and prediction. Following this, we discuss our experimental results in Section 5. Finally, in Section 6 we present some concluding remarks and future work.
Typically, utility curves are not generated once and valid over the lifetime of a video stream. Rather, they are time-varying due to their sensitivity to changes in stream content. This implies that the generated utility curve may dynamically change over potentially very short time scales (e.g., tens of milliseconds) as the content changes, e.g., due to scene changes in a video stream. An important observation from the network resource management point-of-view is that network adaptation operates over much longer time scale potentially in the order of hundreds of milliseconds to seconds. This is a product of the signaling system efficiency, resulting network load of the dynamic resource management systems and the round trip delay between a source encoder and receiving decoder.
We observe the "content variation time scale" may be several orders of magnitude faster than the "network adaptation time scale". This observation complicates the issue of utility curve generation. Fundamental tradeoffs are at play here in the design choices. Which time scale should we design to? If we choose to trigger the generation of utility curves based on the content variation time scale we will have very accurate representation of the bandwidth utility over that interval. However, supporting dynamic end-to-end bandwidth renegotiation on such a potentially fast time scale would degrade the performance of the network due to the increased capacity required to support the load on the signaling system. Alternatively, if we design for slower time scale to limit the impact of renegotiation on the network and its signaling system we are in danger of under or over estimating resources needs of video content. Providing some intuition: assume that the content variation and network adaptation time scales are in the order of 50 ms and 500 ms, respectively. Triggering the generation of utility curves every 500 ms will reduce the signaling overhead in support of dynamic resource management but will likely introduce under or over estimation of bandwidth demand in the system. Curves generated at the beginning of a utility duration remain fixed over that period, e.g., 500 ms. However, the initially generated curve may be out of date at the end of the first content variation period (i.e., after 50 ms) rendering the approach ineffective.
In addition, one can observe that the two critical time scales (viz. content variation and network adaptation time scales) are themselves adaptive in nature. It is very likely that the content variation time scale is not periodic but asynchronously time-varying, e.g., whenever a scene changes. In addition, if one selects a certain network adaptation time scale based on the round trip delay then this may be time-varying depending on the network architecture as is the case with today's Internet. Determining what these time scales are and tracking their change is an important issue in building a utility-based adaptation framework.
The challenge then is to push the utility duration close to the network adaptation time scale to minimize the impact on the signaling system while keeping the predicted utility curve "accurate" over longer time scales that will typically exceed their content variation time scales. We observe that these are opposing requirements. In this paper we describe the process of extending the validity of utility curves over long time scales as long-range utility prediction.
The design of the utility generator depends on a number of factors including utility prediction, encoding techniques, video content as well as user preferences. Therefore, a single generic solution to all these issues is unlikely practical. Rather, the utility generator should be programmable and therefore capable of accommodating a range of system considerations.
An important design decision that motivates the design of our framework is separation of the utility generator architecture from the data path transport as illustrated in Figure 2. In this case, the generation process computes utility curves on-the-fly in a passive manner without impacting the video transport operating at the data path. This separation principle allows the utility generator to compute utility curves independent of the actual video transport. By placing the utility generation process "out-of-band" from the data-path, various techniques with differing computational complexity can be deployed, e.g., buffering of multiple pictures and transcoding in the case of subjective video quality [12].
Figure 3: Utility Generator Architecture
The utility generator architecture comprises a set of modules that constitute a four-stage process for generating utility curves. The architecture can be programmed to tailor the utility generation process based on the requirements of a video stream. These modules are executed in a sequential fashion (stages 1-4) thereby generating a predicted utility curve as illustrated in Figure 3 and comprise:
The profile selector plays two roles in the utility generation process. The profile selector first derives the range of scaling rates applicable to video streams and then determines the array of sampling rates and the corresponding scaling actions. The maximum scalable rate (R_{max}) can be passively obtained from a picture header (e.g., in the case of a constant-rate encoded MPEG stream) or actively determined by measuring the size of each picture and computing the raw bit rate. The minimum scalable rate (R_{min}) can be estimated from the picture header size because at the minimum scaling rate all pictures transmitted contain headers to maintain the necessary synchronization between an encoder and decoder.
Given a pair of maximum and minimum scalable rates are derived, the profile selector chooses a number of scaling rate samples R_{i} between R_{max} and R_{min }, as well as the scaling methods (denoted by S_{i }) required to scale the video down to rate R_{i }. The selector's decision is influenced by the properties of each scaling technique, i.e., effective rate range and resulting rate reduction granularity.
Based on the resulting rate reduction class (viz. discrete or continuous), media scaling techniques are categorized by two classes:
These scaling methods are instantiated inside the generator by service providers during the creation of a new service that demands new type of scaling methods. The sequence in which scaling methods are applied to a video stream is specified by a scaling profile that is made programmable. A scaling profile can be programmed statically or dynamically. In the static case, a scaling profile is instantiated by a service provider configuring the utility generator and content scaler (illustrated in Figure 2) accordingly assuming that the content scaler is capable of supporting the scaling profile. In the dynamic case, a scaling profile is programmed using a profiling signaling channel.
The distortion meter compute utility values indirectly from distortion measurements. The theoretical foundation of video quality metrics is based on a distortion-rate function D(R) [11], where R is the scaled-down rate of the information source and D(.) is the distortion: measure by the "minimum distance" between the original data and all the scaled-down data flows whose rate is less than or equal to R. Given the distortion measure D(R), the utility value u(R) can be simply derived from: u(R) = u_{max} - D(R), where u_{max} is the utility of maximum bandwidth requirement. In this sense the bandwidth utility metric is equivalent to the distortion metric. The distortion meter comprises an array of scalers. With a given set of scaling rates {R_{i}} and the corresponding set of scaling methods {S_{i}}, these scalers each apply a scaling method S_{i} to video stream computing the resulting distortion D(R_{i }).
The utility generator is designed to operate dynamically in the network in an online dynamic manner. Therefore, subjective video quality metrics requiring off-line human intervention is unsuitable. Rather, utility metrics have to be objective based on the signal-to-noise ratio (SNR) or correlated to the subjective tests of the human visual system. The complexity of the distortion meter depends heavily on the type of utility metrics used. The work reported on in this paper adopts the SNR metric in support of low latency processing. We normalize u_{max} to 1 and choose err^{2}(R) / sig^{2} as the distortion measure so that u(R_{min}) = 0 and u(R_{max}) = 1. The derived utility metric then becomes:
u(R) = 1 - err^{2}(R) / sig^{2} , where sig^{2} = SUM_{i=1, N} x_{i}^{2} / N, and err^{2}(R) = SUM_{i=1, N} (x_{i} - y_{i}(R) )^{2 }/ N .
The sig^{2} component is the mean energy each pixel has in the original picture and err^{2} the mean square error between the distorted image and the original image. N is the number of pixels in the picture, x_{i} and y_{i} the ith pixel in the original and distorted images, respectively. err^{2}(R) is non-increasing as R increases in [R_{min}, R_{max }], where err^{2}(R_{min }) = sig^{2} and err^{2}(R_{max }) = 0. Therefore, u(R) is non-decreasing in the range of [0, 1] as R increases in [R_{min}, R_{max }].
Because the DCT transform matrix is unitary, the calculation of err^{2} can be accomplished in DCT transform domain so that:
sig^{2} = SUM_{i=1, N} X_{i}^{2} / N and err^{2} = SUM_{i=1, N} (X_{i} - Y_{i})^{2 }/ N,
where X_{i} and Y_{i} are the ith DCT coefficients in the original and distorted images, respectively. This property implies that the utility metric can be calculated directly over the encoded bit stream without the need for transcoding. Another benefit of the metric is that it is additive, i.e., err^{2 }can be calculated in sequence from one macro-block to the other.
Next, we quantize the instantaneous utility curves easing the processing and reducing the number of states required since quantization reduces the number of rate samples representing a utility curve. Even though quantization reduces the accuracy of utility curves, it can actually reduce a metric's sensitivity to minor content changes and hence increase the utility duration. We quantize and scale the utility metric into the 5 levels used in the MOS test. For utility level 1, 2, . . . , 5, the corresponding discrete utility values are 0.1, 0.3, 0.5, 0.7, and 0.9, respectively. Let r_{k} denote the rate value corresponding to the kth utility level. r_{k} is calculated via linear interpolation to locate the rate that causes u(R) to cross the kth discrete utility value. The resulting 5 sample points (r_{1} , 1), (r_{2} , 2), . . . , (r_{5} , 5), together with two end points (R_{min} , 1) and (R_{max} , 5), supports the construction a piecewise-linear utility curve whose utility range is in [1, 5].
The utility assembler constructs a utility curve from a set of sample points ( R_{i }, u(R_{i} ) ) computed by the distortion meter discussed above. The shape of the constructed utility curve depends on the type of scaling methods applied to its generation. If all the scaling actions are associated with the continuous-rate scaling class set then the constructed curve has piece-wise linear shape with linear lines connecting all sample points as illustrated in Figure 4. In contrast, if a scaling action S_{i} is associated with the discrete-rate scaling class set then the curve connecting point ( R_{i }, u(R_{i} ) ) and its neighboring point ( R_{i-1 }, u(R_{i-1} ) ) will form a step function shape as illustrated in Figure 4. With a series of scaling actions, one can envision the shape of a utility curve be a concatenation of curves with discrete steps and piecewise-linear shapes.
Under normal operations the assembler periodically generates instantaneous utility curves. There are several choices on the length of the content variation interval. Generating instantaneous utility curves for every video picture is not feasible because the resulting curve will depend heavily on the type of pictures (e.g. I, P or B pictures in the case of MPEG). Therefore it is necessary to consider multiple consecutive pictures (e.g. one Group of Picture (GOP)) computing instantaneous utility curves even though this may require buffering multiple pictures.
Figure 4: Example of an Instantaneous Utility Curve
The utility assembler provides a triggering interface that activates asynchronous generation of instantaneous utility curves on-demand as well as in a periodic fashion based on the content variation time scale. This trigger interface is exposed and can be programmed in a number of ways, e.g., driven by an online scene detector [2]. In this case, the interface could be exercised when a change in scene is detected. Other drivers could be a source encoder when it changes the coding scheme or the network when it wants to force a refresh of the current utility curve or from an end user to change the preference of the media scaling profile, e.g. prefers dropping color first instead of reducing frame rate. The utility assembler provides a flexible programmable interface to accommodate these and other drivers.
As discussed in Section 2, prediction is a key challenge. The framework is not restricted to generating utility curves over longer time scales in the order of network adaptation time scales but can accommodate very fast time scales too, e.g., per-picture, per-GOP, per-scene, per-groups of GOPs, etc. Our methodology is one of programmability and flexibility, and therefore the time scales at which utility curves are generated is open. This does not, however, detract from the arguments and tradeoffs discussed in Section 2. By taking these arguments into account it makes little sense generating utility curves on the per-frame or per-GOP time scales and forcing an end-to-end bandwidth renegotiation. The network-signaling overhead may be prohibitive as the single consideration. Therefore, we need to strive to generate utility curves that are valid over longer time scales close to the network adaptation time scale; hence, minimizing the signaling overhead and perturbation of the network resource management system. In the final stage of the utility generation process we extend the lifetime of instantaneous utility curves through a process of long-range utility prediction.
Utility prediction is motivated by the following observation:
that relaxing the utility value measurement of utility curves does not typically violate the bandwidth requested by the actual curve because the derived curve tends to over-estimate the actual bandwidth demand.
The only drawback of this approach is the possible over-allocation of bandwidth that a predicted utility curve has requested for a video stream. In the next section we attempt to quantify this over allocation. This relaxation procedure can be viewed as a search for a lower bound on the utility values (or equivalently, upper bound on bandwidth requests) for a group of instantaneous utility curves. Because "relaxed curves" typically fall below actual curves (i.e., in Figure 1 the piece-wise linear curve is a lower bound of the continuous quadratic curve.) thus avoiding under-estimation of bandwidth demand with a given utility value. By predicting the lower bound accurately, all newly generated instantaneous utility curves above the previously computed lower bound can be omitted from the renegotiation of network bandwidth. This "prediction" process increases the utility duration over which utility curves are valid (as discussed in Section 2) with the downside of potentially over-consumption of bandwidth or in some cases under-estimation of resources which would raise a bandwidth violation in the dynamic resource management system. The tradeoff involved in utility prediction will be illustrated in Section 4.4.
Next we describe the operation of the utility prediction algorithm. As utility curves are constructed from sample points, in the following description of the algorithm, a curve is represented by its corresponding sample rate vector R. The prediction algorithm uses an over-allocation factor that "inflates" the bandwidth demand by hedging that the predicted utility curve remains at the lower bound of the instantaneous utility curve during the next utility duration T. The detailed mechanisms of the prediction algorithm aim to dynamically adjust this "expanding factor" closely tracking the dynamics measured in the instantaneous utility curves.
The prediction algorithm operates in normal or exception mode. In normal mode, the prediction algorithm generates one predicted utility curve every T interval. Here the duration T is defined as the "utility duration" and should ideally be in the order of the network adaptation time scale. During this interval, the prediction algorithm continuously uses instantaneous utility curves R^{k} to update its internal measurement R^{avg}, following a moving average formula: R^{avg} = a * R^{avg} + (1-a) R^{k}, where a is a control parameter. When the interval T expires, the algorithm multiplies R^{avg} by e (expanding factor) to generate a new predicted utility curve for the next interval T. In addition, the expanding factor e is decreased by a small decrement e^{dec} until e reaches 1.
In exception mode, the prediction algorithm handles violations. A violation is defined when:
the previously predicted utility curve has to be changed to accommodate a newly generated instantaneous utility curve that exceeds the bandwidth demand represented by the previously predicted utility curve.
When violations occur the utility predictor transitions its state machine to exception mode. The algorithm then updates the internal measurement R^{avg} with respect to the utility curve causing the violation. The "expanding factor" is increased as well by an increment e^{inc}. The prediction algorithm then generates a new utility curve with the updated values and interacts with the bandwidth allocator (as illustrated in Figure 2) to renegotiate bandwidth on an end-to-end basis. The pseudo-code of the prediction algorithm can be found in the Appendix.
In the previous sections we have discussed a methodology for generating instantaneous utility curves and predicting long-range utility curves in content-based networks. In this section we present an evaluation of the implementation of the utility generator system detailed in Section 3. We show through experimentation and analysis that the operation of the four-stage generation process looks very promising. In particular, we evaluate the ability of the utility generator system to create instantaneous utility curves on the fly and to derive predicted long-range utility curves. Finally, we evaluate the performance of our system in terms of the observed network resource utilization and violations analyzing the measurement error introduced by the prediction algorithm.
The experiments used two MPEG-1 video traces notably the Chef (1 minute TV interview) and TrueLies (3.5 minutes action movie excerpt) video clips [16]. The Chef sequence has relatively slow scene changes whereas the TrueLies has very fast scene changes.
In comparison to continuous-rate scaling methods, discrete-rate scaling is relatively simple to model, e.g., by measuring the rate of the dropped components and the drop in utility value based on scaling profile. In the following experiments we are primarily concerned with how to predict long-range utility curves and without loss of generality we focus on continuous-rate scaling methods, e.g., dropping DCT coefficients. We choose dynamic rate shaping [10] as a means of implementing dropping DCT coefficients because it optimizes dropping to best approach the ideal rate-distortion curve. For evaluation purposes, the scaling profile is statically set during experimentation containing only the dynamic rate shaping scaling methods. The profile selector's role is then reduced to only measuring the maximum and minimum scalable rates R_{max} and R_{min }using the approach described in Section 3.2. In this case, the selector generates 50 scaling rate samples that are evenly spaced in the range (R_{min} , R_{max} ).
The distortion meter is implemented from the dynamic rate shaping source code [10]. The data structure storing dynamic rate shaping execution states is extended to an array. In this case, each entry stores the distortion measure and scaling states corresponding to one scaling rate sample. This allows the continuous execution of dynamic rate shaping under one scaling rate samples independently from the executions under other rate samples. Such a structure is necessary if the scalers inside the distortion meter are implemented in parallel.
The utility assembler is simply a data parser that can accumulate distortion data across multiple pictures for per-GOP utility curve generation. The long-range predictor implements the prediction algorithm discussed in Section 3.5 and detailed in the Appendix. The utility duration T, expanding factor e and its increment and decrement are control parameters of the experiments. The parameter for moving average a is set to 0.8 throughout the experiments discussed in the next two sections.
The first experiment is designed to investigate the similarity of utility curves generated for every MPEG picture using the Chef video trace, which is constant bit rate encoded with a scalable rate region between 0.18 and 0.25 Mbps. As illustrated in Figure 5a, the resulting utility curve varies significantly between pictures. Note that picture 15 and 18 are I and P pictures, respectively, and that pictures 16, 17 and 19 are B pictures.
Next we investigated per-GOP generated utility curves. We observe that by averaging out fast inter-picture variation across a small number of pictures we can reduce the dependency over picture types. Smoothing over multiple pictures (e.g. a GOP) requires buffering the utility curves of every picture. Because the curves are constructed from a number of scaling rate samples (in our case 50), the required memory for this operation is not large for each stream.
Figure 5: Results of Instantaneous Utility Curves
Figure 5(b) and 5(c) illustrate a number of examples of the per-GOP (15 pictures) generated utility curves. To verify the effect of scene-based utility curve generation, we further group the resulting curves by the similarity of their shapes. From a total of 1809 pictures for the Chef video trace, we generate 120 per-GOP instantaneous utility curves. These were then grouped into 22 different scene groups. The largest scene group contains 20 GOPs, while the smallest group contains only one GOP. In Figure 5(b), GOP 15 to 18 are associated with the same scene group and the corresponding utility curves are well matched. However, GOP 19 and 20 (illustrated in Figure 5(c)) have quite different utility curve shapes. Note that there are scene changes occurring in these two GOPs.
This experiment illustrates that instantaneous utility curves are very sensitive to picture types and scene changes, which are interactive. Even with scene detection and predication, the resulting utility curves may not be applicable over long time scale. In fact, for GOP 19 and 20, the generated utility curve is only good for half a second (i.e., one GOP interval). In the Chef video trace, even the largest scene (with 20 GOPs) can only provide a "valid and accurate" curve for approximately 10 seconds. This may be a useful time scale for the network adaptation time scale to operate over. However, the prediction over even longer periods would be desirable.
The prediction algorithm is designed to operate over per-GOP generated instantaneous utility curves. We first analyze the performance of the prediction algorithm with fixed expanding factor.
4.3.1 The Effect of a Fixed Expanding Factor
We apply the prediction algorithm over the Chef trace, with a = 0.8 and e = 1.3. We set e^{inc} and e^{dec} to zero to turn off the adjustment on e in this experiment. The number of violations under constant expanding factor e is given in Table 1. The data was collected for experiments with e set from 1.1 to 1.3 and the utility duration T ranging from 10 seconds to 50 seconds.
Table 1: Number of Violations in Fixed-e Prediction Algorithm
Chef Clip |
TrueLies Clip | ||||||||||
T=10 |
T=20 |
T=30 |
T=40 |
T=50 |
T=10 |
T=20 |
T=30 |
T=40 |
T=50 | ||
e=1.1 |
6 |
7 |
4 |
2 |
2 |
20 |
13 |
11 |
10 |
8 | |
e=1.2 |
2 |
2 |
0 |
2 |
0 |
1 |
5 |
2 |
5 |
1 | |
e=1.3 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
From Table 1 it is clear that the number violations decrease as e increases from 1. However, simply increasing e is not a viable solution. The resulting utility curve will degenerate into a step function shape (i.e. utility value is 1 when rate is below the peak rate and 5 otherwise). Such a curve is of little use for network adaptation since it is equivalent to peak-rate allocation and does not capture the scalability of video flows.
4.3.2 The Effect of an Adaptive Expanding Factor
One solution to reduce the number of violations is to dynamically increase or decrease the expanding factor according to the degree of "relaxation" on measurement and the existence of violations. By adaptively adjusting e, we can reduce the number of violations by a large amount without significantly over-estimating the data rate. In the following experiments, we set e^{inc} = 0.1 and e^{dec} = 0.01 so that the expanding factor is slowly decreased when there is no violations and quickly increased when a violation occurs. Table 2 shows the performance of this adaptive mechanism for determining the expanding factor.
For e = 1.1 and 1.2, the algorithm generally performs well because the number of violations are much smaller than the results shown in Table 1. However, there are cases (e.g. for e=1.3, T=10 sec, TrueLies clip, and for e=1.2, T=20 sec, Chef clip) that the "adaptive e" algorithm increases the number of violations by a small amount from the "fixed e" algorithm. These incidences are caused by the mechanism of decreasing expanding factor e. In Table 2, the number of violations does not change significantly when T varies. This indicates that the violations occur in bursts, which could be caused by a sequence of fast scene changes. We observe that the number of violations does not increase as T increases. One reason for this is that when utility duration is large, the chance of reducing the expanding factor is smaller because the expanding factor is only adjusted at the end of utility duration assuming no violation has been observed. This result implies that the algorithm will perform well for large and small T.
Table 2: Number of Violations in Adaptive-e Prediction Algorithm
Chef Clip |
TrueLies Clip | ||||||||||
T=10 |
T=20 |
T=30 |
T=40 |
T=50 |
T=10 |
T=20 |
T=30 |
T=40 |
T=50 | ||
e=1.1 |
4 |
2 |
2 |
2 |
2 |
4 |
5 |
5 |
5 |
5 | |
e=1.2 |
2 |
3 |
0 |
2 |
0 |
2 |
1 |
1 |
1 |
1 | |
e=1.3 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
Figure 6(a) and 6(b) illustrate predicted utility curves for T = 30 seconds with an initial value of e set to 1.2. For the Chef trace, we are able to predict one utility curve every 30 seconds (60 GOPs) with no violations (as defined in Section 3.5). The resulting utility curves are illustrated in Figure 6(a). Here a total of 2 curves are generated. Each one is derived from the measurement of 60 per-GOP instantaneous utility curves over the past 30 seconds. Because there are no violations for the whole video clip, each curve can be used for network adaptation for the next 30 seconds without any needs to renegotiate a new utility curve with the bandwidth allocator between periodic updates.
Figure 6(b) illustrates the predicted utility curves for the TrueLies trace that contains more frequent scene changes than the Chef trace. We observe that half of the curves do not have step-shapes at the peak rate. Rather, they span the whole range of scalable rates. This demonstrates that the derived utility curves track the scalability of the video streams rather well. The step-shape utility curves in 6(b) result from a single violation at time equal to 19 sec due to bandwidth under-estimation in the utility prediction process at 0 second, which is caused by the fact that there is no historical data available for the very first utility prediction. When an increment is added to the expanding factor, the subsequent three curves are generated as step functions because of a large expanding factor. However, the prediction algorithm corrects its operation by continuously reducing the expanding factor until there is no violation in the utility duration. Observe that at time 139 sec, the generated curve becomes non-step shape again.
Figure 6: Predicted Utility Curve with Initial Adaptive e =1.2
To quantitatively analyze the measurement error introduced by the prediction algorithm, we define two error metrics that measure the maximum distance between a predicted utility curve u*(R) and an instantaneous utility curve u_{j}(R) generated in the subsequent utility duration of the predicted curve:
err^{+} = max_{i = 1, ... , 5 }{ u*^{-1}(i) - u_{j}^{-1}(i) , 0 }/ R_{max};
err^{-} = min_{i = 1, ... , 5 }{ u*^{-1}(i) - u_{j}^{-1}(i) , 0 }/ R_{max} .
Both metrics are defined as the percentage of over or under allocation relative to the maximum rate R_{max}. In Figure 7(a) and (b), we depict the measured errors after applying our prediction algorithm to the TrueLies clip, with an initial value of 1.2 for the adaptive expanding factor. The utility duration is set to 10 sec in 7(a) and 50 sec in 7(b). The top curve in both figures represents the over-estimation error that oscillates frequently with the range between 5% and 20%. The 20% over-estimation results from the 1.2 expanding factor that essentially inflates the rate estimation by 20%. The under-estimation error mostly remains zero and generates a negative spike of less than -3% only when there are observed violations.
To determine the optimal performance, we run the long-range predictor in an offline mode. The offline predictor stores all the instantaneous utility curves generated during a target utility duration to accurately compute the long-range utility curve for that utility duration. The resulting system becomes noncausal as it uses posteriori data to predict utility curve. Figure 7(c) illustrates the best estimation error achieved by the offline algorithm over the TrueLies clip with a 50 sec utility duration. The under-estimation error is constantly zero. The over-estimation error has a similar shape to Figure 7(b) but is shifted to the range between 0% and 15%. In this case the over-estimation error is predominantly caused by the intrinsic scene changes of the video content and not by the utility generation procedure.
Figure 7: Utility Estimation Error (TrueLies Clip)
After comparing the over-estimation errors in Figure 7, we observe that (1) the two over-estimation curves in Figure 7(a) and (b) are of similar shape indicating that the proposed prediction algorithm is not sensitive to the length of utility duration, and (2) the over-estimation error in Figure 7(c) is of the similar shape as the ones in Figure 7(a) and (b) but is smaller by an amount of 5%. This 5% is caused by the difference between the 20% expanding factor and the 15% peak error shown in Figure 7(c). We observe that the proposed prediction algorithm performs well in tracking the intrinsic scalability of video streams. The extra degree of over-estimation (e.g. 5% in Figure 7(a)) is necessary for a causal system and represents a tradeoff between the tightness of utility prediction and the number of violations observed in a utility duration.
The preceding evaluation results demonstrate that the proposed adaptive utility-prediction algorithm (described in section 3.5 and the Appendix) can effectively predict utility curves over a utility duration comparable to network adaptation time scale. These results show that there are two conflicting mechanisms (i.e. incrementing and decrementing of the expanding factor) that inter-play in the proposed algorithm. These two mechanisms represent the tradeoff between a large utility duration vs. an accurate measure on video scalability.
One may view the variation in video content scalability composed of a spectrum of dynamics. The prediction algorithm filters out the intrinsic slow time-varying dynamics that can be utilized efficiently by network over time scales suited to its dynamic, e.g., signaling. Even though the non-causal system discussed in preceding section may operate offline for video on demand systems, the operation can not be applied to network adaptation because the desired utility duration may not be decided in advance of offline operation. This is because the network adaptation time scale is unknown before transmission and may vary during the lifetime of a video transport session due to variation in round trip delay and capacity of signaling/control system performing bandwidth renegotiation. In fact, a measurement mechanism is needed to dynamically detect the network adaptation time scale. The utility generator then in turn has to operate online to determine a suitable utility duration.
It is possible that content providers will encode pre-generated instantaneous utility curves and scaling profiles inside the video packet header or control stream (i.e. MPEG-4 object descriptor) to leverage the adaptive encoding features at the source encoder and to control the way in which video content is scaled at content scalers inside network. In this case, the first three modules of the utility generator system (section 3.1) are located inside video encoder since there is no need to separately generate instantaneous utility curves on-the-fly. However, the utility predictor needs to operate as a network module predicting the long-range utility curves over the network adaptation time scale.
Our findings are general even though we do not experiment with other coding types, e.g. MPEG-2/4 and H.263+. Because of the similarity in bit-stream syntax and compression techniques used by these coding techniques, the findings on predicting utility metrics are general enough to be applied to other types of video streams. There is a potential side-effect of error-propagation resulting from media scaling inside network, i.e. the dropping of one frame may lead to the corruption of the remaining video stream when a very high compression ratio is used in encoding. However, with error-resilience coding techniques added to low bit-rate video transport (e.g. over wireless link) [13], the distortion caused by dropping frames and coefficients can be well contained.
In this paper, we have proposed a utility-based network-adaptation framework based on the notion of a passive and dynamic utility generator system that operates online and can efficiently create utility curves for packet video on-the-fly. The utility generator represents a new approach to generating utility curves through a passive inspection process. The utility generator interacts with a bandwidth allocator to periodically renegotiate the network resources needs of scalable video applications.
To match the network renegotiation time scale, we have experimented with an adaptive prediction algorithm that dynamically adjusts the extent of bandwidth over-estimation in utility measurements. Our experiments verify the effectiveness of our adaptive prediction algorithm. We achieve long-range predicted utility curves without sacrificing measurement accuracy.
The utility metric used in this paper is based on the SNR type of video quality measure that does not take into account the properties of the human visual system. Hence this measure does not correlate well with the subjective video quality assessments [4, 12]. We plan to investigate the use of more sophisticated objective utility metrics that incorporate human vision systems. Two possible approaches are the Inst. of Telecomm. Sciences (ITS) [6] measure and the Motion Picture Quality Metric (MPQM) [7]. In addition, we plan to exploit MPEG-4's object-level scalability and scene detection/prediction as mechanisms to complement the available media scaling techniques for utility-based adaptation.
We would like to thank our colleagues at Columbia University Image Lab, especially Paul Bocheck and Shih-Fu Chang for discussing the viability of utility curve in image processing and the use of scene predication, Stephen Jacobs and Alexandros Eleftheriadis for explaining the details of the dynamic rate shaping algorithm. In addition, we would like to thank Olivier Verscheure from Swiss Federal Institute of Technology (EPFL) for his valuable discussion on video quality metrics incorporating human visual systems. Finally, we would like to thank the Intel Corporation for sponsoring this research with a grant for programmable mobile networking.
[1] N. Yeadon, F. Garcia, D. Hutchison and D. Shepherd, "Filters: QoS Support Mechanisms for Multipeer Communications", IEEE Journal on Selected Areas in Communications, Special Issue on Distributed Multimedia Systems and Technology, Vol. 14, No. 7, Sept. 1996, pp 1245-1262.
[2] P. Bocheck and S.-F. Chang, "Content Based Dynamic Resource Allocation for VBR Video in Bandwidth Limited Networks'', Proc. of the Sixth IEEE/IFIP International Workshop on Quality of Service (IWQoS'98), Napa, California, May 18-20, 1998.
[3] R. Liao and A. Campbell, "On Programmable Universal Mobile Channels", Proc. of the 4^{th} ACM Intl. Conf. on Mobile Computing and Networking (Mobicom'98), Dallas, TX, Oct. 1998.
[4] Recommendation ITU-R BT.500-7, "Methodology for the Subjective Assessment of the Quality of Television Pictures", ITU-R Recommendations, Geneva, Oct. 1995.
[5] D. Reininger, R. Izmailov, "Soft Quality of Service with VBR + Video", Proc. of Intl. Workshop on Audio-Visual Services over Packet Networks (AVSPN'97), Aberdeen, Scotland, UK, Sept. 15-16, 1997, pp. 207-211.
[6] A. Webster, C. Jones, M. Pison, S. Voran, and S. Wolf, "An Objective Video Quality Assessment System Based on Human Perception", SPIE Human Vision, Visual Processing, and Digital Display IV, Vol. 1913, pp. 15-26, San Jose, CA, February 1993.
[7] C. Lambrecht and O. Verscheure, "Perceptual Quality Measure Using a Spatio-Temporal Model of the Human Visual System", Proc. of IS&T/SPIE, San Jose, Feb. 1996.
[8] S. Schenker, "Some Fundamental Design Decisions for the Future Internet'', IEEE Journal on Selected Areas in Communications, Vol. 13, pp. 1176-1188, 1995.
[9] K. Lee, "Adaptive Network Support for Mobile Multimedia", Proc. of 1st ACM Intl. Conf. on Mobile Computing and Networking (Mobicom'95), Berkeley, CA, Nov. 1995.
[10] A. Eleftheriadis and D. Anastassiou, "Dynamic Rate Shaping of Compressed Digital Video", Proc. 2nd IEEE Intl. Conf. on Image Processing, Arlington, VA, Oct. 1995.
[11] T. Berger, Rate Distortion Theory: A Mathematical Basis for Data Compression. Prentice-Hall, 1971.
[12] A. Basso, I. Dalgic, F. Tobagi and C. Lambrecht, "Study of MPEG-2 Coding Performance based on a Perceptual Quality Metric", Proc. of Picture Coding Symposium, Melbourne, Australia, 1996.
[13] N. Farber, B. Girod, and John Villasenor, "Extensions of ITU Recommendation H.324 for Error-Resilient Video Transmission", IEEE Commun. Mag., Special Issue on Wireless Video, pp120-128, June 1998.
[14] R. Koenen edit, "MPEG 4 Overview", ISO/IEC JTC1/SC29/WG11 N2459, Atlantic City, October 1998.
[15] A. Campbell, M. Kounavis, and R. Liao, "Programmable Mobile Networks", Journal on Computer Networks and ISDN Systems, April 1999.
[16] Experiment traces: "Chef and TrueLies".
Definition: R^{low} : predicated utility curve lower-bound R^{avg} : the averaged utility to update R^{low} after timeout of utility duration e : the expanding factor e^{dec} : decrement on e when there is no violation in a utility duration e^{inc} : increment on e when there is violation a : control parameter for moving average R^{avg} Initiation: On arrival of R^{0}, the first GOP utility curve R^{avg} = R^{0}; R^{low} = e * R^{0} ; Loop: On arrival of R^{k}, the kth GOP utility curve R^{avg} = a * R^{avg} + (1-a) R^{k} ; // moving average, internal mesurement if (R^{k} <= R^{low} ) { // all R^{k}_{j} <= R^{low}_{j } if (timeout) { // periodic update e = max{1, e - e^{dec} }; // no violation, decrease e R^{low} = e * R^{avg} , R^{low} send to network; } } else { // violation! Some R^{k}_{j} > R^{low}_{j } e = e + e^{inc} ; for j in {1, ... , 5} { // search for the violating component if (R^{k}_{j} > R^{low}_{j}) R^{low}_{j} = min{e * R^{k}_{j}, R^{low}_{j+1}}, R^{avg}_{j} = R^{k}_{j}; } R^{low} send to network; }