Special Issue on Computer Communications: Building QoS into Distributed Systems

Guest Editorial

Quality of Service in Distributed Systems

Delivering quality of service (QOS) guarantees in distributed systems is fundamentally an end-to-end issue, that is, from application-to-application. Consider, for example, the remote access and distribution of audio and video content from a web server: in the distributed system platform, quality of service assurances should apply to the complete flow of information from the remote server across the network to the point of delivery and playout. Generally, this requires end-to-end admission testing and resource reservation in the first instance, followed by careful co-ordination of disk and thread scheduling and flow control in the end-systems, packet/cell scheduling and congestion control in the network and, finally, active end-to-end monitoring and maintenance of the delivered quality of service. This is a complex and challenging area of research where delivering end-to-end quality of service guarantees in distributed systems is still proving very problematic.

To date, most of the developments in the area of quality of service support have occurred in the context of individual research areas. There has been considerable progress in the separate areas of distributed object computing, operating systems, transport systems and multimedia networking. In end-systems, most of the progress has been made in the areas of scheduling, multimedia synchronization and transport support. In networks, research has focused on providing suitable traffic models and service disciplines as well as appropriate admission control and resource reservation protocols. Some progress has also been made toward the definition of QOS architectures that span the end-system and network.

Many of the advances made in our understanding of how to build QOS controls into distributed systems have come from the networking community. About a decade ago researchers realized that the combination of faster processors, fiber optic links, and packet-switching enables the development and deployment of a single infrastructure that combines the capabilities of the telephone network, traditional data networks, and the radio and television broadcast networks. This early vision of an integrated services network led to the design of ATM networks, and, in parallel, the integration of voice and video transmission capabilities into the Internet. While full-scale deployment of integrated services networks is far from being realized, it seems clear that this vision will continue to play a leading role in network design and usage in years to come.

At the heart of an integrated services network is the need to provide diverse applications with their desired qualities of service. We can solve this problem with one of three approaches. First, we can overbuild capacity, so that even the most stringent application meets its requirements. This is unlikely to be cost-efficient. Second, we can require all applications to adapt to the current network-state. This is impractical for a wide class of applications, such as those that depend on guarantees of delay and bandwidth to carry audio or video traffic. Thus, the diversity of requirements posed by networked applications are likely to be efficiently satisfied only by the third alternative, that is, by tailoring network service to the characteristics of the application. This alternative requires the application to specify its service requirement, the network to limit the number of simultaneous streams - so that individual service requests are met - and the network to police traffic streams so that no user oversteps its stated usage limits. These policy requirements must be implemented by appropriate scheduling, policing, signaling and re-negotiation mechanisms. Experience has shown that, with a judicious choice of mechanisms and policies, it is possible to build networks that do provide per-application end-to-end quality of service. However, moving solutions from small-scale testbeds to the field requires solving a number of significant problems.

Perhaps the main practical problem to widely deploying technologies that provide quality of service is that they are incompatible with the existing ubiquitous infrastructure, both in the telephone network and the Internet. The telephone network provides only a single service quality, and the Interent essentially provides none. Today, all users recognize the wealth of information available on the Internet. However, each one is also aware that accessing multimedia content can be a frustrating experience. Currently, most multimedia information available on the Internet is low-quality audio accompanied by small, grainy, slow motion video. This will change as new integrated services technology is phased in. How can we phase-in the new infrastructure, particularly when an end-to-end guarantee requires quality of service mechanisms at every hop? The answer to this question will determine the pace at which integrated services networks become deployed in the field. Even if we were to come up with a good solution to this problem, other challenges remain.

First, how should an application describe its traffic and its performance requirements? The default answer, which is to use a leaky bucket, does not seem adequate in dealing with traffic that shows burstiness at multiple time scales. Moreover, with the rapid introduction of new networked applications, it is not clear which set of performance requirements ought to be supported by the network. Indeed, it is not even clear whether network operators should allow individual applications to customize their own performance parameters (leading to complex admission control algorithms) or whether applications must choose from a small fixed menu of choices. Second, what support should an end-system provide to serve different streams with different qualities of service? Should we require distributed object computing and operating system designers to change their scheduling and buffer management algorithms? If so, what changes are necessary? Third, if applications are allowed to ask for highly customizable services, what can we say about admission control? How can we characterize the set of simultaneously admissible applications? Fourth, how can we assure quality of service in the presence of wireless networks, where link quality can rapidly degrade, and connectivity is not guaranteed? Fifth, how can we scale our designs? How do we ensure that the amount of state at an intermediate system and the computation time for scheduling and admission algorithms stays within practical limits as the system grows? The list of hard problems seems to grow without bound, the closer we study the problem!

To address these and other questions we have put together an issue of Computer Communications on "Quality of Service in Distributed Systems". The purpose of this issue is to bring together research and experiences from the building of experimental systems and through analysis of quantitative models. Topics addressed in this issue include quality of service research in adaptive applications, distributed object computing, end-system QOS architectures, Internet, broadband networks and routing. For this issue, we received over twenty papers, of which four were selected for publication. The remainder of the issue is comprised of two of the best papers from the 5th IFIP International Workshop on Quality of Service (IWQOS'97) and one invited contribution. While each of the papers cover different aspects of the problem space they collectively add to our understanding of realizing end-to-end quality of service in distributed systems.

Many real-time application domains can benefit from flexible and open distributed object computing environments. Many of these environments are well suited for conventional request/response applications but lack the necessary support for real-time applications. Schmidt, Levine and Mungee, in the invited paper entitled, "The Design of the TAO Real-Time Object Request Broker", describe the design of a high performance, real-time CORBA implementation called TAO. The platform runs on a range of operating systems and features real-time scheduling services that can provide quality of service guarantees for deterministic real-time CORBA implementations. Many conventional CORBA implementations suffer from the effects of priority inversion and non-determinism that are resolved by TAO.

To provide predictable quality of service in end-systems we need to design resource management architectures whose applications and operating systems cooperate to dynamically adapt to variations in the applications' resource requirement and available networking resources. In "Integrated CPU and Network I/O QOS Management in a End System", Lakshaman, Yavatkar and Finkel, describe the implementation of an Adaptive Quality of Service Architecture (AQUA) using the Sun Solaris operating system. They examine how AQUA can manage the CPU to provide predictable quality of service to applications. A key notion of the work is managing CPU and network IO resources in an integrated fashion.

Quality of service concepts are as applicable in the end-system as they are in the network. Ott, Michelitsch, Reininger and Welling in their paper, "An Architecture for Adaptive QOS and its Application to Multimedia Systems Design", describe the implementation of an adaptive distributed multimedia system that generalizes the concept of QOS applying it across all software architectural layers. A prototype system evaluates the proposed QOS architecture. Key components of the system are a graphical user interface that captures the applications' QOS requirements, a dynamic network service that efficiently matches available network resources to user requirements and a processor scheduler that schedules tasks according to the execution requirements. In related work, Alfano in the paper, "Design and Implementation of a Cooperative Multimedia Environment with QOS Control", discusses the development of a proof-of-concept cooperative multimedia system that manages multimedia services and underlying resources in an integrated way. In this QOS architecture particular emphasis is placed on realizing QOS mapping and QOS control mechanisms in an experimental environment that supports the cooperative access, manipulation and distribution of continuous media and data.

The Internet Engineering Task Force has described a class of applications that require hard real-time guarantees for QOS from a future integrated services Internet. These applications need the network to guarantee that packets will arrive within a guaranteed delivery time and will not be discarded at intermediate routers due to queue overflow, provided that the flows traffic characteristics remain within an agreed contract. In "Call Admission and Resource Reservation for Guaranteed QOS services in the Internet", Verma, Pankaj and Leon-Garcia develop an efficient and distributed algorithm based on a cost function to divide the end-to-end guaranteed QOS requirement into local QOS requirements that are mapped to local resource requirements. The solution takes into account route selection and the actions of receiving and sending nodes in the admission control and resource reservation process.

Gaining an understanding of the fundamental relationship between quality of service, connection cost and traffic smoothing in Variable Bite Rate (VBR) video over ATM networks is an important area of research. Zhang and Hui in their paper, "Applying Traffic Smoothing Techniques for QOS Service Control in VBR video transmissions", discuss how traffic shaping can improve the quality of service and at the same time reduce the network cost for delivering VBR video over ATM networks. The authors present an efficient traffic smoothing technique called "minimum polyline smoothing" that minimizes the connection cost for deterministic services and dynamically adjusts QOS parameters for statistical services in VBR video transmissions.

QOS-based routing in the Internet and ATM networks is known to be an intractable problem. Recent work suggests that call-by-call source routing is ideal for QOS-based path selection within small networks where it is easy to distribute relatively accurate topological information. Such an approach has a number of limitations as network scale to larger sizes. In their paper, entitled "A Scalable QOS-based Inter-Domain Routing Scheme in a High Speed Wide Area Network", Kim, Lim and Kim discuss a suite of routing algorithms specifically designed to cater to large networks.

The papers selected for this issue address many of the important technical barriers that exist in building quality of service into distributed systems. After a decade of research in the field, many problems remain. Our only hope, then, may be to build small systems, study their behavior, and apply what has been learned toward the construction of larger and scalable distributed systems We believe that we need to take a hands-on approach coupled with the analysis of well-founded models. The seven papers in this issue are examples of this thinking.

Enjoy!

Andrew T. Campbell, Guest Editor

S. Keshav, Guest Editor

Andrew T. Campbell is an Assistant Professor in the Department of Electrical Engineering and a member of the COMET Group at the Center for Telecommunications Research, Columbia University, New York. Dr. Campbell joined Columbia in January 1996 after completing his Ph.D. at Lancaster University, UK, where he conducted research in multimedia communications as a British Telecom Research Lecturer. Before joining Lancaster University, he worked for 10 years in industry focusing on the design and development of network operating systems and communication protocols for packet-switched, local area networks and packet radio networks. He is currently leading a new research effort at Columbia in Wireless Media Systems focusing on the development of programmable QOS-aware middleware for mobile multimedia networks that comprise ad-hoc, broadband and next generation Internet technologies.

S. Keshav is an Associate Professor of Computer Science at Cornell University. Formerly a Member of Technical Staff at AT&T Bell Laboratories, Dr. Keshav received his Ph.D. in Computer Science from UC Berkeley in 1991. He is the author of numerous technical papers, two of which were selected by ACM SIGCOMM as being among the most influential papers in computer networking in the past twenty-five years. His book, "An Engineering Approach to Computer Networking", was published by Addison-Wesley in their Professional Computing Series in May 1997.

Order of Papers for Issue

1. Douglas Schmidt "QoS Support for Real-Time CORBA" Washinghton University [Abstract], [ps].

2. K.Lakshman, R. Yavatkar " Integrated CPU and Network I/O QoS Management in an End-System" Intel Architecture Labs and University of Kentucky [Abstract], [ps].

3. M. Ott, G. Michelitsch, D. Reininger, G. Welling "An Architecture for Adaptive QoS and its Application to Multimedia Systems Design" NEC USA Inc [Abstract], [ps].

4. Marco Alfano "Design and Implementation of a Cooperative Multimedia Environment wih QoS Control" University of Palermo, Italy [Abstract], [ps].

5. Sanjeev Verma, Rajesh K. Pankaj, Alberto Leon-Garcia " Call Admission and Resource Reservation for Guaranteed QoS services in Internet", University of Toronto, Canada [Abstract], [ps].

6. Junbiao Zhang, Joseph Y. Hui "Applying traffic smoothing techniques for quality of service control in VBR video tansmissions", Rutgers University, Piscataway New Jersey [Abstract], [ps].

7. Seung-Hoon Kim, Kyungshik Lim, and Cheeha Kim "A Scalable QoS-based Inter-Domain Routing Scheme in a High Speed Wide Area Network" Pohang University of Science and Technology, Korea 5 [Abstract], [ps].