by Jeremy Stretch v QUALITY OF SERVICE PART 1. Quality of Service Models. Layer 2 QoS Markings. Medium. Ethernet Class of Service (CoS). Name. PDF | It is often the case that in the current literature, the term “QoE” is used in Quality of Service, or QoS, is a well-established research domain that has seen. This book is designed to provide information about IP Quality of Service. Every effort has been made to make this book as complete and as accurate as possible, .

Author: | LEEANNA GALVES |

Language: | English, Spanish, Indonesian |

Country: | Botswana |

Genre: | Health & Fitness |

Pages: | 787 |

Published (Last): | 23.08.2015 |

ISBN: | 340-8-34607-957-9 |

ePub File Size: | 15.85 MB |

PDF File Size: | 10.22 MB |

Distribution: | Free* [*Sign up for free] |

Downloads: | 21706 |

Uploaded by: | DANI |

Quality of Service (QoS) refers to the capability of a network to provide better service to selected network traffic over various technologies, including Frame Relay. This chapter explains quality of service (QoS) and the service models that embody it. It also suggests benefits you can gain from implementing Cisco IOS QoS in. Quality of Service. These slides borrow material from various sources which are indicated below each slide when necessary. Slides mostly taken from.

If you are supporting a large group of users and they are experiencing any of the problems mentioned below, you probably need to implement QoS. A small business with few users may not need QoS, but even there it should be helpful. QoS is a way to allow real-time network traffic like voice or video streams that is sensitive to network delays to "cut in line" in front of traffic that is less sensitive like downloading a new app, where an extra second to download isn't a big deal. QoS identifies and marks all packets in real-time streams using Windows Group Policy Objects and a routing feature called Port-based Access Control Lists, more about those is below which then helps your network to give voice, video, and screen share streams a dedicated portion of network bandwidth. Without some form of QoS, you might see the following quality issues in voice and video: Jitter — media packets arriving at different rates, which can result in missing words or syllables in calls. Packet loss — packets dropped, which can also result in lower voice quality and hard to understand speech.

It uses aggregate measurements i. Pacifici et al. The problem is formulated as a multivariate linear regression problem and accounts for multiple effects such as data aging. Also, several works have shown how combining the queueing theoretic formulas used by regression methods with the Kalman filter can enable continuous demand tracking [ 41 ],[ 42 ].

Regression techniques have also been used to correlate the CPU demand placed by a request on multiple servers. For example, linear regression of average utilization measurements against throughput can correctly account for the visit count of requests to each resource [ 32 ]. Stepwise linear regression [ 43 ] can also be used to identify request flows between application tiers.

The knowledge of request flow intensities provides throughputs that can be used in regression techniques. Explicit modeling of this logic, or part of it, for QoS prediction can help improving the effectiveness of QoS management. Several classes of models can be used to model QoS in cloud systems. Here we briefly review queueing models, Petri nets, and other specialized formalisms for reliability evaluation.

However, several other classes exist such as stochastic process algebras, stochastic activity networks, stochastic reward nets [ 44 ], and models evaluated via probabilistic model checking [ 45 ]. A comparison of the pros and cons of some popular stochastic formalisms can be found in [ 46 ], where the authors highlight the issue that a given method can perform better on some system model but not on others, making it difficult to make absolute recommendations on the best model to use.

LQNs are used to better model key interaction between application mechanisms, such as finite connection pools, admission control mechanisms, or synchronous request calls. Modeling these feature usually require an in-depth knowledge of the application behavior. On the other hand, while closed-form solutions exist for some classes of queueing systems and queueing networks, the solution of other models, including LQNs, rely on numerical methods.

Queueing Systems. Queueing theory is commonly used in system modeling to describe hardware or software resource contention.

Several analytical formulas exist, for example to characterize request mean waiting times, or waiting buffer occupancy probabilities in single queueing systems. In cloud computing, analytical queueing formulas are often integrated in optimization programs, where they are repeatedly evaluated across what-if scenarios.

The authors use this model to investigate rejection probabilities and help dimensioning of cloud data centers. Other works that rely on queueing models to describe cloud resources include [ 53 ],[ 54 ]. The works in [ 53 ],[ 54 ] illustrate the formulation of basic queueing systems in the context of discrete-time control problems for cloud applications, where system properties such as arrival rates can change in time at discrete instants.

These works show an example where a non-stationary cloud system is modeled through queueing theory.

Queueing Networks. A queueing network can be described as a collection of queues interacting through request arrivals and departures. Each queue represents either a physical resource e. Cloud applications are often tiered and queueing networks can capture the interactions between tiers. An example of cloud management solutions exploiting queueing network models is [ 55 ], where the cloud service center is modeled as an open queueing network of multiclass single-server queues.

PS scheduling is assumed at the resources to model CPU sharing. Each layer of queues represents the collection of applications supporting the execution of requests at each tier of the cloud service center. This model is used to provide performance guarantees when defining resource allocation policies in a cloud platform. Also, [ 56 ] uses a queueing network to represent a multi-tier application deployed in a cloud platform, and to derive an SLA-aware resource allocation policy.

Each node in the network has exponential processing times and a generalized PS policy to approximate the operating system scheduling. Layered Queueing Networks. Layered queueing networks LQNs are an extension of queueing networks to describe layered software architectures. Compared to ordinary queueing networks, LQNs provide the ability to describe dependencies arising in a complex workflow of requests and the layering among hardware and software resources that process them.

Several evaluation techniques exist for LQNs [ 58 ]-[ 61 ]. LQNs have been applied to cloud systems in [ 62 ], where the authors explored the impact of the network latency on the system response time for different system deployments.

LQNs are here useful to handle the complexity of geo-distributed applications that include both transactional and streaming workloads. Jung et al. While this work is not specific to the cloud, it illustrates the application of LQNs to multi-tier applications that are commonly deployed in such environments. Bacigalupo et al.

LQNs are used to predict the performance of an enterprise application deployed on the cloud with strict SLA requirements based on historical data. The authors also provide a discussion about the pros and cons of LQNs identifying a number of key limitations for their practical use in cloud systems. These include, among others, difficulties in modeling caching, lack of methods to compute percentiles of response times, tradeoff between accuracy and speed.

Since then, evaluation techniques for LQNs that allow the computation of response time percentiles have been presented [ 61 ]. Hybrid models.

Queueing models are also used together with machine learning techniques to achieve the benefits of both approaches. Queueing models use the knowledge of the system topology and infrastructure to provide accurate performance predictions.

However, a violation of the model assumptions, such as an unforeseen change in the topology, can invalidate the model predictions.

Machine learning algorithms, instead, are more robust with respect to dynamic changes of the system. The drawback is that they adopt a black-box approach, ignoring relevant knowledge of the system that could provide valuable insights into its performance. Desnoyers et al. Queueing theory is used to model different components of the system and data mining and machine learning approaches ensure dynamic adaptation of the model to work under system fluctuations.

The proposed approach is shown to achieve high accuracy for predicting workload and resource usages. Thereska et al. The performance models are based on queueing-network models abstracted from the system and enhanced by machine learning algorithms to correlate system workload attributes with performance attributes.

A queueing network approach is taken in [ 66 ] to provision resources for data-center applications. As the workload mix is observed to fluctuate over time, the queueing model is enhanced with a clustering algorithm that determines the workload mix. The approach is shown to reduce SLA violations due to under-provisioning in applications subject to to non-stationary workloads.

Petri nets are a flexible and expressive modeling approach, which allows a general interactions between system components, including synchronization of event firing times.

They also find large application also in performance analysis.

RBDs and Fault Trees aim at obtaining the overall system reliability from the reliability of the system components. The interactions between the components focus on how the faulty state of one or more components results in the possible failure of another components.

Petri nets. It has long been recognized the suitability of Petri nets for performance and dependability of computer systems. They have recently enjoyed a resurgence of interest in service-oriented systems to describe service orchestrations [ 67 ]. In the context of cloud computing, we have more application examples of Petri nets nets for dependability assessment, than for performance modeling. Applications to cloud QoS modeling include the use of SPNs to evaluate the dependability of a cloud infrastructure [ 68 ], considering both reliability and availability.

SPNs provide a convenient way in this setting to represent energy flow and cooling in the infrastructure.

Wei et al. GSPNs are used to provide fine-grained detail on the inner VM behaviors, such as separation of privileged and non-privileged instructions and successive handling by the VM or the VM monitor. Petri nets are here used in combination with other methods, i. Reliability Block Diagrams. Reliability block diagrams RBDs are a popular tool for reliability analysis of complex systems.

The system is represented by a set of inter-related blocks, connected by series, parallel, and k-out-of-N relationships. In [ 70 ], the authors propose a methodology to evaluate data center power infrastructures considering both reliability and cost.

RBDs are used to estimate and enforce system reliability. Dantas et al. An RBD is used to evaluate the impact of a redundant cloud architecture on its dependability. A case study shows how the redundant system obtains dependability improvements.

Melo et al. Fault Trees. Fault Trees are another formalism for reliability analysis. The system is represented as a tree of inter-related components. If a component fails, it assumes the logical value true, and the failure propagation can be studied via the tree structure. In cloud computing, Fault Trees have been used to evaluate dependencies of cloud services and their effect on application reliability [ 73 ].

Fault Trees and Markov models are used to evaluate the reliability and availability of fault tolerance mechanisms. Jhawar and Piuri [ 74 ] uses Fault Trees and Markov models to evaluate the reliability and availability of a cloud system under different deployment contexts.

Kiran et al. Fault Trees are used to assess the probability of SLA violations. The idea behind the methods reviewed in this section is to describe a service in terms of its response time, assuming the lack of any further information concerning its internal characteristics e. Non-parametric blackbox service models include methods based on deterministic or average execution time values [ 77 ]-[ 81 ].

Several works instead adopt a description that includes standard deviations [ 76 ],[ 82 ],[ 83 ] or finite ranges of variability for the execution times [ 84 ],[ 85 ]. Parametric service models instead assume exponential or Markovian distributions [ 86 ],[ 87 ], Pareto distributions to capture heavy-tailed execution times [ 88 ], or general distributions with Laplace transforms [ 89 ]. Huang et al. Here, the authors explore the QoS-aware service provisioning in cloud platforms by explicitly considering virtual network services.

A comparison with state of the art QoS routing algorithms shows that the proposed algorithm is both cost-effective and lightweight. Klein et al.

The authors present a network model that allows estimating latencies between locations and propose a genetic algorithm to achieve network-aware and QoS-aware service provisioning. The work in [ 92 ] considers cloud service provisioning from the point of view of an end user.

An economic model based on discrete Bayesian Networks is presented to characterize end-users long-term behavior. In [ 36 ] an on-line resource demand estimation approach is presented. Experiments with different workloads show the importance of tuning the parameters, thus the authors proposes an online method to tune the regression parameters.

Casale et al. It uses aggregate measurements i. Pacifici et al. The problem is formulated as a multivariate linear regression problem and accounts for multiple effects such as data aging. Also, several works have shown how combining the queueing theoretic formulas used by regression methods with the Kalman filter can enable continuous demand tracking [ 41 ],[ 42 ].

Regression techniques have also been used to correlate the CPU demand placed by a request on multiple servers. For example, linear regression of average utilization measurements against throughput can correctly account for the visit count of requests to each resource [ 32 ]. Stepwise linear regression [ 43 ] can also be used to identify request flows between application tiers. The knowledge of request flow intensities provides throughputs that can be used in regression techniques.

Explicit modeling of this logic, or part of it, for QoS prediction can help improving the effectiveness of QoS management. Several classes of models can be used to model QoS in cloud systems. Here we briefly review queueing models, Petri nets, and other specialized formalisms for reliability evaluation. However, several other classes exist such as stochastic process algebras, stochastic activity networks, stochastic reward nets [ 44 ], and models evaluated via probabilistic model checking [ 45 ].

A comparison of the pros and cons of some popular stochastic formalisms can be found in [ 46 ], where the authors highlight the issue that a given method can perform better on some system model but not on others, making it difficult to make absolute recommendations on the best model to use.

LQNs are used to better model key interaction between application mechanisms, such as finite connection pools, admission control mechanisms, or synchronous request calls. Modeling these feature usually require an in-depth knowledge of the application behavior. On the other hand, while closed-form solutions exist for some classes of queueing systems and queueing networks, the solution of other models, including LQNs, rely on numerical methods.

Queueing Systems. Queueing theory is commonly used in system modeling to describe hardware or software resource contention. Several analytical formulas exist, for example to characterize request mean waiting times, or waiting buffer occupancy probabilities in single queueing systems. In cloud computing, analytical queueing formulas are often integrated in optimization programs, where they are repeatedly evaluated across what-if scenarios.

The authors use this model to investigate rejection probabilities and help dimensioning of cloud data centers. Other works that rely on queueing models to describe cloud resources include [ 53 ],[ 54 ]. The works in [ 53 ],[ 54 ] illustrate the formulation of basic queueing systems in the context of discrete-time control problems for cloud applications, where system properties such as arrival rates can change in time at discrete instants.

These works show an example where a non-stationary cloud system is modeled through queueing theory. Queueing Networks. A queueing network can be described as a collection of queues interacting through request arrivals and departures. Each queue represents either a physical resource e. Cloud applications are often tiered and queueing networks can capture the interactions between tiers. An example of cloud management solutions exploiting queueing network models is [ 55 ], where the cloud service center is modeled as an open queueing network of multiclass single-server queues.

PS scheduling is assumed at the resources to model CPU sharing. Each layer of queues represents the collection of applications supporting the execution of requests at each tier of the cloud service center. This model is used to provide performance guarantees when defining resource allocation policies in a cloud platform. Also, [ 56 ] uses a queueing network to represent a multi-tier application deployed in a cloud platform, and to derive an SLA-aware resource allocation policy.

Each node in the network has exponential processing times and a generalized PS policy to approximate the operating system scheduling. Layered Queueing Networks. Layered queueing networks LQNs are an extension of queueing networks to describe layered software architectures.

Compared to ordinary queueing networks, LQNs provide the ability to describe dependencies arising in a complex workflow of requests and the layering among hardware and software resources that process them. Several evaluation techniques exist for LQNs [ 58 ]-[ 61 ]. LQNs have been applied to cloud systems in [ 62 ], where the authors explored the impact of the network latency on the system response time for different system deployments.

LQNs are here useful to handle the complexity of geo-distributed applications that include both transactional and streaming workloads. Jung et al.

While this work is not specific to the cloud, it illustrates the application of LQNs to multi-tier applications that are commonly deployed in such environments.

Bacigalupo et al. LQNs are used to predict the performance of an enterprise application deployed on the cloud with strict SLA requirements based on historical data. The authors also provide a discussion about the pros and cons of LQNs identifying a number of key limitations for their practical use in cloud systems. These include, among others, difficulties in modeling caching, lack of methods to compute percentiles of response times, tradeoff between accuracy and speed.

Since then, evaluation techniques for LQNs that allow the computation of response time percentiles have been presented [ 61 ]. Hybrid models. Queueing models are also used together with machine learning techniques to achieve the benefits of both approaches. Queueing models use the knowledge of the system topology and infrastructure to provide accurate performance predictions.

However, a violation of the model assumptions, such as an unforeseen change in the topology, can invalidate the model predictions. Machine learning algorithms, instead, are more robust with respect to dynamic changes of the system. The drawback is that they adopt a black-box approach, ignoring relevant knowledge of the system that could provide valuable insights into its performance.

Desnoyers et al. Queueing theory is used to model different components of the system and data mining and machine learning approaches ensure dynamic adaptation of the model to work under system fluctuations. The proposed approach is shown to achieve high accuracy for predicting workload and resource usages. Thereska et al. The performance models are based on queueing-network models abstracted from the system and enhanced by machine learning algorithms to correlate system workload attributes with performance attributes.

A queueing network approach is taken in [ 66 ] to provision resources for data-center applications. As the workload mix is observed to fluctuate over time, the queueing model is enhanced with a clustering algorithm that determines the workload mix. The approach is shown to reduce SLA violations due to under-provisioning in applications subject to to non-stationary workloads. Petri nets are a flexible and expressive modeling approach, which allows a general interactions between system components, including synchronization of event firing times.

They also find large application also in performance analysis. RBDs and Fault Trees aim at obtaining the overall system reliability from the reliability of the system components. The interactions between the components focus on how the faulty state of one or more components results in the possible failure of another components. Petri nets. It has long been recognized the suitability of Petri nets for performance and dependability of computer systems.

They have recently enjoyed a resurgence of interest in service-oriented systems to describe service orchestrations [ 67 ]. In the context of cloud computing, we have more application examples of Petri nets nets for dependability assessment, than for performance modeling.

Applications to cloud QoS modeling include the use of SPNs to evaluate the dependability of a cloud infrastructure [ 68 ], considering both reliability and availability.

SPNs provide a convenient way in this setting to represent energy flow and cooling in the infrastructure. Wei et al. GSPNs are used to provide fine-grained detail on the inner VM behaviors, such as separation of privileged and non-privileged instructions and successive handling by the VM or the VM monitor.

Petri nets are here used in combination with other methods, i. Reliability Block Diagrams. Reliability block diagrams RBDs are a popular tool for reliability analysis of complex systems. The system is represented by a set of inter-related blocks, connected by series, parallel, and k-out-of-N relationships. In [ 70 ], the authors propose a methodology to evaluate data center power infrastructures considering both reliability and cost.

RBDs are used to estimate and enforce system reliability. Dantas et al.

An RBD is used to evaluate the impact of a redundant cloud architecture on its dependability. A case study shows how the redundant system obtains dependability improvements. Melo et al. Fault Trees. Fault Trees are another formalism for reliability analysis. The system is represented as a tree of inter-related components.

If a component fails, it assumes the logical value true, and the failure propagation can be studied via the tree structure. In cloud computing, Fault Trees have been used to evaluate dependencies of cloud services and their effect on application reliability [ 73 ]. Fault Trees and Markov models are used to evaluate the reliability and availability of fault tolerance mechanisms. Jhawar and Piuri [ 74 ] uses Fault Trees and Markov models to evaluate the reliability and availability of a cloud system under different deployment contexts.

Kiran et al. Fault Trees are used to assess the probability of SLA violations. The idea behind the methods reviewed in this section is to describe a service in terms of its response time, assuming the lack of any further information concerning its internal characteristics e. Non-parametric blackbox service models include methods based on deterministic or average execution time values [ 77 ]-[ 81 ].

Several works instead adopt a description that includes standard deviations [ 76 ],[ 82 ],[ 83 ] or finite ranges of variability for the execution times [ 84 ],[ 85 ].

Parametric service models instead assume exponential or Markovian distributions [ 86 ],[ 87 ], Pareto distributions to capture heavy-tailed execution times [ 88 ], or general distributions with Laplace transforms [ 89 ]. Huang et al. Here, the authors explore the QoS-aware service provisioning in cloud platforms by explicitly considering virtual network services. A comparison with state of the art QoS routing algorithms shows that the proposed algorithm is both cost-effective and lightweight.

Klein et al.