UC-assessgrid

From EGI Knowledge Base

Jump to: navigation, search

Use Case title:

Short description:

Actors involved:


In our everyday life we book various services and benefits from a broad public and commercial infrastructure making our life more convenient. Usually there are several providers offering identical or similar services and products, e.g. there are several carriers offering flights to Brussels. The selection of an appropriate carrier by a potential passenger is based on various parameters: price and availability are major criterions, but the reputation and thus the assumed quality of service also play a major role. The reputation expresses a long-term tradition and customer trustiness. Young companies without this type of reputation publish testing and evaluation reports to prove their quality and to attract passengers. Similar mechanisms exist in many other fields. Hotel guests study the number of stars, put these in relation to the national standard and try to get as much quality of service as possible within their budget. Furthermore, they study web sites, where hotel guests comment their stay at a certain hotel and give their subjective evaluation in different categories. Auction buyers take a close look at the past evaluation of the seller and have a preference for well-ranked persons following the locality principle: If everything was OK in the past, it presumably will be satisfactory in my special case, too. We send our children to schools having an excellent reputation, give money to research institutions with outstanding records, with the hope, the job / product / service will be delivered with the highest possible quality.

The Grid community did not recognise the importance and impact of provider reputation so far. But does anyone really believe, that Grid users will submit a job / mission into a cloud of resources and will not care, who is going to execute the work? In particular, if deadlines, the project progress and finally the own career are regarded? The complete virtualisation of resources is a powerful technological development, but it is not the way how we live today, not yet. A guest, a customer, a buyer needs some information about the reputation and the quality of the provider on one hand and a legal agreement (implicit or explicit) on the other hand, before these persons are ready to assign a mission to a certain provider. While the Grid community undertook significant efforts by developing SLAs and protocols for specification and negotiation on Quality of Service issues, it completely ignored the need for reputation as corner stone for gaining trustiness in the new Grid technology. We see a direct implication by the lack of commercial, SLA-bounded providers and the lack of Grid users beyond the early adaptors.

AssessGrid addresses the issues of reputation and trustiness for all groups of Grid participants. End-users in terms of broad public (the traditional Web user today) will have a connection to a confidence center and thus have access to reputation indicators suitable for the specific job / mission. A teenager willing to play an online game sees the prices and reputation of several Grid game providers. He will benefit from past experiences published by other users and can be sure – to some extent – that he will get the expected quality of service. The SLA will be concluded with a provider where he believes – based on published rankings – to get the highest performance for the budget. Similar to current search engines showing the matching rate per hit, the Grid user will have a trustiness rate published beside the name of the providers. The importance of this information can be clearly seen on the progress of certain book sellers on the Internet: although elsewhere some books might be available earlier, the users stick to the shops they trust and prefer to wait longer.

The reputation helps also the providers to improve their service and to attract as much users as possible. But they need a platform for an objective comparison and competition, which is likely to be located at the Grid broker. These serve as an interface to the Grid for users, who are not able or don’t want to approach the Grid directly. Brokers have a large number of submitted jobs, so over time they can learn a lot about the quality of a certain provider and thus create and publish reputation indicators. These indicators express the risk of failure or the risk of job execution with lower quality than agreed and offer a platform for provider rankings and competition. Brokers acting as AssessGrid users will benefit in two areas. They will improve the reliability by estimating the failure risk and by selecting the best providers for the jobs of their customers. Thus, they will create their own reputation. Moreover, they will enrich their service: every broker can find suitable resources and rank them by price. But AssessGrid-enabled brokers can additionally take the reputation and the current risk estimation of a certain provider into account and show the user that his job is “in good hands”. A start-up company offering risk- and trust-enhanced brokering services will have a significant competition advantage compared to brokers using the existing technology.

Commercial providers of resources and services should be the driving force for large scale Grid deployment. Although customers require SLAs and the needed technology exists, providers are still cautions on adoption. An SLA is a business risk: in case of system failures or operator problems, SLA might be violated resulting into financial loss (penalty fee) and image damage (lowered reputation). But the most significant problem is that many providers don’t have any indicators to evaluate their infrastructure with respect to Grid jobs and the current utilization. Therefore, providers offer only small number of SLAs with low penalties, which are on the other hand not attractive for the users. Therefore, providers need objective indicators about the quality of the own infrastructure, risk estimation for different situations (low/high loaded resources, vacation time, overloaded network, etc.) which helps to decide on incoming SLAs and to set a penalty fee corresponding to the risk of failure and decision-support for system development, management and planning. Finally, a self-organising fault tolerance mechanisms use certain risk indicators as thresholds to increase the reliability. In case of failures and thus risk above the threshold, the business policy will be adapted. For example, longer slack-times will be negotiated, the penalty fee will be reduced or even SLAs will be rejected. On the other hand, spare resources will be activated or a redundant processing will be activated.

End-user scenario An end-user is hereby defined as Grid participant from a broad public approaching the Grid in order to use a specific service, similar to a Web user visiting a site or activating a service. An example is given by a Grid user looking for an access to an online game. The user describes the sought game by giving the name, the anticipated game duration and the available bandwidth. Based on this information, the Grid middleware compiles a list with all providers offering the service. The user can now select one of the providers and assign an SLA, but he has no information about the quality of the provider, e.g. provider might react slowly due to an overload or poor infrastructure and thus destroy the joy of playing. AssessGrid helps the end-user by taking the estimated risk into account and thus gives the opportunity to select the most suitable provider according to price and availability. Moreover, by evaluating aggregated risk indicators and past events, the user will gain certain confidence in the published risk indicators. Providers promising high quality service but failing in a significant number of SLAs will be marked less trustworthy. This way, the end-user will have an independent and objective evaluation of the provider’s quality without additional effort. The risk and trustiness indicators will be presented in a customized way so also users with little background can understand the results and make the best choice at that time.

Broker scenario The broker is acting as a matchmaker between Grid customers and Grid providers. He is in charge of finding suitable resources and services, which may be operated by an arbitrary number of providers, matching the demands of the requesting Grid customer. The broker’s goal is to drive this matchmaking process at the conclusion of a new SLA. Beside resources and services, the Grid customer may also ask the broker to find resources for an entire workflow. Here, the broker has to decompose the workflow, finding suitable resources and services for each workflow task, respecting the customer demands on workflow-wide QoS guarantees. For the Grid customer, a major service of the broker is the preselection of Grid providers, comparable with an independent insurance agent who is supporting his customers by preselecting insurance policies from a number of insurance companies. Hence, the Grid broker has a strong interest in offering a trustworthy service to his customers, only offering reliable resources and services. We demonstrate the quality of AssessGrid using an example from the automotive industry. The design and development process for a new automobile is a tightly coupled cooperation between the automobile company and a multitude of component suppliers. For ensuring the overall compatibility and functionality, detailed specifications and periodic simulations are mandatory. These simulations are complex workflows, starting with comprehensive data collection from all developing parties, followed by nested pre-processing steps and simulation, concluding with evaluation, monitoring, and filtered distribution of results. These results are then fed back into the development process of all parties. In the light of short development cycles, delays on workflow execution have to be avoided, since this would cause a delay in all parties working in the automobile development process. Hence, the AssessGrid broker first has to analyse the structure of the workflow, identifying complexities, concurrencies, and synchronization points. For each task the broker now has to find suitable resources and services, minimizing the overall execution time of the workflow. In this mapping process, the AssessGrid broker has to consider the risk of resource offerings, avoiding the selection of risky resource offerings for directly consecutive workflow tasks. In case of workflow tasks with sufficient time buffers, the broker can minimize on resource cost instead of resource risk, since a violation of the SLA bound to this task would not impact on the overall workflow execution time. At runtime of the workflow, the AssessGrid broker monitors the execution of the workflow, adding this information to his internal provider assessment database. This database is then used for risk management considerations in future broker operations.

Personal tools
hidden pages