D4Science
From EGI Knowledge Base
|
The main objective of the D4Science (DIstributed colLaboratories Infrastructure on Grid ENabled Technology 4 Science) project is to deploy the e-Infrastructures built so far by the EGEE and DILIGENT projects so that they address the needs of several new scientific communities affiliated with the broad disciplines of Environmental Monitoring and Fisheries and Aquaculture Resources Management.
These e-Infrastructures, which offer mechanisms that concurrently exploit networks, grids, and data in a seamless fashion, will enable scientific communities to operate within a coherent pan-European model, regardless of the location of their research facilities. The project will progressively consolidate and expand these open e-Infrastructures to better address the needs of the two major target disciplines (which have challenging differences but also interesting commonalities). As outcome of this project, thousands of scientists will obtain greater access to facilities for creating Virtual Research Environments, a.k.a. Collaboratories, based on shared computation, storage, and generic service resources offered by EGEE and DILIGENT at a European level, as well as on data and domain-specific service resources offered by large international organizations, such as the European Space Agency, the Food and Agriculture Organization of the United Nations and the WorldFish Center, supported by the Consultative Group on International Agriculture Research.
The envisioned D4Science e-Infrastructure will have a multiplicative benefit to many scientific fields and will also act as a catalyst for the kind of cooperation and cross-fertilization among multiple communities that is necessary for addressing many grand challenges of science and society.
| Project homepage | www.eu-egee.org |
|---|---|
| No. of Partners | 10 |
| No. of Countries | 7 |
| Start Date | 2008-01 |
| Duration (Months) | 24 |
| Cost (€ per Year) | 1,960,000 |
| EU Funding (€ per Year) | 1,575,000 |
| FTEs (per Year) | 15.4 |
EGI Functions Mapped onto D4Science Activities
[edit] Operation of a reliable Grid infrastructure
The overall operation of the D4Science production Infrastructure is guaranteed by the SA1 Infrastructure Operation work package. This work package is in charge to both define and implement the policies governing the deployment of such an infrastructure and guaranteeing its operation. To reach such an objective the work package defined:
- An organisational structure consisting of management and resource centres in charge respectively of planning and organising the overall infrastructure (management centre), and providing the infrastructure with actual resources by implementing the overall plan (resource centre a.k.a. site). Each resource centre can contribute to the infrastructure through (i) gLite nodes, i.e. nodes offering computing and storage facilities through the gLite software; (ii) gCube nodes, i.e. nodes offering gCube based facilities like content management or search facilities as well as dynamic deployment of new functions through the gCube software; and/or (iii) community node, i.e. nodes providing data sources and community specific services through web services and the like.
- A deployment plan consisting in the milestones to be accomplished to implement the designed infrastructure. Having this plan allows to carefully monitor the status of the infrastructure and its evolution as well as to have a clear picture of the “size” of the resulting infrastructure.
- A set of procedures and tools governing the lifetime of the infrastructure. The systematic usage of established procedures is a fundamental aspect for the provision of an efficient production level infrastructure. Such procedure range from the one governing the installation and upgrade of the software through which sites provide their services to the one governing the monitoring, the certification, the security, etc. Details on the whole activity are reported in the D4Science Production Infrastructure web site (https://infrastructure.wiki.d4science.research-infrastructures.eu/).
[edit] Coordination of middleware development and standardization
The software supporting the operation of the D4Science infrastructure mainly consists of two systems: (i) the gLite software through which the gLite nodes are implemented and (ii) the gCube software through which the gCube sites are implemented.
As far as the gLite software, the EGEE team has implemented it. In the context of D4Science, it constitutes an underlying middleware through which it is possible to use computing and storage facilities seamlessly.
As far as the gCube software, it has been implemented in the context of the DILIGENT project and will be consolidated and enhanced during D4Science. It builds upon the gLite with the goal to implement an advanced software system:
- enabling cost-effective utilisation of computational and storage resources;
- offering a features-full platform for distributed hosting, management and retrieval of data and information of various genres and a framework for extending state-of-the-art indexing, selection, fusion, extraction, description, annotation, transformation, and presentation of content;
- eliminating manual deployment overheads, guaranteeing optimal placement of services within the infrastructure and opening novel opportunities for outsourcing state-of-the-art implementation.
The whole design of this advanced framework has been influenced by existing standards and best practices among which the Web Services Resource Framework plays a fundamental role. A complete list of the standards gCube deals with is reported in https://quality.wiki.d4science.research-infrastructures.eu/quality/index.php/Standards.
The overall activity concerning the consolidation and enhancement of the gCube software is coordinated by the JRA1 Overall Planning & Development Coordination work package and implemented by the rest of the JRA work packages. User communities will be directly involved in this software consolidation and enhancement activity by providing the developers with their requirements and use cases.
[edit] Development and operation of build and test systems
[edit] Components selection, validation, integration and deployment
Because of the quality of service D4Science infrastructure is requested to satisfy, an integrated workflow involving various actors and work packages contributing to the functions above have been defined. More details are reported in the dedicated wiki page (https://integration.wiki.d4science.research-infrastructures.eu/). Such a wiki page reports on the tools selected to support the whole workflow, the roles and the relative responsibilities of the different actors involved, clarifies on the constituent activities by identifying guidelines.
[edit] Mechanisms for resource provisioning to Virtual Organisations
[edit] Application Support
Clearly, the porting cost strictly depends on the level of engagement the target application is expected to achieve.
gCube drastically reduced the cost of porting existing application to the Grid, e.g. by systematising web services deployment and management facilities, by providing a unifying interface for discovery heterogeneous resources.

