|
09/11/2006, 11:30 AM - 12:00 PM
Speaker: Enis Afgan, University of Alabama.
While Grid middleware provides the necessary virtualization of resources and thus enables cross-domain computation, several aspects of application deployment and execution require a significant level of expertise. As scientific Grids multiply and user base grows, there is an increased need to make users perceive the Grid as a single, powerful, accessible, and user friendly entity where notions of individual resources, instruments, algorithms, and software packages are alleviated and abstracted. One of the approaches for achieving this is through customization and adaptation of applications already established in the wider community. Since these applications are mostly used by the not necessarily computer-savvy scientists, there is a strict requirement not to introduce new aspects into application execution. Otherwise, it might deter ate users from accepting the new infrastructure and have them continue to use resources available at their local institution through well established routines. As a step in this direction, we have developed an application conforming to these requirements for the Basic Local Alignment Search Tool (BLAST) while we are still working on NAMD version. Both of these applications are in constant need for the surplus of compute power Grid can offer and are commonly used by the biomedical scientists in gene discovery, categorization, and simulation. With this in mind, and observing the multitude of unused or underused workstations across a campus Grid, we built Dynamic BLAST - a master-worker type application which focuses on harnessing unused CPU power of distributed workstations. It relies on well established protocols and technologies of the Globus Toolkit for job management tasks, while it uses a portal, command line, or API interfaces to communicate with the end user. All the intricacies of resource acquisition, data distribution, and job submission are handled by Dynamic BLAST, allowing the end-user to keep their focus on application specific parameters and data interpretation. Dynamic BLAST distributes multiple queries across many, possibly geographically distributed, resources and thus performs searches resulting in shorter turnaround time. Beyond only stealing unused CPU cycles, Dynamic BLAST tries to optimize execution on underlying resources by employing a version of existing BLAST algorithm most suitable for the particular resource. In order to accomplish this, it uses information from previous BLAST runs, then, depending on resource performance and availability, it selectively transfers appropriate amount of work, and finally invokes a sequential or parallel version of BLAST on given resource. By appropriately granularazing user's input, Dynamic BLAST is capable of performing fine workload balancing, which is inherently necessary for the Grid due to heterogeneity of its resources. Usability and benefit of Dynamic BLAST were tested by using a set of resources available on UABGrid, a local campus Grid. Cumulative application execution times obtained by running on distributed and readily available workstations recruited by Dynamic BLAST were comparable and/or improved from the execution times of equivalent jobs on large and dedicated resources running standalone BLAST. As such, Dynamic BLAST has shown to be able to provision Grid resources to the wider scientific community through a simple invocation while hiding unnecessary middleware particulars.


|