
POSTER SESSIONS
A Case Study on Workload Characterization (Phases Identification) Techniques and Adaptive Automatic Grid Reconfiguration
Baochuan Lu
University of Arkansas
The purpose of this study is to develop an adaptive model for an intra-grid environment. We will present several phase identification techniques in characterizing grid workload and how the results are utilized in identifying hot-spots for dynamic grid resource allocation. Our study shows that these techniques can be used as the intelligent component of a reactive mechanism for a grid to adapt to changing environmental conditions by dynamic automatic reconfiguration. The parameters for those techniques are specific to the workload characteristics exhibited by certain applications. Tuning these parameters is a separate task which can benefit immensely from system modeling and simulation. The poster will be organized as follows: One section gives a physical and logical description of the environment in which the research is conducted. One describes the characteristics of the measured data. One discusses various execution phase identification techniques. Another one section examines findings in the relationship between phase identification technique parameters and phase lengths by employing frequency count analysis of phase lengths. One section describes one implementation of the phase identification technique, simulation study and the result of deploying it in the target system. One section describes conclusions and future work plans.
A Study on the Performance of Transport Protocols combining Explicit Router Feedback with Window Control Algorithms
Aarthi Narayanan, Venkatesh Sarangan
Computer Science Department
Oklahoma State University
TCP performs congestion control besides flow control for the data transmitted to provide reliable, efficient and committed network operation. Although it enjoys many merits, it does not perform as efficiently in networks with high bandwidth and delay. Two different approaches have been proposed in this literature to alleviate this problem. One is to explicitly provide feedback from the routers about the congestion in the network, as done by the eXplicit Congestion Control protocol (XCP) [1]. XCP has been shown to perform better than TCP especially in high bandwidth-delay networks, achieving higher utilization and almost no loss. The other is window based congestion control, where general rules to increase the window size for bandwidth probing and backing off on detecting a congestion have been proposed. Such schemes include traditional AIMD (Additive Increase/Multiplicative Decrease) [2] which is also used by XCP, the binomial algorithms [3] such as IIAD (Inverse Increase/Additive Decrease) and SQRT and the recently proposed memory-based schemes like SIMD (Square Increase/Multiplicative Decrease) [4]. As these two approaches are not mutually exclusive, a combination of these could lead to improved transport layer performance in terms of smoothness, aggressiveness and responsiveness. There could also be a faster convergence to fairness and efficiency thereby achieving high link utilization with negligible packet loss. This poster will present results on the performance of this conjecture obtained using ns simulations.
References
[1] D. Katabi, M. Handley and C. Rohrst. Congestion Control for High Bandwidth-Delay Product Networks. In Proceedings of ACM SIGCOMM, August 2002.
[2] R. Jain, K. Balakrishnan and D. Chiu. Congestion Avoidance in Computer Networks with a Connectionless Network Layer. Technical Report DEC-TR-506, Digital Equipment Corporation, august 1987
[3] D. Bansal and H. Balakrishnana. Binomial congestion control algorithms. In Proceedings of IEEE INFOCOM, April 2001.
[4] S. Jin, L. Guo, I. Matta and A. Bestavros. TCP-friendly SIMD Congestion control and its Convergence Behaviour. In Proceedings of ICNP 2001, November 2001.
Acxiom’s Capacity On Demand Framework
Doug Hoffman
Acxiom Corporation
For the past five years Acxiom Corporation has be transitioning its IT infrastructure from a large SMP server model to a highly parallel, distributed Grid model. Because Acxiom’s internal processing needs are somewhat different from traditional Grid applications custom monitoring and control software, called the Apiary, was written in house. Initially a number of CORBA based services were moved to arrays of nodes, numbering from 20 to 200 members each. These arrays, called Hives, are statically allocated from a Grid consisting of +6000 nodes. A joint research project involving the University of Arkansas and Acxiom has been investigating strategies for automatically allocating nodes to Hives based on processing load in an attempt to create a more dynamic Apiary environment. To help test these strategies Acxiom has added a new feature to its control software named the Capacity On Demand framework (COD). COD works by detecting highly loaded nodes within the Hives it monitors and creating duplicates of those nodes, a process we call cloning. A pool of idle nodes, available for use as clones, is kept in special Resource Hives that are shared among a number of Service Hives. To clone a node an idle node is allocated from a Resource Hive, the application software and data that are present on the overloaded node are placed on the idle node, the service is configured and it is brought on line. Once a clone is online workload is balanced across all of the nodes in the affected Hive. This process is repeated until the load on the Hive is brought within acceptable margins. When the load on a Hive drops below a programmed lower threshold the cloned nodes are released back to the Resource Hive. We present a demonstration of dynamic node allocation using real nodes in the Acxiom Grid. This was accomplished using decision model code from UA and the COD software, driven by a suite of test applications, also written by Acxiom developers. The test suite consists of a client Hive that can impose arbitrary network and CPU loads on a server Hive. The client Hive allows a workload profile to be accurately reproduced allowing different allocation strategies to be tested under consistent conditions.
Adaptation of Globus Toolkit 3 Tutorials for Undergraduate Computer Science Students
Amy Apon
University of Arkansas
Grid computing has emerged as a mechanism for large scale collaboration and authorized sharing of resources on the Internet. Unlike the Internet, which provides access to data, Grid computing allows a user to login one time and access multiple data sources, computing cycles, on-line devices, storage space, sensors, applications, and other resources. For example, physicists worldwide login to a grid to design, create, operate, and analyze the products of a major detector at CERN, the European high energy physics laboratory. Grid computing provides the infrastructure for highly-controlled sharing of resources that enables coordinated problem solving. While the Globus Toolkit has been used very successfully by computing professionals in the development of a number of production grids and extended grid environments, only a few course materials have been available that are accessible to instructors who desire to teach Grid computing to undergraduate computer science students. The purpose of this project is to identify the basic topics that are appropriate for study by undergraduate computer science students and the prerequisites for these topics. The approach is to focus on and to adapt Globus Toolkit 3 tutorial materials. This work is motivated by the fact that prerequisite knowledge topics required to study Grid computing are currently poorly understood. This work is helping to clarify the prerequisite knowledge to using Grid tools and protocols and to build a foundation for teaching these topics to undergraduates. The adaptation of existing instructional materials includes adding description and links to background details, and the development of new exercises that are appropriate for undergraduates. We have completed preliminary assessment of the developed lecture materials and exercises that have been used in courses at the University of Arkansas during the fall semester, 2004. The course web sites and other references provide a valuable starting point for instructors. The course web site is: http://csce.uark.edu/~aapon/courses/gridcomputing/ There are many important outcomes of this work. The development of appropriate instructional materials for undergraduates is vital to the development of an educated and strong national workforce. The results of our activites are being disseminated through web sites, articles in two conferences to date, and the participation and leadership in international workshops and tutorials in Grid education. The proposed activity will enrich the scientific and engineering research capability in the GPN region. The effectiveness of the materials is being tested on undergraduates. The proposed activities will not only lead to a baseline set of laboratory exercises for Grid computing, but by clarifying and making explicit the prerequisite knowledge to Grid software development, activities will also lead to research in better tools, interfaces, and education for Grid computing. Future work includes the identification of the ACM/IEEE knowledge units that correspond to the materials and laboratory exercises. Future work also includes the modification of these materials to be consistent with new releases of Globus Toolkit.
Advanced Optical Networks for Research & Education
Jim Archuleta
Ciena Corporation
CIENA will provide a graphical depiction of technologies that represent latest advancement of optical networks in the research and education community. The poster will highlight key features that drive flexibility, scalability, manageability and cost effectiveness that is available with today's solutions. Network deployments will be highlighted with an illustration featuring newly introduced technology supporting R&E networks. Capabilities such as and multi-protocol support, 10GbE, OADM technology, and automated provisioning and management will be illustrated within context of Research networking.
Creating A Virtual Organization For Data Sharing
Denis C. Hancock, Jr., Larry R. Sanders, Michael A. Woodson, Gordon K. Springer
University of Missouri
Using a set of agreed-upon eduPerson attributes, the Research Support Computing group at the University of Missouri has created a framework for a data repository shared among member institutions of the Great Plains Network. Shibboleth provides the means of authentication and authorization. A user attempts to access the data repository web page, and is passed off to the InQueue federation WAYF (Where are you from?) server which refers the user to his or her home institution for authentication. Upon success, the user is returned to the original repository host (the service provider) where the attributes released by the authentication host (the identity provider) are passed to the authorization routines of the service provider’s web server. If all criteria are met, then the user is granted authorization to access the data repository. Authorization decisions are based upon the resource names to which the identity provider asserts that a user is entitled. The Entitlement field of the eduPerson schema is used to specify users' resource entitlements. Resources are specified via Universal Resource Names (URNs) in the greatplains.net namespace that has been registered with MACE. The URN namespace provides unique, global, persistent names for resources shared among GPN institutions, e.g. "urn:mace:greatplains.net:repository" for the GPN shared data repository. Such a system allows access across institutional boundaries. We trust the authenticating party to verify the user’s identity and to assert his or her attributes. The identity provider determines what attributes can be released, in accordance to privacy and policy issues. The service provider retains control of authorization through its attribute acceptance policy. A virtual organization develops when two or more entities agree on what is released and what is accepted.
Deploying Condor in Student PC Labs Using VMware
Henry Neeman
University of Oklahoma
Condor is a Grid scheduling technology for loosely coupled collections of compute resources, ideal for opportunistic use of idle desktop PCs. While Condor provides many useful features, some of these features are unavailable in the Windows version, and therefore on x86 PCs Condor runs best under Linux. However, desktop PC users typically demand that Windows be the operating system with which they interact. To address this issue, the University of Oklahoma is deploying Condor across student PC labs by using an inexpensive commercial software product, VMware, which allows Condor to run native in Linux while providing Windows as the desktop environment.
Design of Sparse Grooming Networks for Transporting Dynamic Multi-granularity Sub-wavelength Traffic
Wang Yao, Mengke Li and Byrav Ramamurthy
University of Nebraska Lincoln
The widespread use of the Internet and the availability of
high-bandwidth services such as videoconferencing necessitate a
fundamental shift from copper- to optical fiber-based
networks. However, technologies such as Synchronous Optical Network
(SONET) and Asynchronous Transfer Mode (ATM), though employing optical
fiber, do not realize the full potential of the optical medium. They are
limited by the peak electronic speed of the network components (a few
tens of Gbps), whereas a single-mode fiber can carry data at speeds up
to three orders of magnitude higher! Wavelength division multiplexing
(WDM) is a promising technique to alleviate this opto-electronic
bandwidth mismatch. Using WDM, a single strand of fiber can carry
several independently-modulated optical channels on distinct
wavelengths each operating at peak electronic processing speed.
In an optical wavelength division multiplexing (WDM) network, the
optical channels each operate at a different wavelength and each carry
data at speeds upto 10 Gbps. However customer traffic requirements are
likely to be for lower bitrate circuits (upto several Mbps). One way
to bridge this gap is using ``traffic grooming'' which lets carriers
transport low bitrate circuits on high-bandwidth wavelength
channels. The challenges are to incorporate this technique while
achieving objectives such as minimizing resource consumption,
increasing revenue and providing survivability of network traffic.
As telecommunication carriers are facing an increasing pressure of
generating revenue within stringent budgets, a sparse grooming network
(SGN) is a practical and economical solution to meet this challenge by
efficiently provisioning multi-granularity sub-wavelength connections
with a reduced cost, which comes from the reduced number of high-speed
electronic equipments. In this work, we address the design of SGNs
under dynamic multi-granularity traffic. Two grooming algorithms are
proposed to efficiently exploit the sparse grooming capability existed
in the network. Moreover, an intelligent grooming node (G-node)
placement heuristic based on an analytical model is proposed to select
the set of G-nodes which can lead to the minimum blocking probability.
Grid Computation of Fast Fourier Transform Applications for Clinical Imaging Systems
Dee H. Wu Ph. D, Yamini Sivashunmugam, M.S.E.E. , Henry J. Neeman, Ph.D., Vincent Magnotta Ph.D.
University of Oklahoma Health Sciences Center
Our research concerns the development and evaluation of grid and cluster FT algorithms for the implementation of medical imaging simulation under grid/cluster computing systems. Future MRI and CT scanner will utilize optimized image scan controls to reduce the time required to image a subject. Such methods require complex algorithm optimization for real time reconstruction and interaction. To address these issues, Fourier transform, filtered back projection reconstruction, and Bloch Equation modeling algorithms that were evaluated on a grid cluster for modeling imaging acquisition simulation. We compared multivariate optimization and standard FFTW algorithm simplifications on our grid computed network for these applications. It was demonstrated that when possible FFT simplifications greatly benefit the speed of calculations by up to eighty seven times improvement in some cases.
Internet2 at Wichita State University and Kansas
John Matrow
Wichita State University
When WSU joined Internet2 in 2001, NSF matching grant money was used to fund campus projects for two years. This enabled us to hold two Internet2 Day events on campus. In addition, WSU also committed to a university-wide program called Global Learning which uses Internet2 to collaborate with many countries via their corresponding research networks. WSU was showcased twice on Internet2 in the past year. The new statewide broadband network for schools, hospitals and libraries called Kan-ed committed to Internet2 connectivity by joining KanREN which is a Sponsored Education Group Participant of Internet2. Kan-ed funded projects were the basis of the first Kansas Internet2 Day last November.
Potential Hydrologic and Environmental Applications For Grid-Enabled Cyberinfrastructure
Ralph Davis, Amy Apon, Indrajeet Chaubey
University of Arkansas
Water resources and environmental planners and managers are routinely relying more heavily on output from numerical models to support decision making. Examples of this include 1) the USDA Soil and Water Assessment Tool (SWAT) which is used to estimate nonpoint source loading to streams, 2) the USGS Modflow package which is used to estimate groundwater levels under both pumping and non-pumping conditions, 3) stream-aquifer simulation models used to simulate impacts of aquifer production on stream flow, 4) wetlands delineation and assessment models, 5) lake modeling packages used to simulate the cycling of nutrients and contaminants in a lake, and 6) meteorological and climatological models used to simulate precipitation, drought, floods, etc. These models all rely on spatially distributed data as input and produce spatially distributed output data. Geographic information systems (GIS) are routinely used to manipulate and display model input/output through a graphical user interface. The usefulness of these tools is often limited today by the fact that they run on a single platform, can perform analysis on a single set of inputs for only one user at a time, and the output products are often limited to a very select user group. The models are used to simulate specific environmental stressors provided by local, state and federal agency personnel. The model output is generally delivered to the decision maker in the form of a report, table or static map, with little opportunity for the decision maker to alter the environmental stressor via alternate scenario testing. The emerging paradigm is development of functional decision support models that provide multi-level user interface for more direct interaction by decision makers with the modeling process. We envision a functional decision support model that provides real time access to spatially distributed users at multiple levels. Level 1 users would have access to manipulate base model input parameters required for calibration and validation of the model. Level 2 users would have access to conduct scenario testing including production of new output products. Level 3 users would have access to view and download existing output products. We believe this can readily be achieved using existing grid technologies and available middleware. Development of this grid-enabled decision support system will prove to be an invaluable tool for the scientific, planning, and management communities, as well as the general public.
Subversion
Linh Ngo
University of Arkansas
Subversion is a new repository system that aims to not only offer the utilities of cvs but also overcome some of cvs’ limitations. Although featuring a large amount of functionalities, subversion relies mostly on command line interface and a rugged authentication system. Most of the current web-accessible interfaces for subversion only allow browsing and check out; on the other hand, the normal GUI applications for subversion let users have access to most of the subversions interfaces. There are also tradeoffs between web-interfaces and GUI application: the former has a larger flexibility in operating systems and authenticating systems while the latter does not. The main purpose of this project is to create a medium between subversion and users that allows a large amount of choices while still retains most of the helpful functionalities. Furthermore, this project will examine the integration of the shibboleth system into the medium in order to allow subversion users to be authenticated by a trust federation.
Towards a Grid-based DBMS
Craig Thompson
University of Arkansas
In certain high-end data-centric applications, practitioners seem to be discovering that traditional relational database technology is not meeting their requirements for huge data sets, high transaction throughput and flexible workflow support – and so they are switching to grid technology. This paper compares the two architectures and considers how to preserve benefits from both kinds of architecture, suggesting that a grid-based DBMS architecture is a reasonable and interesting target for future research. We also observe that such an architecture can benefit from other software architectures including service-oriented, agent, and aspect-oriented architectures.
Using Shibboleth to Manage Access Control to Remote Cluster Computing Resources
Amy Apon, Kurt Landrus, Kathryn Huxtable
University of Arkansas & University of Kansas
A key issue in high-performance computing is the authentication and authorization of users to remote resources. Shibboleth, a project and software package of Internet2/MACE, is designed to authorize a user to a remote web-based resource through the use of the login and attribute information that is maintained at the home institution of the user. Shibboleth allows user privacy to be maintained and provides an architecture for trust relationships between users and resources. Shibboleth can be used to enforce levels of access based on user characteristics. For example, if members of a class or research group need access to a remote data or computing resource, Shibboleth can be used to grant this access based on group membership. This project is a prototype demonstration of the use of Shibboleth v1.2 for providing authorization and access control to remote cluster computing resources through a web job submission interface.
A Proposed Grid-based Framework for Collaborative Regional Community Numerical Weather Prediction
William Capehart
South Dakota School of Mines and Technology