4/14/09 - The Spring 2009 (PDF) edition of the CAC newsletter is available.

Newsletter Archive


The University of Florida , the University of Arizona and Rutgers, the State University of New Jersey , have established a national research center for autonomic computing (CAC).

This center is funded by the Industry/University Cooperative Research Center program of the National Science Foundation, CAC members from industry and government, and university matching funds.


10/01/09 - Grid Computing at Yahoo! (PDF)

Viraj Bhat, Grid Engineer, Yahoo!

The Yahoo! Grid Initiative is an effort underway to build a massivecomputing environment that supports and is augmented by an open-source software framework. The computing environment includes more than 30,000 nodes. The software framework is based on the Hadoop project from the Apache Software Foundation, an open-sourceimplementation of the map/reduce programming model and a distributed file system that places the data close to the computations. When combined, the computing environment and software framework enable distributed, parallel processing of huge amounts of data. The Grid is used for a variety of research and development projects and for a growing number of production processes from across Yahoo!, including key components of search, advertising, data pipelines and user-facing properties.

The focus of this talk is on Apache Hadoop and its related sub-projects such as Pig and Zookeeper, which form the building blocks of the Grid infrastructure. Hadoop is a framework for running applications on large clusters built of commodity hardware. The Hadoop framework transparently provides applications both reliability and data motion. Hadoop implements a computational paradigm named Map/Reduce, where the application is divided into many small fragments of work, each of which may be executed or re-executed on any node in the cluster. In addition, it provides a distributed file system (HDFS) that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both Map/Reduce and the distributed file system are designed so that node failures are automatically handled by the framework. Additional projects that augment Hadoop include Pig, a high-level data-flow language and execution framework and Zookeeper, a high-performance coordination service for distributed applications.

3/25/09 - Achieving Predictability in Large-scale Distributed Systems (PDF)

Abhishek Chandra, Assistant Professor, Department of Computer Science and Engineering University of Minnesota-Twin Cities.

Large-scale distributed systems such as volunteer Grids, clouds, and P2P systems consist of large number of loosely coupled nodes contributing computational, storage, and network resources for deploying large-scale applications. While these systems are attractive due to their scalability and low cost of deployment, they are inherently heterogeneous and unreliable, leading to several challenges in their predictable and reliable usage. In this talk, I will present two resource management techniques designed towards making these systems more predictable for applications: reputation-based scheduling and resource bundles. Reputation-based scheduling is a scheduling technique that provides a desired reliability to applications, irrespective of the actual reliability of the underlying infrastructure. Resource bundles are an aggregation-based resource discovery mechanism designed to provide statistical guarantees on resource availability. I will present performance results for these techniques obtained through simulations as well as through experiments conducted on a live PlanetLab testbed.

10/02/08 - On the Usefulness of Applying Application Utility (PDF)

Giuseppe (Peppo) Valetto, Assistant Professor, Department of Computer Science, Drexel University

Autonomic Computing systems should be able to adapt to unexpected circumstances. A major focus of our current research is how to engineer mechanisms that support that kind of self-adaptation, so that we can "design for the unexpected". In particular, the work presented in this talk adopts the idea of application-level utility, as an abstract way to gauge the "health" of a running software system. Utility can be expressed as a function of the current state of the system, without necessarily engaging in deep analysis of the causes and significance of incidents and conditions that occur at run time and change that state. In our work, an application utility function is a means to map the trajectory of the application state over time to a simple value indicating whether the software is fulfilling some intended task. We present the following contributions: how to automate the elicitation of utility data from an a application, based on the instrumentation of the corresponding software; how to synthesize application-level utility functions, by means of statistical analysis of such state data; how to embed synthesized utility functions within the runtime code of the application, thus making it utility-aware. We describe this process by means of a number of examples, and discuss the principal traits of our engineering approach.

2/28/08 - Opportunities and Challenges of System Virtualization: The “Managed Virtual Systems” Project (PDF)

Karsten Schwan, Professor, Computer Science, Georgia Institute of Technology Director, CERCS Research Center

Virtualization can provide degrees of freedom in resource access and use much beyond its known benefits for server consolidation in datacenters. Opportunities include (1) access to remote devices and runtime device migration without the need for expensive SAN hardware, (2) the provision of entirely new functionality via self-virtualized and logical devices, extending toward (3) the runtime composition of virtual computing platforms from physically distinct CPU/memory, disk, and device resources. The first part of this talk presents technical advances in system virtualization attained by our research, implemented for the Xen hypervisor and with experiments performed on server-class platforms. The second part of the talk addresses the `management' challenges caused by the improvements in flexibility created by system virtualization.