OGF21 Schedule
The 21st Open Grid Forum - OGF21
October 15-19, 2007
Seattle Washington, USA

Tuesday, October 16
3:15 pm - 4:45 pm
Data/Compute Affinity - Focus on Data Caching (90 mins)
Chris Smith
View Participants

As the size of data sets being processed grows, many organizations with production Compute Grids are experiencing problems arising from the way in which data is accessed within their computing environments. Some fundamental questions emerge, such as Do I bring the compute to the data, or move the data to the compute? and How do I efficiently move data to the computing element in such a way as I don’t degrade the efficiencies I’ve developed in managing computing workloads on the Grid?

These issues - the affinity of computing and data - are currently being explored by many practitioners of Grid computing. Several mechanisms have emerged that help people process their data efficiently, depending on the application pattern. For instance, more companies are making significant use of data cache technologies. This session will describe the use of data caching approaches to solving the data/compute affinity problem, provide use cases that illustrate how specific technology solutions deal with the data/compute affinity issue, and frame current data caching issues/limitations and how they can be addressed in future. We will also explore the role OGF can take in solving the data/compute affinity problem and discuss plans for future workshops at OGF22 and beyond.

Presentations:
Increase CPU utilization to near 100% - move your parallel tasks to data that is distributed across grid memory
Jags Namnarayan
Gemstone, Chief Architect

The primary goal of Grid computing is cost savings through increased CPU utilization. But, often, parallel jobs require data that is stored in enterprise databases, provisioned in dispersed file systems, involve job flows with significant data traffic between jobs and require publishing of intermediate/result data to enterprise data repositories, etc. Essentially, the data intensive nature of Grid applications can result in servers becoming IO bound, sometimes reducing the average CPU utilization to less than 50%. The presentation will be on the use of distributed main memory cache that offers the scalability and elasticity required to operate in a grid to make the data available to compute applications at memory speeds. An introduction to terms like replicated caching, partitioned data management, hierarchical caching across compute nodes and main-memory data grid servers, read-through and write-behind caching for synchronization with external data systems will be provided. To achieve maximum CPU utilization, how do you get the data close to the compute node? If the data is partitioned across the Grid nodes, what are the policies? we will discuss static partitioning vs dynamic partitioning of data, dynamic rebalancing of data across grid nodes driven by either data growth or access patterns, static configuration of redundant copies vs dynamic increase in number of data copies for parallel access and techniques for distributed query processing.
To achieve optimal routing of job/task to where the data is provisioned, the talk covers the integration aspects of a compute scheduling engine with a data grid.

Data-Awareness and Low-Latency on the Enterprise Grid: Getting the Most out of Your Grid with Enterprise IMDG
Shay Hassidim,
GigaSpaces, Deputy CTO

This presentation is based on the Data Awareness and Low latency on the Enterprise Grid White Paper which addresses through the Space based Architecture model why a typical In-Memory-Data-Grid can not solve Data contention and latency challenges with Enterprise grid based applications.



Location: Leonesa III
 
Rate This Session:
Rating: Comments:

 
    Slides:     Data-Awareness and Low-Latency on the Enterprise Grid Getting the Most out of Your Grid with Enterprise IMDG
    Slides:     Data-Awareness and Low-Latency on the Enterprise Grid
    Slides:     Data/Compute Affinity - Focus on Data Caching
    Slides:     Optimize computations with Grid data caching OGF21

> login   RSS RSS Contact Webmaster

OGFSM, Open Grid ForumSM, Grid ForumSM, and the OGF Logo are trademarks of OGF