Grid Enabled Analysis
If you are looking for a link to the June 2003 GAE
workshop pages, please go
here.
Overview
The importance of a Grid
Analysis Environment (GAE) for the LHC experiments is
hard to overestimate. Whilst the utility and need for Grids has been proven in
the production environment (by the LHC experiments as a whole), their
significance and critical role in the area of physics
analysis has yet to be realised.
The work on GAE at Caltech is a natural progression from our
completed projects
GIOD and ALDAP,
our recently funded project CAIGEE,
our collaboration in PPDG,
GriPhyN,
iVDGL and more recently the
GECSR project.
The development of the GAE is the acid test of the utility of Grid systems
for science. The GAE will be used by a large, diverse community. It will
need to support hundreds, even thousands, of analysis tasks with widely
varying requirements. It will need to employ priority schemes, and robust
authentication and security mechanisms. And, most challenging, it will
need to operate well in what we expect to be a severely resource limited
global computing system. So we believe that the GAE is the key to success
or failure of the Grid for physics, since it is where
the critical physics analysis gets done, where the Grid end-to-end services are
exposed to a very demanding clientelle, and where the
physicists themselves
have to learn how to collaborate at large distances on challenging
analysis topics.
Published Works
A list of recent
publications related to GAE can be found here.
Scope and Architecture
In the following Figure we show a schematic representation
(author: Lothar Bauerdick, FNAL) of how the
dynamic process of supporting analysis tasks in a
worldwide community of physicists might look.

The diagram shows a "snapshot" in time of analysis
activities in the experiment. Groups of individuals,
separated by large geographic distances, are working on specific
analysis topics, Supersymmetry, for example. Resources in the Grid
system are being shared between the active groups. The dashed line
boundaries enclosing each of groups, move and change shape and size
as the composition or requirements of the groups changes.
In the CAIGEE project, we have proposed a candidate
software architecture to support analysis in the Grid. The architecture is based
on the use of Web Services, or Portals, which mediate the access between the
analysis clients and the Grid resources. The most recent version of the CAIGEE
architecture is shown below.

The important features of the architecture are its support
for a wide variety of clients (both platforms and analysis software), and its
use of Web Services to optimise the user's view and use of the Grid resources.
Web Services
Web services are computing services offered via the Web. In a typical Web
services scenario, an end-user application sends a request to a service at a
given URL using the SOAP protocol over HTTP. The service receives the request,
processes it, and returns a response. An often-cited example of a Web service
from the business world is that of a stock quote service, in which the request
asks for the current price of a specified stock, and the response gives the
stock price. This is one of the simplest forms of a Web service in that the
request is satisfied almost immediately, with the request and response being
parts of the same method call.
Web services and consumers of Web services are typically different
organizations with diverse software and hardware platforms, making Web services
a potentially very viable solution to providing distributed computing
connectivity. An institution or organization can be the provider of Web services
and also the consumer of other Web services. For example, a group of researchers
could be in the consumer role when it uses a Web service to read and analyze
data from other service providers and in the provider role when it supplies
other researchers with the final product of the analysis.
At Caltech, we have been working
on developing, and proving the feasibility of, Web Services for physics
analysis data access. Different types of data ranging from detailed event
objects stored in Objectivity ORCA databases, as well as Tag objects stored in
Objectivity Tag databases have been converted into prototypical Web Services.
From Objectivity ORCA databases, we developed a set of tools that allowed
lightweight access to detailed event objects through Web Services. These Web
Services provided access to data ranging in granularity from the Federation
metadata to the event hits and tracks. Accessed using this method, the data
could then be used by a variety of tools and software programs. In 2002 we
successfully provided distributed access, via a Web Service, to FNAL’s JetMet
Ntuple files produced from ORCA Trigger analysis. This Web Service was
implemented using both a SQLServer database backend running under Windows, and
using an Oracle9i database backend running under Linux. The user's view of the
interface was identical, so demonstrating the ease with which we were able to
hide the hetorogeneity and details of the data
persistence mechanism.
GAE Tools
Clarens
The Clarens project started in the spring of 2001 as a remote analysis server
and grew into a Grid-enabled web services layer, with
client and server functionality to support remote analysis by end-user
physicists. The project web page with mailing lists, source archives (CVS) and
bug tracking system is hosted at
clarens.sourceforge.net. Clarens is currently deployed at CMS sites in the
US, at CERN, as well as Pakistan.
The server architecture was changed from a compiled CGI executable to an
interpreted Python framework running inside the Apache http server, improving
transaction throughput by a factor of ten. A Public Key Infrastructure (PKI)
security implementation was developed to authenticate clients using certificates
created by Certificate Authorities (CAs). All client/server communication still
takes place over commodity http/https protocols with authentication done at the
application level.
Authorization of web service requests is achieved
using a hierarchical system of access control lists for users and groups forming
part of a so-called Virtual Organization (VO). As a side-effect, Clarens offers
a distributed VO management system with administration delegation away from a
central all-powerful administrator.
Server-side applications made available through Clarens include the
obligatory file access methods, proxy certificate escrow, access to RDBMS data
access through SOCATS (see below), SDSC Storage
Resource Broker (SRB) access, VO administration and shell command execution.
Users on the Clarens-enabled servers are able to deploy their own web services
without system administrator involvement. All method documentation and their
APIs are discoverable through a standard interface. Access to web service
methods is controlled individually through the ACL system mentioned above.
The services described above are available from within Python scripts, C++,
as well as standalone applications and web browser-based applets using Java. A
Root-based client was used to demonstrate
distributed analysis of CMS JetMET data at the Supercomputing 2002
conference in Baltimore, MD. Clarens was also selected to be part of the CMS
first data challenge (DC1) in 2004.
SOCATS
We have begun development of a general purpose tool to deliver large SQL
result sets in a binary optimized form. This project
is called SOCATS. SOCATS is an acronym for STL Optimized Caching and Transport
System. The main purpose of SOCATS is to deliver relational result
sets to C++ clients in binary optimized STL vectors and maps. The data
returned from the SOCATS server to the client will be described through standard
web service WSDL, but the data itself will be delivery in binary form. This will
save the overhead of parsing large amounts of XML tags for large datasets. It
will also reduce latency problems for WAN environments, in that large batches of
rows which efficiently fill the network pipe will be transferred together. We
intend to utilize the Clarens as our rpc layer for this product.
GroupMan
The GroupMan application was developed in response to a need for more
user-friendly administration of current-generation LDAP-based virtual
organizations.
GroupMan can be used to populate the LDAP server with the required data
structures and certificates downloaded from Certificate Authorities (CAs).
Certificates may also be imported from certificate files in the case of CAs that
do not offer certificate downloads. These certificates can then be used to
create and manage groups of users using a platform-independent graphical user
interface written in Python.
The VO data is stored in such a way that it can be extracted using standard
Grid-based tools to produce so-called {\em gridmap} files used by the Globus
toolkit. These files map host system usernames to individuals or systems
identified by their certificates, thereby providing a coarse-grained
authorization mechanism.
Handheld Analysis Client
This GAE client, based on JAS, and interfacing to Clarens,
has been developed in collaboration with computer scientists at NUST, Islamabad.
The picture shows an example histogram as it is displayed on the PDA.

Desktop GAE hardware for Collaborative Analysis
We have constructed a four-screen desktop analysis
setup that operates from a single server, using a
single graphics card. The 4-way graphics card used allows an affordable setup to
be built that offers enough screen space and pixels for most or all of:
- Traditional analysis tools (e.g. ROOT session)
- Software development windows: code, debug, execution, etc.
- Event displays (IGUANA)
- Grid "views": monitoring information like MonALISA displays or processed
monitoring results
- Persistent collaboration -- VRVS session(s); VNC sharing other's desktops,
etc.
- Online event or detector monitoring information from CMS (possibly more
shared desktops)
- Web browsers, Email, etc.

Our prototype setup works on a desktop with four 20" displays for a cost of
about $6-7 k. One can imagine variations where this works in small and large
meeting rooms, by using different displays. The cost of such a setup is certain
to drop substantially in the future.
GAE Web Services
Our Web Services are being catalogued at the
Wooster UDDI Server (contact
julian@cacr.caltech.edu if you'd
like access to this server).
Example
"Tag" Web
Service (SQL Server): This Web service provides access to a small database of
approximately 180,000 events, stored in an SQLServer database at Caltech. Web Service
methods include those to fetch the total run and event count, issue arbitrary
queries on the databases, and receive Tag event objects as bin-64 encoded data
streams. Server technology is Microsoft .NET.
Example
"Tag" Web Service
(Oracle9i): This service provides identical methods
to the one above, but here the data are hosted in an Oracle9i database located
at CERN.
Example
nTuple Web Service: This service provides access to a large (~ 14GBytes)
repository of JETMET data from the CMS experiment (the original PAW nTuple data
were obtained from the CMS JETMET team at FermiLab). The data are stored in a
SQLServer database. The largest table has
132,000,000
rows. Server technology is Microsoft
.NET.
GAE Activity and Progress Reports
Information on
GAE Middleware (SOCATS
work) (by
Eric Aslakson)
PPDG Report from the Caltech Group for
March-June 2002.
Visit of Tony Johnson, Joseph
Perl, Max Turri, Mark Donszelmann (SLAC) to discuss GAE and JAS
Measurements of
3ware RAID
array server performance.
Demonstrations of GAE Work In Progress
iGrid2002 Demonstration
(September 2002)
SuperComputing 2002
Demonstration (November 2002)
We broke the Internet2 Land Speed record while at
Supercomputing 2002.
Article
in WIRED.
Proposals, Architectural Documents, Significant Email
The NSF-funded
"CAIGEE"
Project (Caltech, UCSD,UCR,UCD)
Overview
of Current Work
Email from GAE
Participants.
Other GAE Links
CMS Grid Integration Page:
http://cmsdoc.cern.ch/cms/ccs/grid/
GAE Background Information
California
Tier2 Prototype (Background and History)
Caltech Tier2
Hardware Details (October 2002)
Overview and Status
- Presentation at the CMS ATLAS Review in Berkeley, Jan 2003
Presentations
to Mary Anne Scott (DoE/MICS Program Manager), May 2002
Acknowledgements
CAIGEE: This material is based upon work
supported by the National Science Foundation under Grant No. 0218937.
Any opinions, findings, and conclusions or recommendations expressed in
this material are those of the author(s) and do not necessarily reflect the
views of the National Science Foundation.
|