The Storage Resource Broker Project
"Enabling the Australian data grid"
What is the SRB?
The Storage Resource Broker (SRB) was developed by the San Diego
Supercomputer Center at UC San Diego.
"The SRB is client-server
that provides a uniform interface for connecting to heterogeneous data
resources over a network and accessing replicated data sets. SRB, in
conjunction with the Metadata Catalog (MCAT), provides a way to access
data sets and resources based on their attributes rather than their
names or physical locations."
SDSC SRB web site
SRB handles data through a client-server architecture and allows users
to access location-independent data: the user does not need to know
where files are located and how they are stored. The SRB organises data
logically into a single virtual file system in a similar way to a
filesystem found on a stand-alone computer rather than physically. The
use of stored metadata about each file facilitates the querying of
these distributed data.
Metadata – data about the data – allow users to quickly find a dataset,
what it contains, its format, when or where the data were collected or
created, etc. Metadata querying and browsing enables user communities
to have transparent access to each other’s data collections. Users can
store or replicate their data collections across several servers to
facilitate data preservation, while not losing local access control.
Access permissions on individual files can be set so that other users
can access the files. Users wishing to access these collections will
view a single collection through the SRB middleware and do not need to
know the physical location of the data.
The SRB servers that provide access to the archival resources, the
Metadata Catalogue (MCAT) and the SRB clients are the main elements of
an SRB domain. Files uploaded into the SRB are referenced by logical
file handles chosen by the user – a name that is meaningful to the
user. The MCAT maps these logical handles to the physical file
locations on individual resources, and stores the metadata associated
with the files, as well as information about the users and the physical
resources managed by the SRB. The MCAT is implemented in a relational
database such as Oracle or PostgreSQL and contains all the metadata and
information about the files and resources in the SRB domain. The MCAT
is usually associated to one SRB server (MCAT-SRB server). A user
queries the catalogue about a specific file without knowing in which
system the file resides. SRB can be naturally integrated into a data
visualisation pipeline (Pailthorpe and Bordes 2000) for repeated
analysis of large, diverse, distributed data sets.
Data queries through the SRB
Figure 1 illustrates how the SRB works in a typical scenario (the
arrows show the typical flow of data).
A user needs a file stored in
‘Data Resource 2’ which could be a tape archive or a hard disk etc.
However the user does not have this information. The only thing that
the user has to do is login and request the file by making a query via
an SRB client. What follows happens behind the scene.
contacts the local SRB Server A and requests the file (1).
contacts the MCAT-enabled SRB Server B in order to find the physical
location of the file (2).
Server B looks up the information in the MCAT
and passes this information back to Server A (3).
Server A redirects
the request to SRB Server C because the file is
located on a resource maintained by SRB Server C (4).
Server C retrieves the file from Data
Resource 2 (5 and 6)
Server C services the client directly (7).
held on these devices appear to the user as part of a single file
system. The full process is transparent to the user who does not need
to know in which data resource the requested file is located.
Figure 1: SRB components
Baru, C., R. Moore, A. Rajasekar and M. Wan Michael 1998 The SDSC Storage Resource Broker. In
Proceedings of the 1998 Conference of
the Centre for Advanced Studies on Collaborative Research (CASCON),
Toronto, Canada, November 30-December 3, 1998, pp.5-17.
Moore, R., R. Marciano, M. Wan, T. Sherwin and R. Frost 1996 Towards the interoperability of web,
database, and mass storage technologies for petabyte archives.
In Proceedings of the Fifth
NASA GSFC Conference on Mass Storage Systems and Technologies, College
Park, Maryland, September 1996.
Moore, R. 2001 Data management
systems for scientific applications. In R. Boisvert and P. Tang
(eds), The Architecture of Scientific
Software, pp.273-284. Dordrecht: Kluwer Academic Publishers.
PostgreSQL Global Development Group
2006 PostgreSQL Global Development Group.
Rajasekar, A. and R. Moore 2001 Data
and metadata collections for scientific applications. In European High Performance Computing
Conference, Amsterdam, Holland, June 26 2001, pp.72-80.
Rajasekar, A., M. Wan and R. Moore 2002 MySRB and SRB - Components of a data grid.
In Proceeding of The 11th
International Symposium on High Performance Distributed Computing
(HPDC-11) Edinburgh, Scotland, July 24-26, 2002, pp.57-69.
University of California, San Diego 2006 The Storage
Wan, M., Rajasekar, A., Moore, R. and P. Andrews 2003 A Simple Mass Storage System for the SRB
Data Grid. In Proceedings of
the 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and
Technologies, (MSST 2003), pp. 20-25.