The Storage Resource Broker Project

"Enabling the Australian data grid"

What is the SRB?

The Storage Resource Broker (SRB) was developed by the San Diego Supercomputer Center at UC San Diego.

"The SRB is client-server middleware that provides a uniform interface for connecting to heterogeneous data resources over a network and accessing replicated data sets. SRB, in conjunction with the Metadata Catalog (MCAT), provides a way to access data sets and resources based on their attributes rather than their names or physical locations."
SDSC SRB web site


SRB handles data through a client-server architecture and allows users to access location-independent data: the user does not need to know where files are located and how they are stored. The SRB organises data logically into a single virtual file system in a similar way to a filesystem found on a stand-alone computer rather than physically. The use of stored metadata about each file facilitates the querying of these distributed data.

Metadata – data about the data – allow users to quickly find a dataset, what it contains, its format, when or where the data were collected or created, etc. Metadata querying and browsing enables user communities to have transparent access to each other’s data collections. Users can store or replicate their data collections across several servers to facilitate data preservation, while not losing local access control. Access permissions on individual files can be set so that other users can access the files. Users wishing to access these collections will view a single collection through the SRB middleware and do not need to know the physical location of the data.

The SRB servers that provide access to the archival resources, the Metadata Catalogue (MCAT) and the SRB clients are the main elements of an SRB domain. Files uploaded into the SRB are referenced by logical file handles chosen by the user – a name that is meaningful to the user. The MCAT maps these logical handles to the physical file locations on individual resources, and stores the metadata associated with the files, as well as information about the users and the physical resources managed by the SRB. The MCAT is implemented in a relational database such as Oracle or PostgreSQL and contains all the metadata and information about the files and resources in the SRB domain. The MCAT is usually associated to one SRB server (MCAT-SRB server). A user queries the catalogue about a specific file without knowing in which system the file resides. SRB can be naturally integrated into a data visualisation pipeline (Pailthorpe and Bordes 2000) for repeated analysis of large, diverse, distributed data sets.

Data queries through the SRB

Figure 1 illustrates how the SRB works in a typical scenario (the arrows show the typical flow of data).
A user needs a file stored in ‘Data Resource 2’ which could be a tape archive or a hard disk etc. However the user does not have this information. The only thing that the user has to do is login and request the file by making a query via an SRB client. What follows happens behind the scene.
The client contacts the local SRB Server A and requests the file (1).
Server A contacts the MCAT-enabled SRB Server B in order to find the physical location of the file (2).
Server B looks up the information in the MCAT and passes this information back to Server A (3).
Server A redirects the request to SRB Server C because the file is located on a resource maintained by SRB Server C (4).
Server C retrieves the file from Data Resource 2 (5 and 6)
Server C services the client directly (7).

The files held on these devices appear to the user as part of a single file system. The full process is transparent to the user who does not need to know in which data resource the requested file is located.


srb
Figure 1: SRB components

References

Baru, C., R. Moore, A. Rajasekar and M. Wan Michael 1998 The SDSC Storage Resource Broker. In Proceedings of the 1998 Conference of the Centre for Advanced Studies on Collaborative Research (CASCON), Toronto, Canada, November 30-December 3, 1998, pp.5-17.

Moore, R., R. Marciano, M. Wan, T. Sherwin and R. Frost 1996 Towards the interoperability of web, database, and mass storage technologies for petabyte archives. In Proceedings of the  Fifth NASA GSFC Conference on Mass Storage Systems and Technologies, College Park, Maryland, September 1996.

Moore, R. 2001 Data management systems for scientific applications. In R. Boisvert and P. Tang (eds), The Architecture of Scientific Software, pp.273-284. Dordrecht: Kluwer Academic Publishers.

PostgreSQL Global Development Group 2006 PostgreSQL Global Development Group.

Rajasekar, A. and R. Moore 2001 Data and metadata collections for scientific applications. In European High Performance Computing Conference, Amsterdam, Holland, June 26 2001, pp.72-80.

Rajasekar, A., M. Wan and R. Moore 2002 MySRB and SRB - Components of a data grid. In Proceeding of The 11th International Symposium on High Performance Distributed Computing (HPDC-11) Edinburgh, Scotland, July 24-26, 2002, pp.57-69.

University of California, San Diego 2006 The Storage Resource Broker.

Wan, M., Rajasekar, A., Moore, R. and P. Andrews 2003 A Simple Mass Storage System for the SRB Data Grid. In Proceedings of the 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, (MSST 2003), pp. 20-25.