Systems Seminar - CSE

Semantically-Smart Disk Systems

Remzi Arpaci-Dusseau
SHARE:

There is a knowledge gap in the world of storage. File systems have plenty of
high-level knowledge; for example, they know how files are laid out on disk,
which files are in which directories, and the current state of which blocks
are free and which are utilized. Storage systems (i.e., RAIDs), in contrast,
have plenty of low-level knowledge; for example, they know about scheduling
algorithms, exact head positions, and the RAID scheme that is
employed. Unfortunately, many interesting classes of functionality demand both
high-level and low-level knowledge. In current systems, such knowledge is
difficult or impossible to obtain.

In this talk, I will present a new approach to unifying and exploiting these
disparate pieces of knowledge within what we call a "semantically-smart" disk
system (SDS). As opposed to a typical "smart' disk system, an SDS has
detailed knowledge of how the file system above is using the storage system,
including information about the file system on-disk data structures and
policies. An SDS exploits this knowledge to transparently improve performance
or enhance functionality beneath a standard block interface (e.g., SCSI).

I will discuss some of the important hurdles one must overcome in building an
SDS, including how an SDS can automatically obtain file-system specific
information, and the overheads involved in doing so. I will then present a
number of case studies that demonstrate functionality or performance
improvements that would be difficult or impossible to implement within a more
traditional framework. In our initial work, we have found that a surprising
amount of functionality can be embedded within an SDS, hinting at a future
where disk manufacturers can compete on enhanced functionality and not simply
cost-per-byte and performance.

Remzi Arpaci-Dusseau is an assistant professor in the Department of Computer
Sciences at the University of Wisconsin, Madison. He received his B.S. in
Computer Engineering from the University of Michigan, Ann Arbor, and M.S. and
Ph.D. in Computer Science from the University of California, Berkeley. His
research interests include storage systems, operating systems, parallel and
distributed systems, high-performance applications, databases, and computer
architecture. Currently, Remzi is primarily interested in developing and
understanding the key technologies that will be crucial in building the next
generation of manageable, robust, and high-performance storage systems.

Sponsored by

SSL