Chapter 11 Distributed File Systems Introduction Case studies NFS Coda 1 Distributed File Systems A distributed file system enables clients to access files stored on one or more remote file servers A file service specifies what the file system offers A file service is specified by a set of primitives or operations available to the user to access the service A file server is a process that implements the file service 2 1
File Service Models Remote access model Work done at the server Consistent sharing (+) Server may be a bottleneck ( ) High communication cost ( ) Upload/download model Work done at the client Low communication cost (+) Consistency is harder to maintain ( ) 3 Server Types Stateless servers Server does not maintain any client state Client must specify location for read/write, re authenticate for each request Can easily recover from failure (no need to restore any state) Stateful servers Server provides open and close operations and maintains client state (e.g., files opened by each client, current read/write pointer for each file) Authenticate once at file open time, client does not need to specify location for read/write in request message Server must ensure that state can be recovered after a crash 4 2
Semantics of File Sharing UNIX semantics: A read operation returns the effect of the last write operation Can be achieved in a distributed file system if there is only one file server and clients do not cache files Session semantics: Changes to an open file are initially visible only to the process that modified the file. Only when the file is closed are the changes made visible to other processes. Most distributed file systems make use of local caches and implement session semantics What if two or more clients cache and modify the same file simultaneously? The final result depends on whose close request is most recently processed by the server 5 (a) On a single machine, when a read follows a write, the value returned by the read is the value just written. b) In a distributed system with caching, obsolete values may be returned. 6 3
Network File System (NFS) NFS is a distributed file system protocol developed by Sun Microsystems in 1984, allowing a user on a client computer to access files stored on a remote server much like local storage is accessed Client/server architecture Client file system requests are forwarded to a remote server File system requests are implemented as remote procedure calls (RPCs) NSF is OS independent: client and server implementations exist for almost all operating systems and platforms 7 NFS Architecture The virtual file system (VFS) layer is added to the UNIX kernel to allow applications to access different types of file systems in a uniform way VFS provides a standard file system interface, hides difference between accessing local and remote file systems The basic NFS architecture for UNIX systems 8 4
NFS File System Model Files are hierarchically organized into a naming graph in which nodes represent directories and files A directory file contains the mappings between file names and file handles (i.e., unique file identifiers) To access a file, a client must first look up its name and obtain the associated file handle In NSFv3, servers are stateless No open and close operations Server must check permission on each read and write call In NSFv4, servers are stateful open and close operations are provided Server checks permission at file open time 9 An incomplete list of file system operations supported by NFS 10 5
RPCs in NFS In NFSv3, every operation is implemented as an RPC NFSv4 supports compound procedures by which several operations can be grouped into a single RPC Better performance in wide area networks a) Reading data from a file in v3. b) Reading data using a compound procedure in v4. 11 Naming in NFS NSF provides clients transparent access to a remote file system by letting a client mount (part of) a remote file system into its own local file system A sever can export a directory (i.e., make a directory and its entries available to clients) An exported directory can be mounted into a client s local name space 12 6
Synchronization in NFS NFS provides two ways of synchronizing access to shared files Use locks Share reservation 13 File Locking A client can request read lock or write lock for a specific range of bytes in a file Locks are granted for a specific time, i.e., they have an associated lease When the lease on a lock expires, the server removes the lock (nonblocking) NFSV4 operations related to file locking 14 7
Share Reservation Share reservation is an implicit way to lock a file When a client opens a file, it specifies the type of access it requires and the type of access the server should deny other clients The open operation fails if the server cannot meet the client s requirements A share reservation is similar to a lock, except that its granularity is on an entire file, and its lifetime equals the duration of the file open 15 The result of an open operation with share reservations in NFSv4. a) When the client requests shared access given the current denial state. b) When the client requests a denial state given the current file access state. 16 8
Client Side Caching Client can cache data (file data, file attributes, file handles, directories) previously read from server In NFSv3, client side caching is left to implementation Most implementations never guaranteed consistency (cached data could be stale for up to 30 seconds) NSFv4 supports two different approaches for caching file data, effectively implementing the upload/download model of file service Implementing session semantics Open delegation 17 Implementing Session Semantics After a client opens a file, it caches the data it obtains from the server as the result of various read operations Write operations can be carried out in the cache Modified data in the cache must be flushed back to the server when the file is closed When a client opens a previously closed file that has been (partly) cached, the client must revalidate the cached data by checking when the file was last modified The cache is invalidated if it contains stale data 18 9
Open Delegation Delegation is a technique by which the server delegates the management of a file to a client At OPEN, the server may provide the client either a read or write delegation for the file If granted a read delegation, the client is assured that no other client has the ability to write to the file for the duration of the delegation (Read delegations can be granted to multiple clients at the same time) If granted a write delegation, the client is assured that no other client has read or write access to the file (A write delegation can be granted to only one client) While holding a delegation, the client handles various operations (OPEN, CLOSE, READ, WRITE, LOCK, LOCKU) locally without sending them to the server This greatly reduces the interactions between the server and the client for delegated files, leading to better performance in wide area networks 19 Callbacks If another client requests access to the file that conflicts with the granted delegation, the server contacts the initial client and recalls the delegation Upon return of the delegation, the server will centrally manage various file operations Server uses a callback mechanism to recall the delegation A callback is an RPC from the server to the client Server must keep track of clients to which it has delegated a file 20 10