Storage / SAN / NAS Jarle Bjørgeengen University of Oslo / USIT October 18, 2011
I m available in room PS223 on Fridays... except those weeks I have lectures other weekdays... like this week. Discuss topics related to: Storage Performance Unix/Linux Configuration mgmt Virtualization / Cloud Etc... E-mail: jarle.bjorgeengen@usit.uio.no
Outline About USIT About data storage SAN introduction NAS introduction Types of NAS
Key points about USIT Approximately 50 000 file, print, mail and web-app users in UiO, with different privileges. Dev. and op. of the FS/Studentweb application used by most universities and colleges Dev. and op. of Cerebrum, the glue that ties together all person/user/machine information. Dev. and op. of the national authentication service "Moria". Op. of mail and file backends for Classfronter for the Nordic Countries Data-storage for LHC in Cern. Op. of HPC clusters for research number-crunching.
About data storage
About data storage Computers need storage Early days, punch cards..then disk media (magnetical/optical) increasing storage capacity and performance decreasing physical size Now magnetic high density hard disks dominant (still) Applications need performance (directly affect app. perf.) Applications create workload Different applications create different workload types. Storage needs to satisfy vastly varying workloads simultaneously. Cost reduction require resource sharing (d.t consolidation) Resource sharing introduce further risk (Why? )
Storage Area Network - SAN Centralized storage pool SAN QoS bridge QoS bridge QoS bridge QoS bridge QoS bridge Consumers Virtual disks Shared physical resources
Storage Area Network - SAN Flexible, sharable pool of block storage. Disk virtualization. Used for consolidation (centralization of resources). Clusters need shared disks Shared disk introduces risk (Why?) Physically located outside of server. Interconnected through a network media (With switches). A protocol for block access at the top (SCSI/ATA...)
Storage Area Network - SAN Uses RAID for disk redundancy and performance. Varying degree of component redundancy (cache, controller, buses, etc.) You get what you pay for. Cost increase exponentially when approaching 100% uptime and keeping performance. Intelligent applications can compensate for errors, hence cheaper less reliable storage is possible (Hello Google) Cost of downtime vs. cost of insurance against downtime (redundancy) Cost vs. performance vs. availability.
Typical (FC) SAN layout - simplified ARRAY 2 ARRAY 1 FC switches ZONE A hosts
Storage Area Network - SAN? Slices of disk (virtual disks/luns) Host "sees" it as local disk (/dev/sda, /dev/sdb and so on in Linux) Limit access between initatiors Login process in iscsi Present only to WWN1, WWN2, and so on Zones ín FC switches. Which WWN s can see each other Risks introduced by lack of access control? 2 or more hosts can see the same virtual disk The hosts need to behave. (Coordinate writing Cluster SW)
Host OS considerations? Stable and working driver for HBA Used to be a support/certification nightmare Now HBA-vendors make drivers available upstream Multiple paths (several approaches) Built into FC driver (only failover) Separate MP driver on top (dm-multipath) dm-multipath mostly used now. Fleksible and works well. Storage vendors push their own drivers and agents. Advantages / disadvantages?
Hot topics in storage? SSD is used increasingly Intelligent caching Automatic tiering Usually 3-4 tiers SSD, FC (15k), SAS(10k), SATA (7.5k) Different approaches regarding Estimation of what needs to be moved Granularity of workload profiling Distributed network file systems for linear scaleability in capacity and performance Appliance bundling (Oracle Exadata, EMC Vblock, etc. ) Thin provisioning Thin write? FS / Application awareness?
Sub-LUN tiering / Autotiering
Network Attached Storage - NAS Collect term for accessing files over an IP-network. Using NAS involves "mounting" of remote filesystems.... and user authentication / authorization. Typical usage: Making home directories available across many machines. Group collaboration on file level. File Archive (WORM/Policy-based retention) Untypical usage: Shared storage for clustering (SAN is typical for that)
Types of NAS: NFS NFS (Network File System) developed by SUN Utilizes IP (TCP or UDP) Heavily based on RPC (Remote procedure calls) Available on any Unix/Linux Version 2,3 and 4 V2 is old, unsecure, UDP and synchronous writes only. V3 supports asynchronous writes and TCP in addition. Also unsecure. V4 an IETF standard, secure, only TCP, has implementations for windows and support kerberos auth. 1 V4 consolidates a number of protocols. 1 http://www.nuug.no/aktiviteter/20100413-kerberos/
NFS server / client Server has 3 daemons Mountd - authorization / rejection of client mount requests Nfsd - data transfer Lockd - file-locking (Advisory locking. What does it mean?) Client sends mount request to server If allowed, the client operates on it like any local FS. NB: User ID s must match. client root user is mapped to "nobody" on server. (Why?) can be turned off with no_root_squash option.
NFS server / client
Configuring NFS: Server Install nfs-common, nfs-server Start daemons (/etc/init.d/nfs-server start) Edit /etc/exports (man 5 exports) exportfs -a (check with exportfs or showmount) Statistics - nfsstat
Configuring NFS: Client Install nfs-common,nfs-client,portmapper. mount server:/exported/fs /local/mount/point or.. edit fstab and run mount /local/mount/point or mount -a Check with df, mount, and try file operations (ls,touch,cat, vi, cp, rm, mv) Verify identical user ID s in /etc/passwd or use centralized UID lookup (LDAP) (Other options? ) Not working? Portmapper running on client? Any firewalls in between, or local? (iptables -L on both) SELinux / AppArmor Log files on server tcpdump
Instances of NAS: SMB and CIFS Server Message Block protocol CIFS = SMB (Network file access anyway) Originates from IBM. Modified by Microsoft. Commonly used for integrating Windows and Linux env. SAMBA for Unix/Linux Client and server Can mount windows shares Can Serve files to win clients. Similar to AD when combined with MIT Kereberos and OpenLDAP. Built in file/print service (file share) for Windows server. Many other implementations (NetAPP,FreeNAS,Veritas,EMC,etc...)
SAMBA Server Server provide 5 basic services File sharing Printer sharing Authentication / authorization Name resolution (through WINS) Service announcement Behavior is defined in smb.conf Parallel user/pw database, managed by smbpasswd
SAMBA Client Windows: net use X: \\server\share Linux/Unix: smbmount / mount -t cifs / fstab mount -a Must authenticate: use credentials=<file> (mode 700) Samba utilities: Smbstatus - Info about smbd connections. Smbclient - display observed shares on server. Smbtar - backup of shares. SWAT - Graphical (web-gui) config of SAMBA inetd/xinetd service. Listens on port 901 default Local/remote UID does not matter.
Configuring SAMBA Install SAMBA server on server machine Edit smb.conf (man 5 smb.conf) Testparm to check syntax. Start/reload service. Mount filesystems on client(s)