Archiving and Managing Remote Sensing Data Using State of the Art Storage Technologies Ms B Lakshmi C Chandrasekhar Reddy SVSRK Kishore SDAPSA, NRSC, Hyderabad
NRSC Functions Remote Sensing Data Acquisition Processing Dissemination Analysis
IT Infrastructure Consolidation Storage Consolidation and virtualization Network Consolidation Servers consolidation Security
Considerations for Storage Consolidation Types of storage use: For online processing or archival High performance vs. cost effectiveness Write once/ read many or continuous read/ write Frequently accessed or infrequently accessed File data or RDBMS data Heterogeneous platform access Multiple users or single user/ limited users Data acquisition processes and data processing chains require high performance storage suitable for online processing, etc. Archival data is mainly write-once read-many The past one year archival data may be accessed more frequently than earlier years data. Hence the need for multi-tiered and network consolidated storage
IMGEOS Architecture Gigabit Ethernet ADP Servers ( 3 ) Value Add n Server (2) Meta Data Servers ( 4 ) EMC (2) Version Control Server WFM Server ( 3 ) NMS Media Gen. Storage Area Network Data Reception Facility ( 4 Antenna System ) Data Electronics Station Automation Systems DI DP Servers servers(11) ( 4 ) PQC Server O2/C2 DP servers (5) OCM NAS SAN Storage DQE Server Work Stations ( 41 ) Mgmt Consoles Mgmt Consoles Gigabit Ethernet Data Exchange Gateway NRSC Balanagar (20 Mbps) RISAT-2 Chain FTP Server Test & Dev Systems (4 ) - Router ISTRAC 2 Mbps NRSC Balanagar (20 Mbps) Router Firewall 25 Mbps Ethernet SAC (2), ISAC, 2 Mbps
Storage Tiering Workspace for data processing chain Recent 3 months archival data Reference data RDBMS data Pre-existing archival data Past years ( 4-15 months) archival data Tier-1 High performance storage (~100TB) Tier-2 less expensive storage (~400TB) TIERED STORAGE Tier-3 Tape Library Data processing servers Other servers & subsystems Shared File System implementation to facilitate efficient file sharing
Storage Sizing Provide high speed online 75 TB of FC disk storage for all satellites data ( latest three months data) to facilitate faster generation of satellite data products Provide about 10 TB of high-speed disk space for data processing systems for generation of products. Provide about 10 TB of high-speed disk space for databases like GCPL, CartoDEM, TCPL and etc.
Storage Sizing Contd. Provide cost effective online disk storage (400 TB) for all recently acquired satellites data (Latest 4-15 months data), to facilitate generation of satellite data products. Two Copies of online Tape Archival for all the Satellites data acquired Two copies of data Backup (Local Site) and DR (Remote Site) The data available till 2011 was about 900 TB in the form of off-line DLTs
Data acquisition Infrastructure Terminal 1 Terminal 2 Terminal 3 Data acquisition electronics & switching matrix Terminal 4 Acquisition rate : 2 x 320 Mbps Acquisition Server 1 Acquisition Server 2 Acquisition Server 3 Acquisition Server 4 Network Storage System ( for archival, data processing etc.) Performance : 4/8Gbps (switched )
IMGEOS Storage Architecture FC Storage 75 TB Metadata Servers Cluster 01 High available SAN switches Cluster 02 Tape Library SATA Storage 200 TB SATA Storage 200 TB High available Gigabit switches Linux Servers Windows workstations
Multi-Tiered Storage Products FC Storage (EMC Symmetrix) : 100 TB (Tier-1) SATA Storage (EMC Clarion) : 400 TB (Tier-2) Online Tape Storage (SL 8500) : 1.5 PB x 2 (Tier-3) Vaulted Tape Capacity : 1.5 PB Metadata Servers : DELL Servers SAN File System : Quantum Stornext HSM/ILM Software : Quantum Stornext
IMGEOS Storage FC storage consists of 295 x 450GB 15k RPM disks Provides 75TB useable capacity after RAID overheads Two SATA storage systems, each consisting of 265 x 1TB SATA disks, offers 400TB useable capacity Tape Library with 4000 slots x 1.5 TB, provides 6 PB The storage systems are connected to the servers and Tape library using the SAN Switches
A three Tiered Storage IMGEOS Storage FC Storage (Tier 1) for first 3 months SATA Storage (Tier 2) for the next 15 months Tape Storage (Tier 3) for Archival of the Data Data Movement across the tiers is using Hierarchial Storage Management (HSM) feature of Stornext SAN File System Storage Systems are virtualized using Stornext SAN File System into 9 storage partitions and presented to servers
Network Connectivity Facilitate high speed (FC 4Gbps) data transfer between storage, data-acquisition and processing systems - SAN Facilitate Data Transfer (1Gbps) for all SAN clients Metadata Meta Data Network Facilitate data transfer (1Gbps) among all the acquisition & processing systems/computing nodes, workstations, peripherals and others - Systems Network Facilitate Connectivity for dissemination of satellite data products to users through Internet Provide data connectivity to NRSC-Balanagar for operations Provide data connectivity for ISRO participating centers, (SAC, ISAC, ISTRAC NRSC- Balanagar, ADRIN ) for Software maintenance and testing Facilitate connectivity to receive satellite data acquired at Svalbard, Denmark and Matera, Italy.
Leased Lines Leased Lines for Operations One 10 Mbps Leased line is used between Shadnagar and Balarenagar Campus for DP operations Second 10 Mbps Leased line is used between Shadnagar and Balarenagar Campus for Intranet/Email/Internet operations The state vectors related to satellite pass schedules are being provided by ISTRAC over Spacenet / ISROnet. Leased Lines for Software Maintenance and testing Software for IMGEOS will be provided by ISAC, SAC, ADRIN and NRSC. Hence one 2 Mbps leased line to each centre is provided. Leased Lines to receive data from Svalbard 45 Mbps Connectivity exists to receive satellite data acquired at Svalbard and process at Shadnagar
Systems Network Chassis Switch UTP-96 FC-12 Chassis Switch UTP-96 FC-12 Station Automation Systems (2) DAQLD Servers (4) ADP Servers (3) 10 Gbps 10 Gbps 10 Gbps GbE WFM System (2) Virtual Reality Systerm Version Control System DP Servers (17) DQE VADS PQC Edge SwitchUTP -24 10G-2 Work Stations Edge SwitchUT P -24 10G-2 Work Stations (43) Edge SwitchUTP -24 10G-2 Work Stations Data Exchange Gateway Product Delivery Systems Data Exchange Gateway Test & Dev Systems (4) Secure Appliance Layer-3 SWITCH Leased Lines Balanagar Operations Sky Link ISTRAC Operations Leased Lines SAC, ISAC, ADRIN, NRSC( Balanagar)
Systems Network The function of this network is to provide Ethernet connectivity for Data acquisition and Data processing nodes Two Enterprise class switches as core switches. These switches are redundant to one another for high availability. Each server in the data centre is connected to two switches Uplinks are provided with bandwidth of 10Gbps to edge switches, located in the other buildings for connecting work stations
Servers Total of 60 servers deployed for various activities from data ingest to data dissemination. 42 4-CPU servers 18 2-CPU servers Total of46 workstations & 15 Thin-clients. Arranged in 72 cubicles in the 7 rows. Workstations have 1Gbps for SAN access. Desktops for internet
Connectivity for Data Dissemination The products are disseminated using web/ftp Servers. Connectivity between Systems network & product delivery servers is using Data Exchange Gateway (DEG). From Systems Network finished data products transferred to Web/FTP servers using DEG. Internet leased line with Ethernet Interface The bandwidth of leased line is 25 Mbps
Network Security Layered Network Security Firewall to protect the Infrastructure from the public network Intrusion prevention system for monitoring and preventing malicious activity Data Exchange Gateway to transfer the data products from Systems network to Web/FTP servers Anti-Virus Solution The leased lines from other ISRO centers will be connected using security appliance to the Systems Network
Scalability Tier-1 FC disk system is envisaged for scalability up to about 400TB raw Tier-2 SATA disk system is being procured can be scaled up to about 800TB raw Tape Library can be expanded up to 10000 Slots from the available 4000 Slots
Data Management Providing Data access for the servers / workstations Data Storing / Archiving using defined policies Data re-use Data integrity Data security These activities are being carried out using SAN File System
Storage Partitions Layout 100 TB T1 + 400 TB T2 Level0_1 Prodspace FTP IPO 163 TB Level0_2 159 TB Product space for DP & VADS output 6.4 TB Volume for FTP products 6.4 TB Volume for Initial Phase Operations 7.4 TB Input & output for Level 0 processing Workspace Wprking space for all processes 5.1 TB OTS Volume for OTS products 6.4 TB REF Volume for all permanent files 11 TB
Data Archival Data is archived using Storage Policies Storage policies act as the primary channels through which data protection and data recovery operations are fulfilled. The storage policies are created based on the following parameters. Number of copies to create Media type to use when storing data Amount of time to store data after data is modified The amount of time (in days) before relocating a file Amount of time before truncating a file after a file is Modified Separate Storage policies are being created for each satellite and these policies are monitored for compliance of the policy.
Data Flow Management HSM/ILM Software is used for movement of data across the storage tiers. Data received from level-0 is stored in high performance disk storage (Tier -1), which can be used for processing by Data Processing Servers This data in the Tier-1 will be copied automatically to the tape library (Tier-3) based on the set policies. Two more copies of the same data will be made on tapes, one for online backup of the archive (which shall remain on the tape library) and other for vaulting (these will be placed in fire-safe vaults). Based on the set policies (Time), the data file in the tier-1, is automatically moved to another storage tier (tier-2) and tier-3
Data Protection Two online tape copies and two vaulted copies are created One Vault copy in the Primary Site Second vault copy in the Remote site As tape is a magnetic medium with electro-mechanical components, its reliability is checked regularly by performing a random read on each of the tape. SFS offers data integrity checks, performing a checksum on files as they are written to tape and then verified when restored back to disk. On daily basis few tapes are verified randomly for data access The logs/ admin alerts of metadata servers, SFS and SM are continuously monitored for probable alerts. Health check is periodically done by running various diagnostic checks on the SFS.
Thank You