Bologna 9th ASTRI Consor3um Mee3ng Universidade de São Paulo Ins3tuto de Astronomia, Geofisica e Ciencias Atmosferica The ASTRI SST- 2M ICT Infrastructure & The ASTRI MASS So>ware Test Bed Fulvio GianoE INAF - IASF Bologna for the ASTRI Collabora3on & the CTA Consor3um 1
The ASTRI SST- 2M ICT Infrastructure A complete and stand- alone Computer Centre has been designed and implemented in the SLN Site. The design goal is to obtain a basic ICT equipment, that might be scaled for the ASTRI/CTA mini- array Par;cular awen;on is given to monitor, control and alarm system. Data The Computer Centre and the Control Room are shortly Storage Server (24x3TB HD Acquisition, described below astrivpn.com LAN Storage and ) Network Servers and Services INAF- OACT Network and INTERNET Computing RACK1 RACK2 GPU Server (2x NVIDIA K20) Firewall VPN NAT CPU Server (64 Core 256GB RAM) Fron;era Server Temperature and Power Control Camera Server DAQ (12 Core 32GB RAM 24TB HD) KVM UPS1 UPS2 Telescope Control: OMC and MicroCloud Fulvio Giano+ - INAF- IASFBO - 9th ASTRI Collabora;on Mee;ng - Bologna, 23-25 Feb 2015 2
The ASTRI SST- 2M ICT Infrastructure The astrivpn.com is a private LAN composed by 3 switches: The Master Switch is the hearth of the system which connects all the computers servers and network devices. The Telescope Switch is located on the Telescope cabinet and connects all the Telescope devices to the Computer Room. Two op;cal fibers are connec;ng the two switches at 1 Gbit. A third fiber provides a point- to- point connec;on at 1 Gbit among the Telescope Cherenkov Camera and the Camera Server devoted to the Camera data acquisi;on. The IPMI Switch manages the out of band monitoring of the servers (e.g. Intelligent Pla[orm Management Interface. 3
The ASTRI MASS So>ware Test Bed 3 virtualiza;on computer 2 CPU 6 Core => we can simulate up 72 virtual processors working simultaneously. 1 SAN RAID6 8TB of Disk Space. 2 controller and RAID System => High Reliability 1 Server for the System Central Console => OVM Manager 2 LAN: Management LAN and Data LAN HW Schema IASFBO OVM system AUTH. LDAP astriacs01.giano.ias_o astri01.ias_o.inaf.it HOME OVMMAN astriacs0n.giano.ias_o DATA san01.giano.ias_o Public LAN ias_o.inaf.it OVMS1 SW Schema Management LAN OVMS3 NEW! Public LAN SAN HA ISCSI Storage NEW! Priv. LAN giano.ias_o Virtual Machine (VM) to replicate a default Linux Server installa;on with ACS needed by the MASS SW. With a right number of VM we can realize the Test Bed necessary for the development and maintenance of SW MASS. We adopted the Oracle VM professional Na;ve/Bare Metal virtualiza;on system that guarantees the necessary performance and reliability. Oracle VM allows to have a single control console to easily manage mul;ple servers OVMS and dozens of Virtual Machines. the system will be built on a private network with VPN, NAT and LDAP. The data will be stored on the SAN and accessible to the VM via NFS. Fulvio Giano+ - INAF- IASFBO - 9th ASTRI Collabora;on Mee;ng - Bologna, 23-25 Feb 2015 4
The ASTRI MASS So>ware Test Bed OVM Manager Console Thi is a screen shot of the OVM Manager Console and the LDAP server graphical console that is ac;ve in the system. This console allows to create / destruct / duplicate the VMs, and perform on them Oracle VM system opera;ons, like: start, stop, migra;on, resources defini;on. It is also possible to manage the storage spaces and the images of the VM, and to import /export them from/to other system. 5
The ASTRI SST- 2M ICT Infrastructure Monitoring To ensure the necessary level of availability and reliability of ICT infrastructure a good monitoring, control and alarm system is needed. This system must be well designed, easy to use and always updated. This is why we are trying to integrate the various tools provided by the technology adopted it SLN Server Room in a single web interface With this interface, we are able to monitor all most important parameters like: CPU Load Used Memory Disk space Temperatures Ac;ve Job and Users Network traffic etc In addi;on we are developing a control interface that allows us to act remotely to solve problems that are cause for alarm. 6