Cyberinfrastructure Resources at Clemson University Jill Gemmill Galen Collier Cyberinfrastructure Technology Integration (CITI) November 2, 2011
Outline Vision: SC Cloud Sharing Resources to build common cyberinfrastructure The Palmetto Cluster (HPC) Condor Pool (HTC) C-Light R&E Network Outreach, Training, User Support HUBZero collaborative software platform
Clemson Datacenter 16,000 sq. feet of Powered and Cooled Raised Floor 50 Tons of HPC equipment moved over 2010 holidays into new area Upgrade of 2 to 4.5mW power Computational Barn Raising Dec. 27-30, 2010
The Palmetto Cluster Both Shared and Distributed systems Operates at over 115 teraflops (TF) #97 on June 2011 Top 500 list #2 among public academic institutions 1,616 compute nodes (14,168 cores) Operating System: Scientific Linux 6 Myrinet 10G network interconnect Data Storage: 115 TB scratch space (not backed up) 72 TB of backed-up storage
Distributed Memory Systems Name Count Model Processor L2 Cache Cores Memory Local Disk compute node phase 1 node0001-0257 257 Dell PE 1950 Intel Xeon E5345 @2.33GHz x 2 4 MB 8 12 GB 80 GB (SATA) compute node phase 2 node0258-0515 258 Dell PE 1950 Intel Xeon E5410 @2.33GHz x 2 6 MB 8 12 GB 80 GB (SATA) compute node phase 3 node0516-0771 256 Sun X2200 M2 x64 AMD Opteron 2356 @ 2.3GHz x 2 4 MB 8 16 GB 250 GB (SATA) compute node phase 4 node0772-1023,1108-1111 256 IBM dx340 Intel Xeon E5410 @2.33GHz x 2 6MB 8 16 GB 160 GB (SATA) compute node phase 4.1 node1024-1107 84 IBM dx340 Intel Xeon E5410 @2.33GHz x 2 6MB 8 16 GB 160 GB (SATA) compute node (former CCMS nodes) node1112-1541 430 Sun X6250 Intel Xeon L5420 @2.5GHz x 2 6MB 8 32 GB 160 GB (SATA) compute node phase 6 nodes 1553-1622 70 HP DL 165 G7 AMD Opteron 6172 @2.1GHz x 2 12MB 24 48 GB 250 GB (SATA)
Shared Memory Systems and GPUs Name Count Model Processor L2 Cache Cores Memory Local Disk regular large shared memory systems nodelm01- nodelm04 4 HP DL 580 G7 Intel Xeon 7542 @ 2.66 GHz x 4 18MB 24 512 GB 146 GB (SAS) math sciences large memory nodemath 1 HP DL 980 G7 Intel Xeon 7560 @ 2.66 GHz x 8 64 2 TB Four AMD FirePro V7800 cards accessible via select compute nodes = Total of 5,760 stream processors
Who Uses Palmetto? Core Hours (millions) 20 18 16 14 12 10 8 6 4 2 0 Since October 1, 2010 (1 year of data)
Who Built Palmetto? Clemson Condominium Program: a faculty/university partnership CCIT provides system administration and HPC user support Cyberinfrastructure is a university strategic priority Collaborative opportunities: Hosting of external research clusters is possible External condo ownership is possible (faculty and/or college funded) Collaborative instruction Grid Classroom NSF EPS 0919440
Software Available to Users Numerous Open-Source Packages Available: Abaqus, ABINIT, AMPL, GROMACS, LAMMPS, MCNP, mpiblast, NAMD, PerfSuite, PETSc, R, TotalView, VisIt, and much more Palmetto-Based Software Development Projects: Combustion modeling, data-intensive non-parametric estimation in economics, molecular dynamics FF development, census and education efficacy data analysis, manufacturing & scheduling optimization, finite element analysis, and much more Both Serial and Parallel program code can be used
PBS Professional 11.1 PBS Pro 11.1 is the new resource management service used on the Palmetto Cluster Enables you to make more efficient use of your time through scripting computational tasks PBS takes care of running these tasks and returning the results If the cluster is full, PBS holds your tasks and runs them when the resources are available PBS ensures fair sharing of cluster resources (policy enforcement) PBS ensures optimal/efficient use of available resources
OrangeFS Filesystem High-performance parallel filesystem Supports very high I/O activity Specialized hardware and software Development team based at Clemson 115 TB of space, open to all users Temporary work directory for all jobs
External Use of Palmetto NSF (EPSCoR Track 2) and DoE (CyberInstitute) funding have purchased 60+ nodes on Palmetto cluster on behalf of SC Cloud members These users have Condo Owner status Currently 80+ non-clemson users of Palmetto Funding extends thru end of 2012 SUSTAINABILITY MODEL: Condo Model shared across SC institutions
Easy Access to Palmetto Command-line interface, any Secure Shell (ssh) client Transfer files to/from using FileZilla (or scp)
Additional Resources for External Users User accounts with 100 GB backed-up /home directories 10 Palmetto nodes + 6 TB backed-up storage Priority job queue for EPSCoR users Special attention from cluster support staff
Desktop2Petascale.org Online resource for regularly updated training material and user guides Host site for community interaction Customizable virtual Linux environment Easy access to the Palmetto cluster, and other resources via terminal Built using HubZero platform
External User Support Non-Clemson users simply contact Galen to have a Palmetto account created Galen serves as each new user s primary pointof-contact for all support issues Most users become independent after just a few e-mails or conversations Users have access to regularly updated user documentation (D2P hub)
External User Support Clayton McCauley, C of C Starr Hazard, MUSC Jerry Ebalunode, USC Support-focused community of partners Galen Collier, Clemson Jacek Jakowski, UTK Barr von Oehsen, Clemson Bhanu Rekepalli, UTK
Condor at Clemson High-Throughput Computing Typically, over 10,000 cores available Windows and Linux environments Available to OSG community We can train you to do this at your institution
SC has a network of networks In March 2007, all SC HE was at 200 Mbps or less (competitive disadvantage). Clemson has brought in over $30M in CI related funding for South Carolina
C-Light Connectivity Connector to National/International R&E networks High throughput/greater bandwidth needed by higher ed for research/education Some CIOs don t think this is needed if you disagree, you should go talk with him/her. NSF EPSCoR $6M Award
Cyberinfrastructure Ecosystem Small college faculty needs go beyond desktop XSEDE TG/XD allocation requires demonstration of success Last mile issue is being addressed Bridging human expertise: run jobs, scale up, training, friendly interface development Campus Champion A campus-based regional facility and science gateway bridges campus researchers to national facilities 21
Science Outcomes/Impact Conference and Journal Publications Student Training New Collaborations Leading to new capabilities Discovery 22
Opportunities for Collaboration Collaboration in Learning Grid Classroom CI Days Workshops Online training resources Classroom guest teaching Training Boot Camps
Collaboration in Research CI Days presentations CI Days poster sessions HubZero platform New Areas for Collaboration Digital Humanities Social Media Listening Center Scientific Visualization Opportunities for Collaboration
Contact Info gemmill@clemson.edu galen@clemson.edu