Server Protocol Compliance, Security, Performance and Scalability Testing - Implement RFC, Going Beyond POSIX Interop! Raymond Wang, Tanmay Waghmare Microsoft Corporation
Agenda Key Learning Points Why traditional way using POSIX interop to test server implementation does not work very well How we test server implementation with packet level test framework (synthetic client) Techniques used for protocol compliance, security, perf and scale testing Agenda Overview of Windows server Server test architecture RFC5661 protocol compliance testing Security testing Session trunking performance Server limits testing Q&A 2
Overview and Test Challenges Overview Compliant with all mandatory aspects of RFC 5661 Highly available Windows Failover Clustering Identity Mapping support Password/group file mapping Active Directory ADLDS or 3 rd party LDAP stores(rfc 2307 compliant) User Name Mapping (legacy) RPCSEC_GSS support (krb5, krb5i and krb5p) Multiprotocol access to the same share Volume mount point support Not currently implemented ACL s Delegations Migration & Replication pnfs RDMA Other optional aspects of RFC 5661 Test Challenges - complex protocol, 600+ pages of RFC 5661 Stateful, 40+ operations, 50+ attributes Traditional coverage via POSIX APIs is not enough to test Compounding, pseudo file system Sessions, Reply cache Session trunking & client ID trunking Limited availability of stable clients Clients do not implement all features. Increased complexity callbacks, delegates, ACL, multi-server namespace. 3
Test Strategy Protocol Compliance Test Framework Server Functionality Functionality (Session/State) Server Reliability Server Security Performance & Scalability ONCRPC Library Procedure Client Id & Session File I/O Stress ONC RPC fuzzing File I/O performance with session trunking Synthetic Client Operations EOS & reply cache Pseudo FS Stress Fuzzing Limit Test File Attributes Lease management RFC Reading File IO Session recovery Error validation Attack Surface Reduction Pseudo State management Test Pillars Test case repository Code Coverage Feedback Locking & Share reservation Session Trunking Interop Tests Team Review 4
Test Architecture ONCRPC Client Test API & Tests Scalability Protocol compliance(compound RPC, File I/O, File Attributes, PseudoFS, Lock & Share reservation, client ID & Session, EOS & reply Cache etc. ) NLM fuzz ing test Po rtm ap fuzzing test NFSv2/v3 protocol tests NFS/SMB interop (NFSv2/v3/v4.1) Shell Tool Data integrity File I/O Stress NLM Lib RPCBind (Portmap) Lib NFSv2/v3 Lib NFS4 Thin Client & File I/O APIs Perf Tool Fuzzing text ONC RPC Library ON C RPC Fuzzing Dumb Fuzzing Engine XDR encode/decode, ONC RPC Lib auth_sys, RPC message routines Async I/O architecture, callback, auth_gss (krb5,krb5i,krb5p) Core Components 5
Test Architecture - ONCRPC Library Features Support AUTH_SYS and RPCSEC_GSS( krb5, krb5i and krb5p) High throughput Build-in ONCRPC fuzzing engine Support callbacks Asynchronous I/O model IO Completion Ports 6
Test Approach Objectives Enable RFC 5661 compliance testing and protocol validation Simulate interesting client behaviors that can t be done using regular clients Craft individual RPC/Ops, enable security and fault injection testing Test developer friendly - Hide complex protocol details, simplified test APIs Features Synthetic client built on top of ONC RPC Library Provide client ID and session management Slot and sequence number management for sessions Automatic client/session recovery and lease-renew Hard/Soft mount behavior (set by a policy) Network load-balancing across multi-connections in a session Allow overriding default protocol validation logic using callbacks 7
Simplified Test APIs Example: Getting Server s Lease Time Complete steps EXCHANGE_ID Simplified steps CREATE_SESSION ENCODE COMPOUND RPC CALL DECODE DESTROY_SESSION DESTROY_CLIENTID 8
Example of Protocol Validation Example: RFC Compliance current state id test Compound(PUTROOTFH + LOOKUP + OPEN + READ + CLOSE + WRITE) using special current state id Expected result: WRITE should fail with error NFS4ERR_BAD_STATEID Actual results: CLOSE failed with error NFS4ERR_OLD_STATEID 9
Security Testing ONCRPC fuzzing ONCRPC fuzzing engine is built into ONCRPC Library and can be enabled by calling application It can fuzz the following areas TCP Record Marker RPC header PRC credentials AUTH_SYS structure PRC_SECGSS structure PRC payload ONCRPC fuzzer does not wait for server s reply 10
Security Testing Challenges Implementation Stateful protocol, compound can have any number of different operations To achieve high code penetration current FH, saved FH, State id, SEQUENCE operations must be valid Maintain a pool of valid file handles and state ids Dedicated session for fuzzing SEQUENCE operation Dumb fuzzing (low code penetration) Use valid SEQUENCE and file handles COMPOUND(SEQUENCE+PUFH+SAVEFH+PUTFH+X) Random values for operation X Smart fuzzing (higher code penetration) Use valid SEQUENCE file handles and state ids Craft compound for each operation that manipulate file handle and state id Ex. To fuzz OPEN - COMPOUND(PUTFH+OPEN+GETFH) File handle and state id created during fuzzing will saved and used in subsequent fuzzing 11
Session Trunking Test Objectives: Verify the functionality of session trunking in Windows NFS Server Test Server s performance and scalability with session trunking Create a full end to end example of session trunking in action Challenges: Unavailability of industry clients doing session trunking Interface type, network speed, make/model, number of interfaces Utilizes the network load-balancing mechanism provided by synthetic client Tracks # of pending I/Os / connection Maximized multiple 1/10GbE NICs with Synthetic Test Client and Windows NFS Server 12
Extending RFC compliance - Server Scalability/Limits Testing Objectives Find the limits of internal data structures Find the resource bottleneck hindering performance and scalability Challenges Simulate huge number of clients/sessions using limited resources Prevent client Ids/sessions lease from expiring Workload simulation Matrices Number of client IDs Number of sessions per client ID Number of connections per session Number of opens and locks Number of Pseudo file system nodes Load, No Load variations Implementation Built on top of synthetic client Simulate multiple clients using a single machine to simplify test execution Use different client_owner for each NFS4_Client object Multi-threads prevent client ID/session from expiring by simulating file access activities PASS or FAIL is dictated by latency and error returned by server APIs provided by synthetic client simplified the test development 13
Takeaway Traditional testing using POSIX APIs often not enough to test all server features Ability to craft individual packets can unleash great power for testing protocol compliance Allows you to expand test scenarios beyond that of limited by client implementation Increase confidence for interoperability with future client implementations Rapid test scenario development with wider coverage can be achieved by implementing test developer friendly framework 14
Questions?