ÄçìÞôñéïò ÂáóéëåéÜäçò

Transcription

1 ÄéáôñéâÞ åðß Äéäáêôïñßá Ìïíôåëïðïßçóç - Ðñïóïìïßùóç Äñïìïëüãçóçò êáé Åêôßìçóç Áðüäïóçò ÐïëõâÜèìéá ÄéáóõíäåìÝíùí Äéêôýùí ìå óêïðü ôç Âåëôéóôïðïßçóç ôçò Ëåéôïõñãßáò ÊáôáíåìçìÝíùí Åöáñìïãþí ÄçìÞôñéïò ÂáóéëåéÜäçò Ïêôþâñéïò 2009

2 ÔñéìåëÞò ÓõìâïõëåõôéêÞ ÅðéôñïðÞ Êþóôáò ÂáóéëÜêçò, Åðßêïõñïò ÊáèçãçôÞò ôïõ ôìþìáôïò ÅðéóôÞìçò êáé Ôå íïëïãßáò Õðïëïãéóôþí ôïõ Ðáíåðéóôçìßïõ ÐåëïðïííÞóïõ (ÅðéâëÝðùí) Ðáíáãéþôçò ÃåùñãéÜäçò, ÊáèçãçôÞò ôïõ ôìþìáôïò ÐëçñïöïñéêÞò êáé Ôçëåðéêïéíùíéþí ôïõ Ðáíåðéóôçìßïõ Áèçíþí Óðõñßäùí Óêéáäüðïõëïò, Åðßêïõñïò ÊáèçãçôÞò ôïõ ôìþìáôïò ÅðéóôÞìçò êáé Ôå íïëïãßáò Õðïëïãéóôþí ôïõ Ðáíåðéóôçìßïõ ÐåëïðïííÞóïõ ÅðôáìåëÞò ÅîåôáóôéêÞ ÅðéôñïðÞ Êþóôáò ÂáóéëÜêçò, Åðßêïõñïò ÊáèçãçôÞò ôïõ ôìþìáôïò ÅðéóôÞìçò êáé Ôå íïëïãßáò Õðïëïãéóôþí ôïõ Ðáíåðéóôçìßïõ ÐåëïðïííÞóïõ (ÅðéâëÝðùí) Ðáíáãéþôçò ÃåùñãéÜäçò, ÊáèçãçôÞò ôïõ ôìþìáôïò ÐëçñïöïñéêÞò êáé Ôçëåðéêïéíùíéþí ôïõ Ðáíåðéóôçìßïõ Áèçíþí Óðõñßäùí Óêéáäüðïõëïò, Åðßêïõñïò ÊáèçãçôÞò ôïõ ôìþìáôïò ÅðéóôÞìçò êáé Ôå íïëïãßáò Õðïëïãéóôþí ôïõ Ðáíåðéóôçìßïõ ÐåëïðïííÞóïõ Êùíóôáíôßíïò ÌáóóÝëïò, ÁíáðëçñùôÞò ÊáèçãçôÞò ôïõ ôìþìáôïò ÅðéóôÞìçò êáé Ôå íïëïãßáò Õðïëïãéóôþí ôïõ Ðáíåðéóôçìßïõ ÐåëïðïííÞóïõ ÄçìÞôñéïò ÂëÜ ïò, Åðßêïõñïò ÊáèçãçôÞò ôïõ ôìþìáôïò ÅðéóôÞìçò êáé Ôå íïëïãßáò Õðïëïãéóôþí ôïõ Ðáíåðéóôçìßïõ ÐåëïðïííÞóïõ Ãåþñãéïò ËÝðïõñáò, Åðßêïõñïò ÊáèçãçôÞò ôïõ ôìþìáôïò ÅðéóôÞìçò êáé Ôå íïëïãßáò Õðïëïãéóôþí ôïõ Ðáíåðéóôçìßïõ ÐåëïðïííÞóïõ ÁëÝîáíäñïò Êáëüîõëïò, ËÝêôïñáò ôïõ ôìþìáôïò ÅðéóôÞìçò êáé Ôå íïëïãßáò Ôçëåðéêïéíùíéþí ôïõ Ðáíåðéóôçìßïõ ÐåëïðïííÞóïõ

3 Acknowledgements I would like to express my sincere thanks and gratitude to my advisor Assistant Professor Costa Vassilaki for his helpful suggestions and advice. Collaborating with him has been a pleasant and memorable experience. I would also like to thank the members of my committee Professor Panayioti Georgiadi and Assistant Professor Spiro Skiadopoulo. Last but by no means least it gives me immense pleasure to express my gratitude to my family which has always been an important source of encouragement and support. I owe so much thanks to my three sons Chris, Tasos, and John who joined me when I was writing my doctorate thesis, for giving me unlimited happiness and pleasure.

4 Table of Contents 1 Introduction Classication of Multistage Interconnection Networks and Related Work Thesis Contribution Internal Priority MINs Introduction Related Work Internal Priority MIN Description and Analytical Model Analysis of MINs Denitions of MINs Performance Evaluation Methodology of Internal Priority MINs Simulation and Performance Results of Internal Priority MINs Conclusions for Internal Priority MINs Dual Priority MINs and Asymmetric-sized Buer Queues Introduction Related Work Dual Priority MIN and Analytical Model State Notations for High Priority Queues Denitions for High Priority Queues Mathematical Analysis for High Priority Queues State Notations for Low Priority Queues Denitions for Low Priority Queues Mathematical Analysis for Low Priority Queues Performance Evaluation Methodology of Dual Priority MINs Simulation and Performance Results of Dual Priority MINs Dual Priority MINs vs.single Priority Ones Dual Priority MINs with Asymmetric-sized Buer Queues Conclusions for Dual Priority MINs Class Priority Multi-Layer MINs under Hotspot Trac Introduction Analysis of 2-Class Priority Multi-Layer MINs under Hotspot Environment 77 1

5 4.3 Performance Evaluation Parameters and Methodology of 2-Class Priority MINs under Hotspot Environment Simulation and Performance Results of 2-Class Priority Multi-Layer MINs under Hotspot Environment Simulator Validation for 2-Class Priority MINs under Hotspot Environment Class Priority Single-Layer MINs Performance under Hotspot Environment Class Priority Multi-Layer MINs Performance under Hotspot Environment Conclusions for 2-Class Priority Multi-Layer MINs under Hotspot Environment Multi Priority MINs Introduction Analysis of Multi-Priority MINs Performance Evaluation Parameters and Methodology of Multi-Priority MINs Simulation and Performance Results of Multi-Priority MINs Simulator Validation for Multi-Priority MINs Multi-Priority MINs Performance Conclusions for Multi-Priority MINs Multi-layer, Multi-priority MINs under Multicast Environment Introduction Multi-layer, Multi-priority MIN Description Conguration and Operational Parameters of Multi-layer MINs Performance Evaluation Metrics for Multi-layer MINs Metrics for Single-layer Segment of MINs Metrics for multi-layer MINs Simulation and Performance Results for multi-layer, multi-priority MINs Simulator Validation for Multicasting Multicasting on Single-priority, Single-layer MINs Multicasting on Dual-priority, Single-layer MINs Multicasting on Dual-priority Multi-layer MINs Multicasting on Multi-layer Segment of Dual-priority MINs Conclusions

6 List of Figures 1.1 A classication of MINs A cxc Switching Element A 3-stage Delta Network consisting of cxc SEs A state transition diagram of a SE(k) buer Th of single-buered, 6-stage, single- (or non-) priority MIN T h of double-buered, n-stage, internal- vs. non-priority scheme Th of nite-buered (b=4), n-stage, internal- vs. non-priority scheme Th of nite-buered (b=8), n-stage, internal- vs. non-priority scheme D of nite-buered, 6-stage, internal- vs. non-priority scheme Upf of nite-buered, 6-stage, internal- vs. non-priority scheme class priority for 3-stage MIN consisting of 2x2 SEs A state transition diagram of a high priority buer of SE(k) A state transition diagram of a low priority buer of SE(k) Th total of nite-buered, 10-stage, dual- vs. single-priority MINs RTh(h) of nite-buered, 10-stage, dual-priority MIN RTh(l) of nite-buered, 10-stage, dual-priority MIN D(h) of nite-buered, 10-stage, dual-priority MIN D(l) of nite-buered, 10-stage, dual-priority MIN RTh(h) of nite-buered, k-stage, dual-priority MIN RTh(l) of nite-buered, k-stage, dual-priority MIN Upf(h) of nite-buered, 10-stage, dual-priority MIN Upf(l) of nite-buered, 10-stage, dual-priority MIN Th total of asymmetric-sized, 10-stage, dual-priority MIN RT h(h) of asymmetric-sized, 10-stage, dual-priority MIN RT h(l) of asymmetric-sized, 10-stage, dual-priority MIN D(h) of asymmetric-sized, 10-stage, dual-priority MIN D(l) of asymmetric-sized, 10-stage, dual-priority MIN Upf(h) of asymmetric-sized, 10-stage, dual-priority MIN Upf(l) of asymmetric-sized, 10-stage, dual-priority MIN An 8X8 delta-2 network with hotspot trac A lateral view of an 8X8 multilayer MIN

7 4.3 Total Th of dual-priority, single-buered, 6-stage MINs RT h of single-priority, double-buered, 6-stage, single-layer MIN under hotspot trac RT h of dual-priority, double-buered, 6-stage, single-layer MIN under hotspot trac D of single-priority, double-buered, 6-stage, single-layer MIN under hotspot trac D of dual-priority, double-buered, 6-stage, single-layer MIN under hotspot trac Upf of single-priority, double-buered, 6-stage, single-layer MIN under hotspot trac Upf of dual-priority, double-buered, 6-stage, single-layer MIN under hotspot trac RT h of dual-priority, double-buered, 6-stage, multi-layer MIN under hotspot trac D of dual-priority, double-buered, 6-stage, multi-layer MIN under hotspot trac Upf of dual-priority, double-buered, 6-stage, multi-layer MIN under hotspot trac An (NXN) Omega Network A Multi-Priority (2X2) Switching Element Total T h of multi-priority, single/double-buered, 10-stage MINs RTh of multi-priority, single-buered, 10-stage MIN RTh of multi-priority, double-buered, 10-stage MIN D of multi-priority, single-buered, 10-stage MIN D of multi-priority, double-buered, 10-stage MIN Upf of multi-priority, single-buered, 10-stage MIN Upf of multi-priority, double-buered, 10-stage MIN X4 Single-priority multi-layer MIN X8 Single-priority multi-layer MIN Th of single-priority MINs D of single-priority MINs Upf of single-priority MINs Pl of single-priority MINs Th of single-layer MINs (m=0,0.10) Th of single-layer MINs (m=0.50) D of single-layer MINs (m=0.50) D of single-layer MINs (m=0,0.10) Upf of single-layer MINs P l of single-layer MINs

8 6.13 Th of multi-layer MINs (m=0.50) Th of multi-layer MINs (m=0.10) D of multi-layer MINs (m=0.50) D of multi-layer MINs (m=0.10) Upf of multi-layer MINs (m=0.50) Upf of multi-layer MINs (m=0.10) P l of multi-layer MINs (m=0.50) Th of multi-layer MINs (m=0.50) Th of multi-layer MINs (m=0.10) D of multi-layer MINs P l of multi-layer MINs Upf of multi-layer MINs (m=0.50) Upf of multi-layer MINs (m=0.10)

9 List of Algorithms 2.1 Send-queue process for single- and internal-priority MINs Switching Element process for single-priority MINs Switching Element process for internal-priority MINs Send-queue process of high priority packets for dual-priority MINs Send-queue process of low priority packets for dual-priority MINs Switching Element process for dual-priority MINs Input-queue process for 2-class priority MINs under hotspot environment Unicast forwarding for multi-priority MINs Send-queue process for multi-priority MINs Unicast/Partial forwarding for multi-layer, multi-priority MINs Broadcast forwarding for multi-layer, multi-priority MINs Send-queue process for multi-layer, multi-priority MINs

10 Abstract Multistage Interconnection Networks (MINs) have been widely used as ecient interconnection structures for parallel computer systems, as well as switching nodes for high-speed communication networks. Their performance is mainly determined by their communication throughput and their mean packet delay. Although MINs are fairly exible in handling varieties of trac loads, they tend to quickly saturate under either hotspot or multicast/broadcast trac, especially as the size of the network increases. As a response to this issue, multi-priority or/and multi-layer MINs have been proposed, however their performance prediction and evaluation has not been studied suciently insofar. In this thesis we studied the performance of nite-buered MINs by introducing schemes that natively support dierent priority classes -e.g. internal, dual or multi priority-. The rationale behind introducing multiple-priority schemes is to provide dierent QoS (Quality of Service) guarantees to trac from dierent applications, which is a highly desired feature for many IP network operators, and particularly for enterprise networks. Thus, we applied unicast and multicast routing under uniform or hotspot trac conditions both at single- and multi-layer Switching Elements (SEs) under various oered loads using simulations. Moreover, dierent test-bed setups were used in order to investigate and analyze the performance of all priority-class trac, under dierent Quality of Service (QoS) congurations. Finally, we introduced and calculated a universal performance factor, which includes the importance aspect of each of the above main performance factors namely packet throughput and delay, and we found, i.e., that the use of asymmetric-sized buered systems leads to better exploitation of network capacity, while the increments in delays can be tolerated. Consequently, the ndings of this performance evaluation can be used by network designers for designing optimal congurations while setting up MINs, so as to best meet the performance and cost requirements under the anticipated trac load and quality of service specications. The presented results also facilitate performance prediction for multi-layer MINs before actual network implementation, through which deployment cost and rollout time can be minimized. 7

11 ÅêôåôáìÝíç Ðåñßëçøç óôá ÅëëçíéêÜ Ôá ÐïëõâÜèìéá ÄéáóõíäåäåìÝíá Äßêôõá (Multistage Interconnection Networks) ðïõ âáóßæïíôáé óôïõò Óôáõñùôïýò (Crossbar) Óôïé åéþäåéò Äéáêüðôåò (2X2 Switching Elements) óõíþèùò áðïôåëïýí ôç âáóéêþ áñ éôåêôïíéêþ äéáóýíäåóçò ôüóï ôùí åðåîåñãáóôþí üóï êáé óôïé åßùí ìíþìçò êõñßùò óå ÐïëõåðåîåñãáóôéêÜ ÓõóôÞìáôá [8, 45, 1]. Ôçí ôåëåõôáßá äåêáåôßá, ôá ÐïëõâÜèìéá ÄéáóõíäåäåìÝíá Äßêôõá (ÐÓÄ) üëï êáé ðåñéóóüôåñï ñçóéìïðïéïýíôáé óôçí êáôáóêåõþ ôüóï ÓõóôçìÜôùí ÌåôáãùãÞò üóï êáé Åðéêïéíùíéáêþí Äéêôýùí õøçëþò ùñçôéêüôçôáò, üðùò ïé ÁÔÌ (Asynchronous Transfer Mode) Ìåôáãùãåßò, ïé Gigabit Ethernet Ìåôáãùãåßò êáé ïé Terabit ÄñïìïëïãçôÝò [46, 42, 4]. ¹äç ç áñ éôåêôïíéêþ ôùí ÐÓÄ ñçóéìïðïéåßôáé óå ðïëëýò äéáöïñåôéêïý ôýðïõ åöáñìïãýò êáëýðôïíôáò Ýíá åõñý öüóìá åöáñìïãþò áðü ôïõò åóùôåñéêïýò äéáýëïõò åðéêïéíùíßáò êõêëùìüôùí ðïëý õøçëþò êëßìáêáò ïëïêëþñùóçò (VLSI - Very Large Scale Integration) Ýùò ôá åõñåßáò ðåñéï Þò Äßêôõá Õðïëïãéóôþí (ð.. ùò óôïé åßá äéáóýíäåóçò êáôáíåìçìýíá äéáìïéñáæüìåíùí óôïé- åßùí ìíþìçò Ðïëõåðåîåñãáóôéêþí ÓõóôçìÜôùí Þ ùò óôïé åéþäç áñ éôåêôïíéêþ ôçò äéêôýùóçò âéïìç áíéêþí åöáñìïãþí). Ç áðüäïóç ôùí óôïé åßùí äéáóýíäåóçò (êüìâùí, åðåîåñãáóôþí, ìíþìçò êëð.) áðïôåëåß Ýíáí áðü ôïõò êõñéüôåñïõò ðáñüãïíôåò ðïõ åðçñåüæïõí ôç óõíïëéêþ áðüäïóç ôùí ðáñüëëçëùí êáé êáôáíåìçìýíùí óõóôçìüôùí. Áõôü åß å ùò áðïôýëåóìá ðïëëýò Ýñåõíåò íá åóôéüóïõí ôç ìåëýôç ôïõò óôïí êáèïñéóìü ôùí ðáñáãüíôùí ðïõ åðçñåüæïõí ôçí áðüäïóç åíüò äéêôýïõ äéáóýíäåóçò êáé ãåíéêüôåñá åíüò ðáñüëëçëïõ óõóôþìáôïò êáèþò åðßóçò êáé óôçí ðñüâëåøç êáé åêôßìçóç ôçò áðüäïóþò ôïõ. Ç åðéëïãþ ôçò êáôüëëçëçò áñ éôåêôïíéêþò äéáóýíäåóçò ð.., óôçí êáôáóêåõþ åíüò óýã ñïíïõ ðáñüëëçëïõ óõóôþìáôïò äéáóýíäåóçò ðáßæåé óçìáíôéêü ñüëï óôçí óõíïëéêþ ôïõ áðüäïóç êáé åîáñôüôáé áðü ðïëëïýò ðáñüãïíôåò ìåôáîý ôùí ïðïßùí ïé áðáéôþóåéò óå åýñïò æþíçò (bandwidth), ç äõíáôüôçôá äéáâüèìéóçò (scalability), ç åðåêôáóéìüôçôá (expandability), ç äõíáôüôçôá ôìçìáôïðïßçóçò (partitionability), ïé öõóéêïß ðåñéïñéóìïß, ç áîéïðéóôßá êáèþò êáé ç åðéóêåõáóéìüôçôü (repairability) ôïõ. Óôçí âéâëéïãñáößá ôá ÐïëõâÜèìéá ÄéáóõíäåäåìÝíá Äßêôõá (ÐÓÄ) ôáîéíïìïýíôáé óå äýï êýñéåò êáôçãïñßåò: óôïõò ìåôáãùãåßò äéáßñåóçò ñüíïõ (time division switches) êáé ôïõò ìåôáãùãåßò äéáßñåóçò þñïõ (space division switches). Ìéá ôõðéêþ ôáîéíüìçóç ôùí ÐÓÄ ç ïðïßá ðåñéëáìâüíåé ôéò ðéï äéáäåäïìýíåò êáôçãïñßåò äéêôýùí ðáñïõóéüæåôáé óôï äéüãñáììá 1. Ãéá ôïõò ó åäéáóôýò äéêôýùí ôá crossbar äßêôõá ðüíôïôå áðïôåëïýóáí óçìåßï Ýëîçò åîáéôßáò ôïõ ãåãïíüôïò üôé äåí ðáñïõóéüæïõí åóùôåñéêþ áíüó åóç (non-blocking switches). 8

12 ÄéÜãñáììá 1 Ôáîéíüìçóç ÐïëõâÜèìéá ÄéáóõíäåäåìÝíùí Äéêôýùí Ðáñüëá áõôü, ç áñ éôåêôïíéêþ ôùí ÐÓÄ ðïõ âáóßæåôáé óôá crossbar äßêôõá ðáñïõóéüæåé åîáéñåôéêü õøçëþ ðïëõðëïêüôçôá ôüóï óôéò äéáóõíäýóåéò (paths) üóï êáé óôéò äéáêëáäþóåéò (crosspoints) ç ïðïßá åßíáé ôüîçò Ï(N 2 ), üðïõ N åßíáé ï áñéèìüò ôùí èõñþí åéóüäïõ Þ åîüäïõ ôïõ åí ëüãù äéêôýïõ, áðïôñýðïíôáò Ýôóé ôçí åöáñìïãþ ôçò óå äßêôõá ìåãüëïõ ìåãýèïõò. Óõíåðþò ç ôå íïëïãßá crossbar äéêôýùí åíäåßêíõôáé ãéá ôçí êáôáóêåõþ ìåóáßïõ ìåãýèïõò Äéêôõáêþí ÓõóôçìÜôùí ôá ïðïßïé äåí ðáñïõóéüæïõí åóùôåñéêþ áíüó åóç, åíþ Ý ïõí ðáñüëëçëá ôç äõíáôüôçôá ôçò áõôü-äñïìïëüãçóçò (self-routing). Áðü ôçí Üëëç ðëåõñü ôá äßêôõá ôýðïõ Banyan [16] áñáêôçñßæïíôáé áðü ôï ãåãïíüò üôé õðüñ åé Ýíá êáé ìïíáäéêü ìïíïðüôé ìåôáîý êüèå èýñáò åéóüäïõ êáé åîüäïõ, äéáèýôïõí ôç äõíáôüôçôá ôçò áõôü-äñïìïëüãçóçò (self-routing), áëëü ðáñïõóéüæïõí åóùôåñéêþ áíüó åóç óôçí ðñïþèçóç ôùí ðáêýôùí ôïõò üôáí ç áíôßóôïé ç ïõñü ðñïïñéóìïý åßíáé ãåìüôç. Ç ðïëõðëïêüôçôá üìùò ôùí äéêôýùí ôýðïõ Banyan õðïëïãßæåôáé ùò óõíüñôçóç Nlog 2 N êáé åßíáé áéóèçôü ìéêñüôåñç åêåßíçò ôùí crossbar äéêôýùí, ìåôáôñýðïíôáò Ýôóé ôçí åí ëüãù áñ éôåêôïíéêþ ùò âüóç ãéá ôçí êáôáóêåõþ ìåãáëýôåñùí óå ìýãåèïò Äéêôõáêþí ÓõóôçìÜôùí. Äõóôõ þò üëåò ïé õðïêáôçãïñßåò ôùí äéêôýùí ôýðïõ Banyan [16] ðáñïõóéüæïõí åóùôåñéêþ áíüó åóç êáé åîáéôßáò áõôïý ôïõ ãåãïíüôïò ç áðüäïóþ ôïõò ðýöôåé ñáãäáßá êáèþò ôï ìýãåèüò ôïõò áõîüíåé åéäéêüôåñá êüôù áðü óõíèþêåò öïñôßïõ ôýðïõ multicast Þ hotspot. Ðáñüëá áõôü õðüñ ïõí ìåñéêïß ôñüðïé ðïõ ìðïñïýí íá ìåéþóïõí ôçí ðéèáíüôçôá åóùôåñéêþò áíüó åóçò ôùí ðáêýôùí, üðùò ç åöáñìïãþ ðñïôåñáéïôþôùí Þ ç áñ éôåêôïíéêþ ðïëëáðëþí åðéðýäùí, óå åðßðåäá ðïõ íá èåùñïýíôáé áðïäåêôü áðü ôéò äéáöüñùí ôýðùí åöáñìïãýò, üðïõ ï áêñéâþò êáèïñéóìüò ôùí áíåêôéêþí ïñßùí ôçò åêüóôïôå åöáñìïãþò ó åôßæåôáé Üìåóá ìå ôçí ðïéüôçôá ôçò ðñïóöåñüìåíçò õðçñåóßáò. ôóé üëåò ïé õðïêáôçãïñßåò ôùí äéêôýùí ôýðïõ Banyan, üðùò ôá Omega [26], Delta [35] êáé Generalized Cube [2] äßêôõá ðñïôéìïýíôáé Ýíáíôé ôùí Üëëùí ôýðùí äéêôýùí, äéüôé åßíáé öèçíüôåñá óôçí êáôáóêåõþ ôïõò êáé åõêïëüôåñá óôéò äéáäéêáóßåò Ýëåã ïõ. 9

13 ÄéÜãñáììá 2 3-óôáäßùí Delta äßêôõï êáôáóêåõáóìýíï ìå cxc Óôïé åéþäåéò Äéáêüðôåò Óôï äéüãñáììá 2 áðåéêïíßæåôáé Ýíá NXN Delta äßêôõï ôï ïðïßï áðáñôßæåôáé áðü n = log c N óôüäéá êáé åßíáé êáôáóêåõáóìýíï áðü cxc Óôïé åéþäåéò Äéáêüðôåò (ÓÄ), üðïõ c åßíáé ï âáèìüò ôïõ êüèå ÓÄ. Óýìöùíá ìå ôï äéüãñáììá 2 êüèå óôüäéï áðáñôßæåôáé áðü N c ÓÄ êáé åðïìýíùò ï óõíïëéêüò áñéèìüò ÓÄ óå Ýíá ÐÓÄ åßíáé N c log cn. Óôï äéüãñáììá 3 áðåéêïíßæåôáé Ýíá 8 8 ðïëõåðßðåäï Delta äßêôõï, ôï ïðïßï áðïôåëåßôáé áðü äýï ìýñç: ôï áñ éêü ôï ïðïßï åßíáé ìïíïåðßðåäï êáé óõíßóôáôáé áðü ôïõò ÓÄ ôùí äýï ðñþôùí óôáäßùí êáé ôï åðüìåíï ôï ïðïßï åßíáé äýï åðéðýäùí êáé óõíßóôáôáé áðü ôïõò ÓÄ ôïõ ôåëåõôáßïõ óôáäßïõ. Óýìöùíá ìå ôï äéüãñáììá 3 ç ðñïþèçóç åíüò ðáêýôïõ áðü ôï äåýôåñï ðñïò ôï ôñßôï óôüäéï ðñáãìáôïðïéåßôáé ùñßò áíáó Ýóåéò, äéüôé ôá ðáêýôá ôïõ äåýôåñïõ óôáäßïõ äåí áíôáãùíßæïíôáé ìåôáîý ôïõò ãéá ôçí ßäéá èýóç åíäéüìåóçò ìíþìçò ôïõ åðüìåíïõ óôáäßïõ êáé óõíåðþò êüèå unicast Þ multicast åêðïìðþ åßíáé åëåýèåñç óõãêñïýóåùí óôï ôåëåõôáßï ðïëõåðßðåäï ôìþìá ôïõ äéêôýïõ. Ãåíéêüôåñá, áí ï âáèìüò áíáäßðëùóçò (replication) ôïõ åðüìåíïõ óôáäßïõ i + 1 ôïí ïðïßï óõìâïëßæïõìå ìå l i+1 éóïýôáé ìå 2 l i ç åêðïìðþ áðü ôï ôñý oí ðñïò ôï åðüìåíï óôüäéï ðñáãìáôïðïéåßôáé Üíåõ óõãêñïýóåùí. 10

14 ÄéÜãñáììá 3 8x8 Delta äßêôõï ðïëëáðëþí åðéðýäùí Óõíåðþò áí ãéá Ýíá ÐÓÄ ìå n óôüäéá õðüñ åé Ýíáò áñéèìüò nb(1 nb < n) ôýôïéïò þóôå k : l k+1 = 2 l k (nb k < n), ôüôå ôï ÐÓÄ èåùñåßôáé üôé ëåéôïõñãåß ùñßò óõãêñïýóåéò ôá ôåëåõôáßá (n nb) óôüäéá. ÅðéðëÝïí, óýìöùíá ìå ôïí [50] áíáó Ýóåéò èá ìðïñïýóå íá óõìâïýí êáé óôçí Ýîïäï ôïõ ÐÓÄ, åüí åßôå ï ðïëõðëýêôçò åßôå ç ãñáììþ åîüäïõ äåí åß áí åðáñêþ ùñçôéêüôçôá. Ðáñüëá áõôü óôç äéáôñéâþ áõôþ èåùñïýìå üôé ç Ýîïäïò ôùí ðáêýôùí áðü ôï ÐÓÄ ðñáãìáôïðïéåßôáé ùñßò óõãêñïýóåéò, äéüôé ï âáèìüò áíáäßðëùóçò ôïõ ÐÓÄ åðéëýãåôáé ìå ôýôïéï ôñüðï þóôå íá åðéôõã Üíåôáé ç ìýãéóôç äõíáôþ áðüäïóç ôïõ äéêôýïõ ìå ôï áìçëüôåñï äõíáôü êüóôïò êáôáóêåõþò. ÄéÜãñáììá 4 8x8 Delta äßêôõï ìå öïñôßï hotspot ÔÝëïò, óôï äéüãñáììá 4 áðåéêïíßæåôáé Ýíá (8 8) ÐÓÄ ðïõ äéáèýôåé ìéá hotspot Ýîïäï ç ïðïßá óôï ðáñáäåéãìü ìáò èåùñåßôáé üôé åßíáé ç Ýîïäïò 0 êáé óôçí ïðïßá üëåò ïé åßóïäïé (0-7) óôýëíïõí Ýíá óçìáíôéêü ìýñïò ôçò êßíçóþò ôïõò. ôóé, óýìöùíá ìå ôï äéüãñáììá 11

15 4 üëïé ïé ÓÄ ôïõ ÐÓÄ ìðïñïýí íá ôáîéíïìçèïýí óå äýï êáôçãïñßåò: óôçí ïìüäá Grouphst êáé Group-nt, üðïõ hst óõìâïëßæåé åêåßíïõò ôïõò ÓÄ ðïõ ëáìâüíïõí êáé ðñïùèïýí hotspot öïñôßï, åíþ ôï nt óõìâïëßæåé åêåßíïõò ôïõò ÓÄ ðïõ ëáìâüíïõí êáé ðñïùèïýí ìüíï êáíïíéêü (normal trac or non-hotspot) öïñôßï. Ìå âüóç ôï áíùôýñù äéüãñáììá äéáêñßíïõìå ôéò åîþò êáôçãïñßåò åîüäùí: Ýîïäïò 0, ç ìïíáäéêþ hotspot Ýîïäïò. Ýîïäïò 1, ç ðëáúíþ (adjacent) Ýîïäïò. Ôá ðáêýôá ðïõ äñïìïëïãïýíôáé ðñïò áõôþ ôçí Ýîïäï óõãêñïýïíôáé ìå ôá ðáêýôá ðïõ êáôåõèýíïíôáé ðñïò ôç hotspot Ýîïäï êáôü ôç äéüñêåéá üëùí ôùí åíäéüìåóùí óôáäßùí êáé åßíáé åëåýèåñá ôýôïéùí óõãêñïýóåùí ìüíï üôáí äéáó ßæïõí ôçí Ýîïäï (output link). Ýîïäïé 2 êáé 3, ïé ïðïßåò åßíáé åëåýèåñåò óõãêñïýóåùí ìå ðáêýôá ðïõ äñïìïëïãïýíôáé ðñïò ôç hotspot Ýîïäï êáôü ôç äéüñêåéá ôïõ ôåëåõôáßïõ óôáäßïõ ôïõ ÐÓÄ. ÁõôÝò ïé Ýîïäïé ïíïìüæïíôáé Cold-1, äéüôé áêñéâþò åßíáé åëåýèåñåò óõãêñïýóåùí ìå hotspot öïñôßï ìüíï êáôü Ýíá óôüäéï. Ýîïäïé 4-7, ïé ïðïßåò åßíáé åëåýèåñåò óõãêñïýóåùí ìå ðáêýôá ðïõ äñïìïëïãïýíôáé ðñïò ôç hotspot Ýîïäï êáôü ôç äéüñêåéá ôùí äýï ôåëåõôáßùí óôáäßùí ôïõ ÐÓÄ êáé ïíïìüæïíôáé Cold-2. Ãåíéêåýïíôáò óå Ýíá i-óôáäßùí ÐÓÄ, ïé èýñåò åîüäïõ ôïõ ìðïñïýí íá ôáîéíïìçèïýí óôéò áêüëïõèåò (i + 1) æþíåò: hotspot, adjacent, and cold-j (1 j i 1). ¼ëá ôá ÐïëõâÜèìéá ÄéáóõíäåäåìÝíá Äéêôýá ðïõ åîåôüæïõìå óå áõôþ ôç äéáôñéâþ áíþêïõí óôçí êáôçãïñßá ôùí Banyan Switches êáé ëåéôïõñãïýí êüôù áðü ôéò áêüëïõèåò èåùñþóåéò: Ç äñïìïëüãçóç åêôåëåßôáé ùò ìéá pipeline äéåñãáóßá ìå ôçí Ýííïéá üôé åêôåëåßôáé ðáñüëëçëá óå êüèå óôüäéï. Ï óõã ñïíéóìüò ôùí åóùôåñéêþí ñïëïãéþí ëåéôïõñãåß óå äéáêåêñéìýíá ñïíéêü äéáóôþìáôá (slotted time model) [44] êáé ïé Óôïé åéþäåéò 2X2 Äéáêüðôåò (ÓÄ) Ý ïõí åðáêñéâþò ðñïêáèïñéóìýíï ñüíï åîõðçñýôçóçò ôùí ðáêýôùí (deterministic service time). Ç Üöéîç ôùí ðáêýôùí óå êüèå åßóïäï ôïõ äéêôýïõ åßíáé ìéá áðëþ äéåñãáóßá Bernoulli, ð.., ç ðéèáíüôçôá Üöéîçò åíüò ðáêýôïõ óå Ýíá ùñïëïãéáêü êýêëï åßíáé óôáèåñþ êáé ïé áößîåéò åßíáé ìåôáîý ôïõò áíåîüñôçôåò. ÅðéðëÝïí, êáôü ôçí Üöéîç åíüò ðáêýôïõ óôçí åßóïäï (äçë. óå ìéá ïõñü åíüò ÓÄ ôïõ ðñþôïõ óôáäßïõ) åüí ç åíäéüìåóç ìíþìç (buer) ôçò åí ëüãù ïõñüò åßíáé ãåìüôç ôï ðáêýôï áõôü áðïññßðôåôáé. íá ðáêýôï èåùñåßôáé üôé âñßóêåôáé óå êáôüóôáóç áíüó åóçò óå ìéá ïõñü ôïõ ôñý ïíôïò óôáäßïõ åüí ç åíäéüìåóç ìíþìç ðñïïñéóìïý ôïõ åðüìåíïõ óôáäßïõ åßíáé ãåìüôç. 12

16 ¼ôáí äõï ðáêýôá óå Ýíá óôüäéï áíôáãùíßæïíôáé ãéá ôçí ñþóç ôçò ßäéáò åíäéüìåóçò ìíþìçò ôçò ïõñüò ðñïïñéóìïý ôïõ åðïìýíïõ óôáäßïõ åîáéôßáò ôïõ ãåãïíüôïò üôé äåí õðüñ åé åðáñêþò þñïò ãéá íá áðïèçêåõôïýí êáé ôá äõï ðáêýôá, ôüôå óõìâáßíåé óýãêñïõóç êáé óôçí ðåñßðôùóç ðïõ äåí õðïóôçñßæïíôáé ðñïôåñáéüôçôåò Ýíá áðü ôá äýï ðáêýôá ìå ôõ áßï ôñüðï ãßíåôáé áðïäåêôü óôçí åí ëüãù ïõñü, åíþ ôï Üëëï ðáñáìýíåé óå êáôüóôáóç áíüó åóçò óôï ôñý ïí óôüäéï. Óôçí ðåñßðôùóç üìùò ôçò ðñïôåéíüìåíçò (internal-priority) äñïìïëüãçóçò üðïõ õðïóôçñßæåôáé ç ðñïôåñáéüôçôá åêåßíùí ôùí ðáêýôùí ðïõ ïé ïõñýò ôïõò Ý ïõí ìåãáëýôåñï ðëçèõóìü, óå êáôüóôáóåéò óýãêñïõóçò ðñïçãïýíôáé åêåßíá ôá ðáêýôá ðïõ ðñïýñ ïíôáé áðü ÓÄ üðïõ ôï ìýãåèïò ôçò åíäéüìåóçò ìíþìçò ôùí åí ëüãù ïõñþí åßíáé ìåãáëýôåñï ôùí áíôßóôïé ùí ïõñþí ôùí áíôáãùíéæüìåíùí ðáêýôùí. Óå ðåñéðôþóåéò õëïðïßçóçò äéêôýùí ðïõ õðïóôçñßæïõí ðñùôïãåíþò ðïëëáðëýò ðñïôåñáéüôçôåò êáôü ôçí ðñïþèçóç ôùí ðáêýôùí, ïé åí ëüãù ðñïôåñáéüôçôåò áíôáíáêëïýí ôçí ðïéüôçôá ôçò ðáñå üìåíçò õðçñåóßáò. Ãéá ôçí êáôáóêåõþ ôùí ÓÄ ôùí ÐÓÄ ñçóéìïðïéþíôáé äéáöïñåôéêýò ïõñýò áðïèþêåõóçò áíü êáôçãïñßá ðñïôåñáéüôçôáò, åíþ êüèå ðáêýôï óôçí åßóïäï ôïõ äéêôýïõ óçìáäåýåôáé ìå ôçí êáôüëëçëç ðñïôåñáéüôçôá ðñéí ïäçãåéèåß óôçí áíôßóôïé çò ôüîçò ïõñü. ôóé êáôü ôç äéáäéêáóßá åðßëõóçò óõãêñïýóåùí ëáìâüíåôáé õðüøç ç ðñïôåñáéüôçôá êüèå ðáêýôïõ, üðïõ ðñïöáíþò ôá ðáêýôá ìå ôçí õøçëüôåñç ðñïôåñáéüôçôá ðñïçãïýíôáé óôç äñïìïëüãçóç åêåßíùí ðïõ äéáèýôïõí áìçëüôåñç ðñïôåñáéüôçôá. Óå ðåñéðôþóåéò õðïóôþñéîçò hotspot öïñôßïõ õðüñ åé Ýíá áñ éêü êëüóìá f hs ôïõ óõíïëéêïý öïñôßïõ óå êüèå åßóïäï ôïõ ÐÓÄ ôï ïðïßï äñïìïëïãåßôáé ðñïò ìéá óõãêåêñéìýíç hotspot Ýîïäï, ç ïðïßá ãéá üëåò ôéò ðåñéðôþóåéò ìåëýôçò èåùñåßôáé üôé åßíáé ç Ýîïäïò 0. Óå Ýíá äßêôõï üðïõ õðïóôçñßæåé ðïëëáðëýò ðñïôåñáéüôçôåò ôï hotspot öïñôßï áñáêôçñßæåôáé åî áñ Þò ùò öïñôßï áìçëþò ðñïôåñáéüôçôáò (low priority trac). ôóé óå Ýíá ÐÓÄ ðïõ õðïóôçñßæåé äýï êáôçãïñßåò ðñïôåñáéïôþôùí (dual-priority scheme) ôá åíáðïìåßíáíôá ðáêýôá [ (1 f hs )] äýíáôáé íá áñáêôçñéóôïýí åßôå ùò õøçëþò åßôå ùò áìçëþò ðñïôåñáéüôçôáò êáé êáôáíýìïíôáé ïìïéüìïñöá óå üëåò ôéò åîüäïõò. Áõôü óçìáßíåé üôé ïðïéáäþðïôå Ýîïäïò ôïõ ÐÓÄ, åêôüò âýâáéá ôçò hotspot åîüäïõ Ý åé áêñéâþò ôçí ßäéá ðéèáíüôçôá íá áðïôåëýóåé ôç äéåýèõíóç ðñïïñéóìïý åíüò ðáêýôïõ. ÐÝñáí ôïõ öïñôßïõ ðïõ åîáñ Þò áñáêôçñßæåôáé ùò hotspot ï õðüëïéðïò ñõèìüò åéóüäïõ ðáêýôùí [ (1 f hs )] óôï ÐÓÄ äñïìïëïãåßôáé ïìïéüìïñöá ðñïò üëåò ôéò èýñåò åîüäïõ, óõìðåñéëáìâáíïìýíçò êáé ôçò hotspot, Ýôóé Ýíá åðéðëýïí öïñôßï [( (1 f hs ))=N] ðñïùèåßôáé ðñïò ôç hotspot Ýîïäï, ôï ïðïßï óõìðåñéëáìâüíåé ôüóï áìçëþò üóï êáé õøçëþò ðñïôåñáéüôçôáò ðáêýôá. Óå ðåñéðôþóåéò õðïóôþñéîçò multicast öïñôßïõ, ç åðéêåöáëßäá êüèå ðáêýôïõ ðåñéëáìâüíåé ôá áêüëïõèá äýï éóïìåãýèç ðåäßá Routing Address (RA) êáé Multicast Mask (MM), üðïõ ôï êáèýíá áðü áõôü êáôáëáìâüíåé n bits, üðïõ n åßíáé ï óõíïëéêüò áñéèìüò ôùí óôáäßùí ôïõ ÐÓÄ. ôóé, êáôü ôçí Üöéîç åíüò ðáêýôïõ óå Ýíá ÓÄ ôïõ k óôáäßïõ, áñ éêü åîåôüæåôáé ôï áíôßóôïé ï bit ôçò MM êáé áí áõôü âñåèåß ßóï ìå 13

17 1 ôüôå ôï ðáêýôï åêôåëåß ìéá multicast áíôß unicast åêðïìðþò, áíôéãñüöïíôáò ôï êáé óôéò äõï åîüäïõò ôïõ ÓÄ. Áíôßèåôá üôáí ôï åí ëüãù bit ôçò MM âñåèåß üôé åßíáé 0, ôüôå êáé ìüíï ôüôå åîåôüæåôáé ôï áíôßóôïé ï bit ôçò RA, þóôå ôï ðáêýôï íá áêïëïõèþóåé ôçí êáôüëëçëç unicast åêðïìðþ. Åßíáé ðñïöáíýò üôé üôáí üëá ôá bits ôçò MM åíüò ðáêýôïõ åßíáé ìçäýí, ôüôå ôï ðáêýôï áêïëïõèåß ìéá áðëþ äéáäñïìþ (unicast), öôüíïíôáò óå ìéá óõãêåêñéìýíç Ýîïäï. Áðü ôçí Üëëç ðëåõñü, óôçí áêñáßá ðåñßðôùóç, ðïõ üëá ôá bits ôçò MM Ý ïõí ôåèåß ßóá ìå ôç ìïíüäá, ôüôå ôï åí ëüãù ðáêýôï åêðýìðåôáé óå üëåò ôéò åîüäïõò (broadcast) ôïõ ÐÓÄ. Óå üëåò ôéò Üëëåò ðåñéðôþóåéò ôï ðáêýôï ðñïùèåßôáé óå ìéá ïìüäá èõñþí åîüäïõ ç ïðïßá áðïôåëåß ôçí ïìüäá Multicast Group (MG). ÔÝëïò üëá ôá ðáêýôá óôéò èýñåò åéóüäïõ ðåñéëáìâüíïõí ôüóï ôá äåäïìýíá ðïõ ðñýðåé íá ìåôáöåñèïýí üóï êáé ôéò åôéêýôåò äñïìïëüãçóçò (routing tags). ôóé, êáèüóïí ôá ðáêýôá öôüíïõí óôç äéåýèõíóç ðñïïñéóìïý ôïõò áðïìáêñýíïíôáé Üìåóá áðü ôï ÐÓÄ ìå áðïôýëåóìá íá ìçí ëáìâüíåé þñá áíüó åóç ðáêýôùí êáôü ôç äéýëåõóþ ôùí áðü ôï ôåëåõôáßï óôüäéï ôïõ ÐÓÄ. Ãéá ôïí õðïëïãéóìü åíüò (N X N) ÐÓÄ áðïôåëïýìåíï áðü n = log c N åíäéüìåóá óôüäéá ìå (cxc) ÓÄ, ñçóéìïðïéïýìå ôéò áêüëïõèåò ìïíüäåò ìýôñçóçò. óôù T Ýíá ó åôéêü ìåãüëï ñïíéêü äéüóôçìá äéáéñåìýíï óå u äéáêñéôü ñïíéêü äéáóôþìáôá ( 1 ; 2 ; ; u ) Average throughput Th avg åßíáé ï ìýóïò áñéèìüò ðáêýôùí ðïõ ãßíïíôáé áðïäåêôü óå üëïõò ôïõò ðñïñéóìïýò åîüäïõ áíü äéêôõáêü êýêëï. ÁõôÞ ç ìïíüäá ìýôñçóçò áíáöýñåôáé êáé ùò åýñïò æþíçò. ôóé ôï Th avg ìðïñåß íá ïñéóôåß ùò åîþò uk=1 n Th a (k) avg = lim u (1) u üðïõ ôï n a (k) äçëþíåé ôïí áñéèìü ôùí ðáêýôùí ôá ïðïßá öôüíïõí óôïõò ðñïïñéóìïýò ôïõò êáôü ôç äéüñêåéá ôïõ k th ñïíéêïý äéáóôþìáôïò. Normalized throughput Th åßíáé ôï ðçëßêï ôïõ average throughput Th avg ðñïò ôïí óõíïëéêü áñéèìü ôùí èõñþí åîüäïõ ôïõ äéêôýïõ N. ôóé ôï Th ìðïñåß íá åêöñáóôåß ìå ôïí áêüëïõèï ôýðï Th = Th avg (2) N Average packet delay D avg åßíáé ï ìýóïò ñüíïò ðïõ áðáéôåßôáé ãéá íá äéáó ßóåé Ýíá ðáêýôï ôï äßêôõï êáé ìðïñåß íá åêöñáóôåß ìå ôïí áêüëïõèï ôýðï na (u) k=1 t D d (k) avg = lim u (3) n a (u) üðïõ ôï n a (u) äçëþíåé ôï óõíïëéêü áñéèìü ðáêýôùí ðïõ ãßíïíôáé áðïäåêôü êáôü ôç äéüñêåéá ôùí u ñïíéêþí äéáóôçìüôùí åíþ ôï t d (k) áíáðáñéóôüíåé ôç óõíïëéêþ êáèõóôýñçóç ãéá ôï k th ðáêýôï. Áíáëõôéêüôåñá, èåùñïýìå üôé t d (k) = t w (k) + t tr (k) üðïõ t w (k) äçëþíåé ôç óõíïëéêþ êáèõóôýñçóç ðïõ ëáìâüíåé þñá óå üëåò ôéò ïõñýò áðü üðïõ äéýñ åôáé ôï k th ðáêýôï, åíþ áõôü âñßóêåôáé óå êáôüóôáóç áíáìïíþò ìý ñé ôçí ýðáñîç äéáèåóéìüôçôáò ìéáò 14

18 åíäéüìåóçò ìíþìçò ôïõ åðüìåíïõ óôáäßïõ ãéá ôçí ðñïþèçóç ôïõ åí ëüãù ðáêýôïõ. Ï äåýôåñïò üñïò t tr (k) äçëþíåé ôç óõíïëéêþ êáèõóôýñçóç ðïõ áðáéôåßôáé ãéá ôçí äéüäïóç ôïõ k th ðáêýôïõ óå êüèå åíäéüìåóï óôüäéï ôïõ äéêôýïõ êáé ôï ïðïßï éóïýôáé ìå n nc, üðïõ n = log 2 N åßíáé ï áñéèìüò üëùí ôùí åíäéüìåóùí óôáäßùí, åíþ ôï nc óõìâïëßæåé ôï ñüíï åíüò äéêôõáêïý êýêëïõ. Normalized packet delay D åßíáé ôï ðçëßêï ôïõ D avg ðñïò ôçí åëü éóôç êáèõóôýñçóç åíüò ðáêýôïõ ç ïðïßá èåùñåßôáé üôé åßíáé ç êáèõóôýñçóç äéüäïóçò n nc (ð.., üôáí ç êáèõóôýñçóç óå üëåò ôéò ïõñýò åßíáé ìçäåíéêþ). ôóé, ôï D ìðïñåß íá ïñéóôåß ùò D = D avg (4) n nc Universal performance factor Upf êáèïñßæåôáé áðü ìéá ó Ýóç ðïõ ðåñéëáìâüíåé ôïõò äõï ðáñáðüíù êýñéïõò ðáñüãïíôåò áðüäïóçò D êáé T h. Ç áðüäïóç åíüò ÐÓÄ èåùñåßôáé âýëôéóôç üôáí ï ðáñüãïíôáò D åëá éóôïðïéåßôáé êáé ï ðáñüãïíôáò T h ìåãéóôïðïéåßôáé, Ýôóé ç öüñìïõëá ãéá ôïí õðïëïãéóìü ôïõ Upf ìðïñåß íá åêöñáóôåß ìå ôçí áêüëïõèç ó Ýóç Upf = w d D 2 + w th 1 (5) Th 2 üðïõ ïé äåßêôåò w d êáé w th äçëþíïõí ôéò áíôßóôïé åò âáñýôçôåò ôùí óõíéóôùóþí ðáñáãüíôùí ôïõ Upf, êáèïñßæïíôáò Ýôóé åðáêñéâþò ôç óðïõäáéüôçôá ôçò êüèå óõíéóôþóáò ìå ôï ðåñéâüëëïí ëåéôïõñãßáò. Óõíåðþò ç áðüäïóç åíüò ÐÓÄ ìðïñåß íá åêöñáóôåß áðü ìéá óõíäõáóôéêþ ìïíüäá ìýôñçóçò üðïõ ðñïöáíþò üôáí ï ðáñüãïíôáò packet delay ãßíåôáé ìéêñüôåñïò Þ/êáé ï ðáñüãïíôáò throughput ãßíåôáé ìåãáëýôåñïò ïé ôéìýò ôïõ Upf ìéêñáßíïõí, Ýôóé þóôå ìéêñüôåñåò ôéìýò ãéá ôï Upf íá õðïäçëþíïõí êáëýôåñç óõíïëéêþ áðüäïóç ãéá ôï ÐÓÄ. ÅðåéäÞ ïé áíùôýñù óõíéóôþóåò D êáé T h Ý ïõí äéáöïñåôéêýò êëßìáêåò ôéìþí, ìå ôçí êáíïíéêïðïßçóþ ôïõò ðñïêýðôåé Ýíá êïéíü ðåäßï áíáöïñüò. Ç êáíïíéêïðïßçóç ðñáãìáôïðïéåßôáé ìå ôç äéáßñåóç ôçò ôéìþò ôïõ êüèå ðáñüãïíôá ìå ôçí áëãåâñéêþ ôïõ åëü éóôç Þ ìýãéóôç ôéìþ ðïõ ï êüèå ðáñüãïíôáò ëáìâüíåé. ôóé, ç åîßóùóç 2 ìðïñåß íá áíôéêáôáóôáèåß ìå ôçí áêüëïõèç ðáñüóôáóç: ( ) D D min 2 ( Th Upf = w d D min + w th max ) Th 2 (6) Th üðïõ D min åßíáé ç åëü éóôç ôéìþ ôïõ normalized packet delay (D) êáé Th max ç ìýãéóôç ôéìþ ôïõ normalized throughput. ÅðïìÝíùò, üôáí ï óõíäõáóôéêüò ðáñüãïíôáò universal performance factor Upf, ôåßíåé óôï 0, ç áðüäïóç ôïõ ÐÓÄ èåùñåßôáé âýëôéóôç, åíþ üôáí ï ðáñüãïíôáò Upf áõîüíåé ç óõíïëéêþ áðüäïóç ôïõ ÐÓÄ èåùñåßôáé üôé åéñïôåñýåé. ÔÝëïò, ëáìâüíïíôáò õðüøç üôé ïé äýï ôéìýò ôùí óõíéóôùóþí delay êáé throughput ðïõ óõììåôý ïõí óôïõò áíùôýñù ôýðïõò åßíáé êáíïíéêïðïéçìýíåò éó ýåé D min = Th max = 1, êáé åðïìýíùò ç ðáñáðüíù ðáñüóôáóç ìðïñåß íá áðëïðïéçèåß ìå ôçí ðáñáêüôù åîßóùóç Upf = w d (D 1) 2 + w th 15 ( ) 1 Th 2 (7) Th

19 ¼ëåò ïé áíùôýñù åîéóþóåéò ìðïñïýí íá ðñïóáñìïóôïýí êáôüëëçëá Ýôóé þóôå íá éó ýïõí êáé óå ðåñéðôþóåéò ÐÓÄ ðïõ õðïóôçñßæïõí ðñïôåñáéüôçôåò, hotspot öïñôßï, multicast öïñôßï Þ áêüìç ðïëõåðßðåäïõò ÓÄ. Óôçí äéáôñéâþ áõôþ õðïëïãßæïõìå ôüóï ôéò óõíéóôþóåò ðáñáìýôñïõò üóï êáé ôç óõíïëéêþ áðüäïóç ôùí ÐÓÄ áíáðôýóóïíôáò åéäéêïý óêïðïý ðñïãñüììáôá åîïìïßùóçò óå C++, ôá ïðïßá äýíáôáé íá ëåéôïõñãïýí êüôù áðü äéáöïñåôéêü ðåñéâüëëïíôá ëåéôïõñãßáò. Ç åêôßìçóç áðüäïóçò åíüò ÐÓÄ ðñáãìáôïðïéåßôáé êõñßùò åöáñìüæïíôáò ôå íéêýò åîïìïßùóçò óå ó Ýóç ìå ôçí åöáñìïãþ ðëþñïõò ìáèçìáôéêþò ìïíôåëïðïßçóçò [13], åîáéôßáò ôçò õøçëþò ðïëõðëïêüôçôáò ç ïðïßá ðçãüæåé áðü ôï óõíäõáóìü äéáöüñùí ðáñáìýôñùí üðùò ç åíóùìüôùóç ðïëëáðëþí ðñïôåñáéïôþôùí, ç õðïóôþñéîç öïñôßïõ multicast Þ hotspot, êáèþò êáé ç ñþóç ðïëëáðëþí åíäéüìåóùí ìíçìþí óôïõò ÓÄ. Ôá ðñïãñüììáôá åîïìïßùóçò õðïóôçñßæïõí äéáöüñùí ôýðùí åêðïìðþí üðùò ð.., full-multicast åêðïìðþ, üðïõ Ýíá ðáêýôï ðñïùèåßôáé åüí êáé ìüíï åüí êáé ïé äõï ïõñýò ôïõ åðïìýíïõ óôáäßïõ äéáèýôïõí åðáñêþ åíäéüìåóç ìíþìç ãéá ôçí áðïèþêåõóç êáé ôùí äýï áíôéãñüöùí Þ partial-multicast åêðïìðþ (äåò ðáñáêüôù áëãüñéèìï ðñïþèçóçò), üðïõ Ýíá ðáêýôï äýíáôáé íá åîõðçñåôçèåß åßôå ðëþñùò, äçëáäþ êáé ðñïò ôéò äýï êáôåõèýíóåéò, åßôå ìåñéêþò, äçëáäþ ìüíï ùò ðñïò ôç ìßá êáôåýèõíóç, ðáñáìýíïíôáò óôçí ïõñü åêðïìðþò åùóüôïõ åîõðçñåôçèåß êáé ùò ðñïò ôç äåýôåñç êáôåýèõíóç. ÊáôÜ ôçí åöáñìïãþ ôçò åîïìïßùóçò ôïõ ðåñéâüëëïíôïò ëåéôïõñãßáò åíüò ÐÓÄ, ëáìâüíïíôáé õðüøç áñêåôýò ðáñüìåôñïé üðùò ôï ìþêïò ôçò åíäéüìåóçò ìíþìçò (buer-length), ï áñéèìüò åéóüäùí êáé åîüäùí (number of input and output ports), ï áñéèìüò ôùí óôáäßùí (number of stages), ôï óõíïëéêü öïñôßï óôçí åßóïäï (oered load), ôï ðïóïóôü multicast åêðïìðþò (multicast ratio), ôï áñ éêü ðïóïóôü hotspot öïñôßïõ (initial hotspot fraction), ï áñéèìüò ôùí ðñïôåñáéïôþôùí êáé ôá áíôßóôïé á ðïóïóôü ôïõò (number of priority classes and ratios), êáèþò êáé ï áñéèìüò ôùí åðéðýäùí (number of layers). Áëãüñéèìoò broadcast åêðïìðþò ðáêýôùí ðïëëáðëþí ðñïôåñáéïôþôùí óå ÐÓÄ ðïëëáðëþí åðéðýäùí Broadcast Forwarding (cs id ; cl id ; nl id ; sq id ; uq id ; lq id ; pr id ; mp; bm) Input: ôñý ïí óôüäéï (cs id ) ; åðßðåäï ôñý ïíôïò êáé åðüìåíïõ óôáäßïõ (cl id ; nl id ) ôùí ïõñþí áðïóôïëþò êáé ðñïïñéóìïý áíôßóôïé á ; ïõñü áðïóôïëþò (sq id ) ôñý ïíôïò óôáäßïõ ; åðüìåíïõ óôáäßïõ ïõñü Üíù êáé êüôù èýñáò åîüäïõ áíôßóôïé á uq id ; lq id ; êáôçãïñßá ðñïôåñáéüôçôáò (pr id ) ; ðïëéôéêþ äñïìïëüãçóçò multicast ðáêýôùí (mp) ; ìç áíéóìüò áíüó åóçò (bm). Output: Ðëçèõóìüò ïõñþí áðïóôïëþò êáé ðñïïñéóìïý (P op) ; óõíïëéêüò áñéèìüò ðáêýôùí êüèå ïõñüò áðïóôïëþò ðïõ åîõðçñåôþèçêáí Þ áíáó Ýèçêáí áíôßóôïé á (Serviced; Blocked) ; óõíïëéêüò áñéèìüò êýêëùí êáèõóôýñçóçò ãéá êüèå ïõñü áðïóôïëþò(delay) ; äéåýèõíóç äñïìïëüãçóçò ðáêýôïõ RA óå êüèå èýóç åíäéüìåóçò ìíþìçò; äåßêôçò ìåñéêþò åîõðçñýôçóçò multicast ðáêýôùí óôçí êïñõöþ êüèå ïõñüò áðïóôïëþò (PS). { if (Pop[uq id ][cs id + 1][nl id ][pr id ] = B) or (Pop[lq id ][cs id + 1][nl id ][pr id ] = B) 16

20 // êáôüóôáóç áíüó åóçò { Blocked[sq id ][cs id ][cl id ][pr id ] = Blocked[sq id ][cs id ][cl id ][pr id ] + 1 ; if (mp = full ) and (bm = blm ) // block and lost mechanism; { Pop[sq id ][cs id ][cl id ][pr id ] = Pop[sq id ][cs id ][cl id ][pr id ] 1 ; for (bf id = 1; bf id >= Pop[sq id ][cs id ][cl id ][pr id ]; bf id + +) RA[sq id ][cs id ][cl id ][pr id ][bf id ] = RA[sq id ][cs id ][cl id ][pr id ][bf id + 1] ; // üðïõ RA åßíáé ç äéåýèõíóç äñïìïëüãçóçò // ðáêýôïõ ôçò (bf id ) th èýóçò åíäéüìåóçò ìíþìçò ïõñüò áðïóôïëþò } } if (Pop[uq id ][cs id + 1][nl id ][pr id ] < B) and (Pop[lq id ][cs id + 1][nl id ][pr id ] < B) { // ðñïþèçóç broadcast Serviced[sq id ][cs id ][cl id ][pr id ] = Serviced[sq id ][cs id ][cl id ][pr id ] + 1 ; Pop[sq id ][cs id ][cl id ][pr id ] = Pop[sq id ][cs id ][cl id ][pr id ] 1 ; Pop[uq id ][cs id + 1][nl id ][pr id ] = Pop[uq id ][cs id + 1][nl id ][pr id ] + 1 ; Pop[lq id ][cs id + 1][nl id ][pr id ] = Pop[lq id ][cs id + 1][nl id ][pr id ] + 1 ; RA[uq id ][cs id +1][nl id ][pr id ][Pop[uq id ][cs id +1][nl id ][pr id ]] = RA[sq id ][cs id ][cl id ][pr id ][1]; RA[lq id ][cs id +1][nl id ][pr id ][Pop[lq id ][cs id +1][nl id ][pr id ]] = RA[sq id ][cs id ][cl id ][pr id ][1]; for (bf id = 1; bf id >= Pop[sq id ][cs id ][cl id ][pr id ]; bf id + +) RA[sq id ][cs id ][cl id ][pr id ][bf id ] = RA[sq id ][cs id ][cl id ][pr id ][bf id + 1] ; PS[sq id ][cs id ][cl id ][pr id ] = 1 ; // áñ éêïðïßçóç äåßêôç îáíü } if (mp = partial ) // partial multicast ðïëéôéêþ äñïìïëüãçóçò { if (Pop[uq id ][cs id + 1][nl id ][pr id ] < B) and (Pop[lq id ][cs id + 1][nl id ][pr id ] = B) { // Üíù èýñáò partial multicast åîõðçñýôçóç Pop[uq id ][cs id + 1][nl id ][pr id ] = Pop[uq id ][cs id + 1][nl id ][pr id ] + 1 ; RA[uq id ][cs id +1][nl id ][pr id ][Pop[uq id ][cs id +1][nl id ][pr id ]] = RA[sq id ][cs id ][cl id ][pr id ][1]; PS[sq id ][cs id ][cl id ][pr id ] = 0 ; } if (Pop[uq id ][cs id + 1][nl id ][pr id ] = B) and (Pop[lq id ][cs id + 1][nl id ][pr id ] < B) { // êüôù èýñáò partial multicast åîõðçñýôçóç Pop[lq id ][cs id + 1][nl id ][pr id ] = Pop[lq id ][cs id + 1][nl id ][pr id ] + 1 ; 17

21 } } RA[lq id ][cs id +1][nl id ][pr id ][Pop[lq id ][cs id +1][nl id ][pr id ]] = RA[sq id ][cs id ][cl id ][pr id ][1]; PS[sq id ][cs id ][cl id ][pr id ] = 1 ; } Delay[sq id ][cs id ][cl id ][pr id ] = Delay[sq id ][cs id ][cl id ][pr id ] + Pop[sq id ][cs id ][cl id ][pr id ] ; return Pop; Serviced; Blocked; Delay; RA; PS ; ÅóùôåñéêÜ, êüèå ÓÄ ðåñéëáìâüíåé Ýíáí ðßíáêá áðü p ìç äéáìïéñáæüìåíá æåýãç åíäéüìåóùí ìíçìþí (non-shared buer queue pairs), üðïõ p åêöñüæåé ôï óõíïëéêü áñéèìü êáôçãïñéþí ðñïôåñáéïôþôùí, üðïõ ãéá êüèå êáôçãïñßáò æåýãïò ïõñþí, ç ìéá åíäéüìåóç ìíþìç ñçóéìïðïéåßôáé áðïêëåéóôéêü áðü ôçí Üíù óõóôïé ßá, åíþ ç äåýôåñç áðïêëåéóôéêü áðü ôçí êüôù óõóôïé ßá ïõñþí. Óå üëåò ôéò ðåñéðôþóåéò ñþóçò åíäéüìåóçò ìíþìçò ç ëåéôïõñãßá ôïõò âáóßæåôáé óôçí áñ Þ FCFS (First Come First Serviced). ÅðéðëÝïí óå ðåñéðôþóåéò áíôáãùíéóìïý êüëõøçò ìéáò åíäéüìåóçò ìíþìçò üôáí õðïóôçñßæïíôáé ðïëëáðëýò ðñïôåñáéüôçôåò ôá ðáêýôá ìå ôçí õøçëüôåñçò ôüîçò ðñïôåñáéüôçôá ðñïçãïýíôáé Ýíáíôé üëùí ôùí Üëëùí. ¼ëåò ïé ðåñéðôþóåéò ìåëýôçò åêðïìðþò ðïõ âáóßóôçêáí óôçí ôå íéêþ ðñïóïìïßùóçò åêôåëýóôçêáí ìå ôçí åöáñìïãþ ðñïþèçóçò ðáêýôùí óôáèåñïý ìþêïõò ìýóá óå óôáèåñü ñïíéêü äéáóôþìáôá, üðïõ êüèå êýêëïò åêðïìðþò èåùñåßôáé ï ñüíïò ðïõ áðáéôåßôáé ãéá ôçí ðñïþèçóç åíüò Þ äýï áíôéãñüöùí ôïõ ðáêýôïõ óå ðåñéðôþóåéò unicast Þ broadcast åêðïìðþò áíôßóôïé á. ÊáôÜ ôç äéüñêåéá åêôýëåóçò ôùí ðåéñáìüôùí óõãêåíôñþíïíôáé ôéìýò ãéá üëá ôá ìåãýèç áðüäïóçò ðïõ åîåôüæïõìå ãéá ôï ÐÓÄ êáé ðéï óõãêåêñéìýíá ãéá ôï åýñïò æþíçò (packet throughput), ôçí êáèõóôýñçóç (packet delay) êáèþò êáé ôçí ðéèáíüôçôá áðþëåéáò ðáêýôùí (loss probability), üðïõ ç ïñèüôçôü ôïõò äéáóöáëßæåôáé ìå ôçí åêôåôáìýíç åêôýëåóç ðåéñáìüôùí êáôü ôçí ïðïßá ëáìâüíïíôáé óôáôéóôéêýò ôéìýò üëùí ôùí ìåãåèþí êáôü ôç äéüñêåéá 10 5 êýêëùí ñïëïãéïý. Ï óõãêåêñéìýíïò áñéèìüò åðáíáëþøåùí åðéëý èçêå þóôå íá äéáóöáëßæåôáé Ýíá óôáèåñü ðåñéâüëëïí ëåéôïõñãßáò ãéá ôï ÐÓÄ, üðïõ êáôü ôçí áñ éêþ öüóç åêôýëåóçò ôùí ðåéñáìüôùí êáé ðéï óõãêåêñéìýíá ôùí ðñþôùí 10 3 êýêëùí äåí óõëëýãïíôáé äåäïìýíá, äéüôé ôï äßêôõï èåùñåßôáé üôé äåí Ý åé Ýñèåé áêüìá óå óôáèåñþ êáôüóôáóç (steady state) ëåéôïõñãßáò. Ôï äéáäßêôõï áðïôåëåß ðëýïí êïéíü ôüðï üðïõ ìåôáêéíåßôáé ôåñüóôéïò üãêïò äåäïìýíùí. Ç áñ éôåêôïíéêþ ôùí äéêôýùí åðüìåíçò ãåíéüò ðñïóðáèåß íá ðåôý åé õøçëü åýñïò æþíçò ìå áìçëýò êáèõóôåñþóåéò óôç ìåôáöïñü ôùí ðáêýôùí ãéá ôïõò ôåëéêïýò ñþóôåò. Ôá ÐÓÄ åíþ ìðïñïýí íá ñçóéìïðïéçèïýí óôçí êáôáóêåõþ äéêôýùí åðüìåíçò ãåíéüò ìå åõýëéêôï êáé áîéüðéóôï ôñüðï ìå äéüöïñá öïñôßá óôçí åßóïäï ôïõ äéêôýïõ, åíôïýôïéò ôåßíïõí ãñþãïñá íá öôüóïõí óå êïñåóìü õðü óõíèþêåò hotspot Þ multicast/broadcast öïñôßùí, åéäéêü üôáí ôï ìåãåèüò ôïõò áõîüíåé. Ùò ëýóç ôïõ åí ëüãù ðñïâëþìáôïò ðñïôåßíåôáé ç ñþóç ðïëõåðßðåäùí ÐÓÄ ôùí ïðïßùí ç óõìðåñéöïñü êáé ç åêôßìçóç ôçò áðüäïóþò ôùí äåí Ý åé åðáñêþò ìåëåôçèåß. 18

22 ôóé, ç óõíåéóöïñü áõôþò ôçò äéáôñéâþò åßíáé äéðëþò óçìáóßáò. Áðü ôçí ìßá ðëåõñü ðáñý ïõìå Ýíá åõñý öüóìá óôáôéóôéêþí áíáëýóåùí êáé åêôéìþóåùí ôçò áðüäïóçò åíüò ÐÓÄ ôá ïðïßá ìðïñïýí íá ñçóéìïðïéçèïýí ùò âüóç áðü ôïõò ó åäéáóôýò äéêôýùí óôçí êáôáóêåõþ áðïôåëåóìáôéêþí äéêôõáêþí óõóôçìüôùí ìåéþíïíôáò ôï êüóôïò áíüðôõîþò ôïõò. Áðü ôçí Üëëç ðëåõñü, áíáðôýóóïõìå áðïäïôéêïýò áëãüñéèìïõò ãéá ôç ìåëýôç ÐÓÄ êüôù áðü äéáöïñåôéêýò óõíèþêåò ëåéôïõñãßáò üðùò hotspot öïñôßï, multicast êáé broadcast äñïìïëüãçóç óå ðïëëáðëþí ðñïôåñáéïôþôùí ó Þìáôá åöáñìüæïíôáò ðïëõåðßðåäåò óõóôïé ßåò ÓÄ. Ðéï óõãêåêñéìýíá óôï äåýôåñï êåöüëáéï ðñïôåßíåôáé Ýíá êáéíïôüìï ó Þìá åóùôåñéêþòðñïôåñáéüôçôáò (internal-priority) ãéá ôçí äñïìïëüãçóç ôùí ðáêýôùí ìýóá óôï ÐÓÄ. Ôï ðñïôåéíüìåíï ó Þìá ëáìâüíåé õðüøç ôïí ôñý ïíôá ðëçèõóìü ôùí ïõñþí áðïóôïëþò, äßíïíôáò ðñïôåñáéüôçôá óôá ðáêýôá åêåßíùí ôùí ïõñþí ìå ôï ìåãáëýôåñï ðëçèõóìü. Ôï óêåðôéêü áõôþò ôçò ðñïóýããéóçò âáóßæåôáé óôï ãåãïíüò üôé ìå ôçí áðïóõìöüñçóç ôùí ìåãáëýôåñùí ïõñþí óå ðëçèõóìü, ç ðéèáíüôçôá íá âñåèåß ãåìüôç ìéá ïõñü åëáôôþíåôáé, ìå áðïôýëåóìá íá áðïññßðôïíôáé ëéãüôåñá ðáêýôá åîáéôßáò ôçò Ýëëåéøçò þñïõ åíäéüìåóçò ìíþìçò áðïèþêåõóçò. ôóé, óå ðåñéðôþóåéò óõãêñïýóåùí ðáêýôùí ãéá íá ìðïñåß íá ëçöèåß õðüøç ìéá ôýôïéá áðüöáóç åßíáé áðáñáßôçôï êáôü ôç ëþøç ôùí ðáêýôùí óôïõò ÓÄ íá ãíùóôïðïéåßôáé ï ðëçèõóìüò ôçò áíôßóôïé çò ïõñüò åêðïìðþò, äçëáäþ íá ðáñý åôáé ìéá åðéðëýïí ðëçñïöïñßá ç ïðïßá äåí åßíáé äéáèýóéìç óå ìéá ôõðéêþ áñ éôåêôïíéêþ ÐÓÄ. ÅðïìÝíùò, ïé ÓÄ ðïõ ëåéôïõñãïýí ìå ôï ðñïôåéíüìåíï åóùôåñéêþò-ðñïôåñáéüôçôáò (internalpriority) ó Þìá åíóùìáôþíïõí êáôü ôçí áðïóôïëþ êüèå ðáêýôïõ ôçí áíôßóôïé ç ðëçñïöïñßá ãéá ôï ìýãåèïò ôçò ïõñüò áðïóôïëþò (length of their transmission packet queue) óå Ýíá ðñïïßìéï (preamble) ðåäßï óôçí áñ Þ ôïõ ðáêýôïõ. ôóé üôáí ï ÓÄ ôïõ åðüìåíïõ óôáäßïõ áíé íåýóåé êáôüóôáóç óýãêñïõóçò óõãêñßíåé ôá áíôßóôïé á ìåãýèç ôùí ïõñþí (queue sizes) åêðïìðþò êáé ðáñá ùñåß ðñïôåñáéüôçôá óôá ðáêýôá ìå ôçí õøçëüôåñç ôéìþ óôï ðñïïßìéï (preamble) ðåäßï. Ôï ãåãïíüò áõôü áíáìýíåôáé íá áõîþóåé ôçí óõíïëéêþ áðüäïóç ôïõ ÐÓÄ. Ãéá ôïõò äýï óçìáíôéêüôåñïõò ðáñüãïíôåò áðüäïóçò åíüò ÐÓÄ, äçëáäþ ôï åýñïò æþíçò (packet throughput) êáé ôï ìýóï ñüíï (mean time) ðïõ áðáéôåßôáé ãéá Ýíá ðáêýôï íá äéáó ßóåé ôï äßêôõï, óõãêåíôñþíïíôáé óõãêñéôéêü óôáôéóôéêü óôïé åßá ìåôáîý ôùí äýï áñ éôåêôïíéêþí, äçëáäþ ôïõ åóùôåñéêþò-ðñïôåñáéüôçôáò (internal-priority) êáé ôïõ áðëþò ðñïôåñáéüôçôáò (single-priority) ó Þìáôïò. ÔÝëïò, ðñïóäéïñßæïíôáé ðïóïôéêü êáé ðáñïõóéüæïíôáé ôá êýñäç ðïõ ðñïêýðôïõí áðü ôç ñþóç åóùôåñéêþò äñïìïëüãçóçò üóïí áöïñü ôüóï ôï åýñïò æþíçò (throughput) üóï êáé ôïí óõíïëéêü ðáñüãïíôá áðüäïóçò (combined performance indicator). Óôï ôñßôï êåöüëáéï åéóüãåôáé ìéá ðáñáëëáãþ ôùí ÐÓÄ äéðëþò åíäéüìåóçò ìíþìçò (double-buered) ìå ôç ñþóç áóýììåôñçò êáôáíïìþò ôçò óôïõò ÓÄ ôïõ ÐÓÄ (asymmetric buer sizes) ìå óêïðü ôçí ðáñï Þ äéáöïñåôéêþí ðáñå üìåíùí õðçñåóéþí (qualityof-service) óôá ðáêýôá áíüëïãá ìå ôçí ðñïôåñáéüôçôá ðïõ áõôü äéáèýôïõí, åíþ ðáñüëëçëá ðáñý åôáé ç ìýãéóôç äõíáôþ áðüäïóç ôïõ óõóôþìáôïò. Óçìåéþíåôáé üôé ôï óõãêåêñéìýíï ìýãåèïò åíäéüìåóçò ìíþìçò (buer size) åðéëý èçêå, äéüôé ìå âüóç ôá ðåéñüìáôü ìáò áðïäåß èçêå üôé ðáñý åé ôç âýëôéóôç óõíïëéêþ áðüäïóç. Ðéï óõãêåêñéìýíá ðáñáôçñþèçêå 19

23 üôé ãéá ìéêñüôåñá ìåãýèç åíäéüìåóçò ìíþìçò (buer size =1) ïé ôéìýò ôïõ åýñïõò æþíçò (network throughput) ðýöôïõí åîáéôßáò ôçò áýîçóçò ôçò ðéèáíüôçôáò áíüó åóçò (blocking probabilities), åíþ ãéá ìåãáëýôåñåò ôéìýò óôï ìåãýèïò ôçò åíäéüìåóçò ìíþìçò (buer size =4, 8) ç êáèõóôýñçóç óôçí ðñïþèçóç åíüò ðáêýôïõ (packet delay) áõîüíåé óçìáíôéêü óå óõíäõáóìü ìå ôçí áýîçóç ôïõ êüóôïõò õëéêïý (hardware) ôùí ÓÄ. ¼óïí áöïñü ôçí ñþóç áóýììåôñïõ ìåãýèïõò åíäéüìåóçò ìíþìçò âñýèçêå üôé ïäçãåß óå êáëýôåñç åêìåôüëëåõóç ôùí ðüñùí ôïõ äéêôýïõ êáé ôçò ùñçôéêüôçôüò ôïõ ãåíéêüôåñá, äéüôé õðüñ åé ðåñéóóüôåñç äéáèýóéìç åíäéüìåóç ìíþìç óôá áìçëüôåñçò ðñïôåñáéüôçôáò ðáêýôá, üðïõ ï ðëçèõóìüò ôïõò èåùñåßôáé üôé åßíáé ðïëý õøçëüôåñïò. Ðéï óõãêåêñéìýíá, óõãêñßíïíôáò ôçí áðüäïóç åíüò áóýììåôñïõ (asymmetric buer size) ÐÓÄ ìå Ýíá áíôßóôïé ï ôõðéêü ÐÓÄ êáôáóêåõáóìýíï ìå ßóïõ ìåãýèïõò åíäéüìåóåò ìíþìåò óå üëïõò ôïõò ÓÄ (equal-sized buer) âñýèçêå üôé ç ðñþôç áñ éôåêôïíéêþ ðáñý åé êáëýôåñç óõíïëéêþ áðüäïóç (overall throughput), åîåôüæïíôáò áèñïéóôéêü üëá ôá ðáêýôá êáé óçìáíôéêüôåñá êáëýôåñç áðüäïóç ôüóï ôïõ åýñïõò æþíçò (packet throughput) üóï êáé ôçò êáèõóôýñçóçò (delay) üóïí áöïñü ôá ðáêýôá áìçëüôåñçò ðñïôåñáéüôçôáò. Áíôßèåôá, ãéá ôá õøçëüôåñçò ðñïôåñáéüôçôáò ðáêýôá âñýèçêå üôé áðüäïóþ ôïõò äåí áëëüæåé óçìáíôéêü. ôóé, ìå ôç ñþóç áóýììåôñïõ ìåãýèïõò åíäéüìåóçò ìíþìçò (asymmetric-sized buer) åðéôõã Üíïíôáé óçìáíôéêü êýñäç óôç óõíïëéêüôåñç áðüäïóç ôïõ ÐÓÄ, äéüôé ç ìïñöþ ôïõ üëïõ öïñôßïõ ôáéñéüæåé êáëýôåñá ìå áõôïý ôïõ åßäïõò áóýììåôñçò êáôáíïìþò ôçò åíäéüìåóçò ìíþìçò. Óôï ôýôáñôï êåöüëáéï åîåôüæåôáé ç áðüäïóç åíüò äéðëþò-ðñïôåñáéüôçôáò (2-class priority) ÐÓÄ õðü óõíèþêåò hotspot öïñôßïõ êüôù áðü äéáöïñåôéêïýò ñõèìïýò Üöéîçò ðáêýôùí óôçí åßóïäï (oered load). ÅðéðëÝïí, ëáìâüíïõìå õðüøç ôéò äéáöïñýò óôçí áðüäïóç ôùí èõñþí åîüäïõ ôïõ ÐÓÄ õðü óõíèþêåò hotspot öïñôßïõ, üðùò áõôýò êáèïñßæïíôáé áðü ôïí [37] óýìöùíá ìå ôïí ïðïßï ç áðüäïóç ôçò êüèå èýñáò åîüäïõ åîáñôüôáé áðü ôïí áñéèìü ôùí åðéêáëýøåùí ôçò óõãêåêñéìýíçò äéáäñïìþò ìå ôç äéáäñïìþ ôçò hotspot åîüäïõ. Áíáëõôéêüôåñá, ðáñïõóéüæïõìå ôéò óôáôéóôéêýò ìåôñþóåéò, üðùò áõôýò ðñïêýðôïõí áðü åêôåôáìýíá ðåéñüìáôá êáé áöïñïýí ôïõò äýï óçìáíôéêüôåñïõò ðáñüãïíôåò áðüäïóçò ôïõ ÐÓÄ, ôï åýñïò æþíçò (throughput) êáé ôçí êáèõóôýñçóç (delay), êáèþò åðßóçò õðïëïãßæïõìå êáé áíáëýïõìå ôï óõíäõáóôéêü ðáñüãïíôá (Universal performance factor) ñçóéìïðïéþíôáò äéáöïñåôéêýò âáñýôçôåò (weights) óôéò äýï óõíéóôþóåò ðáñüìåôñåò, åêöñüæïíôáò Ýôóé ôçí óðïõäáéüôçôá ôçò êüèå ðáñáìýôñïõ óôï ðñïóäéïñéóìü ôçò óõíïëéêüôåñçò áðüäïóçò ôïõ ÐÓÄ. ÅðéðëÝïí, äéáðéóôþíïõìå üôé åíþ ôá ÐÓÄ åßíáé áñêåôü åõýëéêôá óôï åéñéóìü äéáöüñùí öïñôßùí åéóüäïõ, ç áðüäïóþ ôïõò õðïâáèìßæåôáé óçìáíôéêü êüôù áðü óõíèþêåò hotspot öïñôßïõ, åéäéêüôåñá óôá ìåãáëýôåñá äßêôõá. Ùò áðüíôçóç óôï ðñüâëçìá ôïõ äåíäñïåéäïýò êïñåóìïý, ç ñþóç ðñïôåñáéïôþôùí óôá ðáêýôá ôùí äéáöüñùí åöáñìïãþí ðáñý åé êáëýôåñç áðüäïóç óôéò õøçëüôåñçò ðñïôåñáéüôçôáò õðçñåóßåò -QoS-. ÔÝëïò, ï ëüãïò ðïõ ðñïôåßíåôáé ç ñþóç åíüò êáéíïôüìïõ ðïëõåðßðåäïõ ÐÓÄ åßíáé ç ðåñáéôýñù âåëôßùóç ôçò áðüäïóçò ôùí áìçëüôåñçò ðñïôåñáéüôçôáò ðáêýôùí ðïõ åê ôùí ðñáãìüôùí åßíáé åîáéñåôéêü õðïâáèìéóìýíç. Ìå âüóç ôá óôáôéóôéêü äåäïìýíá ðïõ ðñïêýðôïõí áðü ôçí åöáñìïãþ ðïëõåðßðåäùí ÐÓÄ êáé èýëïíôáò íá éóïññïðþóïõìå ìåôáîý ôçò ðñïóöåñüìåíçò áðüäïóçò êáé ôïõ êüóôïõò êáôáóêåõþò åíüò ÐÓÄ ðñïôåßíïõìå Ýíá êáéíïôüìï 4-åðéðÝäùí 20

24 ÐÓÄ ôï ïðïßï äéáðéóôþóáìå üôé âåëôéþíåé äñáìáôéêü ôçí áðüäïóç ôüóï ôïõ hotspot öïñôßïõ üóï êáé ôùí Üëëùí ðáêýôùí áìçëþò ðñïôåñáéüôçôáò óôéò äéüöïñåò æþíåò (zones) åîüäïõ. Óôï ðýìðôï êåöüëáéï åðåêôåßíïõìå ôçí åöáñìïãþ ðñïôåñáéïôþôùí óå Ýíá ÐÓÄ åöáñìüæïíôáò ìéá áñ éôåêôïíéêþ ç ïðïßá õðïóôçñßæåé ðñùôïãåíþò ðïëëáðëýò ðñïôåñáéüôçôåò êáôü ôç äñïìïëüãçóç ôùí ðáêýôùí. ÅðéðëÝïí, áíáëýïõìå ôçí áðüäïóç åíüò ôýôïéïõ ÐÓÄ ñçóéìïðïéþíôáò óôçí êáôáóêåõþ ôùí ÓÄ ü é ìüíï ïõñýò ìå åíäéüìåóç ìíþìç ìéáò èýóåùò (single-buered) áëëü êáé ïõñýò ìå åíäéüìåóç ìíþìç äýï èýóåùí (double-buered queues) ìå óêïðü ôçí âåëôéóôïðïßçóç ôçò óõíïëéêüôåñçò áðüäïóçò ôïõ äéêôýïõ êáé ôçí ðáñï Þ êáëýôåñçò ðïéüôçôáò õðçñåóéþí (QoS). Åßíáé ãåãïíüò üôé ôá ôåëåõôáßá ñüíéá Ý åé áõîçèåß äñáìáôéêü ôï ðëþèïò êáé ôï åýñïò ôùí åöáñìïãþí ðïõ ôñý ïõí åßôå óôï äéáäßêôõï åßôå óå äéüöïñá IP äßêôõá ôùí åðé åéñþóåùí. Ôï åýñïò ôùí äéáöüñùí åöáñìïãþí ðåñéëáìâüíåé äéáäñáóôéêü ðñïãñüììáôá (ð.., telnet, and instant messaging), ìåôáöïñü äåäïìýíùí ìåãüëïõ üãêïõ (ð.., ftp, and P2P le downloads), åôáéñéêýò âüóåéò äåäïìýíùí (ð.., database transactions), êáé ðñáãìáôéêïý- ñüíïõ åöáñìïãýò (ð.., voice, and video streaming). Ï óôü ïò áõôþò ôçò ìåëýôçò åßíáé ç ðáñï Þ ñþóéìùí óôïé åßùí êáé ðëçñïöïñéþí óôïõò ó åäéáóôýò äéêôýùí ó åôéêü ìå ôï ðþò ç åöáñìïãþ ðïëëáðëþí ðñïôåñáéïôþôùí óôá ðáêýôá ôùí äéáöüñùí åöáñìïãþí åðçñåüæåé ôçí ðïéüôçôá ôçò áíôßóôïé çò ôüîçò ðáñå üìåíçò õðçñåóßáò êáé ãåíéêüôåñá ôç óõíïëéêüôåñç áðüäïóç ôïõ äéêôýïõ. Ìå âüóç ôá áíùôýñù áðïôåëýóìáôá, ïé ó åäéáóôýò äéêôýùí åßíáé óå èýóç íá êáèïñßóïõí áðïäïôéêüôåñá ôïí ôñüðï åöáñìïãþò ðñïôåñáéïôþôùí óôéò áíùôýñù åöáñìïãýò þóôå íá ôáéñéüæåé êáëýôåñá ìå ôï åôáéñéêü ðåñéâüëëïí ëåéôïõñãßáò, íá éêáíïðïéåß ôéò áðáéôþóåéò ôùí ôåëéêþí ñçóôþí êáé íá áîéïðïéåß êáôü ôï ìýãéóôï äõíáôü ôïõò äéêôõáêïýò ðüñïõò. ÔÝëïò, ç åêôßìçóç ôçò áðüäïóçò äéêôýùí ðïõ õðïóôçñßæïõí ðñùôïãåíþò ðïëëáðëýò ðñïôåñáéüôçôåò, ðñéí ôçí êáôáóêåõþ ðñáãìáôéêþí äéêôõáêþí óõóôçìüôùí, åëá éóôïðïéåß ôï êüóôïò êáé ôï ñüíï áíüðôõîþò ôïõò. Óôï Ýêôï êáé ôåëåõôáßï êåöüëáéï üëåò ïé ðñïçãïýìåíåò ìåëýôåò åêôßìçóçò ôçò áðüäïóçò åíüò ÐÓÄ (ð.., [38]) åðåêôåßíïíôáé ìå ôñüðï ðïõ íá ðåñéëáìâüíïõí ôç ñþóç áñ éôåêôïíéêþí ðïëëáðëþí ðñïôåñáéïôþôùí óå óõíäõáóìü ìå ôçí åöáñìïãþ ðïëëáðëþí åðéðýäùí ÓÄ óå ðåñéâüëëïíôá multicast öïñôßùí. Åßíáé ãåãïíüò üôé ç ôå íïëïãßá ôùí ÐÓÄ áðïôåëåß ìéá ðñïåîý ïõóá ðñïóýããéóç óôçí õëïðïßçóç äéêôýùí åðüìåíçò ãåíéüò (Next Generation Networks NGNs) ìå ìéá áíáëïãßá êüóôïõò êáé áðüäïóçò ðïõ íá èåùñåßôáé åëêõóôéêþ. Ðáñüëá, áõôü Ý åé äéáðéóôùèåß üôé ç áðüäïóç ôùí ÐÓÄ õðïâáèìßæåôáé óçìáíôéêü üôáí åéóüãåôáé áñêåôü öïñôßï ôýðïõ multicast óôçí åßóïäï êáé ãéá ôï ëüãï áõôü ðñïôåßíåôáé ç ñþóç ðïëõåðßðåäùí ÓÄ. ôóé, óôï åí ëüãù êåöüëáéï ìåëåôüôáé êáé áíáëýåôáé äéåîïäéêü ç áðüäïóç ôýôïéùí óõóôçìüôùí êüôù áðü äéüöïñåò óõíèþêåò ëåéôïõñãßáò, üðùò ôï öïñôßï óôçí åßóïäï ôïõ äéêôýïõ (input loads) êáé ôï ðïóïóôü ôùí multicast ðáêýôùí (multicast packet ratios), åöáñìüæïíôáò ùò äñïìïëüãçóç ôçí ôå íéêþ ôçò ìåñéêþò ðñïþèçóçò (partial multicast) üðùò áõôþ ðáñïõóéüæåôáé áðü ôïí [51], äåäïìýíïõ üôé ç ôå íéêþ áõôþ ðáñý åé êáëýôåñç áðüäïóç óå ó Ýóç ìå ôï ìç áíéóìü ôçò ðëþñïõò ðñïþèçóçò (full multicast) [56], üðïõ Ýíá ðáêýôï áíôéãñüöåôáé êáé åêðýìðåôáé êáé ðñïò ôéò äýï åîüäïõò ôïõ, 21

25 åüí êáé ìüíï åüí åßíáé äéáèýóéìåò êáé ïé äýï åíäéüìåóåò ìíþìåò ðñïïñéóìïý ôïõ. ÔÝëïò, ç ñþóç ðïëëáðëþí ðñïôåñáéïôþôùí óå ÓÄ ðïõ äéáèýôïõí ïõñýò ìå åíäéüìåóç ìíþìç äýï èýóåùí (double-buered queues) ðáñý åé êáëýôåñç ðïéüôçôá õðçñåóéþí (QoS) óôá ðáêýôá óõãêñéíüìåíç ìå ôçí õëïðïßçóç ôïõ [51], üðïõ ñçóéìïðïéïýíôáé ïõñýò ìå åíäéüìåóç ìíþìç ìéáò èýóåùò (single-buered queues). Ôá áðïôåëýóìáôá áõôþò ôçò áíüëõóçò äýíáôáé íá ñçóéìïðïéçèïýí ãéá ôï ó åäéáóìü âýëôéóôùí ó çìüôùí áñ éôåêôïíéêþí ÐÓÄ, þóôå íá åðéôõã Üíåôáé ï êáëýôåñïò óõíäõáóìüò ôùí áðáéôþóåùí áðüäïóçò êáé êüóôïõò, êüôù áðü äéüöïñá ðñïóäïêþìåíá öïñôßá óôçí åßóïäï ôïõ äéêôýïõ, ðáñý ïíôáò ôçí áðáéôïýìåíç ðïéüôçôá óôéò ðñïóöåñüìåíåò õðçñåóßåò. Ç ðáñïýóá äéáôñéâþ ìåëåôü ôéò ÐñïçãìÝíåò Õðçñåóßåò Äéáäéêôýïõ êáé ôç Âåëôéóôïðïßçóç ôùí Ìç áíéóìþí Åðéêïéíùíßáò ôïõò, åóôéáæüìåíç êõñßùò óôï åðéêïéíùíéáêü õðüâáèñï ðïõ ðñýðåé áõôýò íá äéáèýôïõí ðñïêåéìýíïõ íá åßíáé åöéêôþ ç äéáóöüëéóç êáé ðñïóöïñü êáôüëëçëçò ðïéüôçôáò õðçñåóßáò óôéò åöáñìïãýò êáé óõíáêüëïõèá óôïõò ñþóôåò. Åðéðñüóèåôá, óôï ðëáßóéï áõôü åîåôüæåôáé êáé ç éêáíüôçôá áíôáðüêñéóçò ôùí õðïäïìþí óôçí êëéìüêùóç ôçò ñïþò ôùí äåäïìýíùí, ðïõ ãåíéêþò ïöåßëåôáé åßôå óôçí êáôáíüëùóç õðçñåóéþí õøçëüôåñçò ðïéüôçôáò (ð.. âßíôåï õøçëüôåñçò åõêñßíåéáò) åßôå óôçí áýîçóç ôïõ ðëþèïõò ôùí ôåëéêþí ñçóôþí. Ðéï óõãêåêñéìýíá, ç åí ëüãù äéáôñéâþ åóôéüóôçêå óôá áêüëïõèá èýìáôá: Óôçí êáôçãïñéïðïßçóç ôùí ðïëõâüèìéá óõíäåäåìýíùí äéêôýùí, ôå íïëïãßá ðïõ åõñýôáôá ñçóéìïðïéåßôáé óå õðïäïìýò õðïóôþñéîçò êáôáíåìçìýíùí åöáñìïãþí. Óôç ìåëýôç ôçò áðüäïóçò ôùí ðïëõâüèìéá óõíäåäåìýíùí äéêôýùí ìå åóùôåñéêþ ðñïôåñáéüôçôá. Óôç ìåëýôç ôçò áðüäïóçò ôùí ðïëõâüèìéá óõíäåäåìýíùí äéêôýùí ìå äýï êëüóåéò ðñïôåñáéïôþôùí êáé áóýììåôñç êáôáíïìþ ôùí åíäéüìåóùí ìíçìþí. Óôç ìåëýôç ôçò áðüäïóçò ôùí ðïëõâüèìéá óõíäåäåìýíùí äéêôýùí ìå äýï êëüóåéò ðñïôåñáéïôþôùí êáé óçìåßá Ýíôïíïõ êõêëïöïñéáêïý öüñôïõ (hotspots). Óôç ìåëýôç ôçò áðüäïóçò ôùí ðïëõâüèìéá óõíäåäåìýíùí äéêôýùí ìå ðïëëáðëýò êëüóåéò ðñïôåñáéïôþôùí. Óôç ìåëýôç ôçò áðüäïóçò ôùí ðïëõâüèìéá óõíäåäåìýíùí äéêôýùí ìå ðïëëáðëü åðßðåäá êáé ðïëëáðëýò êëüóåéò ðñïôåñáéïôþôùí, åéäéêüôåñá ãéá ôïí åéñéóìü ðïëëáðëþí åêðïìðþí. Ç âåëôßùóç ôçò áðüäïóçò ôùí ðïëõâüèìéá óõíäåäåìýíùí äéêôýùí, ç åöáñìïãþ êáéíïôüìùí áñ éôåêôïíéêþí ðïõ íá õðïóôçñßæïõí ðñùôïãåíþò ðïëëáðëýò êëüóåéò ðñïôåñáéïôþôùí, ç ìåëýôç ôçò óõìðåñéöïñüò ôïõò óå êáôáóôüóåéò êõñßùò Ýíôïíïõ êõêëïöïñéáêïý öüñôïõ Þ/êáé ðïëëáðëþí åêðïìðþí, êáèþò êáé ç ñþóç íýùí ó çìüôùí ìå ðïëëáðëü åðßðåäá óôïé åéùäþí äéáêïðôþí áðïôåëïýí ðëýïí ôéò áíáãêáßåò ðñïûðïèýóåéò ãéá ôçí åäñáßùóç ôïõò óôá êõñßáñ á äßêôõá åðüìåíçò ãåíéüò. ôóé, óôá ðëáßóéá áõôþò ôçò äéáôñéâþò 22

26 Ðñïôåßíïíôáé ìïíôåëïðïéþóåéò ôùí ðïëõâüèìéá óõíäåäåìýíùí äéêôýùí ðïõ åðåêôåßíïõí ôéò Þäç õðüñ ïõóåò ìå åðéðëýïí êáôáóôüóåéò êáé ìåôáâüóåéò, ãéá ðåñáéôýñù áýîçóç ôçò áêñßâåéáò ôçò ìïíôåëïðïßçóçò. Ðñïôåßíåôáé ç áóýììåôñç êáôáíïìþ ôùí åíäéüìåóùí ìíçìþí óå äßêôõá ðïëëáðëþí ðñïôåñáéïôþôùí Ýôóé þóôå ôá ìåãýèç ôùí åíäéüìåóùí ìíçìþí íá áíôéóôïé ïýí áíáëïãéêü ðñïò ôïõò üãêïõò äéáêéíïýìåíùí ðáêýôùí óå êüèå êáôçãïñßá ðñïôåñáéüôçôáò. Ðñïôåßíåôáé ç ñþóç ìç áíéóìïý åóùôåñéêþò ðñïôåñáéüôçôáò óôá ðïëõâüèìéá óõíäåäåìýíá äßêôõá, ï ïðïßïò ìðïñåß íá âåëôéþóåé ôéò åðéäüóåéò ôïõ äéêôýïõ. Ðñïôåßíåôáé ìßá íýá ìåôñéêþ áîéïëüãçóçò ôçò åðßäïóçò ôùí õðïäïìþí åðéêïéíùíßáò ðïõ óõíäõüæåé ôïõò äýï óçìáíôéêüôåñïõò ðáñüãïíôåò åðéäüóåùí ôùí õðïäïìþí áõôþí -ôïí ñõèìü äéåêðåñáßùóçò êáé ôçí êáèõóôýñçóç- åðéôñýðïíôáò óôïí ó åäéáóôþ íá êáèïñßóåé ôç óðïõäáéüôçôá ôïõ êüèå åðéìýñïõò ðáñüãïíôá óôçí åíéáßá áõôþ ìåôñéêþ. ÔÝëïò, ðáñý ïíôáé åêôåôáìýíåò êáé áíáëõôéêýò åêôéìþóåéò, âáóéóìýíåò óå ðñïóïìïßùóç, ãéá ôç óõìðåñéöïñü êáé ôéò åðéäüóåéò üëùí ôùí êáôçãïñéþí ôùí ðïëõâüèìéá óõíäåäåìýíùí äéêôýùí ðïõ áíáöýñèçêáí áíùôýñù. Ïé åêôéìþóåéò áõôýò áðïôåëïýí óçìáíôéêü âïþèçìá ãéá ôïõò ó åäéáóôýò ôùí õðïäïìþí þóôå íá äéáóôáóéïëïãïýí êáôüëëçëá ôï áðáñáßôçôï õëéêü ðñïêåéìýíïõ ãéá êüëõøç ôùí áíáãêþí ôïõò ìå ôï ìéêñüôåñï äõíáôü êüóôïò. 23

27 Chapter 1 Introduction 1.1 Classication of Multistage Interconnection Networks and Related Work 1.2 Thesis Contribution 1.1 Classication of Multistage Interconnection Networks and Related Work Multistage Interconnection Networks (MINs) with crossbar Switching Elements (SEs) are frequently employed in multiprocessor computer architectures for interconnecting processors and memory modules [8, 45, 1]. They are also increasingly used for implementing the switching fabric of high-capacity communication networks, such as Asynchronous Transfer Mode (ATM) Switches, Gigabit Ethernet Switches and Terabit Routers [46, 42, 4]. MINs have already being used for many dierent applications, ranging from internal buses in Very Large Scale Integration (VLSI) circuits to wide area computer networks -e.g. as interconnection elements for distributed shared-memory multiprocessors or as network components for industrial applications. The performance of the communication infrastructure that interconnects the system's elements (nodes, processors, memory modules etc) has been recognized as a critical factor for overall system performance, both in the context of parallel and distributed systems. As a result, much research has been conducted, targeting to identify the factors that affect the communication infrastructure's performance and provide models for performance prediction and evaluation. The choice of the appropriate interconnection architecture of a modern and parallel computing system plays a major role in its overall performance evaluation, and depends on many dierent factors among its performance requirements, scalability, incremental expandability, partitionability, distance span, physical constraints, reliability and repairability. 24

28 In literature the Multistage Interconnection Networks (MINs) have been mainly classied into two classes: time division switches and space division ones. A simple classication of switching fabric that includes most of the proposed approaches is illustrated in gure 1.1. Figure 1.1: A classication of MINs Crossbar switches have always been attractive to network designers because they are internally non-blocking. Crossbar designs have a complexity in paths or crosspoints that grows as a function of N 2 where N is the number of input or output ports, making them do not scale well to large sizes. Consequently, they are very useful for the construction of non-blocking, self-routing, switching elements and switches of modest size. Banyan Switches [16], on the other hand, are blocking multistage interconnection selfrouting networks which are characterized by the fact that there is exactly a unique path from each input port to each output port. Because, this kind of interconnection has a smaller complexity for paths and Switching Elements (SEs) which is just of the order Nlog 2 N is much more suitable than the crossbar structure for the construction of larger switching fabrics. Unfortunately, all network subclasses [26, 35, 2] with the Banyan [16] property are internal blocking networks and thus their performance degrade rapidly as their size increases, especially under hotspot or multicast trac. Nevertheless, there are several ways to reduce their internal blocking to levels that will be acceptable (e.g. using internal buering in multi-priority schemes or applying multilayer SEs) to applications of dierent QoS requirements. MINs with the Banyan [16] property, such as Omega Networks [26], Delta Networks [35], and Generalized Cube Networks [2] are generally preferred over non-banyan MINs, since the latter are in general more expensive than Banyan MINS and more complex to control. Due to the advent of MINs, much research eorts have been devoted to the inves- 25

29 tigation of their performance under various congurations and trac conditions, and proposals have been made for the improvement of their performance. The main aspects that have been considered in these works are the buer size of switching elements (e.g. [19, 58]), MIN size (number of stages - e.g. [58, 59]), trac patterns (including uniform vs. hotspot e.g. [38, 39, 58, 6] and unicast vs. broadcast/multicast e.g. [48, 34]), and packet priorities (e.g. [38, 39]). Performance evaluation has followed two distinct paths, the rst one employing analytical methods such as Markov chains, queuing theory and Petri nets, while the second path uses simulation. Architectural issues (e.g. multilayer congurations [50] and wiring [29]) and routing algorithms (e.g. [54]) have also been considered in research eorts. 1.2 Thesis Contribution The contribution of this thesis is twofold. On one hand, we provide a wide spectrum of performance measures, which can be valuable assets for designers of networks and parallel multiprocessor systems in order to minimize the overall deployment costs while delivering ecient systems. On the other hand, we develop ecient algorithms for investigating the behavior of MINs under various operational parameters such as hotspot trac, multicast and broadcast routing, multi-priority schemes, and multi-layer congurations. In chapter 2, a novel two-level internal priority scheme is proposed for performing routing within the MIN. The proposed scheme takes into account the queue lengths of the MIN Switching Elements (SEs), prioritizing packets in SEs having greater queue lengths. The rationale behind this approach is that by ooading large queues, the probability that buers ll up decreases, thus less packets will be dropped due to buer shortage. In the proposed internal-priority scheme, if a conict occurs it is resolved by examining the number of packets within the transmission queue of the SEs from which the contending packets originate. For such a decision, however, to be taken, the receiving SE needs to have available the queue lengths of the transmitting SEs, a piece of information which is not available to the receiving SE in typical MINs. To make this information available, SEs operating under the internal priority MIN scheme send the length of their transmission packet queue at the start of the packet header, as a preamble. When receiving SEs detect a conict situation (i.e. two incoming transmissions and only one free buer slot), they compare the queue sizes of the transmitting SEs and proceed in receiving the packet preambled with the largest value for the queue size. This is expected to increase network performance, while fairness between packets is also promoted. We compare the proposed novel internal priority scheme against the single priority one, by gathering metrics for the two most important network performance factors, namely packet throughput and the mean time a packet needs to traverse the network. Thus, we demonstrate and quantify the improvements on MIN performance stemming from the introduction of priorities at packets in terms of throughput as well as a combined performance indicator which depicts its overall performance. 26

30 In chapter 3, a variation of double-buered SEs that uses asymmetric buer sizes is introduced in order to oer dierent quality-of-service parameters to packets that have dierent priorities, while providing in parallel optimal overall network performance. We note here that the particular buer sizes have been chosen since they have been reported to provide optimal overall network performance: indeed, it is observed that for smaller buer sizes (1) the network throughput drops due to high blocking probabilities, whereas for higher buer sizes (4 and 8) packet delay increases signicantly (and the SE hardware cost also raises). The asymmetric-sized buer conguration has been found to better exploit network resources and capacity, since available buers are more ttingly allocated to the priority class needing them. More specically, when comparing the asymmetric buer size conguration against its equal-sized buer counterpart, we found that the former provides better overall throughput and signicantly better low-priority packet throughput and delay; for high-priority packets on the other hand, the performance of the two schemes is almost identical, with the equal-sized buer scheme having a small edge. The asymmetric-sized buer conguration achieves these performance benets because it better matches buer allocation to the shape of network trac. In chapter 4, performance aspects of 2-class priority MINs are examined under hotspot trac conditions, considering dierent rates of oered load. We additionally take into account the dierences in the performance of the MIN outputs under hotspot trac identied in [37], according to which the performance of each output depends on the amount of overlapping that the path to the specic output has with the path to the hotspot output. We present metrics for the two most important network performance factors, namely throughput, delay and we also calculate and present the performance in terms of the Universal performance factor as introduced at previous chapters, which combines throughput and delay into a single metric, allowing the designer to express the perceived importance of each individual factor through weights. Although Multistage Interconnection Networks (MINs) are fairly exible in handling varieties of trac loads, their performance considerably degrades by hotspot trac, especially at increasing size networks. As a response to the tree saturation problem, the proposed dual-priority MIN conguration can oer better quality-of-service to some applications by prioritizing their packets and providing better QoS to high priority ones. Finally, the rationale behind proposing a novel multi-layer MIN is to improve the performance of low priority packets. Thus, in an attempt to balance between MIN performance and cost, the proposed novel 4-layer MIN conguration was found to be dramatically improve hotspot trac as well as all other zones low priority trac. In chapter 5, an extension of previous studies is applied by introducing MINs that natively support multi-class routing trac. We also analyze the performance of multipriority SEs that use not only single-buered, but also double-buered queues in order to oer better QoS, while providing in parallel better overall network performance. The past few years have witnessed a dramatic increase in the number and variety of applications running over the Internet and over enterprise IP networks. The spectrum includes 27

31 interactive (e.g. telnet, and instant messaging), bulk data transfer (e.g. ftp, and P2P le downloads), corporate (e.g. database transactions), and real-time applications (e.g. voice, and video streaming). The goal of this study is to provide network designers with insight on how packet prioritization aects the QoS delivered to each priority class and the network performance in general. This insight can help network designers to assign packet priorities to various applications in a manner that will comply with the corporate policy, satisfy application requirements and maximize network utilization. The presented results also facilitate performance prediction for multi-priority networks before actual network implementation, through which deployment cost and rollout time can be minimized. In chapter 6, all previous studies in the area of performance evaluation of MINs (e.g. [38]) are extended by including multi-priority, multi-layer MINs under multicast trac. Multistage Interconnection Network technology is a prominent approach for implementing Next Generation Networks (NGNs), having an appealing cost/performance ratio. Multicasting however, which is a core requirement for NGNs, has been found to signicantly degrade MIN performance, and multi-layer MINs have been introduced to cope with traf- c shapes involving multicasting. In this chapter, we extensively study the performance of multi-layer MINs operating under various overall input loads and multicast packet ratios. We applied the partial multicast policy [51], since it oers superior performance compared to the full multicast mechanism [56], where a packet is copied and transmitted when only both destination buers are available. Finally, we extend all previous studies by considering Switching Elements (SEs) that natively support a multi-priority scheme, and also considering double-buered SEs has been reported to provide better QoS for packets, as compared to single-buering congurations [51]. The ndings of this performance evaluation can be used by network designers for drawing optimal congurations while setting up MINs, so as to best meet the performance and cost requirements under the anticipated trac load and quality of service specications. 28

32 Chapter 2 Internal Priority MINs 2.1 Introduction 2.2 Related Work 2.3 Internal Priority MIN Description and Analytical Model Analysis of MINs Denitions of MINs 2.4 Performance Evaluation Methodology of Internal Priority MINs 2.5 Simulation and Performance Results of Internal Priority MINs 2.6 Conclusions for Internal Priority MINs 2.1 Introduction Banyan switching fabric [16] has advantages of simplicity in hardware, speed, easy VLSI implementation and easiness of routing. Thus, this kind of blocking Multistage Interconnection Networks (MINs) were mainly applied for interconnecting processors and memory modules in parallel multiprocessor systems [8, 45, 1]. MINs have been also recently identied as an ecient interconnection network for communication structures such as gigabit Ethernet switches, terabit routers, and ATM switches [46, 42, 4]. Signicant advantages of Banyan-type [16] MINs include their appealing cost/performance ratio and their ability to route multiple communication tasks concurrently. Consequently, this type of MINs is frequently proposed to connect a large number of processors to establish a multiprocessor system, while they have also received considerable interest in the development of packetswitched networks. On the other hand, non-banyan MINs, are in general, more expensive than Banyan networks and more complex to control. 29

33 In a parallel or distributed system, the performance of the network interconnecting the constituent elements (nodes, processors, memory modules etc) is a critical factor for the overall system performance. Much research has been therefore conducted during the last decades in the area of investigating the performance of networks and communications facilities. In order to evaluate network performance dierent methods have been used, mainly classied in two major categories. The rst category includes analytical models based either on Markov models or on Petri-nets, whereas the second category employs simulation to estimate network performance. Accurate performance estimation before network implementation is of essence, since it allows network designers to adapt network design and tune operational parameters to the specic requirements of the system under implementation, enabling thus building of ecient systems, cost reduction and minimization of rollout times. In this chapter we propose a novel two-level internal priority scheme for performing routing within the MIN. The proposed scheme takes into account the queue lengths of the MIN switching elements, prioritizing packets in SEs having greater queue lengths. The rationale behind this approach is that by ooading large queues, the probability that buers ll up decreases, thus less packets will be dropped due to buer shortage. This is expected to increase network performance, while fairness between packets is also promoted. The performance the proposed scheme is also evaluated and compared against that of single-priority MINs. The remainder of this chapter is organized as follows: section 2.2 overviews related work in the area of network performance evaluation and priority schemes, while in section 2.3 we present the proposed priority scheme which is termed as internal priority and give an analytical model for nite-buered MINs with internal priority and non priority scheme SEs. The analytical model employs a novel 5-states buer model. Subsequently, in section 2.4 we present the performance criteria and parameters related to the network. Section 2.5 presents the results of our performance analysis, which has been conducted through simulation experiments, while section 2.6 provides the concluding remarks. 2.2 Related Work The principal methods for estimating network performance are analytical modeling and simulation. Markov chains, which fall in the analytical modeling category, have been extensively used by many researchers. In [5, 30] Markov chains are used in order to approximate the behavior of MINs under dierent buering schemes. In [5], particularly, Markov chains are enhanced with elements from queuing theory. Petri nets [15, 17, 28] have been also used as modeling methods either to complement Markov chains or as self-contained approaches. The studies reported in [18, 44] investigated MINs with uniform load trac on inputs. Hot-spot trac performance was also examined by Jurczyk [24], while Turner [47] dealt with multicast in Clos networks, as a subclass of MINs. Atiquzzaman [3] focused only on non-uniform arriving trac schemes. Furthermore, Lin 30

34 and Kleinrock [27] discuss approaches that examine the case of Poisson trac on inputs of a MIN. In the industry domain, Cisco has built its new CRS-1 router [9, 11] as a multistage switching fabric. The switching fabric that provides the communications path between line cards is a 3-stage, self-routed architecture. Packet priority is a common issue in networks, arising when some packets need to be oered better quality of service than others. Packets with real-time requirements (e.g. from streaming media) vs. non real-time packets (e.g. le transfer), and out-of-band data vs. ordinary TCP trac [43] are two examples of such dierentiations. There are already several commercial switches which accommodate trac priority schemes, such as [12, 20]. These switches consist internally of single priority SEs and employ two priority queues for each input port, where packets are queued based on their priority level. Chen and Guerin [7] studied an (N X N) non-blocking packet switch with input queues, built using one-priority SEs. Ng and Dewar [32] introduced a simple modication to load-sharing replicated buered Banyan networks to guarantee priority trac transmission. In this chapter, a dierent type of packet priority is employed. Contrary to other approaches where priority is dened at the application layer (e.g. real-time packets from streaming media vs. non real-time packets from le transfer; out-of-band data vs. ordinary TCP trac [43] and so forth) or in the parallel system's architecture level (e.g. processor-memory trac regarding operating system operations is prioritized against user process' trac) in the proposed architecture packet priority is computed dynamically and is directly proportional to the transmission queue length of the SE that the packet is currently stored in. This priority is used for resolving buer contentions, which in typical MINs are resolved by randomly dropping one of the contending packets. 2.3 Internal Priority MIN Description and Analytical Model Figure 2.1: A cxc Switching Element A MIN can be dened as a network used to interconnect a group of N inputs to a 31

35 group of M outputs using several stages of small size Switching Elements (SEs) followed (or leaded) by link states. It is usually dened by, among others, its topology, routing algorithm, switching strategy and ow control mechanism. A MIN with the Banyan property is dened in [16] and is characterized by the fact that there is exactly one unique path from each source (input) to each sink (output). Banyan MINs are multistage selfrouting switching fabrics. Thus, each SE of k th stage can decide in which output port to route a packet to, depending on the corresponding k th bit of the destination address. Figure 2.2: A 3-stage Delta Network consisting of cxc SEs An (NXN) MIN can be constructed by n = log c N stages of (cxc) SEs, where c is the degree of the SEs. A typical SE is illustrated in gure 2.1. At each stage there are exactly N c SEs, consequently the total number of SEs of a MIN is N c log cn. Thus, there are O(N logn) interconnections among all stages, as opposed to the crossbar network which requires O(N 2 ) links. A typical conguration of an NXN delta network, one of the most widely used classes 32

36 of Banyan MINs, which was proposed by Patel [35] and combines benets of Omega [2] and Generalized Cube Networks [26] (destination routing, partitioning and expandability), is shown at gure 2.2. However, the performance evaluation is independent from the internal link permutations of the Banyan-type network, thus it can be applied to any class of such networks. In this chapter, we consider a Multistage Interconnection Network with the Banyan property that operates under the following assumptions: Routing is performed in a pipeline manner, meaning that the routing process occurs in every stage in parallel. Internal clocking results in synchronously operating switches in a slotted time model [44], and all SEs have deterministic service time. The arrival process of each input of the network is a simple Bernoulli process, i.e. the probability that a packet arrives within a clock cycle is constant and the arrivals are independent of each other. We will denote this probability as. A packet arriving at the rst stage (k = 1) is discarded if the buer of the corresponding SE is full. A packet is blocked at a stage if the destination buer at the next stage is full. The packets are uniformly distributed across all the destinations and each queue uses a FIFO policy for all output ports. When two packets at a stage contend for a buer at the next stage and there is not adequate free space for both of them to be stored (i.e. only one buer position is available at the next stage), there is a conict. In the single priority (or no-priority) scheme MINs, one packet will be accepted at random and the other will be blocked by means of upstream control signals. In the proposed internal-priority scheme, if a conict occurs it is resolved by examining the number of packets within the transmission queue of the SEs from which the contending packets originate. For such a decision, however, to be taken, the receiving SE needs to have available the queue lengths of the transmitting SEs, a piece of information which is not available to the receiving SE in typical MINs. To make this information available, SEs operating under the internal priority MIN scheme send the length of their transmission packet queue at the start of the packet header, as a preamble. When receiving SEs detect a conict situation (i.e. two incoming transmissions and only one free buer slot), they compare the queue sizes of the transmitting SEs and proceed in receiving the packet preambled with the largest value for the queue size. The other packet will be blocked, and the transmitting SE will be notied by means of an upstream control signal during the next network cycle, as in the \typical" MIN operation. Since buer sizes in SEs are usually in the range 1 to 16, the length of the preamble can vary from 1 to 4 bits (in our study the length of the preamble was set to 3), which is quite small compared to the packet length. The preamble need not be checksumed (which would increase its size), since any error in these bits would (in 33

37 the worst case) simply lead to accepting the wrong (with respect to the priority policy) packet, a case that would only marginally aect the gains obtained by the introduction of the internal priority scheme. Finally, all packets in input ports contain both the data to be transferred and the routing tag. As soon as packets reach a destination port they are removed from the MIN, so, packets cannot be blocked at the last stage. Our analysis uses a model, which considers not only the current state of the associated buer, but also the previous one. Based on the one clock history consideration the Mun's [31] three states model is enhanced by a ve states buer model, just as it is described in the following paragraphs Analysis of MINs State `00': Buer was empty at the beginning of the previous clock cycle and it is also empty at beginning of the current clock cycle (i.e. no new packet has been received during the previous clock cycle; buer remains empty). State `01': Buer was empty at the beginning of the previous clock cycle, while it contains a new packet at the current clock cycle (i.e. a new packet has been received during the previous clock cycle; buer is lled now). State `10': Buer had a packet at the previous clock cycle, while it contains no packet at the current clock cycle (i.e. a packet has been sent during the previous clock cycle, but no new packet has been received; buer is empty now). State `11n': Buer had a packet at the previous clock cycle and has a new packet at the current clock cycle (i.e. a packet has been sent during the previous clock cycle, and a new packet has also been received; buer is lled with a new packet now). State `11b': Buer had a packet at the previous clock cycle and has a blocked packet at the current clock cycle (i.e. no packet has been sent during the previous clock cycle due to blocking; buer is lled with the blocked packet now). The following variables are dened in order to develop an analytical model. In all denitions SE(k) denotes a SE at stage k of the MIN Denitions of MINs P00(k,t) is the probability that a buer of SE(k) is empty at both (t 1) th and t th network cycles. P01(k,t) is the probability that a buer of SE(k) is empty at (t 1) th network cycle and has a new packet at t th network cycle. 34

38 P10(k,t) is the probability that a buer of SE(k) has a packet at (t 1) th network cycle and has no packet at t th network cycle. P11n(k,t) is the probability that a buer of SE(k) has a packet at (t 1) th network cycle and has also a new one at t th network cycle. P11b(k,t) is the probability that a buer of SE(k) has a packet at (t 1) th network cycle and has a blocked one at t th network cycle. q(k,t) is the probability that a packet is ready to be accepted to a buer of SE(k) at t th network cycle. r01(k,t) is the probability that a packet in a buer of SE(k) is ready to move forward during the t th network cycle, given that the buer is in `01' state. r11n(k,t) is the probability that a packet in a buer of SE(k) is ready to move forward during the t th network cycle, given that the buer is in `11n' state. r11b(k,t) is the probability that a packet in a buer of SE(k) is ready to move forward during the t th network cycle, given that the buer is in `11b' state. The following equations represent the evolution of the state probabilities as the clock cycles advance. These equations are derived from the state transition diagram at gure 2.3. Figure 2.3: A state transition diagram of a SE(k) buer The probability that a buer of SE(k) was empty at the (t 1) th network cycle is P 00(k; t 1) + P 10(k; t 1). Therefore, the probability that a buer of SE(k) is empty 35

39 both at the current t th and previous (t 1) th network cycles is the probability that the SE(k) was empty at the previous (t 1) th network cycle multiplied by the probability [1 q(k; t 1)] of no packet was ready to be forwarded at the SE(k) during the previous network cycle (the two facts are statistically independent, thus the probability that both are true is equal to the product of the individual probabilities). Formally, this probability P 00(k; t) can be expressed by P 00(k; t) = [1 q(k; t 1)] [P 00(k; t 1) + P 10(k; t 1)] (2.1) The probability that a buer of SE(k) was empty at the (t 1) th network cycle and a new packet has arrived at the current t th network cycle is the probability that the SE(k) was empty at the (t 1) th network cycle [which is equal to P 00(k; t 1) + P 10(k; t 1)] multiplied by the probability q(k; t 1) that a new packet was ready to be transmitted to SE(k) during the (t 1) th network cycle. Formally, this probability P 01(k; t) can be expressed by P 01(k; t) = q(k; t 1) [P 00(k; t 1) + P 10(k; t 1)] (2.2) The case that a buer of SE(k) was full at the begining of (t 1) th network cycle but is empty during the (t 1) th network cycle eectively requires the following two facts to be true: (a) a buer of SE(k) was full at the (t 1) th network cycle and the packet was successfully transmitted and (b) no packet was received during the (t 1) th network cycle to replace the transmitted packet into the buer. The probability for fact (a) is equal to r01(k; t 1) P01(k; t 1) +r11n(k; t 1) P11n(k; t 1) +r11b(k; t 1) P11b(k; t 1); this is computed by considering all cases that during the network cycle t 1 the SE had a packet in its buer and multiplying the probability of each state by the corresponding probability that the packet was successfully transmitted. The probability of fact (b), i.e. that no packet was ready to be transmitted to SE(k) during the previous network cycle is equal to 1 q(k; t 1). Formally, the probability P 10(k; t) can be computed by the following formula: P 10(k; t) = [1 q(k; t 1)] [r01(k; t 1) P 01(k; t 1) +r11n(k; t 1) P 11n(k; t 1) +r11b(k; t 1) P 11b(k; t 1)] (2.3) The probability that a buer of SE(k) had a packet at the (t 1) th network cycle and has also a new one (dierent than the previous; the case of having the same packet in the buer is addressed in the next paragraph) at the t th network cycle is the probability of having a ready packet to move forward at the previous (t 1) th network cycle [which is equal to r01(k; t 1) P 01(k; t 1) + r11n(k; t 1) P 11n(k; t 1) + r11b(k; t 1) P 11b(k; t 1)] multiplied by q(k; t 1), i.e. the probability that a packet was ready to be transmitted to SE(k) during the previous network cycle. Formally, this probability 36

40 P 11n(k; t) can be expressed by P 11n(k; t) = q(k; t 1) [r01(k; t 1) P 01(k; t 1) +r11n(k; t 1) P 11n(k; t 1) +r11b(k; t 1) P 11b(k; t 1)] (2.4) The nal case that should be considered is when a buer of SE(k) had a packet at the (t 1) th network cycle and still contains the same packet at the t th network cycle. This occurs when the packet in the buer of SE(k) was ready to move forward at the (t 1) th network cycle, but it was blocked (not forwarded) during that cycle, due to a blocking event either the associated buer of the next stage SE was already lled due to another blocking, or it was occupied by a second packet of the current stage contending for the same buer during the process of forwarding. The probability for this case can be formally dened as P 11b(k; t) = [1 r01(k; t 1)] P 01(k; t 1) +[1 r11n(k; t 1)] P 11n(k; t 1) +[1 r11b(k; t 1)] P 11b(k; t 1) (2.5) Adding the equations (2.1) (2.5), both left and right-hand sides are equal to 1 validating thus that all possible cases have been covered; indeed, P 00(k; t) + P 01(k; t) + P 10(k; t) + P 11n(k; t) + P 11b(k; t) = 1 and P 00(k; t 1) + P 01(k; t 1) + P 10(k; t 1) + P 11n(k; t 1) + P 11b(k; t 1) = 1. The analytical model presented in the previous paragraphs extends the ones presented in other works (e.g. [17, 18, 44] ) by considering the state and transitions occurring within an additional clock cycle. This enhancement improves the accuracy of the performance parameters calculation (throughput and delay). The simulation presented in section 2.5 takes into account all the above presented dependencies among the queues of each SE(k) of the MIN. 2.4 Performance Evaluation Methodology of Internal Priority MINs In order to evaluate the performance of a (N X N) MIN with n = log c N intermediate stages of (cxc) SEs, we use the following metrics. Let T be a relatively large time period divided into u discrete time intervals ( 1 ; 2 ; ; u ) Average throughput Th avg is the average number of packets accepted by all destinations per network cycle. This metric is also referred to as bandwidth. Formally, Th avg can be dened as uk=1 n Th a (k) avg = lim u (2.6) u where n a (k) denotes the number of packets that reach their destinations during the k th time interval. 37

41 Normalized throughput Th is the ratio of the average throughput Th avg to the number of network outputs N. Formally, Th can be expressed by Th = Th avg (2.7) N Normalized throughput is a good metric for assessing the MIN's cost eectiveness. Average packet delay D avg is the average time a packet spends to pass through the network. Formally, D avg is expressed by na (u) k=1 t D d (k) avg = lim u (2.8) n a (u) where n a (u) denotes the total number of packets accepted within u time intervals and t d (k) represents the total delay for the k th packet. We consider t d (k) = t w (k) + t tr (k) where t w (k) denotes the total queuing delay for k th packet, while waiting at each stage for the availability of a buer at the next stage of the network. The second term t tr (k) denotes the total transmission delay for k th packet at each stage of the network, that is just n nc, where n = log 2 N is the number of intermediate stages and nc is the network cycle. Normalized packet delay D is the ratio of the D avg to the minimum packet delay which is simply the transmission delay n nc (i.e. zero queuing delay). Formally, D can be dened as D = D avg (2.9) n nc Universal performance factor Upf is dened by a relation involving the two major above normalized factors, D and Th: the performance of a MIN is considered optimal when D is minimized and T h is maximized, thus the formula for computing the universal factor arranges so that the overall performance metric follows that rule. Formally, Upf can be expressed by Upf = w d D 2 + w th 1 (2.10) Th 2 where w d and w th denote the corresponding weights for each factor participating in the Upf, designating thus its importance for the corporate environment. Consequently, the performance of a MIN can be expressed in a single metric that is tailored to the needs that a specic MIN setup will serve. It is obvious that, when the packet delay factor becomes smaller or/and throughput factor becomes larger the Upf becomes smaller, thus smaller Upf values indicate better overall MIN performance. Because the above factors (parameters) have dierent measurement units and scaling, we normalize them to obtain a reference value domain. Normalization is performed by dividing the value of each factor by the (algebraic) minimum or maximum value that this factor may attain. Thus, equation (2.10) can be replaced by: 38

42 ( ) D D min 2 ( Th Upf = w d D min + w th max ) Th 2 (2.11) Th where D min is the minimum value of normalized packet delay (D) and Th max is the maximum value of normalized throughput. Consistently to equation (2.10), when the universal performance factor Upf, as computed by equation (2.11) is close to 0, the performance a MIN is considered optimal whereas, when the value of Upf increases, its performance deteriorates. Moreover, taking into account that the values of both delay and throughput appearing in equations (2.10 and 2.11) are normalized, D min = Th max = 1, thus the equation can be simplied to: ( ) 1 Th 2 Upf = w d (D 1) 2 + w th (2.12) Th In the remaining of this chapter we will consider both factors of equal importance, setting thus w d = w th = 1. Finally, we list the major parameters aecting the performance of a MIN. Buer-size b of a queue is the maximum number of packets that an input buer of a SE can hold. In this study we consider a nite-buered (b = 1; 2; 4; 8) MIN. Oered load is the steady-state xed probability of such arriving packets at each queue on inputs. In our simulation is assumed to be = 0:1; 0:2; :::; 0:9; 1. Number of stages n is the number of stages of an (N X N) MIN, where n = log 2 N. In our simulation n is assumed to be n = 3; 6; 8; 10, which is a widely used MIN size. 2.5 Simulation and Performance Results of Internal Priority MINs The performance of MINs is usually determined by modeling, using simulation [52] or mathematical methods [53]. In this chapter we estimated the network performance using simulations. We developed a generic simulator for MINs in a packet communication environment. The simulator can handle several switch types, inter-stage interconnection patterns, load conditions, switch operation policies, and priorities. We focused on an (N X N) Banyan Network that consists of (2 X 2) SEs, using internal queuing. Each (2 X 2) SE in all stages of the MIN was modeled by two non-shared buer queues, where the crossbar segment was located in front of queues. Buer operation was based on FCFS principle. In the case of non-priority scheme MINs, when there was a contention between two packets, it was solved randomly (algorithms 2.1 and 2.2). The performance of non-priority MINs was compared against the performance of internal priority MINs, where contentions were resolved by favoring the packet transmitted from the SE with the highest transmission queue length (algorithms 2.1 and 2.3). The simulation was performed at packet level, assuming xed-length packets transmitted in equal-length time slots, where the slot was the time required to forward a packet from one stage to the next. 39

43 The parameters for the packet trac model were varied across simulation experiments to generate dierent oered loads and trac patterns. Metrics such as packet throughput and packet delays were collected at the output ports. We performed extensive simulations to validate our results. All statistics obtained from simulation running for 10 5 clock cycles. The number of simulation runs was adjusted to ensure a steady-state operating condition for the MIN. There was a stabilization process in order the network be allowed to reach a steady state by discarding the rst 10 3 network cycles, before collecting the statistics. SendQueue Process (cs id ; sq id ; bm) Input: Current stage id (cs id ) ; send-queue id (sq id ) of current stage ; blocking mechanism (bm). Output: Population for send- and accept-queue (P op) ; total number of serviced and blocked packets for send-queue (Serviced; Blocked) respectively ; total number of packet delay cycles for send-queue (Delay) ; routing address RA of each buer position of queue. { if (Pop[sq id ][cs id ] > 0) // send-queue is not empty { RA bit = get bit(ra[sq id ][cs id ][1]; cs id ); // get the (cs id ) th bit of Routing Address (RA) of the leading packet of // send-queue by a cyclic logical left shift; perfect shue algorithm if (RA bit = 0) // upper port routing aq id = 2 (sq id %(N=2)) ; // upper link; perfect shue algorithm else // lower port routing aq id = 2 (sq id %(N=2)) + 1 ; // lower link; perfect shue algorithm // where aq id is the accept-queue id of next stage if (Pop[aq id ][cs id + 1] = B) // where B is the buer-size { // blocking state Blocked[sq id ][cs id ] = Blocked[sq id ][cs id ] + 1 ; if (bm = blm ) // block and lost mechanism { Pop[sq id ][cs id ] = Pop[sq id ][cs id ] 1 ; for (bf id = 1; bf id >= Pop[sq id ][cs id ]; bf id + +) RA[sq id ][cs id ][bf id ] = RA[sq id ][cs id ][bf id + 1] ; // where RA is the Routing Address of the packet // located at (bf id ) th position of send-queue } } else // unicast forwarding { Serviced[sq id ][cs id ] = Serviced[sq id ][cs id ] + 1 ; 40

44 } } Pop[sq id ][cs id ] = Pop[sq id ][cs id ] 1 ; Pop[aq id ][cs id + 1] = Pop[aq id ][cs id + 1] + 1 ; RA[aq id ][cs id + 1][Pop[aq id ][cs id + 1]] = RA[sq id ][cs id ][1] ; for (bf id = 1; bf id >= Pop[sq id ][cs id ]; bf id + +) RA[sq id ][cs id ][bf id ] = RA[sq id ][cs id ][bf id + 1] ; } Delay[sq id ][cs id ] = Delay[sq id ][cs id ] + Pop[sq id ][cs id ] ; return P op; Serviced; Blocked; Delay; RA ; Algorithm 2.1: Send-queue process for single- and internal-priority MINs SinglePriority SEs Process (cs id ; use id ; bm) Input: Current stage id (cs id ) ; Switching Element id (use id ) of upper segment ; blocking mechanism (bm). { lse id = use id + N ; // where N is the number of input/output ports 4 // and lse id is the adjacent Switching Element (SE) of lower segment r = random(); // where r [0::1) if (r < 0:5) // upper segment is clocked rstly { // process for upper queues of SEs SendQueue Process (cs id ; 2 use id ; bm); SendQueue Process (cs id ; 2 lse id ; bm); // process for lower queues of SEs SendQueue Process (cs id ; 2 use id + 1; bm); SendQueue Process (cs id ; 2 lse id + 1; bm); } else // lower segment is clocked rstly { // process for upper queues of SEs SendQueue Process (cs id ; 2 lse id ; bm); SendQueue Process (cs id ; 2 use id ; bm); // process for lower queues of SEs SendQueue Process (cs id ; 2 lse id + 1; bm); SendQueue Process (cs id ; 2 use id + 1; bm); } } Algorithm 2.2: Switching Element process for single-priority MINs 41

45 InternalPriority SEs Process (cs id ; use id ; bm) Input: Current stage id (cs id ) ; Switching Element id (use id ) of upper segment ; blocking mechanism (bm). { lse id = use id + N ; // where N is the number of input/output ports 4 // and lse id is the adjacent Switching Element (SE) of lower segment r = random(); // where r [0::1) // process for upper queues of SEs if (Pop[2 use id ][cs id ] > Pop[2 lse id ][cs id ]) or ((Pop[2 use id ][cs id ] = Pop[2 lse id ][cs id ]) and (r < 0:5)) { // upper segment clocking takes precedence SendQueue Process (cs id ; 2 use id ; bm); SendQueue Process (cs id ; 2 lse id ; bm); } if (Pop[2 use id ][cs id ] < Pop[2 lse id ][cs id ]) or ((Pop[2 use id ][cs id ] = Pop[2 lse id ][cs id ]) and (r >= 0:5)) { // lower segment clocking takes precedence SendQueue Process (cs id ; 2 lse id ; bm); SendQueue Process (cs id ; 2 use id ; bm); } // process for lower queues of SEs if (Pop[2 use id + 1][cs id ] > Pop[2 lse id + 1][cs id ]) or ((Pop[2 use id + 1][cs id ] = Pop[2 lse id + 1][cs id ]) and (r < 0:5)) { // upper segment clocking takes precedence SendQueue Process (cs id ; 2 use id + 1; bm); SendQueue Process (cs id ; 2 lse id + 1; bm); } if (Pop[2 use id + 1][cs id ] < Pop[2 lse id + 1][cs id ]) or ((Pop[2 use id + 1][cs id ] = Pop[2 lse id + 1][cs id ]) and (r >= 0:5)) { // lower segment clocking takes precedence SendQueue Process (cs id ; 2 lse id + 1; bm); SendQueue Process (cs id ; 2 use id + 1; bm); } } Algorithm 2.3: Switching Element process for internal-priority MINs Figue 2.4 shows the normalized throughput of a single-buered MIN with 6 stages as a function of the probability of arrivals for the three classical models [44, 31, 23] and our simulation. All models are very accurate at low loads. The accuracy reduces as input load increases. Especially, when input load approaches the network maximum throughput, the accuracy of Jenq's model is insucient. One of the reasons is the fact that many packets 42

46 Figure 2.4: T h of single-buered, 6-stage, single- (or non-) priority MIN Figure 2.5: T h of double-buered, n-stage, internal- vs. non-priority scheme are blocked mainly at the network rst stages at high trac rates. Thus, Mun introduced a \blocked" state to his model to improve accuracy. The consideration of the dependencies between the two buers of an SE in Theimer's model leads to further improvement. Our simulation was also tested by comparing the results of the Theimer's model with those of our simulation experiments, which were found to be in close agreement (dierences are less than 1%). Figure 2.5 illustrates the gains on normalized throughput of a MIN using an internal priority vs. non priority (or single priority) scheme. In the diagram, curve NPS[b][n] depicts the normalized throughput of an n-stage MIN, where n = 3; 6; 8; 10, constructed by 2X2 SEs, using queues of buer-length b, employing a non priority scheme. Similarly, curve IPS[b][n] shows the corresponding normalized throughput of an n-stage MIN, where n = 3; 6; 8; 10, constructed by 2X2 SEs, using queues of buer-length b, employing an internal priority scheme. In this gure, all curves represent the performance factor of normalized throughput for double buered MINs (b = 2) at dierent oered loads ( = 0:1; 0:2; ; 1). We can notice here that the gains on normalized throughput of a MIN using an internal priority vs. non priority scheme are 1.9%, 3.3%, 3.7%, and 4.0% of the optimal value, which is just Th max = 1, when n = 3; 6; 8; 10 respectively, under full load trac. It is obvious that the normalized throughput falls as the network size (bandwidth) increases. However, the gains of normalized throughput using the internal priority vs. non priority scheme are more considerable as the network size increases. Figure 2.6 illustrates the gains on normalized throughput of a MIN using an internal priority scheme as compared to the single priority one in the case of buer size b = 4. We can notice here that the gains on normalized throughput of a MIN using an internal priority vs. non priority scheme are 1.4%, 3.7%, 4.3%, and 4,7% of the optimal value, when n = 3; 6; 8; 10 respectively, under full load trac. As it is seen by the diagram the gains 43

47 Figure 2.6: T h of nite-buered (b=4), n- stage, internal- vs. non-priority scheme Figure 2.7: T h of nite-buered (b=8), n- stage, internal- vs. non-priority scheme on normalized throughput remain considerable for all network setups, especially in cases where n >= 6. Figure 2.7 presents the case of a MIN with a large queue conguration, where the buer size is b = 8. The results show that the gains on normalized throughput, when the buer length is b = 8 are lower at all network setups (n = 3; 6; 8; 10), but still considerable. According to the above diagram the gains of a MIN using an internal priority vs. non priority scheme are 0.8%, 2.7%, 3.0%, and 3.5% of the optimal value, when n = 3; 6; 8; 10 respectively, under full load trac. It is worthy of remark, that the normalized throughput is improved for both single and internal priority MINs due to the increment of buer size (b = 8), which is more obvious in the case of heavy trac (ë > 0:7) oered load. Figure 2.8 represents the corresponding increments on normalized packet delay for internal priority vs. single priority packets of a 6-stage MIN, under dierent buer size schemas (b = 1; 2; 4; 8), which are found to be negligible for all conguration setups. It emerges that when the buer size of the MIN has the maximum value (b = 8) the normalized delay of internal priority packets under full load trac increases from the corresponding normalized delay of single priority packets - to 6.02, that is just the worst case. It is obvious that the corresponding single buered (b = 1) MINs have the same values for all performance factors at both single and internal priority schemas. The reason is that, when two packets at a stage contend for the same buer at the next stage and there is not adequate free space to be stored the algorithm of solving the contention is the same for both single and internal priority schemas, because all queues can hold only one packet and thus, one of them is selected randomly independently of the priority scheme. It is also noteworthy that larger buers introduce larger delays, because packets ll the buers and stay in the network longer, thereby increasing queuing delays. Large packet delay values can adversely aect applications sensitive to packet delay or jitter, 44

48 Figure 2.8: D of nite-buered, 6-stage, internal- vs. non-priority scheme Figure 2.9: Upf of nite-buered, 6-stage, internal- vs. non-priority scheme such as streaming media trac. Figure 2.9 illustrates the relation of the combined performance indicator Upf of a 6- stage MIN to the oered load, under dierent buer size congurations (b = 1; 2; 4; 8). Recall from section 2.3, the combined performance indicator Upf depicts the overall performance of a MIN, considering the weights of each individual performance factor (throughput and packet delay) are of equal importance. It is clear that the performance indicator Upf has lower (better) values as the buer length increases, but when the buer size reaches the values b = 4; 8 the performance indicator Upf deteriorates signicantly, under moderate and heavy trac (ë > 6) using either internal or single priority scheme. 2.6 Conclusions for Internal Priority MINs In this chapter we have presented a novel MIN architecture employing an internal priority scheme to resolve contentions. The performance of the proposed scheme has been evaluated through simulation and compared against the performance of single-priority MINs, considering dierent oered loads, buer lengths and number of stages. It has been found that the gains for MINs in terms of throughput using the internal priority scheme are considerable in all cases. Especially, the improvement of the throughput is of great worth when the oered load includes mainly data packets (vs. voice packets, which are more sensitive to packet delay), because throughput is the most important performance factor in the case of data packets. It is also worth noting, that the corresponding increments of packet delays are negligible for all conguration setups. Moreover, the overall performance indicator Upf of a MIN, a metric combining both throughput and delay, is improved. In this chapter, when calculating the value of Upf, we have considered the individual 45

49 performance factors (throughput and packet delay) to be of equal importance. This is not necessarily true for all application classes, e.g. for batch data transfers throughput is more important, whereas for streaming media the delay must be optimized. In the next chapters we will consider such cases and will make eorts to provide MIN designers with metrics that will support them in choosing the best MIN setup, taking into account the applications that the MIN will support. 46

50 Chapter 3 Dual Priority MINs and Asymmetric-sized Buffer Queues 3.1 Introduction 3.2 Related Work 3.3 Dual Priority MIN and Analytical Model State Notations for High Priority Queues Denitions for High Priority Queues Mathematical Analysis for High Priority Queues State Notations for Low Priority Queues Denitions for Low Priority Queues Mathematical Analysis for Low Priority Queues 3.4 Performance Evaluation Methodology of Dual Priority MINs 3.5 Simulation and Performance Results of Dual Priority MINs Dual Priority MINs vs.single Priority Ones Dual Priority MINs with Asymmetric-sized Buer Queues 3.6 Conclusions for Dual Priority MINs 47

51 3.1 Introduction During the last decades, much research has targeted the investigation of parallel and distributed systems' performance, particularly in the area of Multistage Interconnection Networks (MINs) and communications. The performance of the communication infrastructure that interconnects the system' s elements (nodes, processors, memory modules etc) has been recognized as a critical factor for overall system performance, both in the context of parallel and in the context of distributed systems. As a result, much research has been conducted, targeting to identify the factors that aect the communication infrastructure's performance and provide models for performance prediction and evaluation. Two major directions have been taken to this end: the rst employs analytical models based either on Markov models or on Petri-nets, while the second uses simulation techniques. These works enable network designers to estimate network performance before it is actually implemented, allowing thus network design tuning and adjustment of parameters. Using the insights from this procedure, network designers may craft ecient systems, tailored to the specic requirements of the system under implementation with a minimal cost, since the actual implementation decisions are deferred until all operational parameters have been determined. In this chapter we use a novel approach to model the operational behaviour of a 2- priority class MIN, which takes into account the previous and the current state of both queues (high and low priority) of each switching element, leading thus to more accurate results. The modelling scheme is complemented with equations expressing the probability for each state transition, giving a complete analytic framework for 2-class priority MIN performance behaviour. Simulation experiments are also conducted to estimate the MIN performance under various trac loads, buer lengths, high/low priority trac ratios and MIN sizes (number of stages). We also introduce a variation of double-buered Switching Elements (SEs) that uses asymmetric buer sizes for packets of dierent priorities, aiming to better exploit the network hardware resources and capacity. Consequently, the ndings of this chapter can be used by network designers to gain insight on the impact of each MIN design parameter on the overall MIN performance and to select the optimal MIN conguration for the needs of their environment. The remainder of this chapter is organized as follows: section 3.2 overviews related work in the area of network performance evaluation and priority schemes, while in section 3.3 we present the proposed dual priority scheme, we describe its operation and give an analytical model. The analytical model employs a novel 5-states and 6-states buer model for high and low priority queues respectively. Subsequently, in section 3.4 we present the performance criteria and parameters related to the network. Section 3.5 presents the results of our performance analysis, which has been conducted through simulation experiments, while section 3.6 provides the concluding remarks. 48

52 3.2 Related Work Single priority queuing systems in the context of MIN performance evaluation have been extensively studied and are reported in numerous publications. For example, [44, 31, 23] study the throughput and system delay of a MIN assuming the SEs have a single input buer, whereas the performance of a nite-buered MINs is studied, among others, in [27]. Moreover, Chen and Guerin [7] studied an (N X N) non-blocking packet switch with input queues, built using one-priority SEs. Ng and Dewar [32] introduced a simple modication to load-sharing replicated buered Banyan networks to guarantee priority trac transmission. A recent development in the MIN domain is the introduction of dual priority (or 2- class) queuing systems, which are able to oer dierent quality-of-service parameters to packets that have dierent priorities. Packet priority has been a common issue in networks, arising when some packets need to be oered better quality of service than others. Packets with real-time requirements (e.g. from streaming media) vs. non real-time packets (e.g. le transfer), and out-of-band data vs. ordinary TCP trac [43] are two examples of such dierentiations. Cases of dierent priorities may arise also in the context of parallel architectures, e.g. some CPUs may be running operating system processes and trac between these CPUs and memory modules can be prioritized against trac from/to other CPUs. In all these cases, the communications infrastructure should include provisions to (a) allow applications or architectural components to designate packet priority and (b) oer better quality of service to the packets indicated as \high priority" ones. 3.3 Dual Priority MIN and Analytical Model Recall from the previous chapter, Multistage Interconnection Networks (MINs) are used to interconnect a group of N inputs to a group of M outputs using several stages of small size Switching Elements (SEs) followed (or preceded) by link states, where all dierent types of MINs [35, 26, 2] with the Banyan property [16] are self-routing switching fabrics, which are also characterized by the fact that there is exactly a unique path from each source (input) to each sink (output). Especially, in a dual priority scheme, when a packet is entered in the MIN its priority is specied by the application or the architectural module that has produced the packet. The priority is henceforth reected into a bit in the packet header and is maintained throughout the lifetime of the packet within the MIN. In order to support priority handling, each SE has two transmission queues per link, accommodated in two (logical) buers, with one queue dedicated to high priority packets and the other dedicated to low priority ones. During a single network cycle, the SE considers all its links, examining for each one of them rstly the high priority queue. If this is not empty, it transmits the rst packet towards the next MIN stage; the low priority queue is checked only if the corresponding high priority queue is empty. Packets in all queues are transmitted in a rst come, rst served basis. In all cases, at most one 49

53 packet per link (upper or lower) of an SE will be forwarded for each pair of high and low priority queues to the next stage. A typical conguration of a 3-stage MIN consisting of 2x2 SEs is depicted in gure 3.1. This conguration is based on the standard 8x8 delta network setup proposed by Patel [35], but has been extended to use dierent queues per input link (for high and low priority packets). In this chapter, we consider a Multistage Interconnection Network with the Banyan property that operates under the following assumptions: Figure 3.1: 2-class priority for 3-stage MIN consisting of 2x2 SEs Routing is performed in a pipeline manner, meaning that the routing process occurs in every stage in parallel. Internal clocking results in synchronously operating switches in a slotted time model [44], and all SEs have deterministic service time. The arrival process of each input of the network is a simple Bernoulli process, i.e. the probability that a packet arrives within a clock cycle is constant and the arrivals are independent of each other. We will denote this probability as. This probability can be further broken down to h and l, which represent the arrival probability for high and low priority packets, respectively. It holds that = h + l. A high/low priority packet arriving at the rst stage (k = 1) is discarded if the high/low priority buer of the corresponding SE is full, respectively. A high/low priority packet is blocked at a stage if the destination high/low priority buer at the next stage is full, respectively. 50

54 Both high and low priority packets are uniformly distributed across all destinations, and each high/low priority queue uses a FIFO policy for all output ports. When two packets at a stage contend for a buer at the next stage and there is no adequate free space for both of them to be stored (i.e. only one buer position is available at the next stage), there is a conict. Conict resolution in a single-priority mechanism operates under the following scheme: one packet will be accepted at random and the other will be blocked by means of upstream control signals. Under the dual priority scheme, the conict resolution procedure takes into account the packet priority: if one of the received packets is a high-priority one and the other is a low priority packet, the high-priority packet will be maintained and the low-priority one will be blocked by means of upstream control signals; if both packets have the same priority, one packet is chosen randomly to be stored in the buer whereas the other packet is blocked. The priority of each packet is indicated through a priority bit in the packet header, thus it suces for the SE to read the header in order to make a decision on which packet to store and which to drop. Finally, all packets in input ports contain both the data to be transferred and the routing tag. As soon as packets reach a destination port they are removed from the MIN, so, packets cannot be blocked at the last stage. Our analysis considers again one clock history, enhancing the Mun's [31] three states model with a ve states buer model, as it is described in the following paragraphs State Notations for High Priority Queues State `00 h ': High priority buer was empty at the beginning of the previous clock cycle and it is also empty at beginning of the current clock cycle (i.e. no new high priority packet has been received during the previous clock cycle; high priority buer remains empty). State `01 h ': High priority buer was empty at the beginning of the previous clock cycle, while it contains a new high priority packet at the current clock cycle (i.e. a new high priority packet has been received during the previous clock cycle; high priority buer is lled now). State `10 h ': High priority buer had a high priority packet at the previous clock cycle, while it contains no packet at the current clock cycle (i.e. a high priority packet has been sent during the previous clock cycle, but no new such packet has been received; high priority buer is empty now). State `11n h ': High priority buer had a high priority packet at the previous clock cycle and has a new one at the current clock cycle (i.e. a high priority packet has been sent during the previous clock cycle, and a new such packet has also been received; high priority buer is lled with a new high priority packet now). 51

55 State `11b h ': High priority buer had a high priority packet at the previous clock cycle and has the packet blocked at the current clock cycle (i.e. no high priority packet has been sent during the previous clock cycle due to blocking; high priority buer is lled with a blocked high priority packet now) Denitions for High Priority Queues The following variables are dened in order to develop an analytical model. In all denitions SE(k) denotes a SE at stage k of the MIN. P 00(k; t) h is the probability that a high priority buer of SE(k) is empty at both (t 1) th and t th network cycles. P 01(k; t) h is the probability that a high priority buer of SE(k) is empty at (t 1) th network cycle and has a new packet at t th network cycle. P 10(k; t) h is the probability that a high priority buer of SE(k) has a packet at (t 1) th network cycle and is empty at t th network cycle. P 11n(k; t) h is the probability that a high priority buer of SE(k) has a packet at (t 1) th network cycle and has also a new one at t th network cycle. P 11b(k; t) h is the probability that a high priority buer of SE(k) has a packet at (t 1) th network cycle and has a blocked one at t th network cycle. q(k; t) h is the probability that a high priority packet is ready to be sent to a high priority buer of SE(k) at t th network cycle (i.e. a high-priority packet will be transmitted by an SE(k-1) to SE(k)). r01(k; t) h is the probability that a high priority packet in a buer of SE(k) is ready to move forward during the t th network cycle, given that the buer is in `01 h ' state. r11n(k; t) h is the probability that a high priority packet in a buer of SE(k) is ready to move forward during the t th network cycle, given that the buer is in `11n h ' state. r11b(k; t) h is the probability that a high priority packet in a buer of SE(k) is ready to move forward during the t th network cycle, given that the buer is in state`11b h ' state Mathematical Analysis for High Priority Queues The following equations, which are derived from the state transition diagram at gure 3.2, represent the state transition probabilities as clock cycles advance. The probability that a high priority buer of SE(k) was empty at the (t 1) th network cycle is P 00(k; t 1) h + P 10(k; t 1) h. Therefore, the probability that a high priority buer of SE(k) is empty both at the current t th and previous (t 1) th network cycles is the 52

56 Figure 3.2: A state transition diagram of a high priority buer of SE(k) probability that the SE(k) was empty at the previous (t 1) th network cycle multiplied by the probability [1 q(k; t 1) h ] of no high priority packet was ready to be forwarded to SE(k) during the previous network cycle (the two facts are statistically independent, thus the probability that both are true is equal to the product of the individual probabilities). Formally, this probability P 00(k; t) h can be expressed by P 00(k; t) h = [1 q(k; t 1) h ] [P 00(k; t 1) h + P 10(k; t 1) h ] (3.1) The probability that a high priority buer of SE(k) was empty at the (t 1) th network cycle and a new high priority packet has arrived at the current tth network cycle is the probability that the SE(k) was empty at the (t 1) th network cycle [which is equal to P 00(k; t 1)h + P 10(k; t 1) h ] multiplied by the probability q(k; t 1) h that a new high priority packet was ready to be transmitted to SE(k) during the (t 1) th network cycle. Formally, this probability P 01(k; t) h can be expressed by P 01(k; t) h = q(k; t 1) h [P 00(k; t 1) h + P 10(k; t 1) h ] (3.2) The case that a high priority buer of SE(k) was full at the (t 1) th network cycle but is empty during the (t 1) th network cycle eectively requires the following two facts to be true: (a) a high priority buer of SE(k) was full at the (t 1) th network cycle and the high priority packet was successfully transmitted and (b) no high priority packet was received during the (t 1) th network cycle to replace the transmitted high priority packet into the buer. The probability for fact (a) is equal to [r01(k; t 1) h P 01(k; t 1) h + r11n(k; t 1) h P 11n(k; t 1) h + r11b(k; t 1) h P 11b(k; t 1) h ]; this is computed by considering all cases that during the network cycle t 1 the SE(k) had a high priority 53

57 packet in its buer and multiplying the probability of each state by the corresponding probability that the packet was successfully transmitted. The probability of fact (b), i.e. that no high priority packet was ready to be transmitted to SE(k) during the previous network cycle is equal to [1 q(k; t 1) h ]. Formally, the probability P 10(k; t) h can be computed by the following formula: P 10(k; t) h = [1 q(k; t 1) h ] [r01(k; t 1) h P 01(k; t 1) h +r11n(k; t 1) h P 11n(k; t 1) h +r11b(k; t 1) h P 11b(k; t 1) h ] (3.3) The probability that a high priority buer of SE(k) had a packet at the (t 1) th network cycle and has also a new one (dierent than the previous; the case of having the same packet in the buer is addressed in the next paragraph) at the t th network cycle is the probability of having a ready high priority packet to move forward at the previous (t 1) th network cycle [which is equal to r01(k; t 1) h P 01(k; t 1) h + r11n(k; t 1) h P 11n(k; t 1) h + r11b(k; t 1) h P 11b(k; t 1) h ] multiplied by q(k; t 1) h, i.e. the probability that a high priority packet was ready to be transmitted to SE(k) during the previous network cycle. Formally, this probability P 11n(k; t) h can be expressed by P 11n(k; t) h = q(k; t 1) h [r01(k; t 1) h P 01(k; t 1) h +r11n(k; t 1) h P 11n(k; t 1) h +r11b(k; t 1) h P 11b(k; t 1) h ] (3.4) The nal case that should be considered is when a high priority buer of SE(k) had a high priority packet at the (t 1) th network cycle and still contains the same packet at the t th network cycle. This occurs when the packet in the high priority buer of SE(k) was ready to move forward at the (t 1) th network cycle, but it was blocked (not forwarded) during that cycle, due to a blocking event - either (a) the associated high priority buer of the next stage SE(k) was already full due to another blocking, or (b) buer space was available at stage k + 1 but it was occupied by a second packet of the current stage contending for the same high priority buer during the process of forwarding. The probability for this case can be formally dened as P 11b(k; t) h = [1 r01(k; t 1) h ] P 01(k; t 1) h +[1 r11n(k; t 1) h ] P 11n(k; t 1) h +[1 r11b(k; t 1) h ] P 11b(k; t 1) h (3.5) Adding the equations (3.1) (3.5), both left and right-hand sides are equal to 1 validating thus that all possible cases have been covered; indeed, P 00(k; t) h +P 01(k; t) h + P 10(k; t) h +P11n(k; t) h +P11b(k; t) h = 1 and P 00(k; t 1) h +P01(k; t 1) h +P10(k; t 1) h + P 11n(k; t 1) h + P 11b(k; t 1) h = 1. Finally, in the marginal case, when l = 0 (or, equivalently, h = ), the system of equations (3.1) (3.5) eectively degenerate to the equation system for a single priority MIN. 54

58 3.3.4 State Notations for Low Priority Queues Modelling of low priority queues needs one additional state, as compared to the highpriority queue model, to accommodate the cases that a low priority packet is blocked due to the existence of a high-priority packet in the same link; thus the model for low queues includes six distinct buer states as follows: State `00 l ': Low priority buer was empty at the beginning of the previous clock cycle and it is also empty at beginning of the current clock cycle. State `01 l ': Low priority buer was empty at the beginning of the previous clock cycle, while it contains a new low priority packet at the current clock cycle. State `10 l ': Low priority buer had a low priority packet at the previous clock cycle, while it contains no packet at the current clock cycle. State `11n l ': Low priority buer had a low priority packet at the previous clock cycle and has a new one at the current clock cycle. State `11b l ': Low priority buer had a low priority packet at the previous clock cycle and has the packet blocked at the current clock cycle. State `11w l ': Low priority buer had a low priority packet at the previous clock cycle and has this packet waiting at the current clock cycle, because the corresponding high priority queue has a ready packet to be transmitted; recall that high priority packets have precedence over low priority ones at the transmission process Denitions for Low Priority Queues Similarly to variable denitions for high priority queues presented in subsection 3.3.2, we dene here the necessary variables to develop an analytical model for low-priority queues: P 00(k; t) l is the probability that a low priority buer of SE(k) is empty at both (t 1) th and t th network cycles. P 01(k; t) l is the probability that a low priority buer of SE(k) is empty at (t 1) th network cycle and has a new low priority packet at t th network cycle. P 10(k; t) l is the probability that a low priority buer of SE(k) has a low priority packet at (t 1) th network cycle and is empty at t th network cycle. P 11n(k; t) l is the probability that a low priority buer of SE(k) has a packet at (t 1) th network cycle and has also a new one at t th network cycle. P 11b(k; t) l is the probability that a low priority buer of SE(k) has a packet at (t 1) th network cycle and still has the same packet at t th network cycle, as the packet could not be transmitted due to blocking. 55

59 P 11w(k; t) l is the probability that a low priority buer of SE(k) has a packet at (t 1) th network cycle and still has the same packet at t th network cycle, as the packet could not be transmitted due to the existence of a high priority packet in the same link. q(k; t) l is the probability that a low priority packet is ready to be sent to a low priority buer of SE(k) at t th network cycle (i.e. a low-priority packet will be transmitted by an SE(k-1) to SE(k). r01(k; t) l is the probability that a low priority packet in a buer of SE(k) is ready to move forward during the t th network cycle, given that the buer is in `01 l ' state. r11n(k; t) l is the probability that a low priority packet in a buer of SE(k) is ready to move forward during the t th network cycle, given that the buer is in `11n l ' state. r11b(k; t) l is the probability that a low priority packet in a buer of SE(k) is ready to move forward during the t th network cycle, given that the buer is in `11b l ' state. r11w(k; t) l is the probability that a low priority packet in a buer of SE(k) is ready to move forward during the t th network cycle, given that the buer is in `11w l ' state Mathematical Analysis for Low Priority Queues Similarly to subsection 3.3.3, the following equations, derived from the state transition diagram in gure 3.3, represent the state transition probabilities of low priority queues as clock cycles advance. State probabilities for low priority queues can be formally dened as: P 00(k; t) l = [1 q(k; t 1) l ] [P 00(k; t 1) l + P 10(k; t 1) l ] (3.6) P 01(k; t) l = q(k; t 1) l [P 00(k; t 1) l + P 10(k; t 1) l ] (3.7) P 10(k; t) l = [1 U(k; t 1) h ] [1 q(k; t 1) l ] [r01(k; t 1) l P 01(k; t 1) l +r11n(k; t 1) l P 11n(k; t 1) l + r11b(k; t 1) l P 11b(k; t 1) l +r11w(k; t 1) l P 11w(k; t 1) l ] (3.8) P 11n(k; t) l = [1 U(k; t 1) h ] q(k; t 1) l [r01(k; t 1) l P 01(k; t 1) l + +r11n(k; t 1) l P 11n(k; t 1) l + r11b(k; t 1) l P 11b(k; t 1) l +r11w(k; t 1) l P 11w(k; t 1) l ] (3.9) P 11b(k; t) l = [1 U(k; t 1) h ] {[1 r01(k; t 1) l ] P 01(k; t 1) l +[1 r11n(k; t 1) l ] P 11n(k; t 1) l +[1 r11b(k; t 1) l ] P 11b(k; t 1) l +[1 r11w(k; t 1) l ] P 11w(k; t 1) l } (3.10) 56

60 Figure 3.3: A state transition diagram of a low priority buer of SE(k) P 11w(k; t) l = U(k; t 1) h [P 01(k; t 1) l + P 11n(k; t 1) l +P 11b(k; t 1) l + P 11w(k; t 1) l ] (3.11) where U(k; t 1) h expresses the probability that a packet exists in the high priority queue of SE(k) during network cycle t 1 and is given by the following equation: U(k; t 1) h = r01(k; t 1) h P 01(k; t 1) h + r11n(k; t 1) h P 11n(k; t 1) h +r11b(k; t 1) h P 11b(k; t 1) h (3.12) The factor [1 U(k; t 1) h ] appearing in the equation eectively manifests that the corresponding states may only be reached if the involved high-priority queues are empty: this holds because the pertinent states may be reached only a packet is transmitted from a low priority queue, and an empty corresponding high-priority queue is a prerequisite for such a transmission to occur. Adding the equations (3.6) (3.11), both left and right-hand sides are equal to 1, validating thus that all possible cases are covered.; indeed P 00(k; t) l +P 01(k; t) l +P 10(k; t) l + P 11n(k; t) l +P11b(k; t) l +P11w(k; t) l = 1 and P 00(k; t 1) l +P01(k; t 1) l +P10(k; t 1) l + P 11n(k; t 1) l + P 11b(k; t 1) l + P 11w(k; t 1) l = 1. Moreover, in the marginal case, when h = 0 (or, equivalently, l = ), U(k; t 1) h = 0 and thus P 11w(k; t) l = 0. Consequently, in that case, the system of equations (6) (10) 57

61 is equivalent to the system of equations (1) (5), which is identical to the equation set holding for a single-priority MIN. The analytical model which was presented in the previous paragraphs extends again the ones presented in other works (e.g. [44]) by considering the state and transitions occurring within an additional clock cycle. This enhancement improves the accuracy of the performance parameters calculation (throughput and delay). The dependencies among the queues of each SE(k) of the MIN and state transitions presented above have been incorporated in the simulation logic of the experiments presented in section Performance Evaluation Methodology of Dual Priority MINs In order to evaluate the performance of a dual-priority MIN the following metrics are used. Let Th and D be the normalized throughput and normalized delay of a MIN as described at chapter 2. Relative normalized throughput RT h(h) of high priority packets is the normalized throughput Th(h) of such packets divided by the corresponding ratio of oered load r h. RTh(h) = Th(h) r h (3.13) Similarly, relative normalized throughput RT h(l) of low priority packets can be expressed by the ratio of normalized throughput T h(l) of such packets to the respective input ratio r l. RTh(l) = Th(l) r l (3.14) This extra normalization of both high and low priority trac leads to a common value domain needed for comparing their absolute performance values with those obtained by the corresponding single priority MINs. Thus, in the diagrams of the next section we will compare the relative normalized throughput of dual-priority MINs with the normalized throughput of single-priority ones. Universal performance factor Upf(h) of high priority packets can be dened - as the previous chapter - by a relation involving the two major above factors, D(h) and RT h(h). Because the performance of high priority trac of a MIN is considered optimal when D(h) is minimized and RT h(h) is maximized, the formula for computing the universal factor arranges so that the overall performance metric follows that rule. Formally, Upf(h) can be expressed by Upf(h) = [ ] 2 1 RTh(h) w d [D(h) 1] 2 + w th (3.15) RT h(h) Similarly, Universal performance factor Upf(l) of low priority packetscan be dened as 58

62 Upf(l) = [ ] 2 1 RTh(l) w d [D(l) 1] 2 + w th (3.16) RT h(l) In the remaining of this chapter we will consider both component factors throughput and delay of equal importance, setting thus w d = w th = 1. Finally, we list the major parameters aecting the performance of a MIN. Buer-size b of a queue is the maximum number of packets that an input buer of a SE can hold. In this study we consider a nite-buered (b = 1; 2; 4; 8) MIN. Oered load is the steady-state xed probability of such arriving packets at each queue on inputs. In our simulation is assumed to be = 0:1; 0:2; :::; 0:9; 1. Ratio of high priority oered load r h is dened by r h = h. In this chapter r h is assumed to be r h = 0:20; 0:30. Similarly, the ratio of low priority oered load r l can be expressed by r l = l =. It is obvious that that r h + r l = 1. Consequently, r l is assumed to be r l = 0:80; 0:70 respectively. Number of stages n is the number of stages of an (N X N) MIN, where n = log 2 N. In our simulation n is assumed to be n = 6; 8; 10, which is a widely used MIN size. 3.5 Simulation and Performance Results of Dual Priority MINs In this chapter we developed a special simulator in C++, capable of handling 2-class priority MINs, where each (2X2) SE was modelled by four non-shared buer queues; the rst two for high priority packets, and the other two for low priority ones. We simulated two dierent congurations, where the buer queues were located either in front of crossbar Switching Element (SE) or behind it. At the following algorithms buer queues are considered to be located after the crossbar segment of SE. The simulator has also several other parameters such as the buer-length, the number of input and output ports, the number of stages, the oered load, and the ratio of high priority packets. The contention between two packets was solved randomly, but when a 2-class priority mechanism was used, high priority packets had precedence (algorithms 3.1 and 3.3) over low priority ones (algorithms 3.2 and 3.3), and contentions were resolved by favoring the packet transmitted from the queue in which the high priority packets were stored in (algorithm 3.3). Finally, the simulations were performed at packet level, assuming xed-length packets transmitted in equal-length time slots, while the number of simulation runs was again adjusted at 10 5 clock cycles with an initial stabilization process of 10 3 network cycles, ensuring a steady-state operating condition. 59

63 Hp SendQueue Process (cs id ; sq id ; bm) Input: Current stage id (cs id ) ; send-queue id (sq id ) of current stage ; blocking mechanism (bm). Output: Population for send- and accept-queue (Hp P op) of high priority packets; total number of serviced and blocked packets for send-queue (Hp Serviced; Hp Blocked) of high priority packets respectively ; total number of packet delay cycles for send-queue (Hp Delay) of high priority packets ; routing address Hp RA of each buer position of high priority queue. { if (Hp Pop[sq id ][cs id ] > 0) // high priority send-queue is not empty { Hp RA bit = get bit(hp RA[sq id ][cs id ][1]; cs id ); // get the (cs id ) th bit of Routing Address (Hp RA) of the leading // high priority packet of send-queue by a cyclic logical left shift if (Hp RA bit = 0) // upper port routing aq id = 2 (sq id %(N=2)) ; // link for perfect shue algorithm else // lower port routing aq id = 2 (sq id %(N=2)) + 1 ; // link for perfect shue algorithm // where aq id is the accept-queue id of next stage if (Hp Pop[aq id ][cs id + 1] = Hp B) // blocking state // where Hp B is the buer-size of high priority queues { Hp Blocked[sq id ][cs id ] = Hp Blocked[sq id ][cs id ] + 1 ; if (bm = blm ) // block and lost mechanism { Hp Pop[sq id ][cs id ] = Hp Pop[sq id ][cs id ] 1 ; for (bf id = 1; bf id >= Hp Pop[sq id ][cs id ]; bf id + +) Hp RA[sq id ][cs id ][bf id ] = Hp RA[sq id ][cs id ][bf id + 1] ; // where Hp RA is the Routing Address of the high priority // packet located at (bf id ) th position of send-queue } } else // unicast forwarding { Hp Serviced[sq id ][cs id ] = Hp Serviced[sq id ][cs id ] + 1 ; Hp Pop[sq id ][cs id ] = Hp Pop[sq id ][cs id ] 1 ; Hp Pop[aq id ][cs id + 1] = Hp Pop[aq id ][cs id + 1] + 1 ; 60

64 } } Hp RA[aq id ][cs id + 1][Hp Pop[aq id ][cs id + 1]] = Hp RA[sq id ][cs id ][1] ; // where Hp RA is the Routing Address of high priority packet // located at (Hp Pop[aq id ][cs id + 1]) th and 1 th position // of accept- and send-queue respectively for (bf id = 1; bf id >= Hp Pop[sq id ][cs id ]; bf id + +) Hp RA[sq id ][cs id ][bf id ] = Hp RA[sq id ][cs id ][bf id + 1] ; } Hp Delay[sq id ][cs id ] = Hp Delay[sq id ][cs id ] + Hp Pop[sq id ][cs id ] ; return Hp Pop; Hp Serviced; Hp Blocked; Hp Delay; Hp RA ; Algorithm 3.1: Send-queue process of high priority packets for dual-priority MINs Lp SendQueue Process (cs id ; sq id ; bm) Input: Current stage id (cs id ) ; send-queue id (sq id ) of current stage ; blocking mechanism (bm). Output: Population for send- and accept-queue (Lp P op) of low priority packets; total number of serviced and blocked packets for send-queue (Lp Serviced; Lp Blocked) of low priority packets respectively ; total number of packet delay cycles for send-queue (Lp Delay) of low priority packets ; routing address Lp RA of each buer position of low priority queue. { if (Lp Pop[sq id ][cs id ] > 0) // low priority send-queue is not empty { Lp RA bit = get bit(lp RA[sq id ][cs id ][1]; cs id ); // get the (cs id ) th bit of Routing Address (Lp RA) of the leading // low priority packet of send-queue by a cyclic logical left shift if (Lp RA bit = 0) // upper port routing aq id = 2 (sq id %(N=2)) ; // link for perfect shue algorithm else // lower port routing aq id = 2 (sq id %(N=2)) + 1 ; // link for perfect shue algorithm // where aq id is the accept-queue id of next stage 61

65 } } if (Lp Pop[aq id ][cs id + 1] = Lp B) // blocking state // where Lp B is the buer-size of low priority queues { Lp Blocked[sq id ][cs id ] = Lp Blocked[sq id ][cs id ] + 1 ; if (bm = blm ) // block and lost mechanism { Lp Pop[sq id ][cs id ] = Lp Pop[sq id ][cs id ] 1 ; for (bf id = 1; bf id >= Lp Pop[sq id ][cs id ]; bf id + +) Lp RA[sq id ][cs id ][bf id ] = Lp RA[sq id ][cs id ][bf id + 1] ; // where Lp RA is the Routing Address of the low priority // packet located at (bf id ) th position of send-queue } } else // unicast forwarding { Lp Serviced[sq id ][cs id ] = Lp Serviced[sq id ][cs id ] + 1 ; Lp Pop[sq id ][cs id ] = Lp Pop[sq id ][cs id ] 1 ; Lp Pop[aq id ][cs id + 1] = Lp Pop[aq id ][cs id + 1] + 1 ; Lp RA[aq id ][cs id + 1][Lp Pop[aq id ][cs id + 1]] = Lp RA[sq id ][cs id ][1] ; // where Lp RA is the Routing Address of low priority packet // located at (Lp Pop[aq id ][cs id + 1]) th and 1 th position // of accept- and send-queue respectively for (bf id = 1; bf id >= Lp Pop[sq id ][cs id ]; bf id + +) Lp RA[sq id ][cs id ][bf id ] = Lp RA[sq id ][cs id ][bf id + 1] ; } Lp Delay[sq id ][cs id ] = Lp Delay[sq id ][cs id ] + Lp Pop[sq id ][cs id ] ; return Lp Pop; Lp Serviced; Lp Blocked; Lp Delay; Lp RA ; Algorithm 3.2: Send-queue process of low priority packets for dual-priority MINs 62

66 DualPriority SEs Process (cs id ; use id ; bm) Input: Current stage id (cs id ) ; Switching Element id (use id ) of upper segment ; blocking mechanism (bm). { lse id = use id + N 4 ; // where lse id is the adjacent Switching Element (SE) of lower segment r = random(); // where r [0::1) if (r < 0:5) // upper segment is clocked rstly { // process for upper queues of SEs if (Hp Pop[2 use id ; cs id ] > 0) // high priority packets have precedence over low priority ones Hp SendQueue Process (cs id ; 2 use id ; bm); else // high priority queue is empty Lp SendQueue Process (cs id ; 2 use id ; bm); if (Hp Pop[2 lse id ; cs id ] > 0) // high priority packets have precedence over low priority ones Hp SendQueue Process (cs id ; 2 lse id ; bm); else // high priority queue is empty Lp SendQueue Process (cs id ; 2 lse id ; bm); // process for lower queues of SEs if (Hp Pop[2 use id + 1; cs id ] > 0) // high priority packets have precedence over low priority ones Hp SendQueue Process (cs id ; 2 use id + 1; bm); else // high priority queue is empty Lp SendQueue Process (cs id ; 2 use id + 1; bm); } if (Hp Pop[2 lse id + 1; cs id ] > 0) // high priority packets have precedence over low priority ones Hp SendQueue Process (cs id ; 2 lse id + 1; bm); else // high priority queue is empty Lp SendQueue Process (cs id ; 2 lse id + 1; bm); 63

67 else // lower segment is clocked rstly { // process for upper queues of SEs if (Hp Pop[2 lse id ; cs id ] > 0) // high priority packets have precedence over low priority ones Hp SendQueue Process (cs id ; 2 lse id ; bm); else // high priority queue is empty Lp SendQueue Process (cs id ; 2 lse id ; bm); if (Hp Pop[2 use id ; cs id ] > 0) // high priority packets have precedence over low priority ones Hp SendQueue Process (cs id ; 2 use id ; bm); else // high priority queue is empty Lp SendQueue Process (cs id ; 2 use id ; bm); // process for lower queues of SEs if (Hp Pop[2 lse id + 1; cs id ] > 0) // high priority packets have precedence over low priority ones Hp SendQueue Process (cs id ; 2 lse id + 1; bm); else // high priority queue is empty Lp SendQueue Process (cs id ; 2 lse id + 1; bm); } } if (Hp Pop[2 use id + 1; cs id ] > 0) // high priority packets have precedence over low priority ones Hp SendQueue Process (cs id ; 2 use id + 1; bm); else // high priority queue is empty Lp SendQueue Process (cs id ; 2 use id + 1; bm); Algorithm 3.3: Switching Element process for dual-priority MINs 64

68 3.5.1 Dual Priority MINs vs.single Priority Ones In this chapter we address the performance evaluation of the 2-class priority scheme for MINs, aiming to get insight on the eects of each factor on the overall performance of this MIN class. In this section, we present our ndings and compare dierent congurations of 2-class priority MINS; we also compare the performance metrics of 2-class priority MINs against the single-priority MIN class. Figure 3.4: Th total of nite-buered, 10-stage, dual- vs. single-priority MINs Figure 3.4 illustrates the gains on total normalized throughput of a MIN using a 2- class priority scheme versus a single priority one. In the diagram, curve 2P[10]B[b]H[20] depicts the total normalized throughput of a 2-class priority, 10-stage MIN, under various buer-length setups (b = 1; 2; 4), when the ratio of high priority packets is 20%. Similarly, curve 1P[10]B[b] shows the corresponding normalized throughput of a single priority, 10- stage MIN, under the same buer-length setups (b = 1; 2; 4). In this gure, all curves represent the performance factor of normalized throughput at dierent oered loads ( = 0:1; 0:1 1). We can notice here that the gains on total normalized throughput of a 2-class priority scheme for a 10-stage MIN versus a single priority one are 23%, 12.6%, and 7.4%, when the buer-lengths are 1, 2, and 4 respectively, under a high-priority appearance of 20%, and full load trac conditions. The throughput gains can be mainly attributed to the exploitation of the extra buer spaces available in the SEs of the 2-class priority MINs: recall that SEs in single-buered MIN supporting one priority class have a single buer available per incoming link; in single-buer MINs supporting two priorities, however, SEs have one buer for high-priority packets and one buer for low-priority packets per input link. The normalized throughput of single-buer MINs supporting two priorities appears though inferior to that of double-buered single-priority MINs in gure 3.4, because the extra buer available in dual-priority MINs is exploited only for high-priority packets 65

69 (20% of the total trac), and thus remains unexploited when no high-priority packets are available. Contrary to that, double-buered single-priority MINs can exploit the extra buer space for any packet, with no restriction whatsoever. In gure 3.4 we can nally notice that the input load at which the dual-priority MIN's performance starts to have an edge over its single-priority counterpart is smaller for single-buered MINs ( = 0:3) and smaller for double-buered ( = 0:5) and quad-buered MINs ( = 0:6). These loads correspond to the points where the probability that single-priority MIN buers are full (leading thus to packet blockings) exceeds a certain threshold, having therefore observable eects. Figure 3.5: RT h(h) of nite-buered, 10- stage, dual-priority MIN Figure 3.6: RT h(l) of nite-buered, 10- stage, dual-priority MIN Figure 3.5, depicts the metric of relative normalized throughput for high priority packets in a MIN using the 2-class priority scheme and the (overall) relative normalized throughput for single-priority MINs. All measurements apply to a 10-stage MIN, and when packets of two priorities are considered, they account to 20% of the overall trac; measurements have been collected for buer lengths b = 1; 2; 4. It is worth noting that the relative normalized throughput of high priority packets is improved dramatically for all conguration setups, approaching the optimal value (Th max = 1), especially when b >= 2, under full load trac conditions. Practically, for b >= 2 and under the examined conditions, blockings events for high-priority packets were very rare. Figure 3.6 illustrates the relative normalized throughput for low priority packets in a MIN using the 2-class priority scheme and the (overall) relative normalized throughput for single-priority MINs. Considering the performance curves for MIN pairs (dual-priority and single-priority) with equal buer sizes (b = 1; 2; 4), we can identify three segments: An initial segment where the performance of single-priority MINs is identical to that of its dual-priority counterpart. This segment corresponds to the load range that 66

70 the available buer space in the single-priority MIN is adequate, and blockings are mostly due to packets in the same SE contending for the same output link, rather than due to buer unavailability at the next MIN stage. A middle segment, where the normalized throughput of the low-priority packets in the two-priority MIN is superior to the (overall) normalized throughput in a singlepriority MIN. The beginning of this segment corresponds to the load point where blockings due to buer unavailability begin to play a part in the MIN performance. In this segment, the gains obtained from the exploitation of the extra buer space in the dual priority MINs is higher than the penalization incurred for low-priority packets, due to the fact that they yield to high-priority ones. An ending segment, where the normalized throughput of the low-priority packets in the two-priority MIN is inferior to the (overall) normalized throughput in a singlepriority MIN. This corresponds to the load range where the yielding of low-priority packets incurs higher penalty than the gains obtained due to the availability of the extra buer space. Especially at loads close to 1, buer space for low-priority packets is already saturated and low-priority packets are further delayed because high-priority packets are preferred for transmission, when present. In all cases, the maximum deterioration recorded is 15.6% for b = 1, 13% for b = 2 and 8.48% for b = 4. This deterioration can be considered as tolerable, especially considering the gains achieved for high-priority packets. Figure 3.7: D(h) of nite-buered, 10-stage, dual-priority MIN Figure 3.8: D(l) of nite-buered, 10-stage, dual-priority MIN Figure 3.7 represents the corresponding decrements on normalized delays for high priority packets of 2-class priority scheme vs. single priority one for a 10-stage MIN, 67

71 under a rate appearance of 20% for high priority oered loads. It noteworthy, that the improvement of high priority packet delays is considerable for all above buer-length congurations of MIN. It follows that normalized delay is reduced dramatically to D(h) = 1:07 1:09 approaching the optimal value D min = 1. It also follows that the minimization of normalized delays for high-priority packets in a 2-class priority scheme is stronger at larger buer-length congurations, where the packet delays have greater values in the corresponding single priority MINs. Figure 3.8 illustrates the normalized delay for low priority packets in a MIN using the 2-class priority scheme vs. single priority one. Similarly to the case of normalized throughput for low-priority packets, when examining the performance curves for MIN pairs (dual-priority and single-priority) with equal buer sizes (b = 1; 2; 4), we can identify three segments respectively. Figure 3.9: RT h(h) of nite-buered, k- stage, dual-priority MIN Figure 3.10: RT h(l) of nite-buered, k- stage, dual-priority MIN Figures 3.9 and 3.10 depict the relative normalized throughput for high and low priority packets respectively, in a k-stage MIN, where k = 6; 8; 10, using a 2-class priority scheme, under a packet appearance of 30% for high priority oered load, and full trac conditions versus the buer-length of MIN. A high-priority packet ratio of 30% was used in these diagrams, to make the eects of the introduction of priority handling more discernible, especially for low priority packets (results for high-priority packets for the 20% ratio case are similar, showing only a slight improvement for b = 1). We noticed again that the relative normalized throughput of high priority packets is improved dramatically for all network size setups, approaching the optimal value (Th max = 1), especially when b >= 2. On the other hand, the loss of normalized throughput for corresponding low priority packets ranged from 9% to 24.6%, which is tolerable for all network size and 68

72 buer-length congurations. We can also notice that the relative normalized throughput appears to drop as the number of MIN stages increases (for low-priority packets and for single-priority MINs): this happens because although the overall number of packets traversing the network in the unit of time increases along with the number of stages, this increment is less than the theoretical growth of the MIN routing capacity, which the denition of the relative normalized throughput takes into account (recall that the normalized throughput metric divides the number of packets traversing the network in the unit of time by the network size, to express the extent to which the MIN's routing capacity is exploited). An equivalent reading of this phenomenon is that less packets per input source reach their destination per unit of time, when the MIN size increases. This performance degradation is due to the fact that each extra MIN stage introduces an additional point that blockings may occur, mainly due to contentions for the same output link. This is especially true under the full load condition considered in Figure 3.10, while for lighter MIN loads, the drop is less observable. MIN designers should take into account this fact when they need to upsize their network installations, and take additional actions if they want to maintain the throughput per input source; two prominent approaches are the super-linear increase of the network size (leaving some inputs unconnected) and the addition of extra buer space in the SEs. Figure 3.11: Upf(h) of nite-buered, 10- stage, dual-priority MIN Figure 3.12: Upf(l) of nite-buered, 10- stage, dual-priority MIN Finally, gures 3.11 and 3.12 illustrate the relation of the combined performance indicator Upf of a 2-class, 10-stage MIN to the oered load, for high and low priority packets respectively, under dierent buer size congurations (b = 1; 2; 4), when the ratio of high priority oered load is 20%. Recall from section 3.3, the combined performance 69

73 indicator Upf depicts the overall performance of a MIN, considering the weights of each individual performance factor (throughput and packet delay) are of equal importance. In gure 3.11 we notice that the value of the universal performance factor decreases (thus MIN performance is improved) when the buer size increases, except for the case of singlepriority MINs with b = 4 and operating under medium and high loads ( >= 0:6), in which case the universal performance factor deteriorates. This holds because the delay in these cases increases rapidly, while gains in the throughput are very small. In gure 3.12 we can observe the behaviour of the universal performance factor for low priority packets in dual-priority MINs, and the (overall) universal performance factor for single-priority MINs when considering dierent oered loads. Consistently with the respective ndings for normalized throughput and delay, three segments are identied when examining the performance curves for MIN pairs (dual-priority and single-priority) with equal buer sizes (b = 1; 2; 4): an initial segment with identical performance among pairs, a middle segment where the dual-priority MIN outperforms the single-priority one and a nal segment where the dual-priority MIN lags behind the single-priority one. This is to be expected since the universal performance factor combines the individual metrics of normalized throughput and delay, and since these metrics exhibit a common behaviour, this behaviour is also exhibited in the combined metric Dual Priority MINs with Asymmetric-sized Buer Queues Figure 3.13: Th total of asymmetric-sized, 10-stage, dual-priority MIN In this subsection we introduce a variation of double-buered SEs that uses asymmetric buer sizes in order to oer dierent quality-of-service parameters to packets that have dierent priorities, while providing in parallel optimal overall network performance. We 70

74 note here that the particular buer sizes have been chosen since they have been reported at previous subsection to provide optimal overall network performance: indeed, it is observed that for smaller buer sizes (1) the network throughput drops due to high blocking probabilities, whereas for higher buer sizes (4 and 8) packet delay increases signicantly (and the SE hardware cost also raises). In gure 3.13, curves 1P[10]B[b] depict the normalized throughput of a 10-stage MIN, under a single priority mechanism, when the buer-length is b = 2; 4. Similarly, curves 2P[10]B[b l,b h ]H[20] show the total normalized throughput of a 10-stage MIN, under a 2-class priority mechanism, when the buer-length of low and high priority packets is b l = 2; 3 and b h = 2; 1 respectively, and the probability of high priority packet appearance is 20%. According to this gure the gain for total normalized throughput of a doublebuered MIN, employing a 2-class priority mechanism (curve 2P[10]B[2,2]H[20]) vs. the corresponding single priority one (curve 1P[10]B[2]) is 12.6%, under full trac load. Considering that the rate of high priority packets is relatively low, and conguring thus a asymmetric buer-sized system (curve 2P[10]B[3,1]H[20]) the total normalized throughput is further improved 14.1%, approaching that of a single priority mechanism, when buer-length is b = 4, where all buers serve all packets. Figure 3.14: RT h(h) of asymmetric-sized, 10-stage, dual-priority MIN Figure 3.15: RT h(l) of asymmetric-sized, 10- stage, dual-priority MIN Figures 3.14 and 3.15 depict the relative normalized throughput of high and low priority packets respectively. According to gure 3.14 both curves employing the 2-class priority mechanism approach the optimal value Th max = 1 of this performance factor. It is obvious that, when the setup of buer-length for high priority packets is b h = 2 (curve 2P[10]B[2,2]H[20]), the relative normalized throughput appears further improved, but the gains are marginal. Figure 3.15 presents the case of low-priority packet throughput; in 71

75 this gure we can observe that the relative normalized throughput of low priority packets is considerably better, when the setup of buer-length for high priority packets is b h = 1 (curve 2P[10]B[3,1]H[20]), as compared to the case of having equal-size buers (curve 2P[10]B[2,2]H[20]), for high and low priority packets. The performance dierence between the two setups is approximately 20% for medium and high network loads ( >= 0:5). We can also observe that the asymmetric-sized buer setup oers superior service to the low-priority packets as compared to the single-priority scheme, mainly owing to the one additional buer position available in the asymmetric setup to packets of this class. The performance improvement appears for medium and high network loads ( >= 0:5) and ranges from 8% to 21%. Figure 3.16: D(h) of asymmetric-sized, 10- stage, dual-priority MIN Figure 3.17: D(l) of asymmetric-sized, 10- stage, dual-priority MIN Figures 3.16 and 3.17 present the ndings for the normalized delay performance metric. In gure 3.16 we observe that both 2-priority schemes (i.e. the equal-sized buer and the asymmetric-sized buer scheme) have a clear edge over the single-priority mechanism, which ranges from 18% at 30% load to over 96% at full load. The dierence however between the performance of the equal-sized buer scheme and the asymmetric-sized buer scheme is very small, less than 4% in all cases. Conversely, when low priority packets are considered (gure 3.17), the equal-sized buer scheme is found to have delays close to the single-priority scheme, with the worst case being a deterioration of 6.7% at oered load = 1. In the asymmetric-sized buer setup however, the deterioration is considerable, especially at high loads (13% at = 0:6 rising up to 24.4% as compared to the equal-sized buer setup at = 1). Figures 3.18 and 3.19 depict the behavior of the universal performance factor metric for high- and low-priority packets, respectively, in correlation to the oered load. We can 72

76 Figure 3.18: Upf(h) of asymmetric-sized, 10- stage, dual-priority MIN Figure 3.19: Upf(l) of asymmetric-sized, 10- stage, dual-priority MIN observe in gure 3.18 that when the load of the network is relatively low ( <= 0:4), all congurations have identical performance; however, when the network load increases, the overall performance of the single-priority conguration quickly deteriorates, as compared to the setups supporting two priorities. The asymmetric-sized buer conguration shows almost identical performance to the equal-size buer conguration in this case, and both these performances are close to the optimal one. Regarding low-priority packets, again the overall performance of all congurations is identical for light network loads ( <= 0:4). Beyond this point, the single-priority setup exhibits the most stable behaviour, with the value of the universal performance factor for low-priority packets Upf(l) being close to 1.5; the single-priority setup has a clear advantage over the dual-priority schemes for oered loads >= 0:7. The congurations supporting two priorities exhibit a wider performance uctuation, with the asymmetric-sized buer conguration having a performance edge for network loads between 0.5 and 0.6, while for network loads >= 0:7 the performance advantage moves to the equal-size buer conguration side, not exceeding though 5.5% in any case. 3.6 Conclusions for Dual Priority MINs In this chapter we have addressed the performance evaluation of dual priority MINs. We have presented an analytical model for their operation, employing a scheme that takes into account both the previous and the last state of the switching elements, providing thus better accuracy than schemes considering only the last state. We have also evaluated the 73

77 performance of 2-class priority MINS under varying oered loads and buer sizes, considering the high-priority and low-priority packet classes, as well as cumulative performance for the MIN, and compared these metrics against the corresponding performance gures of single-priority MINs. In this chapter, we have taken into account the two most important network performance metrics namely throughput and packet delay. The diagrams and discussions given may be used by network designer to tune parameters for their installations so as to obtain optimal performance for the communication requirements of their environments. Moreover we have introduced an asymmetric buer size conguration for MINs supporting two packet priority classes and compared its performance against both the singlepriority scheme and the equal-sized buer conguration of two packet priority classes MINs under dierent trac loads. The asymmetric-sized buer conguration has been found to better exploit network resources and capacity, since available buers are more ttingly allocated to the priority class needing them. More specically, when comparing the asymmetric buer size conguration against its equal-sized buer counterpart, we found that the former provides better overall throughput and signicantly better low-priority packet throughput; for high-priority packets on the other hand, the performance of the two schemes is almost identical, with the equal-sized buer scheme having a small edge. The asymmetric-sized buer conguration achieves these performance benets because it better matches buer allocation to the shape of network trac. 74

78 Chapter 4 2-Class Priority Multi-Layer MINs under Hotspot Traffic 4.1 Introduction 4.2 Analysis of 2-Class Priority Multi-Layer MINs under Hotspot Environment 4.3 Performance Evaluation Parameters and Methodology of 2-Class Priority MINs under Hotspot Environment 4.4 Simulation and Performance Results of 2-Class Priority Multi-Layer MINs under Hotspot Environment Simulator Validation for 2-Class Priority MINs Class Priority Single-Layer MINs Performance under Hotspot Environment Class Priority Multi-Layer MINs Performance under Hotspot Environment 4.5 Conclusions for 2-Class Priority Multi-Layer MINs under Hotspot Environment 4.1 Introduction Multistage Interconnection Networks (MINs) with crossbar Switching Elements (SEs) are frequently proposed as an interconnection infrastructure in parallel multiprocessor systems and network systems alike. In the domain of parallel systems, MINs undertake processor-to-memory communication, whereas in network systems they are typically used in communication devices such as gigabit Ethernet switches, terabit routers, and ATM switches. The signicant advantages of MINs are their good performance, their low 75

79 cost/performance ratio and their ability to route multiple communication tasks concurrently. MINs with the Banyan [16] property, such as Omega Networks [26], Delta Networks [35], and Generalized Cube Networks [2] are generally preferred over non-banyan MINs, since the latter are in general more expensive than Banyan MINS and more complex to control. Due to the advent of MINs, much research has been devoted to the investigation of their performance under various congurations and trac conditions, and proposals have been made for the improvement of their performance. The main aspects that have been considered in these works are the buer size of switching elements (e.g. [19, 58]), MIN size (number of stages - e.g. [58, 59]), trac patterns (including uniform vs. hotspot e.g. [14, 38, 39, 58, 6] and unicast vs. broadcast/multicast e.g. [48, 34]), and packet priorities (e.g. [38, 39]). Performance evaluation has followed two distinct paths, the rst one employing analytical methods such as Markov chains, queuing theory and Petri nets, while the second path uses simulation. Architectural issues (e.g. multilayer congurations [50] and wiring [29]) and routing algorithms (e.g. [54]) have also been considered in research eorts. MIN performance under hotspot trac and multiple priorities is receiving increasing attention, due to their correspondence with trac patterns in real-world systems. Packet priority is a common issue in networks, arising when some packets need to be oered better quality of service than others. Packets with real-time requirements (e.g. from streaming media) vs. non real-time packets (e.g. le transfer), and out-of-band data vs. ordinary TCP trac [43] are two examples of such dierentiations. On the other hand, hotspot trac is a typical situation when a server is deployed in some environment and clients access it frequently to obtain data and services, or when multiple network devices are interconnected via trunk ports. Insofar, however, the joint eect of packet priorities and hotspot trac on the performance of MINs has not received adequate research attention. [39] is a work that has reported on this issue, but discusses an extreme hotspot situation, where all inputs send trac to a specic output link and, additionally, all high-priority trac is sent by a single input. Moreover, the MINs considered in these works are singlebuered, while at previous chapters have shown that using double buering or asymmetric buering leads to elevated performance. In this chapter we examine performance aspects of 2-class priority MINs under hotspot trac conditions, considering dierent rates of oered load. We additionally take into account the dierences in the performance of the MIN outputs under hotspot trac identied in [37], according to which the performance of each output depends on the amount of overlapping that the path to the specic output has with the path to the hotspot output. We present metrics for the two most important network performance factors, namely throughput, delay and we also calculate and present the performance in terms of the Universal performance factor as introduced at previous chapters, which combines throughput and delay into a single metric, allowing the designer to express the perceived importance of each individual factor through weights. 76

80 The rest of this chapter is organized as follows: in section 4.2 we briey analyze the operation a Delta Network operating under hotspot trac conditions and natively supporting 2-class routing trac. Subsequently, in section 4.3 we introduce the performance criteria and parameters related to this network. Section 4.4 presents the results of our performance analysis, which has been conducted through simulation experiments, while section 4.5 concludes this chapter. 4.2 Analysis of 2-Class Priority Multi-Layer MINs under Hotspot Environment Recall from the previous chapters a Multistage Interconnection Network (MIN) is generally dened as a network interconnecting a group of N inputs to a group of M outputs using several stages of small size Switching Elements (SEs). Each SE has a number of input and output links (this number is called the degree of the SE) and is followed (or preceded) by link states. MINs with the Banyan property are dened in [16] and are characterized by the fact that there is exactly a unique path from each source (input) to each sink (output). A Banyan MIN of size (N X N) (i.e. connecting N inputs to N outputs) can be constructed by n = log c N stages of (cxc) SEs, where c is the degree of the SEs. At each stage there are exactly N c SEs. An example MIN of size 8x8 is illustrated in gure 4.1. This MIN is assumed to natively support two priorities and have a single hotspot output, namely output 0, to which all inputs (0-7) direct an increased share of the trac they generate. Under this trac scheme, all SEs can be classied into two dierent groups: Group-hst and Group-nt, where hst stands for those SEs which receive and forward hotspot trac, while nt stands for those SEs in which receive only normal trac; i.e. they are free of hotspot trac. In gure 4.1 we can distinguish the following categories of outputs: Figure 4.1: An 8X8 delta-2 network with hotspot trac 77

81 output 0, which is the hotspot output. output 1, which is the output adjacent to the hotspot output. Packets directed to this output have to contend with packets addressed to the hotspot output at all stages of the MIN, and they are free of such contention only when traversing the output link. outputs 2 and 3, which are free of contention with packets addressed to the hotspot output when they traverse the last stage of the MIN. These outputs are termed as Cold-1, since they are free of contention with hotspot trac for one stage. outputs 4-7, which are free of contention with packets addressed to the hotspot output when they traverse the last two stages of the network and thus are termed as Cold-2. Generalizing, in an i-stage MIN, its output ports can be classied into the following (i + 1) zones: hotspot, adjacent, and cold-j (1 <= j <= i 1). Regarding priority support for the MIN depicted in gure 4.1, we can observe that individual queues have been added for both high and low priority packets. Thus, each SE has two transmission queues per link, with one queue dedicated to high priority packets and the other dedicated to low priority ones. Each queue is assumed to have two buer positions for incoming packets. Figure 4.2: A lateral view of an 8X8 multilayer MIN At this chapter we also extend previous studies by considering multi-layer MINs, where the lateral view of a typical conguration of (8X8) multi-layer MIN is depicted at gure 4.2 and outlined below. The example network consists of two segments, an initial singlelayer one and a subsequent multi-layer one (with 2 layers), operating under hotspot trac environment. According to gure 4.2, it is worth noting that packet forwarding from stage 2 to stage 3 is blocking-free, since packets in stage-2 SEs do not contend for the same output link. Note that according to [50], blocking can occur at the MIN outputs, where SE outputs are multiplexed, if either the multiplexer or the data sink do not have enough capacity; in this chapter however we will assume that both multiplexers and data sinks have adequate capacity. Summarizing the above, a dual-priority, nite-buered MIN is assumed to operate under the following conditions at hotspot environment: Routing is performed in a pipeline manner, meaning that the routing process occurs in every stage in parallel. Internal clocking results in synchronously operating switches in a slotted time model [44], and all SEs have deterministic service time. 78

82 At each input of the network only one packet can be accepted within a time slot. All packets in input ports contain both the data to be transferred and the routing tag. The priority of each packet is indicated through a priority bit in the packet header. Under the 2-class priority mechanism, when applications enter a packet to the network they specify its priority, designating it either as high or low. The oered load in all inputs of the network is uniform, all packets have the same size and the arrivals are independent of each other. There is a FIFO buer in front of each SE enabling the packets of a message to be stored until they can be forwarded to the succeeding stage in the network. The backpressure mechanism deals with packets directed toward full buers of the next stage, forcing them to stay in their current stage until the destination/s become/s available, so that no packets are lost inside the MIN. Under the 2-class priority scheme, the SE considers all its links, examining for each one of them rstly the high priority queue. If this is not empty, it transmits the rst packet towards the successive MIN stage; the low priority queue is checked only if the corresponding high priority queue is empty. In all cases, at most one packet per link (upper or lower) of a SE will be forwarded for each pair of high and low priority queues to the next stage. Conicts between packets are solved randomly with equal probabilities. There is an initial fraction f hs of the total oered load that routed to the single hotspot output port. This fraction is exclusively low-priority trac. The remaining packets, i.e. [ (1 f hs )] are both high- and low-priority packets and are uniformly distributed across all destinations. That means every output of the network except for hotspot has an equal probability of being one of the destinations of a packet. Note also that this input rate [ (1 f hs )] is addressed to all outputs, including the hotspot one, thus an additional load of [( (1 f hs ))=N] is routed towards the hotspot output (including high- and low-priority packets). Packets are removed from their destinations immediately upon arrival, thus packets cannot be blocked at the last stage. 4.3 Performance Evaluation Parameters and Methodology of 2- Class Priority MINs under Hotspot Environment The following major parameters aect the performance of the test-bed 2-class priority MIN under a single hotspot environment. Buer-size b of a high or low priority queue is the maximum number of such packets that the corresponding input buer of a SE can hold. In this chapter we consider 79

83 a double-buered MIN, where (b = 2). We note here that the particular buer size has been chosen since it has been at previous chapters reported to provide optimal overall network performance: indeed, it is observed that for smaller buer sizes (1) the network throughput drops due to high blocking probabilities, whereas for higher buer sizes (4 and 8) packet delay increases signicantly (and the SE hardware cost also raises). Oered load is the steady-state xed probability of arriving packets at each queue on inputs. In our simulation is assumed to be = 0:1; 0:2; :::; 0:9; 1. This probability can be further broken down to hs, hp and lp,, which represent the arrival probability of the initial hotspot trac, and the high and low priority trac of the rest oered load respectively. It holds that = hs + hp + lp. Number of stages n is the number of stages of an (N X N) MIN, where n = log 2 N. In our simulation n is assumed to be n = 6, which is a widely used MIN size. Hotspot fraction f hs is the fraction of the initial hotspot trac which is considered to be f hs = 0:05. We x f hs to this value, since using a higher value for a network of this size would lead to quick saturation of the paths to the hotspot output. Ratio of high priority packets r hp, is the ratio of high priority oered load for the normal trac - i.e. excluding the trac addressed to the initial hotspot - which is uniformly distributed among all output ports and it is assumed to be r hp = 0:20. This ratio is generally adopted in works considering multiple priorities ([38, 39] ). Consequently, hs = f hs hp = r hp (1 f hs ) lp = (1 r hp ) (1 f hs ) Aiming to analyze the performance evaluation of a (N X N) Delta Network with n = log 2 N intermediate stages of (cxc) SEs, the following metrics are used. Let T be a relatively large time period divided into u discrete time intervals ( 1 ; 2 ; ; u ). Average throughput Th avg (zone) of a specic output zone of MIN, where zone = {hotspot; adjacent; cold 1; ; cold (n 1)} is the average number of packets accepted by all destination ports of this zone per network cycle. Formally, Ôh avg (zone) is dened as uk=1 n Th zone (k) avg (zone) = lim u (4.1) u where n zone (k) denotes the total number of packets routed to this specic output zone that reach their destinations during the k th time interval. Normalized throughput Th(zone) of a specic output zone of MIN is the ratio of the corresponding average throughput Ôh avg (zone) to the total number of output ports N(zone). Formally, Th(zone) can be expressed by 80

84 Th(zone) = Th avg(zone) (4.2) N(zone) where N(zone) = 1; 1; 2; ; 2 n 1 for zone = {hotspot; adjacent; cold 1; ; cold (n 1)} respectively, reecting how eectively the network capacity of each output zone of MIN is used. Relative normalized throughput of hotspot trac RTh hs is the normalized throughput T h(hotspot) of the hotspot output port divided by the corresponding ratio of packets on all input ports which are routed to single hotspot output port. T h(hotspot) RTh hs = (4.3) N f hs + (1 r hp ) (1 f hs ) Relative normalized throughput of high priority trac RTh hp is the normalized throughput Th hp of high priority packets routed to all output zones divided by the corresponding ratio of high priority packets on input ports. Th hp RTh hp = (4.4) r hp (1 f hs ) We do not report dierent RTh hp for each zone, since our experiments have shown that this parameter is not aected by the zone when the MIN operates under the parameter ranges listed above. Relative normalized throughput of low priority trac RTh lp (zone) routed to a specic zone of output ports is the normalized throughput Th lp (zone) of such packets divided by the corresponding ratio of low priority packets on input ports. Th RTh hp (zone) lp (zone) = (4.5) (1 r hp ) (1 f hs ) Average packet delay D avg (zone) of packets routed to specic output zone of MIN is the average time the these packets spend to pass through the network. Formally, D avg (zone) is expressed by na (zone;u) k=1 t D d (zone; k) avg (zone) = lim u (4.6) n a (zone; u) where n a (zone; u) denotes the total number of packets accepted within u time intervals, while t d (zone; k) represents the total delay for the k th packet to traverse from an input port towards to a port of the specic output zone. We consider t d (zone; k) = t w (zone; k)+ t tr (zone; k) where t w (zone; k) denotes the total queuing delay for k th packet waiting at each stage for the availability of a buer at the next stage of the network. The second term t tr (zone; k) denotes the total transmission delay for k th packet at each stage of the network, that is just n nc, where n = log 2 N is the number of intermediate stages and nc is the network cycle. 81

85 Normalized packet delay D(zone) is the ratio of the D avg (zone) to the minimum packet delay which is simply the transmission delay n nc (i.e. zero queuing delay). Formally, D(zone) can be dened as D(zone) = D avg(zone) (4.7) n nc Universal performancefactor Upf(zone) is dened through a formula involving the two major above normalized factors, namely D(zone) and RT h(zone): the performance of a zone of MIN is considered optimal when D(zone) is minimized and RTh(zone) is maximized, thus the formula for computing the universal factor arranges so that the overall performance metric follows that rule. Formally, Upf(zone) can be expressed by 1 Upf(zone) = D(zone) 2 + (4.8) RTh(zone) 2 Recall from the previous chapters, it is obvious that, when the packet delay factor of a specic zone becomes smaller or/and the throughput factor of this zone becomes larger the corresponding universal performance factor Upf(zone) becomes smaller. Consequently, as the universal performance factor Upf(zone) becomes smaller, the overall performance for this specic zone of MIN is considered to improve. Because the above factors (parameters) have dierent measurement units and scaling, we normalize them to obtain a reference value domain. Normalization is performed by dividing the value of each factor by the (algebraic) minimum or maximum value that this factor may attain. Thus, equation (4.8) can be replaced by: Upf(zone) = [D(zone) D(zone) min D(zone) min ] 2 + [ RTh(zone) max ] 2 RTh(zone) (4.9) RT h(zone) where D(zone) min is the minimum value of the normalized packet delay D(zone) and RTh(zone) max is the maximum value of relative normalized throughput. Consistently to equation (4.8), when the universal performance factor Upf(zone), as computed by equation (4.9) is close to 0, the performance of the specic zone of MIN is considered optimal whereas, when the value of Upf(zone) increases, its performance deteriorates. Finally, taking into account that the values of both delay and throughput appearing in equation (4.9) are normalized, D(zone) min = RTh(zone) max = 1, thus the equation can be simplied to: Upf(zone) = [D(zone) 1] 2 + [ ] 2 1 RTh(zone) (4.10) RT h(zone) 82

86 4.4 Simulation and Performance Results of 2-Class Priority Multi- Layer MINs under Hotspot Environment The overall network performance of nite buererd MINs under hotspot environment was evaluated again by developing a special-purpose simulator in C++, capable to handle dual priority trac. This type of modeling [52] using simulation experiments was applied due to the complexity of the mathematical model (e.g. [53, 13] ), stemming from the combination of multi-priority with hotspot trac. Several input parameters such as the buer-length, the number of input and output ports, the initial hotspot fraction, the ratio of high priority packets, and the number of layers were considered (e.g. algorithm 4.1). Internally, each SE was modelled by four non-shared buer queues, the rst two dedicated for high priority packets, and the other two for low priority ones, where buer operation was based on the FCFS principle. All simulation experiments were performed at packet level, assuming xed-length packets transmitted in equal-length time slots, where the slot was the time required to forward a packet from one stage to the successive. The contention between two packets were resolved by favoring the packet transmitted from the queue in which high priority packets were stored in; contentions between equal-priority packets were resolved by choosing randomly one of the packets for transmission, whereas the other packet was blocked. InputQueue Process (iq id ; f hs ; r hp ; p ol ; DA hs ) Input: Input-queue id (iq id ) of the rst stage ; fraction f hs of the initial hotspot trac ; ratio of high priority r hp oered load of the normal trac ; probability of oered load p ol on inputs; Destination Address DA hs of the hotspot trac. Output: Population for high and low priority input-queues (Hp P op; Lp P op) resectively; total number of arrived, accepted and lost packets on each input-queue (Hs Arrived; Hp Arrived; Lp Arrived; Hs Accepted; Hp Accepted; Lp Accepted; Hs Lost; Hp Lost; Lp Lost) for hotspot, high and low priority trac respectively ; routing address Hp RA; Lp RA of each buer position of high and low priority input-queue resectively. { rp pg = random() ; // where rp pg [0::1) if (rp pg < p ol ) // a packet generation on input queue at current time slot { rp hs = random() ; // where rp hs [0::1) if (rp hs < f hs ) ; // the generated packet is designated as a hotspot packet { Hs Arrived[iq id ] = Hs Arrived[iq id ] + 1 ; if (Lp Pop[iq id ][1][1] = Lp B) // blocking state; a hotspot packet is lost // where Lp B is the buer-size of low priority queues // the second dimension of array expresses the stage id = 1, // while the third stands for layer id = 1 Hs Lost[iq id ] = Hs Lost[iq id ] + 1 ; 83

87 else // a hotspot packet is accepted at low priority queue { Hs Accepted[iq id ] = Hs Accepted[iq id ] + 1 ; Lp Pop[iq id ][1][1] = Lp Pop[iq id ][1][1] + 1 ; for (bf id = 1; bf id >= Lp Pop[iq id ][1][1]; bf id + +) Lp RA[iq id ][1][1][bf id ] = Lp RA[iq id ][1][1][bf id + 1] ; Lp RA[iq id ][1][1][Lp Pop[iq id ][1][1]] = DA hs ; } } else // the generated packet is designated as a normal packet { rp hp = random() ; // where rp hp [0::1) rp DA = random() N ; // where rp DA [0::N 1] is a randomly selected // Destination Address (DA) if (rp hp < r hp ) ; // a high priority packet is arrived { Hp Arrived[iq id ] = Hp Arrived[iq id ] + 1 ; if (Hp Pop[iq id ][1][1] = Hp B) // a high priority packet is lost // where Hp B is the buer-size of high priority queues Hp Lost[iq id ] = Hp Lost[iq id ] + 1 ; else // a high priority packet is accepted { Hp Accepted[iq id ] = Hp Accepted[iq id ] + 1 ; Hp Pop[iq id ][1][1] = Hp Pop[iq id ][1][1] + 1 ; for (bf id = 1; bf id >= Hp Pop[iq id ][1][1]; bf id + +) Hp RA[iq id ][1][1][bf id ] = Hp RA[iq id ][1][1][bf id + 1] ; Hp RA[iq id ][1][1][Hp Pop[iq id ][1][1]] = rp DA ; { } else // a low priority packet is arrived { Lp Arrived[iq id ] = Lp Arrived[iq id ] + 1 ; if (Lp Pop[iq id ][1][1] = Lp B) // a low priority packet is lost // where Lp B is the buer-size of low priority queues Lp Lost[iq id ] = Lp Lost[iq id ] + 1 ; else // a low priority packet is accepted { Lp Accepted[iq id ] = Lp Accepted[iq id ] + 1 ; Lp Pop[iq id ][1][1] = Lp Pop[iq id ][1][1] + 1 ; for (bf id = 1; bf id >= Lp Pop[iq id ][1][1]; bf id + +) Lp RA[iq id ][1][1][bf id ] = Lp RA[iq id ][1][1][bf id + 1] ; 84

88 } } } { Lp RA[iq id ][1][1][Lp Pop[iq id ][1][1]] = rp DA ; } return Hp Pop; Lp Pop; Hs Arrived; Hp Arrived; Lp Arrived; Hs Accepted; Hp Accepted; Lp Accepted; Hs Lost; Hp Lost; Lp Lost; Hp RA; Lp RA ; Algorithm 4.1: Input-queue process for 2-class priority MINs under hotspot environment Finally, the simulations were performed at packet level, assuming xed-length packets transmitted in equal-length time slots, while the number of simulation runs was again adjusted at 10 5 clock cycles with an initial stabilization process of 10 3 network cycles, ensuring a steady-state operating condition Simulator Validation for 2-Class Priority MINs under Hotspot Environment Figure 4.3: Total Th of dual-priority, single-buered, 6-stage MINs Since no other simulator/model supporting dual priority trac under hotspot environment has been reported insofar in the literature, we validated our simulator against those that have been made available; i.e. single-priority under hotspot environment and dual-priority under uniform trac conditions. In the case of hotspot environment, the measurements reported in table 1 of [25] and those obtained by our simulator in the marginal case of single-priority trac, where r hp = 0, f hs = 0:10, and N = 8, have found to be in close agreement (all dierences were less than 2%). On the other hand, the priority mechanism was tested under uniform trac conditions; this was done by setting the parameter f hs = 0. We compared our measurements against those obtained from Shabtai's Model reported in [38], and have found that both results are in close agreement (the maximum dierence was only 3.8%). Figure 4.3 illustrates 85

89 this comparison, involving the total normalized throughput for all packets (both high and low priority) of a dual-priority, single-buered, 6-stage MIN vs. the ratio of high priority packets under full oered load Class Priority Single-Layer MINs Performance under Hotspot Environment Figure 4.4: RT h of single-priority, doublebuered, 6-stage, single-layer MIN under hotspot trac Figure 4.5: RT h of dual-priority, doublebuered, 6-stage, single-layer MIN under hotspot trac In this chapter we extend the study of hotspot environment in MINs by considering dual-priority SEs in order to support varying quality of service for packets and by using double-buered queues to improve the overall network performance. Figure 4.4 depicts the relative normalized throughput of a single-priority, doublebuered, 6-stage MIN for single hotspot output port, as well as the cold-3 and cold-5 zones in comparison with the normalized throughput of the corresponding MIN conguration under uniform trac conditions, when the initial hotspot trac is set to f hs = 0:05. It is obvious that the non-uniform trac causes a serious trac congestion problem not only to the single hotspot output port but also to the zones which are more close to it. According to gure 4.4, the performance degradation of both hotspot and cold-3 zone is approximately 58.5%, while the cold-5 zone exhibits improved performance, mainly owing to the fact that it has a lighter load (recall that a ratio equal to f hs is addressed to the hotspot output, and this is subtracted from the load of other outputs). As a response to the tree saturation problem, a dual-priority MIN conguration can oer better quality-of-service to some applications by prioritizing their packets. Accord- 86

90 Figure 4.6: D of single-priority, doublebuered, 6-stage, single-layer MIN under hotspot trac Figure 4.7: D of dual-priority, doublebuered, 6-stage, single-layer MIN under hotspot trac ing to gure 4.5, the relative normalized throughput of high priority packets approaches the optimal value RTh hp = 1, when the initial hotspot trac is f hs = 0:05 and the ratio of high priority packets is r hp = 0:20. Recall from previous section the relative normalized throughput of high priority packets is evaluated by collecting measurements on all output ports, showing that the gain is higher for the single hotspot output port and the zones which are more close to it (since these zones exhibit the most acute performance deterioration under hotspot trac). We can also notice that the throughput of low priority trac for hotspot and cold-3 zones is slightly improved against the respective performance in gure 4.4: this can be attributed to the introduction of the additional buers in the SEs (recall that SEs have distinct buers for high- and low-priority packets). The cold-5 zone, on the other hand, exhibits a slight deterioration towards the full input load when compared to gure 4.4, with the performance curve converging to the Single Priority/Uniform curve. This is owing to the fact that at this load range, the network has many high-priority packets to serve, thus the service oered to low priority packets is degraded. Figures 4.6 and 4.7 represent the ndings for the normalized packet delay of singleand dual-priority MINs, under hotspot environment. Again we can observe that highpriority packets obtain service close to the optimal one, at all oered load setups. It is worth noting that normalized packet delay of hotspot trac is eectively double than the delay of the cold-3 zone, while the divergence between the two zones regarding relative normalized throughput is negligible at both congurations. Finally, the delay for packets routed to cold-5 zone is scientically smaller than the delay of packets routed to cold-3 87

91 zone. The small drop in the low priority packet delay towards the full load area in gure 4.7 is owing to the fact that -for that area- a number of low priority packets is not accepted for entrance in the network, due to buer unavailability at the rst MIN stage. Figure 4.8: Upf of single-priority, doublebuered, 6-stage, single-layer MIN under hotspot trac Figure 4.9: Upf of dual-priority, doublebuered, 6-stage, single-layer MIN under hotspot trac Similarly, gures 4.8 and 4.9 depict the behavior of the universal performance factor of single- and dual-priority MINs, under hotspot trac conditions. We can observe that high-priority packets obtain again service close to the optimal zero, under full oered load. We can also notice that the dierence in the delay factor between zones hotspot and cold-3 is reected in the Universal performance factor (although both zones have the same throughput), and that zone cold-5 exhibits considerably better performance for loads > 0: Class Priority Multi-Layer MINs Performance under Hotspot Environment In this subsection, we extend all previous studies by presenting our ndings for a 6-stage multi-layer MIN, where the number of layers at the last stage is equal to l = 4, i.e. the rst four stages are single-layer and multiple layers are only used at the last two stages, in an attempt to balance between MIN performance and cost. It is also worth noting that for the rst 4 stages, double-buered SEs are considered, whereas at the last two stages (which are non-blocking), single-buered SEs are used, as the absence of blockings removes the need for larger buers. 88

92 Figure 4.10: RT h of dual-priority, doublebuered, 6-stage, multi-layer MIN under hotspot trac Figure 4.11: D of dual-priority, doublebuered, 6-stage, multi-layer MIN under hotspot trac Figures 4.10 and 4.11 depict the relative normalized throughput and normalized delay metric respectively for a dual-priority, double-buered, 6-stage, multi-layer MIN versus a corresponding single-layer one, when the initial hotspot trac is set to f hs = 0:05, while the ratio of high priority packets is considered to be r hp = 0:20. Curves represent the performance of low priority trac for single hotspot output port and cold-5 zone, as well as the performance of high priority trac, routed to all output zones, since our experiments have shown that this parameter is not aected by the forwarding zone of such packets. According to gure 4.10, the relative normalized throughput of hotspot trac for multi-layer MIN is found to be dramatically improved in correlation with single-layer one. Relative normalized throughput reaches its peak performance RTh hs = 0:575 when the oered load is = 0:8 -throughput gain 130%-, indicating that the additional bandwidth oered by the multi-layer SEs is exploited to a great extent. It is also noticed that the throughput gain for cold-5 zone is considerable, i.e.17.3% under full load trac, while the performance of high priority packets remains optimal. Although Multistage Interconnection Networks (MINs) are fairly exible in handling varieties of trac loads, their performance considerably degrades by hotspot trac, especially at increasing size networks. As alleviation to the tree saturation problem, the prioritizing of packets, which was applied at previous subsection, through a scheme that natively supports dual-priority trac, provided better QoS to high priority packets. It was noticed that both performance metrics for high priority packets relative normalized throughput and normalized delay approached their optimal values Th max hp = 1 and = 1 respectively under r hp = 0:20 ratio of high priority packets. The rationale be- D min hp 89

93 hind using multiple layers at the last two stages is to improve as well as the performance of low priority packets. Thus, in an attempt to balance between MIN performance and cost, in a 4-layer MIN conguration, we found again the second major performance metric, namely normalized delay to be dramatically improved in terms of hotspot trac (gure 4.11); the peak value of this metric was reduced from the value D hs = 5:64 to D hs = 1:9. Finally, it is also observed that the decrement of normalized delay for low priority packets of cold-5 zone is also considerable - e.g. 20% under full load trac -. Figure 4.12: Upf of dual-priority, double-buered, 6-stage, multi-layer MIN under hotspot trac Similarly, gure 4.12 depicts the behavior of the universal performance factor of dualpriority multi-layer MINs vs. single-layer one, under hotspot trac conditions. We can notice that hotspot trac exhibits much better overall performance (the values of singlelayer MIN are over threefold of those of multi-layer one) at moderate and high input loads, while the performance of low priority trac for cold-5 zone is also considerably improved at higher oered loads. Finally, it is obvious that high-priority packets obtain again service close to the optimal zero, under full oered load. 4.5 Conclusions for 2-Class Priority Multi-Layer MINs under Hotspot Environment In this chapter we have examined the performance of MINs natively supporting two priorities, when these operate under hotspot trac conditions. Our ndings show that 90

94 when the hotspot conditions are not extreme and the high priority packet ratio is moderate (20%), high priority packets receive almost optimal quality of service, whereas the QoS oered to low priority packets varies, depending on the zone they are addressed to. It is also interesting that while throughput for some zones is found to be identical, the same zones exhibit variations of behavior regarding the delay metric. In all cases, performance indicators of low-priority packets for zones that are \close" to the hotspot output appear to quickly deteriorate even for light loads ( >= 0:3), whereas low-priority packets addressed to zones \far" from the hotspot output exhibit a performance similar to that of MINs under uniform input load. As alleviation to the tree saturation problem for hotspot trac, we also introduced multi-layer MINs and we have found all performance metrics to be dramatically improved. Finally, the introduction of an adaptive scheme, altering buer allocation to dierent priority classes according to current trac load and high/low priority ratios can be investigated as well. 91

95 Chapter 5 Multi Priority MINs 5.1 Introduction 5.2 Analysis of Multi-Priority MINs 5.3 Performance Evaluation Parameters and Methodology of Multi-Priority MINs 5.4 Simulation and Performance Results of Multi-Priority MINs Simulator Validation for Multi-Priority MINs Multi-Priority MINs Performance 5.5 Conclusions for Multi-Priority MINs 5.1 Introduction Both in the context of parallel and distributed systems, the performance of the communication network interconnecting the system elements (nodes, processors, memory modules etc) is recognized as a critical factor for overall system performance. During the last decades, much research has targeted the investigation of parallel and distributed systems' performance, particularly in the area of Multistage Interconnection Networks (MINs) and communications. Consequently, the need for communication infrastructure performance prediction and evaluation has arisen, and numerous research eorts have targeted this area, employing either analytical models (mainly based on Markov models and Petrinets) or simulation techniques. The past few years have witnessed a dramatic increase in the number and variety of applications running over the Internet and over enterprise IP networks. The spectrum includes interactive (e.g. telnet, and instant messaging), bulk data transfer (e.g. ftp, and P2P le downloads), corporate (e.g. database transactions), and real-time applications 92

96 (e.g. voice, and video streaming). These application classes have considerably dierent requirements from the communication infrastructure in terms of quality of service aspects, such as throughput, delay or jitter (e.g. bulk transfer applications need high throughput, interactive applications need minimal delays while streaming applications require bounded jitter), and these requirements are typically expressed to the network layer in the form of packet priorities. Another source of packet priority dierentiation is protocol intrinsics, such as TCP out-of-band/expedited data, which are normally prioritized against normal connection data [43]. In order to address these requirements, dual priority (or 2-class) queuing systems have been recently introduced in MINs, providing the ability to oer dierent QoS parameters to packets that have dierent priorities. Several commercial switches already have accommodated trac priority schemes, such as [12, 20]. Each switching element in these products' fabric has two queues at each input port, with one queue dedicated to high priority packets and the other dedicated to low priority ones. High priority packets are serviced rst, while low-priority packets are only serviced when no high-priority packets contend for the same resources (output links). While it is obvious that high-priority packets will receive better quality of service than low-priority ones, the performance of dual priority MINs has not been adequately investigated insofar in order to quantify gains and losses under various trac conditions, and only few results (e.g. [38, 55] have been published. The MINs used in the above studies employ single-buered SEs, where one buer position is dedicated to low priority packets and one buer position is assigned to high priority trac. In corporate environments, however, hosting a multitude of applications, a two-priority scheme is bound not to suce for expressing the diversity of application-level requirements to the network layer. As identied in [36], besides the inherently dierent QoS requirements of dierent types of applications, priority classication is further rened by (a) the dierent relative importance of dierent applications to the enterprise (e.g., Oracle database transactions may be considered critical and therefore high priority, while trac associated with browsing external web sites is generally less important) and (b) the desire to optimize the usage of their existing network infrastructures under nite capacity and cost constraints, while ensuring good performance for important applications. In this we extend the previous studies by introducing MINs that natively support multiclass routing trac. We also analyze the performance of multi-priority SEs that use not only single-buered, but also double-buered queues in order to oer better QoS, while providing in parallel better overall network performance. The remainder of this chapter is organized as follows: in section 5.2 we briey analyze an Omega Network that natively supports multi-class routing trac. Subsequently, in section 5.3 we introduce the performance criteria and parameters related to this network. Section 5.4 presents the results of our performance analysis, which has been conducted through simulation experiments, while section 5.5 provides the concluding remarks. 93

97 5.2 Analysis of Multi-Priority MINs As we have already mentioned previously, there are three major classes of blocking Multistage Interconnection Networks (MINs): Baseline or Delta Networks which were proposed by Patel [35], and Omega and Generalized Cube Networks as described by [2, 26]. Recall from previous chapters, all these multistage interconnection self-routing networks are characterized by the fact that there is exactly a unique path from each input port to each output port, which is just the Banyan property as dened in [16]. Internally, they are constructed by small size Switching Elements (SEs) followed or preceded by links. Switching in these networks is termed as \self-routing" because when a SE accepts a packet in one of its input ports, it can decide to which of its output ports it must be forwarded, depending only on the packet's destination address. A typical conguration of an (NXN) Omega Network is depicted in gure 5.1 and outlined below. class-1 class-p class-1 class-p Figure 5.1: An (NXN) Omega Network Figure 5.2: A Multi-Priority (2X2) Switching Element Omega Networks use the \perfect shue" routing algorithm by rotating to left only the destination tag. A variation of this algorithm is used by Basesine or Delta Networks, where a SE of stage k can decide in which output port to send it based on the k th bit of the destination address and the \k-bit shuee" algorithm, while on Generalized Cube Networks the routing tag is generated by exclusive or of source and destination labels. Figure 5.2 illustrates the internal modeling of a multi-priority SE supporting p priority classes. Each SE is modeled by as an array of p non-shared buer queue pairs; within each pair, one buer is dedicated for the upper queuing bank and the other for the lower bank. During a single network cycle, the SE considers all its input links, examining the buer queues in the arrays in decreasing order of priority. If a queue is not empty, the rst packet from it is extracted and transmitted towards the next MIN stage; packets in lower priority queues are forwarded to an SE's output link only if no packet in a higher priority queue is tagged to be forwarded to the same output link. Packets in all queues are transmitted in a rst come, rst served basis. In all cases, at most one packet per link (upper or lower) of a SE will be forwarded to the next stage. The priority of each packet is indicated through the appropriate priority bits in the packet header. 94

98 Consequently as the performance evaluation presented in this chapter is independent from the internal link permutations of a Banyan-type network, it can be applied to any class of such networks. In our study we used an Omega Network that is assumed to operate under the following conditions: The MIN operates in a slotted time model [44]. In each time slot two phases take place. In the rst phase, control information passes via the network from the last stage to the rst one. In the second phase, packets ow from the rst stage towards the last, in accordance to the ow control information. At each input of every switch of the MIN only one packet can be accepted within a time slot which is marked by a priority tag, and it is routed to the appropriate class queue. The domain value for this special priority tag in the header eld of the packet determines its i-class priority, where i = 1 p. The arrival process of each input of the network is a simple Bernoulli process, i.e. the probability that a packet arrives within a clock cycle is constant and the arrivals are independent of each other. An i-class priority packet arriving at the rst stage is discarded if the corresponding i-class priority buer of the SE is full, where i = 1 p. A backpressure blocking mechanism is used, according to which an i-class priority packet is blocked at a stage if the destination of the corresponding i-class priority buer at the next stage is full, where i = 1 p. All i-class priority packets are uniformly distributed across all the destinations and each i-class priority queue uses a FIFO policy for all output ports, where i = 1 p. The conict resolution procedure of a multi-class priority MIN takes into account the packet priority: if one of the received packets is of higher-priority and the other is of lower priority, the higher-priority packet will be maintained and the lowerpriority one will be blocked by means of upstream control signals; if both packets have the same priority, one packet is chosen randomly to be stored in the buer whereas the other packet is blocked. It suces for the SE to read the incoming packets' headers in order to make a decision on which packet to store and which to drop. All SEs have deterministic service time. Finally, all packets in input ports contain both the data to be transferred and the routing tag. In order to achieve synchronously operating SEs, the MIN is internally clocked. As soon as packets reach a destination port they are removed from the MIN, so, packets cannot be blocked at the last stage. 95

99 5.3 Performance Evaluation Parameters and Methodology of Multi- Priority MINs In order to evaluate the performance of a multi-priority MIN the following metrics are used. Let Th and D be the normalized throughput and normalized delay of a MIN as described at chapter 2. Relative normalized throughput RT h(i) of i-class priority packets, where i = 1 p is the normalized throughput Th(i) of such packets divided by the corresponding ratio of oered load r i. RTh(i) = Th(i) r i (5.1) Normalized packet delay D(i) of i-class priority trac, where i = 1 p is the ratio of the D avg (i) to the minimum packet delay which is simply the transmission delay n nc (i.e. zero queuing delay), where n = log 2 N is the number of intermediate stages and nc is the network cycle. Formally, D(i) can be dened as (5.2) n nc Universal performance factor Upf(i) of i-class priority trac, where i = 1 p can be dened - as the previous chapters - by a relation involving the two major above factors, D(i) and RT h(i). Because the performance of i-class priority trac of a MIN is considered optimal when D(i) is minimized and RTh(i) is maximized, the formula for computing the universal factor arranges so that the overall performance metric follows that rule. Formally, Upf(i) can be expressed by Upf(i) = D(i) = D avg(i) [ ] 2 1 RTh(i) w d [D(i) 1] 2 + w th (5.3) RT h(i) In the remaining of this chapter we will consider both component factors throughput and delay of equal importance, setting thus w d = w th = 1. Finally, we list the major parameters aecting the performance of a multi-class priority MIN. Number of priority classes p is the number of dierent priority classes, where 1 represents the lowest packet class priority, and p denotes the highest one. In our study the number of priority classes is assumed to be p = 3, where 1-class stands for low priority packets, while 2-class and 3-class stand for medium and high priority packets respectively. Buer-size b(i) of an i-class priority queue, where i = 1 p is the maximum number of such packets that the corresponding i-class input buer of a SE can hold. In this chapter we consider symmetric single- b(i) = 1 or double- b(i) = 2 buered MINs. It is worth noting that a buer-size of b = 2 is being considered since it has been reported at previous chapters to provide optimal overall network performance: indeed, for smaller buer-sizes b(i) = 1 network throughput drops due to high blocking probabilities, whereas for higher buer-sizes b(i) = 4 or 8 packet delay increases signicantly (and the SE hardware cost also raises). 96

100 Oered load ë(i) of i-class priority trac, where i = 1 p is the steady-state xed probability of such arriving packets at each queue on inputs. It holds that = p i=1 (i), where represents the total arrival probability of all packets. In our simulation at this chapter is assumed to be = 0:1; 0:2; :::; 0:9; 1. Ratio of i-class priority oered load r(i), where i = 1 p expressed by r(i) = (i). It is obvious that p i=1 r(i) = 1. In the case of a normal-qos setup the ratios of high, medium and low priority packets are assumed to be r(3) = 0:10, r(2) = 0:30 and r(1) = 0:60 respectively, while in the case of a high-qos setup the corresponding ratios become r(3) = 0:20, r(2) = 0:40 and r(1) = 0:40 respectively. Number of stages n is the number of stages of an (N X N) MIN, where n = log 2 N. In our simulation n is assumed to be n = Simulation and Performance Results of Multi-Priority MINs A multi-priority simulator was constructed for evaluating the overall network performance of Omega type MINs. This method of modeling [52] using simulation experiments was applied due to the complexity of the mathematical model [53]. For this purpose we developed a special multi-priority simulator in C++, capable to operate under dierent conguration schemes. It was based on several input parameters such as the number of priority classes, the buer-lengths of queues for all priority classes, the number of input and output ports, the number of stages, the oered load, and the ratios of all priority classes of packets. Internally, each SE of a MIN supporting p priority classes was modeled by as an array of p non-shared buer pairs of queues, with each queue operating in a FCFS basis and one buer from each pair dedicated to the upper queuing bank and the other dedicated to the lower queuing bank. We simulated two dierent congurations, where the queuing banks were located either in front of crossbar Switching Element (SE) or behind it. However, at the following algorithms (algorithms 5.1 and 5.2) queuing banks are considered to be located after the crossbar segment of SE. All packet contentions were resolved by favoring those packets transmitted from the higher priority queues in which they were stored in (algorithm 5.2), while the contention between two packets of the same priority class was resolved randomly. Metrics such as packet throughput, and packet delay were collected by performing extensive simulations to validate our results. 97

101 Unicast Forwarding (cs id ; sq id ; aq id ; pr id ; bm) Input: Current stage id (cs id ) ; send-queue id (sq id ) of current stage ; accept-queue id (aq id ) of next stage ; priority id (pr id ) and blocking mechanism (bm). Output: Population for send- and accept-queues (P op) ; total number of serviced and blocked packets for send-queue (Serviced; Blocked) respectively ; total number of packet delay cycles for send-queue (Delay); routing address RA of each buer position of queue. { if (Pop[aq id ][cs id + 1][pr id ] = B pr id ) // blocking state // where B pr id is the buer-size of a priority pr id -class queue { Blocked[sq id ][cs id ][pr id ] = Blocked[sq id ][cs id ][pr id ] + 1 ; if (bm = blm ) // block and lost mechanism; { Pop[sq id ][cs id ][pr id ] = Pop[sq id ][cs id ][pr id ] 1 ; for (bf id = 1; bf id >= Pop[sq id ][cs id ][pr id ]; bf id + +) RA[sq id ][cs id ][pr id ][bf id ] = RA[sq id ][cs id ][pr id ][bf id + 1] ; // where RA is the Routing Address of the packet // located at (bf id ) th position of send-queue } } else // unicast forwarding { Serviced[sq id ][cs id ][pr id ] = Serviced[sq id ][cs id ][pr id ] + 1 ; Pop[sq id ][cs id ][pr id ] = Pop[sq id ][cs id ][pr id ] 1 ; Pop[aq id ][cs id + 1][pr id ] = Pop[aq id ][cs id + 1][][pr id ] + 1 ; RA[aq id ][cs id + 1][pr id ][Pop[aq id ][cs id + 1][pr id ]] = RA[sq id ][cs id ][pr id ][1] ; // where RA is the Routing Address of the packet // located at (Pop[aq id ][cs id + 1][pr id ]) th and 1 th position // of accept- and corresponding send-queue respectively for (bf id = 1; bf id >= Pop[sq id ][cs id ][pr id ]; bf id + +) RA[sq id ][cs id ][pr id ][bf id ] = RA[sq id ][cs id ][pr id ][bf id + 1] ; } Delay[sq id ][cs id ][pr id ] = Delay[sq id ][cs id ][pr id ] + Pop[sq id ][cs id ][pr id ] ; return P op; Serviced; Blocked; Delay; RA ; } Algorithm 5.1: Unicast forwarding for multi-priority MINs 98

102 SendQueue Process (cs id ; sq id ; bm) Input: Current stage id (cs id ) ; send-queue id (sq id ) of current stage and blocking mechanism (bm). { processor = 0; for (pr id = P 1; pr id >= 0; pr id ) // where P is the total number of priorities if (Pop[sq id ][cs id ][pr id ] > 0) and (processor = 0) // pr id -class send-queue is not empty and processor is still ready for forwarding { RA bit = get bit(ra[sq id ][cs id ][cl id ][pr id ][1]); // get the (cs id ) th bit of Routing Address (RA) of the leading packet // of pr id -class send-queue by a cyclic logical left shift } } if (RA bit = 0) // upper port forwarding { aq id = 2 (sq id %(N=2)) ; // link for perfect shue algorithm Unicast Forwarding(cs id ; sq id ; aq id ; pr id ; bm) ; } else if (RA bit = 1) // lower port forwarding { aq id = 2 (sq id %(N=2)) + 1 ; // link for perfect shue algorithm Unicast Forwarding(cs id ; sq id ; aq id ; pr id ; bm) ; } processor = 1 ; Algorithm 5.2: Send-queue process for multi-priority MINs Finally, the simulations were performed at packet level, assuming xed-length packets transmitted in equal-length time slots, while the number of simulation runs was again adjusted at 10 5 clock cycles with an initial stabilization process of 10 3 network cycles, ensuring a steady-state operating condition. 99

103 5.4.1 Simulator Validation for Multi-Priority MINs Since no other simulator supporting more than two priorities has been reported insofar in the literature, we validated our simulator against single-priority and dual-priority simulators that have been made available. This was done by setting the parameter p (number of priority classes) in our simulator to 1 and 2, and comparing the results obtained from the simulation against results already published for single- and dual-priority MINs. For single-priority MINs, our results have to be found in close agreement with those produced by Theimer's model, which are considered to be the most accurate ones [44]. For p = 2 (dual-priority MINs), as in the previous chapter, we compared our measurements of the total normalized throughput of both high and low priority packets, of single-buered, 6-stage MIN vs. the ratio of high priority packets under full oered load conditions against those obtained from Shabtai's Model reported in [38], and we have found again (as gure 4.2) that both results are in close agreement (maximum dierence was only 3.8%) Multi-Priority MINs Performance Figure 5.3: Total Th of multi-priority, single/double-buered, 10-stage MINs In this chapter we extend our study by introducing multi-priority SEs that use not only single-buered, but also double-buered queues in order to oer better quality-of-services, 100

104 while providing in parallel better overall network performance. In gure 5.3, curves MP[10]B[b]R[h; m; l] represent the total normalized throughput of a 10-stage MIN, under a multi-priority mechanism, when the buer-lengths of all priorityclass SEs are b(i) = 1; 2, where i = 1 p, expressing a symmetric single- or doublebuered MIN setup with the ratios of high, medium and low priority packets to be r(3) = h, r(2) = m and r(1) = l respectively. Similarly, curves SP[10]B[b] depict the normalized throughput of a 10-stage MIN, under a single priority mechanism, when the buer-length is b = 1 or 2. According to this gure the gains for total normalized throughput of a single-buered MIN, employing a multi-class priority mechanism (curves MP[10]B[1]R[h; m; l]) vs. the corresponding single priority one (curve SP[10]B[1]) are 37.6% and 41%, under a normal- QoS (h = 0:10, m = 0:30, l = 0:60) and a high-qos (h = 0:20, m = 0:40, l = 0:40) setup, when = 1 and = 0:7 respectively. Similarly, the gains for total normalized throughput of a double-buered MIN, employing a multi-class priority mechanism (curves MP[10]B[2]R[h; m; l]) vs. the corresponding single priority one (curve SP[10]B[2]) are 22.5% and 26.4%, under a normal-qos and a high-qos setup, when = 1 and = 0:8 respectively. The performance improvement in the overall network throughput may be attributed to the exploitation of the additional buer spaces available for the MIN, since now each priority class has distinct buer spaces and thus blockings due to buer space unavailability occur with decreased probability. Figure 5.4: RT h of multi-priority, singlebuered, 10-stage MIN Figure 5.5: RT h of multi-priority, doublebuered, 10-stage MIN Figures 5.4 and 5.5 depict the relative normalized throughput of all priority-class trac, where -HPT, -MTP, and -LPT stand for high, medium and low priority trac respectively. According to these gures the relative normalized throughput of high priority packets 101

105 approaches the optimal value of this performance metric at both cases of single-buered MINs (RTh = 0:97 or 0.93 for a normal- and high- QoS setup respectively), while at both cases of double-buered congurations it has found to be more improved reaching the maximum value (Th max = 1). Medium-priority packets also achieve higher throughput, as compared to packets in a single-priority MIN, and it is worth noting that in a normal- QoS double-buered MIN, this throughput approaches the optimal value. Low-priority packets, nally, are receiving better throughput than packets in a single-priority MIN when the oered load is less than 0.6, while this service deteriorates when the load exceeds this value. This happens because under heavy loads the probability that packets with high or medium priorities are available at an SE increases, and these packets are chosen over low-priority ones for forwarding to the next MIN stage. It worth mentioning that, although the relative normalized throughput for all classes of trac are better for the case of the normal-qos setup (gures 5.4 and 5.5), the total normalized throughput is greater at the case of the high-qos conguration (gure 5.3), because there are more high and medium priority packets at input ports and thus available buers are better exploited. Figure 5.6: D of multi-priority, singlebuered, 10-stage MIN Figure 5.7: D of multi-priority, doublebuered, 10-stage MIN Figures 5.6 and 5.7 represent the ndings for the normalized packet delay of singleand double-buered MINs, supporting multi-priority trac. Again we can observe that high-priority packets obtain service close to the optimal one, especially for the case of a normal-qos MIN. The delay for medium-priority packets is consistently smaller than the delay of packets in single-priority networks with equal load; the obtained benet for this packet class is higher in normal-qos setups than in high-qos ones, and this is expected since in the high-qos setup (a) a considerable amount of network resources is consumed 102

106 by high-priority packets and (b) more medium-priority packets contend for the remaining network resources. The packet-delay for low-priority packets is smaller than packet-delay in single-priority MINs for load < 0:6 in the single-buer case and < 0:5 for the double-buered setup, but subsequently rises since less network slots are available for serving low-priority packets (due to higher probabilities that a high- or medium-priority packet exists at an SE). It is worth noting that for this load range, all packet classes have smaller delays than the packets in single priority MINs. This may seem contradictory with our previous work [55] which report that increments in buer sizes leads to increased delays; we consider however that in a multi-priority MIN, packets with dierent priorities are stored in separate queues in SEs, decreasing thus the number of blockings of low- and medium- priority packets due to unavailability of suitable buer space in the destination SE. At the load range in question, the gains obtained due to the avoidance of these blockings are higher than the costs incurred from yielding to high priority packets, thus the overall eect on the delay from the introduction of distinct buers for each priority class is positive. In the curves corresponding to high-qos setups we can observe a drop in the delay for very high loads ( > 0:8 for single-buered MINs and > 0:9 for double-buered MINs): this is due to a high amount of blockings for low priority packets at the input of the MIN's rst stage, which eectively preclude a considerable amount of low priority packets from entering the MIN altogether. These packets are not accounted for in the computation of the delay metric, and this is the reason why this \improvement" in the performance indicator appears. Figure 5.8: Upf of multi-priority, singlebuered, 10-stage MIN Figure 5.9: Upf of multi-priority, doublebuered, 10-stage MIN Figures 5.8 and 5.9 depict the behavior of the universal performance factor metric for 103

107 each priority-class packets of single- and double-buered MINs, respectively, in correlation to the oered load. The behavior of the universal performance factor is follows the behavior of the individual performance indicators, showing that high- and medium-priority packets are oered consistently a better quality of service, as compared to packets in single-priority MINs, while for low-priority packets two areas may be identied: the rst one spans along the \light load" segment of the x-axis, in which low-priority packets are oered a better quality of service than packets in single-priority MINs, and the second one spans along the medium- and high-load segment of the x-axis, in which the QoS oered to low-priority packets is inferior to the QoS oered in packets within single-priority MINs. As explained above, the ability to oer better quality of service to all packets at a certain load range is attributed to the existence of more buers (which are specialized for each priority class); extra buer availability leads in turn to less blockings, and thus increased throughput and smaller delays. 5.5 Conclusions for Multi-Priority MINs In this chapter we have presented the modeling of multi-priority MINs and analyzed the performance of a MIN supporting three priority classes under various load conditions. Moreover, two dierent test-bed setups were used in order to investigate and analyze the performance of all priority-class trac, under dierent Quality of Service (QoS) congurations. In the considered environment, Switching Elements (SEs) that natively support multi-class priority routing trac have been used for constructing the MIN. The rationale behind introducing a multiple-priority scheme is to provide dierent QoS guarantees to trac from dierent applications, which is a highly desired feature for many IP network operators, and particularly for enterprise networks. The goal of this chapter is to provide network designers with insight on how packet prioritization aects the QoS delivered to each priority class and the network performance in general. This insight can help network designers to assign packet priorities to various applications in a manner that will comply with the corporate policy, satisfy application requirements and maximize network utilization. The presented results also facilitate performance prediction for multi-priority networks before actual network implementation, through which deployment cost and rollout time can be minimized. 104

108 Chapter 6 Multi-layer, Multi-priority MINs under Multicast Environment 6.1 Introduction 6.2 Multi-layer, Multi-priority MIN Description 6.3 Conguration and Operational Parameters of Multi-layer MINs 6.4 Performance Evaluation Metrics for Multi-layer MINs Metrics for Single-layer Segment of MINs Metrics for Multi-layer Segment of MINs 6.5 Simulation and Performance Results for multi-layer, multi-priority MINs Simulator Validation for Multicasting Multicasting on Single-priority, Single-layer MINs Multicasting on Dual-priority, Single-layer MINs Multicasting on Dual-priority Multi-layer MINs Multicasting on Multi-layer Segment of Dual-priority MINs 6.6 Conclusions 6.1 Introduction Convergence in network technologies, services and in terminal equipment is at the basis of change in innovative oers and new business models in the communications sector [33]. At the network level, this convergence must be supported by low-latency, highthroughput, QoS-aware, packet-switched communication infrastructures, and Multistage 105

109 Interconnection Networks (MINs) have been recognized as an infrastructure capable of delivering the characteristics above. The switching fabric that provides the communications path between line cards is 3-stage, self-routed architecture. MIN technology is a prominent candidate for the implementation of Next Generation Networks (NGNs), due to its ability to route multiple communication tasks concurrently and the appealing cost/performance ratio it achieves. MINs with the Banyan property [16], in particular, e.g. Delta Networks [35], Omega Networks [26], and Generalized Cube Networks [2] are more widely adopted, since non-banyan MINs have generally higher cost and complexity. An example of a MIN-based NGN infrastructure is the, CISCO CRS-1 router [9], which has been built as a multistage switching fabric. MIN-based solutions for NGNs have been widely deployed, as reported in [10]. Multicasting and broadcasting are two important functionalities of communication infrastructure in general and NGNs in particular. This has been recognized by international bodies, such as ITU, which has included multicasting in the NGNs' functional requirements [21] and outlined the framework for the multicast service in NGNs [22]. Existing performance analyses regarding MINs, however, have shown that they quickly saturate under broadcast and multicast trac [48, 49]: within a MIN with the Banyan property, multicasting/broadcasting is implemented through packet cloning [34, 41, 40], and the MIN switching fabric is unable to eciently handle the increased number of packets. As a response to this problem, the replication of the whole MIN network or certain stages of it has been suggested, leading to multi layer MINs [50]. The degree of replication L may be constant for all stages or vary across stages; in general, higher replication degrees should be employed towards the later stages of the MIN to provide the increased switching capacity needed there due to the cloned packets. Since layer replication increases the hardware cost, the rst MIN layers are either not replicated at all or replicated at a modest degree, to keep the MIN manufacturing cost at modest levels. In this chapter, we extend previous studies in the area of performance evaluation of MINs (e.g. [38, 55]) by including multi-layer MINs under multicast trac. We applied the partial multicast policy [51], since it oers superior performance compared to the full multicast mechanism [56], where a packet is copied and transmitted when only both destination buers are available. Furthermore, we extend the study presented in [56] by considering switching elements (SEs) that natively support a dual priority scheme, and also considering double-buered SEs, instead of only single-buered ones; double-buering has been reported to provide better QoS for packets, as compared to single-buering [51]. The remainder of this chapter is organized as follows: in section 6.2 we briey analyze a multilayer MIN for supporting multicasting routing trac. Subsequently, in section 6.3 we present the conguration and operational parameters considered in this chapter, whereas in section 6.4 we present the performance evaluation metrics that are collected. Section 6.5 presents the results of our performance analysis, which has been conducted through simulation experiments, while section 6.6 provides the concluding remarks. 106

110 6.2 Multi-layer, Multi-priority MIN Description Recall from previous chapters, Multistage Interconnection Networks (MINs) are used to interconnect a group of N inputs to a group of M outputs using several stages of small size Switching Elements (SEs) followed (or preceded) by link states, where all dierent types of MINs [35, 26, 2] with the Banyan property [16] are self-routing switching fabrics, which are also characterized by the fact that there is exactly a unique path from each source (input) to each sink (output). Figure 6.1: 4X4 Single-priority multi-layer MIN Figure 6.2: 8X8 Single-priority multi-layer MIN At this chapter we consider multi-layer MINs, where a typical conguration of an (4X4) or (8X8) multi-layer MIN is depicted in gure 6.1 or 6.2 respectively and outlined below. The example network consists of two segments, an initial single-layer one and a subsequent multi-layer one (with 2 layers). Internally, it is constructed by multi-priority SEs supporting p priority classes. Each SE is modeled by an array of p non-shared buer queue pairs; within each pair, one buer is dedicated for the upper queuing bank and the other for the lower bank. During a single network cycle, the SE considers all its input links, examining the buer queues in the arrays in decreasing order of priority. If a queue is not empty, the rst packet from it is extracted and transmitted towards the next MIN stage; packets in lower priority queues are forwarded to an SE's output link only if no 107

111 packet in a higher priority queue is tagged to be forwarded to the same output link. According to gure 6.2, it is worth noting that packet forwarding from stage 2 to stage 3 is blocking-free, since packets in stage-2 SEs do not contend for the same output link; packets at this stage can also be \cloned" (i.e. forwarded to both subsequent SEs in the context of a multicast routing activity), again without any blocking. This is always possible for cases where the degree of replication of succeeding stage i + 1 (which we will denote as l i+1 ) is equal to 2 l i. If, for some MIN with n stages there exists some nb(1 <= nb < n) such that k : l k+1 = 2 l k (nb <= k < n), then the MIN operates in a non-blocking fashion for the last (n nb) stages. Note that according to [50], blocking can occur at the MIN outputs, where SE outputs are multiplexed, if either the multiplexer or the data sink do not have enough capacity; in this chapter however we will assume that both multiplexers and data sinks have adequate capacity. In this chapter, we consider dual-priority, nite-buered, Multistage Interconnection Networks supporting multicast trac which operate under the following assumptions: Routing is performed in a pipeline manner, meaning that the routing process occurs in every stage in parallel. Internal clocking results in synchronously operating switches in a slotted time model [44], and all SEs have deterministic service time. The arrival process of each input of the network is a simple Bernoulli process, i.e. the probability that a packet arrives within a clock cycle is constant and the arrivals are independent of each other. We will denote this probability as. This probability can be further broken down to h and l, which represent the arrival probability for high and low priority packets, respectively. It holds that = h + l. At each input of the network only one packet can be accepted within a time slot. All packets in input ports contain both the data to be transferred and the routing tag. The priority of each packet is indicated through a priority bit in the packet header. Under the dual-priority mechanism, when applications or architectural modules enter a packet to the network they specify its priority, designating it either as high or low. The criteria for priority selection may stem from the nature of packet data (e.g. packets containing streaming media data can be designated as high-priority while FTP data can be characterized as low-priority), from protocol intrinsics (e.g. TCP out-of-band/expedited data vs. normal connection data) or from properties of the interconnected system architecture elements. The packet header consist of two extra equal-length elds the Routing Address (RA) and Multicast Mask (MM), occupying n bits each, where n is the number of stages in the MIN. Upon reception of a packet, the SE at stage k rst examines the k th bit of the MM; if this is set to 1, then the packet makes a multicast instead of a unicast transmission, forwarding the packet to both its output links. If the k th bit of the MM is however set to zero, then the k th bit of the RA is examined, and routing is performed as in the case of unicast MINs. It is obvious that, when all bits of 108

112 the MM of a packet are set zero, the packet follows a unicast path, reaching one specic network output port. On the other extreme, when all its bits are set to one the packet is broadcasted to all output ports of the network. In all other cases, the packet will be forwarded to a group of output ports, which constitute the Multicast Group (MG). A high/low priority packet arriving at the rst stage (k = 1) is discarded if the high/low priority buer of the corresponding SE is full, respectively. A high/low priority packet is blocked at a stage if the destination high/low priority buer at the next stage is full, respectively. A SE operates with either partial or full multicasting. Multicasting is performed by copying the packets within the 2X2 SEs. According to the partial mechanism (PM) if any of destination buers is not available, the packet is forwarded to the available destination and a copy remains at the present stage, in order to be later forwarded to the destination currently unavailable. When the full multicasting mechanism (FM) is employed, a packet is copied and transmitted when only both destination buers are available. Both high and low priority packets are uniformly distributed across all destinations, and each high/low priority queue uses a FIFO policy for all output ports. When two packets at a stage contend for a buer at the next stage and there is no adequate free space for both of them to be stored (i.e. only one buer position is available at the next stage), there is a conict. Conict resolution in a single-priority mechanism operates under the following scheme: one packet will be accepted at random and the other will be blocked by means of upstream control signals. Under the 2-class priority scheme, the conict resolution procedure takes into account the packet priority: if one of the received packets is a high-priority one and the other is a low priority packet, the high-priority packet will be maintained and the low-priority one will be blocked by means of upstream control signals; if both packets have the same priority, one packet is chosen randomly to be stored in the buer whereas the other packet is blocked. Since the priority of each packet is indicated through a priority bit in the packet header, thus it suces for the SE to read the header in order to make a decision on which packet to store and which to drop. Finally, packets are removed from their destinations immediately upon arrival, thus packets cannot be blocked at the last stage. 6.3 Conguration and Operational Parameters of Multi-layer MINs In this chapter we extend our study on performance evaluation of MINs by comparing the performance of dual-priority architecture versus single-priority one under multicast trac. All presented MINs are constructed by either single- or multi-layer SEs. According 109

113 to gure 6.1 the proposed MINs consists of two segments, the rst one having only singlelayer SEs, and the second one which is multi-layer. In the multi-layer segment each stage i + 1 has twice as many layers as the immediately preceding one, i, thus this segment operates in a non-blocking manner. This stems from the fact that if stage i has n i SEs of l i layers each, at a certain MIN network cycle at most 2 n i l i packets will be generated by this stage (all SEs at all layers process and clone a packet), and the subsequent one has enough SEs to intercept and process these packets. Consequently, all SEs at multi-layer segment are considered to have only the buer space needed to store and forward a single packet. On the other hand, the SEs of single-layer segment may employ dierent buer sizes in order to improve the overall MINs performance. Under these considerations, the operational parameters of the MINs evaluated in this chapter are as follows: Buer-size b of a queue is the maximum number of packets that an input buer of a SE can hold. In this study both symmetric single- and double-buered MINs (b = 1; 2) are considered. We note here that the particular buer sizes have been chosen since they have been reported [57] to provide optimal overall network performance: indeed, [57] documents that for higher buer sizes (b = 4; 8) packet delay increases signicantly, while SE hardware cost is also elevated. Furthermore, in the case of multilayer MINs the balance between the overall performance and cost is of crucial importance, since the addition of layers leads to a rising cost. Oered load is the steady-state xed probability of such arriving packets at each queue on inputs. In our simulation is assumed to be = 0:1; 0:2; :::; 0:9; 1. This probability can be further broken down to h and l, which represent the arrival probability for high and low priority packets, respectively. It holds that = h + l. Ratio of high priority oered load r h, is dened by r h = h =. In this chapter r h is assumed to be r h = 0:10. Similarly, the ratio of low priority oered load r l can be expressed by r l = l =. It is obvious that that r h + r l = 1. Consequently, r l is assumed to be r l = 0:90. Number of stages n is the number of stages of an (N X N) MIN, where n = log 2 N. In our simulation n is assumed to be n = 6, which is a widely used MIN size. Multicast ratio m of a SE at stage k, where k = 1 n is the probability of a packet having the k th bit of its multicast mask (MM) set 1. Consequently, it eectively expresses the probability that this particular SE will do a multicast by forwarding the packet to both its output links. In this chapter m is considered to be xed at all SEs and is assumed to be m = 0; 0:1; 0:5; 1. It is obvious that, when m = 0 all input trac is unicast, while if m = 1 all packets are broadcast (i.e. all packets reach all destinations). For intermediate values of m, the probability that a packet is unicast is equal to (1 m) n, i.e. the joint probability that all bits in MM are equal to 0. The value m = 0:1 for multicast ratio is considered, since it for a MIN size n equal to 6 evaluates to (1 0:1) 6 = 0:9 6 = 53:14%, giving thus approximately equal probabilities for unicast or multicast transmission within the MIN. 110

114 6.4 Performance Evaluation Metrics for Multi-layer MINs The two most important network performance factors, namely packet throughput and delay are evaluated and analyzed in this section. The Universal performance factor introduced in [57], which combines the above two metrics into a single one is also applied. In this study, when calculating the value of this combined factor, we have considered the individual performance factors (packet throughput and delay) to be of equal importance. This is not necessarily true for all application classes, e.g. for batch data transfers throughput is more important, whereas for streaming media the delay must be optimized. Moreover, attention has been paid to the denition of throughput and delay for multi-layer MINs, since both the single-layered and multi-layered segments have to be considered. Finally, in this chapter packet loss probability is considered as a separate metric for multicast trac Metrics for Single-layer Segment of MINs In order to evaluate the performance of a multicasting, single-layer (N X N) MIN, we use the following metrics. Let T be a relatively large time period divided into u discrete time intervals ( 1 ; 2 ; ; u ). Average throughput Th avg is the average number of packets accepted by all destinations per network cycle. Formally, Th avg (or bandwidth) is dened as uk=1 n Th a (k) avg = lim u (6.1) u where n a (k) denotes the number of packets that reach their destinations during the k th time interval. Normalized throughput Th is the ratio of the average throughput Th avg to the number of network outputs N. Formally, Th can be expressed by Th = Th avg (6.2) N and reects how eectively network capacity is used. Relative normalized throughput RT h(h) of high priority packets is the normalized throughput Th(h) of such packets divided by the corresponding ratio of oered load r h. RTh(h) = Th(h) r h (6.3) Similarly, relative normalized throughput RT h(l) of low priority packets can be expressed by the ratio of normalized throughput T h(l) of such packets to the corresponding ratio of oered load r l. RTh(l) = Th(l) r l (6.4) This extra normalization of both high and low priority trac leads to a common value domain needed for comparing their absolute performance values with those obtained by 111

115 the corresponding single priority MINs. Thus, in the diagrams of the next section we will compare the relative normalized throughput of dual-priority MINs with the normalized throughput of single-priority ones. Average packet delay D avg (p) is the average time a p-class priority packet spends to pass through the network, where p = {h; l} denotes its trac priority: high or low. Formally, D avg (p) is expressed by na (p;u) k=1 t D d (p; k) avg (p) = lim u (6.5) n a (p; u) where n a (p; u) denotes the total number of p-class packets accepted within u time intervals and t d (p; k) represents the total delay for the k th p-class packet. We consider t d (p; k) = t w (p; k) + t tr (p; k) where t w (p; k) denotes the total queuing delay for k th p-class packet, while waiting at each stage for the availability of a buer at the next stage of the network. The second term t tr (p; k) denotes the total transmission delay for k th p-class packet at each stage of the network, that is just n nc, where n = log 2 N is the number of intermediate stages and nc is the network cycle. Normalized packet delay D(p) is the ratio of the D avg (p) to the minimum packet delay which is simply the transmission delay n nc (i.e. zero queuing delay). Formally, D(p) can be dened as (6.6) n nc Universal performance factor Upf is dened by a relation involving the two major above normalized factors, D and Th: the performance of a MIN is considered optimal when D is minimized and T h is maximized, thus the formula for computing the universal factor arranges so that the overall performance metric follows that rule. Formally, Upf can be expressed by D(p) = D avg(p) Upf = w d D 2 + w th 1 (6.7) Th 2 where w d and w th denote the corresponding weights for each factor participating in the Upf, designating thus its importance for the corporate environment. Consequently, the performance of a MIN can be expressed in a single metric that is tailored to the needs that a specic MIN setup will serve. It is obvious that, when the packet delay factor becomes smaller or/and throughput factor becomes larger the Upf becomes smaller, thus smaller Upf values indicate better overall MIN performance. Because the above factors (parameters) have dierent measurement units and scaling, we normalize them to obtain a reference value domain. Normalization is performed by dividing the value of each factor by the (algebraic) minimum or maximum value that this factor may attain. Thus, equation (6.7) can be replaced by: Upf = w d ( D D min D min ) 2 + w th 112 ( Th max ) Th 2 (6.8) Th

116 where D min is the minimum value of normalized packet delay (D) and Th max is the maximum value of normalized throughput. Consistently to equation (6.7), when the universal performance factor Upf, as computed by equation (6.8) is close to 0, the performance a MIN is considered optimal whereas, when the value of Upf increases, its performance deteriorates. Moreover, taking into account that the values of both delay and throughput appearing in equation (6.8) are normalized, D min = Th max = 1, thus the equation can be simplied to: ( ) 1 Th 2 Upf = w d (D 1) 2 + w th (6.9) Th In the remaining of this chapter we will consider both factors of equal importance, setting thus w d = w th = 1. Finally, as in the evaluation of relative normalized throughput for high and low priority trac the corresponding ratio of oered load takes also place at this metric. Consequently, ( ) 2 1 Th(p) Upf(p) = w d (D(p) 1) 2 + w th r p (6.10) Th(p) where r p = {r h ; r l } is the corresponding ratio of oered load for high and low priority trac respectively. Average packet loss probability Pl avg (p) is the average number of p-class packets rejected by all input ports per network cycle. Formally, Pl avg (p) is dened as uk=1 n Pl r (p; k) avg (p) = lim u (6.11) u where n r (p; k) denotes the total number of p-class packets that are rejected at all queues of SEs at the rst stage of MIN during the k th time interval. Normalized packet loss probability P l(p) is the ratio of the average packet loss probability Pl avg (p) to the number of network input ports N. As in equations (6.3) and (6.4) the relative normalized loss probability Pl(p) can be formally expressed by Pl(p) = Pl avg(p) N r p (6.12) Note that the packet loss probability in the case of unicast trac is equal to( Th), and this is the reason it does not appear in the Upf formula in [57] (this study considers only unicast trac). In this work, we will retain the denition of [57] for Upf, and we will consider packet loss probability as a separate metric for multicast trac Metrics for multi-layer MINs Recall from section 6.3 that multilayer (N X N) MINs considered in this chapter consist of two segments, as illustrated in gure 1: the rst one is a single-layer segment and the second one is a multi-layer segment operating in a non-blocking fashion. Let l be 113

117 the number of layers at the last stage (output) of network. The number of multi-layer stages is then n ml = log 2 l (since layers are doubled in consecutive stages in the multilayer segment), while the number of single-layer stages is n sl = n-log 2 l = log 2 N-log 2 l, where n = log 2 N is the total number of stages in the MIN. Normalized throughput T h(p) of an l-layer MIN can be consequently expressed as Th(p) = Th(p; n log 2 l) (1 + m) 1+log 2l (6.13) where Th(p; n log 2 l) is the normalized throughput of p-class trac at last stage of single-layer segment of MIN. The multiplier in equation (6.13) [(1 + m) 1+log 2l ] eectively represents the cloning factor of a packet undergoing (1+log 2 l) transmissions across stages, with the probability of being duplicated in each transmission is m. Note that equation (6.13) holds under the assumption that no blockings may occur in the last (1 + log 2 l) transmissions; the last one of single-layer and all of multi-layer segment. Normalized delay D(p) of an l-layer MIN can be similarly evaluated basing on the normalized packet delay D(p; n log 2 l) of single-layer segment of MIN. Formally, D(p) can be dened as D(p) = D(p; n log 2l) (n log 2 l) + log 2 l (6.14) n The normalized delay of entire MIN transmission includes both single- and multilayer segments. According to (6.6) the average delay of the single-layer segment can be expressed as D avg (p; n-log 2 l) = D(p; n log 2 l) (n log 2 l) nc. Subsequently, the average delay D avg (p) of entire l-layer MIN is simply augmented by the transmission delay of non-blocking, multi-layer segment which is log 2 l nc. Thus, the normalized delay just as expressed by equation (6.14) is computed by dividing the D avg (p) = [D(p; n- log 2 l) (n log 2 l) + log 2 l] nc over the minimum packet delay, which is simply the transmission delay of all stages, i.e. n nc. Universal performance factor Upf(p) of an l-layer MIN, can be expressed according to equation (6.8), and taking into account that D min = 1, and Th max = 2 l by ( ) 2 l 2 Th(p) Upf(p) = w d (D(p) 1) 2 + w th r p (6.15) Th(p) The maximum normalized throughput take place when the multicast ratio is m = 1, and thus the normalized throughput at last stage of single-layer segment is also Th(p; n log 2 l) = 1. At this case, the second term of equation (6.13) becomes 2 1+log 2l = 2 l, denoting that each queue of all layers within the non-blocking segment of the MIN forwards 2 packets at each time slot. 114

118 6.5 Simulation and Performance Results for multi-layer, multipriority MINs The overall network performance of multicasting store and forward MINs was evaluated by developing a special-purpose simulator in C++, capable to operate under dierent conguration schemes. Performance evaluation was conducted using simulation, rather than mathematical modelling, due to the high complexity of the latter [13], stemming from the combination of multicast trac, multi-priority and multiple buer positions in SEs. The simulator implements two dierent kinds of multicasting transmissions: i) fullmulticast transmission, where a packet transmitted only when both queues of next stage SEs are able to accept the packet and ii) partial-multicast transmission where a packet can be serviced either fully at both directions or partially, being transmitted at one direction and remaining in the queue the transmission towards the other direction is completed (algorithms 6.2 and 6.3). Several input parameters such as the buer-length, the number of input and output ports, the number of stages, the oered load, the multicast ratio, the number of priority classes, and the number of layers were considered. Internally, each SE was modeled by an array of p non-shared buer queue pairs, where p is the number of priority classes; within each pair, one buer was dedicated for the upper queuing bank and the other for the lower bank. We simulated two dierent congurations, where the queuing banks were located either in front of crossbar Switching Element (SE) or behind it. At the following algorithm 6.3. queuing banks are considered to be located after the crossbar segment of SE. At all cases buer operation was based on the FCFS principle. In this study we considered both single- and dual-priority MINs. At the rst case, contention between two packets was resolved randomly, but when a dual priority mechanism was used, high priority packets had precedence over the low priority ones (algorithm 6.3), and contentions were resolved by favouring the packet designated as \high priority" and transmitted from the queue in which the high priority packets were stored in. All simulation experiments were performed at packet level, assuming xed-length packets transmitted in equal-length time slots, where the slot was the time required to forward one (in the case of unicast; algorithms 6.1 and 6.3) or two (in case of broadcast; algorithms 6.2 and 6.3) packet(s) from one stage to the next. Metrics such as packet throughput, packet delay, and loss probability were collected. We performed extensive simulations to validate our results. All statistics obtained from simulation running for 10 5 clock cycles. The number of simulation runs was adjusted to ensure a steady-state operating condition for the MIN. There was a stabilization phase to allow the network to reach a steady state, by discarding the data from the rst 10 3 network cycles, before initiating metrics collection. 115

119 Unicast/Partial Forwarding (cs id ; cl id ; nl id ; sq id ; aq id ; pr id ; mp; bm) Input: Current stage id (cs id ) ; current and next stage layer id (cl id ; nl id ) of send- and accept-queue/s respectively ; send-queue id (sq id ) of current stage ; accept-queue id (aq id ) of next stage ; priority id (pr id ) ; multicast policy (mp) and blocking mechanism (bm). Output: Population for send- and accept-queues (P op) ; total number of serviced and blocked packets for send-queue (Serviced; Blocked) respectively ; total number of packet delay cycles for send-queue (Delay); routing address RA of each buer position of queue ; partial multicast service indicator for the head of line packet of send-queue (PS). { if (Pop[aq id ][cs id + 1][nl id ][pr id ] = B) // blocking state // where B is the buer-size { Blocked[sq id ][cs id ][cl id ][pr id ] = Blocked[sq id ][cs id ][cl id ][pr id ] + 1 ; if (mp = full ) and (bm = blm ) // block and lost mechanism; { Pop[sq id ][cs id ][cl id ][pr id ] = Pop[sq id ][cs id ][cl id ][pr id ] 1 ; for (bf id = 1; bf id >= Pop[sq id ][cs id ][cl id ][pr id ]; bf id + +) RA[sq id ][cs id ][cl id ][pr id ][bf id ] = RA[sq id ][cs id ][cl id ][pr id ][bf id + 1] ; // where RA is the Routing Address of the packet // located at (bf id ) th position of send-queue } } else // unicast forwarding { Serviced[sq id ][cs id ][cl id ][pr id ] = Serviced[sq id ][cs id ][cl id ][pr id ] + 1 ; Pop[sq id ][cs id ][cl id ][pr id ] = Pop[sq id ][cs id ][cl id ][pr id ] 1 ; Pop[aq id ][cs id + 1][nl id ][pr id ] = Pop[aq id ][cs id + 1][nl id ][pr id ] + 1 ; RA[aq id ][cs id +1][nl id ][pr id ][Pop[aq id ][cs id +1][nl id ][pr id ]] = RA[sq id ][cs id ][cl id ][pr id ][1]; for (bf id = 1; bf id >= Pop[sq id ][cs id ][cl id ][pr id ]; bf id + +) RA[sq id ][cs id ][cl id ][pr id ][bf id ] = RA[sq id ][cs id ][cl id ][pr id ][bf id + 1] ; PS[sq id ][cs id ][cl id ][pr id ] = 1 ; // initialize again the indicator // where PS = 1 means the head of line packet has no partialy serviced } Delay[sq id ][cs id ][cl id ][pr id ] = Delay[sq id ][cs id ][cl id ][pr id ] + Pop[sq id ][cs id ][cl id ][pr id ] ; return Pop; Serviced; Blocked; Delay; RA; PS ; } Algorithm 6.1: Unicast/Partial forwarding for multi-layer, multi-priority MINs 116

120 Broadcast Forwarding (cs id ; cl id ; nl id ; sq id ; uq id ; lq id ; pr id ; mp; bm) Input: Current stage id (cs id ) ; current and next stage layer id (cl id ; nl id ) of send- and accept-queue/s respectively ; send-queue id (sq id ) of current stage ; upper and lower output port queue id uq id ; lq id of next stage accept-queue respectively ; priority id (pr id ) ; multicast policy (mp) and blocking mechanism (bm). Output: Population for send- and accept-queues (P op) ; total number of serviced and blocked packets for send-queue (Serviced; Blocked) respectively ; total number of packet delay cycles for send-queue (Delay) ; routing address RA of each buer position of queue ; partial multicast service indicator for the head of line packet of send-queue (PS). { if (Pop[uq id ][cs id + 1][nl id ][pr id ] = B) or (Pop[lq id ][cs id + 1][nl id ][pr id ] = B) // blocking state { Blocked[sq id ][cs id ][cl id ][pr id ] = Blocked[sq id ][cs id ][cl id ][pr id ] + 1 ; if (mp = full ) and (bm = blm ) // block and lost mechanism; { Pop[sq id ][cs id ][cl id ][pr id ] = Pop[sq id ][cs id ][cl id ][pr id ] 1 ; for (bf id = 1; bf id >= Pop[sq id ][cs id ][cl id ][pr id ]; bf id + +) RA[sq id ][cs id ][cl id ][pr id ][bf id ] = RA[sq id ][cs id ][cl id ][pr id ][bf id + 1] ; // where RA is the Routing Address of the packet // located at (bf id ) th position of send-queue } } if (Pop[uq id ][cs id + 1][nl id ][pr id ] < B) and (Pop[lq id ][cs id + 1][nl id ][pr id ] < B) { // broadcast forwarding Serviced[sq id ][cs id ][cl id ][pr id ] = Serviced[sq id ][cs id ][cl id ][pr id ] + 1 ; Pop[sq id ][cs id ][cl id ][pr id ] = Pop[sq id ][cs id ][cl id ][pr id ] 1 ; Pop[uq id ][cs id + 1][nl id ][pr id ] = Pop[uq id ][cs id + 1][nl id ][pr id ] + 1 ; Pop[lq id ][cs id + 1][nl id ][pr id ] = Pop[lq id ][cs id + 1][nl id ][pr id ] + 1 ; RA[uq id ][cs id +1][nl id ][pr id ][Pop[uq id ][cs id +1][nl id ][pr id ]] = RA[sq id ][cs id ][cl id ][pr id ][1]; RA[lq id ][cs id +1][nl id ][pr id ][Pop[lq id ][cs id +1][nl id ][pr id ]] = RA[sq id ][cs id ][cl id ][pr id ][1]; for (bf id = 1; bf id >= Pop[sq id ][cs id ][cl id ][pr id ]; bf id + +) RA[sq id ][cs id ][cl id ][pr id ][bf id ] = RA[sq id ][cs id ][cl id ][pr id ][bf id + 1] ; PS[sq id ][cs id ][cl id ][pr id ] = 1 ; // initialize again the indicator } 117

121 if (mp = partial ) // partial multicast forwarding is enabled { if (Pop[uq id ][cs id + 1][nl id ][pr id ] < B) and (Pop[lq id ][cs id + 1][nl id ][pr id ] = B) { // upper port partial multicast service Pop[uq id ][cs id + 1][nl id ][pr id ] = Pop[uq id ][cs id + 1][nl id ][pr id ] + 1 ; RA[uq id ][cs id +1][nl id ][pr id ][Pop[uq id ][cs id +1][nl id ][pr id ]] = RA[sq id ][cs id ][cl id ][pr id ][1]; PS[sq id ][cs id ][cl id ][pr id ] = 0 ; } } if (Pop[uq id ][cs id + 1][nl id ][pr id ] = B) and (Pop[lq id ][cs id + 1][nl id ][pr id ] < B) { // lower port partial multicast service Pop[lq id ][cs id + 1][nl id ][pr id ] = Pop[lq id ][cs id + 1][nl id ][pr id ] + 1 ; RA[lq id ][cs id +1][nl id ][pr id ][Pop[lq id ][cs id +1][nl id ][pr id ]] = RA[sq id ][cs id ][cl id ][pr id ][1]; PS[sq id ][cs id ][cl id ][pr id ] = 1 ; } } Delay[sq id ][cs id ][cl id ][pr id ] = Delay[sq id ][cs id ][cl id ][pr id ] + Pop[sq id ][cs id ][cl id ][pr id ] ; return Pop; Serviced; Blocked; Delay; RA; PS ; Algorithm 6.2: Broadcast forwarding for multi-layer, multi-priority MINs 118

122 SendQueue Process (cs id ; cl id ; nl id ; sq id ; mp; bm) Input: Current stage id (cs id ) ; current and next stage layer id (cl id ; nl id ) of send- and accept-queue/s respectively ; send-queue id (sq id ) of current stage ; multicast policy (mp) and blocking mechanism (bm). { processor = 0; for (pr id = P 1; pr id >= 0; pr id ) // where P is the total number of priorities if (Pop[sq id ][cs id ][cl id ][pr id ] > 0) and (processor = 0) // pr id -class send-queue is not empty and processor is still ready for forwarding { RA bit = get bit(ra[sq id ][cs id ][cl id ][pr id ][1]); // Routing Address (RA) MM bit = get bit(mm[sq id ][cs id ][cl id ][pr id ][1]); // Multicast Mask (MM) // get the (cs id ) th bit of (RA) and (MM) of the leading packet // of pr id -class send-queue by a cyclic logical left shift respectively } } if ((mp = full ) and (MM bit = 1)) or // broadcast forwarding ((mp = partial ) and (MM bit = 1) and (PS[sq id ][cs id ][cl id ][pr id ] = 1)) // where PS = 1 means the head of line packet has no partialy serviced { uq id = 2 (sq id %(N=2)) ; // upper link for perfect shue algorithm lq id = 2 (sq id %(N=2)) + 1 ; // lower link for perfect shue algorithm Broadcast Forwarding (cs id ; cl id ; nl id ; sq id ; uq id ; lq id ; pr id ; mp; bm) ; } else // unicast or partial multicast forwarding { if (RA bit = 0) or ((mp = partial ) and (MM bit = 1) and (PS[sq id ][cs id ][cl id ][pr id ] = 1)) // upper port forwarding aq id = 2 (sq id %(N=2)) ; // link for perfect shue algorithm else if (RA bit = 1) or ((mp = partial ) and (MM bit = 1) and (PS[sq id ][cs id ][cl id ][pr id ] = 2)) // lower port forwarding aq id = 2 (sq id %(N=2)) + 1 ; // link for perfect shue algorithm // where PS = 1; 2 means packet has serviced partialy // at lower or upper port direction respectively Unicast/Partial Forwarding(cs id ; cl id ; nl id ; sq id ; aq id ; pr id ; mp; bm) ; } processor = 1 ; Algorithm 6.3: Send-queue process for multi-layer, multi-priority MINs 119

123 6.5.1 Simulator Validation for Multicasting To validate our simulator, we modeled single-layer MINs using this simulator and compared the results obtained from it against the results reported in other works selecting among them the ones considered most accurate both under unicast and multicast trac. Figure 6.3: Th of single-priority MINs Figure 6.4: D of single-priority MINs In the case of unicast trac (m = 0) we found that all results obtained by this simulator (gure 6.3 curve [FP]MB1 0) were in close agreement with the results reported in [55] (gure 2), and notably as Theimers model [44], which is considered to be the most accurate one. In all subsequent diagrams, curves ZMBX Y denote the performance of a MIN whose SEs in the single-layer segment have buer size equal to X and operating with multicast ratio m equal to Y. When Z is equal to F, the MIN in question operates under the FM policy, whereas when Z is equal to P the MIN operates under the PM policy. In the special case that (m = 0), the multicast policy is irrelevant since no multicasting occurs, thus both curves coincide and are denoted as [FP]MBX Y. All curves refer to 6-stage MINs. Moreover, for m = 0:5, at the case of using partial multicasting policy on a singlelayer, single-buered, (64X64) MIN, we compared our measurements (gure 6.3 curve P MB1 0:5) against those obtained from Tutschs Model reported in [48] (gure 8 solid curve), when all possible combinations of destination addresses for each packet entering the network were equally distributed, and we have found that both results are in close agreement (normalized throughput is about 75%). 120

124 6.5.2 Multicasting on Single-priority, Single-layer MINs In gure 6.3, we can observe that the partial multicasting policy oers better performance as compared to full multicasting for m = 0:1 and m = 0:5, while no dierences are observed for m = 1. We can also note that for high values of m m >= 0:5 the network is saturated (reaches its peak performance) even with very small loads ( < 0:05), while for m = 0:1 (in which case we may recall that approximately half of the packets entering the network are unicast), the network is saturated for oered loads >= 0:4. Figure 6.4 illustrates the normalized delay in single-layer MINs. Again, the P M policy oers better performance that the FM policy for m = 0:1 and m = 0:5; for m = 1 (i.e. only broadcast packets enter the network), the situation is reversed and the F M policy has a performance edge. This is owing to the fact that if a broadcast packet is partially served, a packet copy will remain in the queue leading thus to partially serving subsequent packets (which are broadcast packets too), and this leads to increased queuing delays. Finally, for m = 1 the delay values for both multicast policies are excessively high. Figure 6.5: Upf of single-priority MINs Figure 6.6: Pl of single-priority MINs Figure 6.5 shows the universal performance factor Upf for single-layer MINs. For m = 1, the value of Upf is high (indicating poor MIN performance), and this is owing to the high delay values. For m = 0 and m = 0:1, the value of Upf drops (thus overall MIN performance increases) until the oered load reaches a value of 0.5 and 0.3 respectively. This is mainly owing to the variation of the throughput, which increases within the above ranges; the delay, on the other hand, exhibits considerably smaller variations. On the contrary, for m = 0:5 and m = 1 the value of Upf continuously increases, since the network is very quickly saturated >= 0:10. Figure 6.6 depicts the packet loss probability for single-layer MINs. We can notice that larger values of m lead to more lost packets; this is to be expected since when m increases, more packets are generated as a re sult of packet cloning due to multicasting, 121

125 and the in creased packet number cannot be successfully serviced since the network is already saturated. The PM policy has again a marginal, in this case edge over FM Multicasting on Dual-priority, Single-layer MINs In this section we only consider the PM policy, since it oers superior performance compared to the FM policy, as shown in the previous section. All SP-BX-M:Y curves at subsequent diagrams denote the performance of a single-priority, 6-stage MIN whose SEs in the single-layer segment have buer size equal to X and operating with multicast ratio m equal to Y %. Similarly, HP-BX-M:Y and LP-BX-M:Y curves depict the performance of high and low priority trac of a dual priority MIN rrespectively. Figures 6.7 and 6.8 represent the relative normalized throughput of a dual vs. single priority mechanism for a single-buered, 6-stage, single-layer MIN at the case of m = 0; 0:10 and m = 0:50 respectively. The introduction of a dual priority scheme in a single-layer MIN has a signicant impact on the quality of service oered to packets having dierent priorities. Figures 6.8 and 6.9 depict the normalized throughput and normalized delay of high- and low-priority packets, in an \heavy multicasting" scenario [the probability that a packet is unicast is equal to (1 0:5) 6 = 0:016, i.e. 98.4% of the packets are multicast to at least two destinations); the respective metrics for a single-priority scheme are also included for comparison. Figure 6.7: (m=0,0.10) T h of single-layer MINs Figure 6.8: (m=0.50) T h of single-layer MINs Due to the heavy multicasting, the network is quickly saturated and we can observe that in the single priority setup, peak throughput is reached at = 0:1, and it remains constant thereafter. In the dual priority setup, high priority packets (recall that these correspond to 10% of the overall trac in the described experiments) are serviced at almost optimal QoS for loads <= 0:7. Low priority packets exhibit a throughput drop, which is small for loads <= 0:3 and tolerable for loads <= 0:5, while for higher loads the performance drop is considerable. Regarding the delay, we can observe in gure 122

126 6.9 that for loads <= 0:4 high-priority packets traverse the network with almost no blockings, while for loads >= 0:6 the eect of blockings on the delay of high-priority packets is noticeable. Low priority packets have a high delay even at small loads, and beyond the point of = 0:5 the delay rises sharply. Figure 6.9: D of single-layer MINs (m=0.50) Figure 6.10: (m=0,0.10) D of single-layer MINs Respectively, gures 6.7 and 6.10 illustrate the normalized throughput and delay of high- and low-priority packets in a unicast (m = 0) and in a moderate multicasting scenario (m = 0:1, which in a 6-stage MIN results to approximately half the packets being multicasted to at least two destinations). Under both scenarios, high-priority packets receive an almost optimal quality of service, both in terms of throughput and delay, under all oered loads, while low-priority packets sustain some observable performance drop in the moderate multicasting scenario only; throughput appears to drop as compared to the single priority setup for very high overall loads ( >= 0:8), while increase in delay becomes apparent at a somewhat smaller load ( >= 0:7). We should note here that the introduction of the dual priority scheme, eectively increases the buer capacity of SEs, since separate buer queues are dedicated to packets of each priority, and this is the reason why both the overall throughput and delay of the network increases (for a discussion on the eects of SE buer size on overall MIN performance, the interested reader is referred to [57]). Figure 6.11 depicts the MIN universal performance factor for the three above scenarios (unicast, moderate multicast and heavy multicast), for both the dual- and the singlepriority setups. The ndings rearm that high priority packets enjoy an almost optimal quality of service, even at high loads. In the unicast and moderate multicast scenario, lowpriority packets reach their peak quality of service for = 0:4, and this slightly drops for higher loads; in the heavy multicast scenario, the quality of service oered to low-priority packets is initially tolerable, but deteriorates sharply as load increases. Packet loss probability is another metric that should be taken into account, in order to assess the overall MIN operation. As we can see in gure 6.12, under the unicast and 123

127 Figure 6.11: Upf of single-layer MINs Figure 6.12: P l of single-layer MINs moderate multicast scenario, high-priority packets are never or rarely dropped; under the heavy multicast scenario, however, for loads > 0:5 the network begins to drop packets, since there is not enough capacity to route all packets to their destinations (recall packets can be cloned as they traverse the network, thus the overall number of packets increases). The probability that a low-priority packet is lost is comparable to that of packet loss in a single priority setup, being observably higher under the heavy multicast scenario for loads >= 0:6. Note that under this scenario and in the extreme case that = 1, the packet loss probability is close to Multicasting on Dual-priority Multi-layer MINs In this subsection, we present our ndings for a 6-stage MIN where the number of layers at the last stage l is equal to 4, i.e. the rst four stages are single-layer and multiple layers are only used at the last two stages, in an attempt to balance between MIN performance and cost. For the rst 4 stages, single- and double-buered SEs are considered, whereas at the last two stages (which are non-blocking), single-buered SEs are used, as the absence of blockings removes the need for larger buers. Figure 6.13 illustrates the normalized throughput for the heavy multicast scenario, considering SEs of buer size 1 and 2; single-priority metrics are also included for reference. Under both scenarios, high-priority packets are served optimally, while low-priority packets enjoy a better service when buer size b is equal to 2; this is consistent with ndings of other works (e.g. [57]), reporting that double buers lead to increased throughput. Note that low-priority packets appear to enjoy better throughput even compared to the single-priority setup (except for loads >= 0:8 in the single-buer conguration), and this is again owing to the additional buers available in the SEs, due to the fact that different buer queues are dedicated to packets of dierent priorities. The overall throughput for high-priority packets has increased by more than 50% in the high load area ( > 0:7) and the throughput of low-priority packets has eectively quadrupled, as compared to the 124

128 single-layer MIN at almost all loads. Taking into account that introducing two stages of multi-layered SEs has lead to increasing the routing capacity in the last two stages by four, we can conclude that this increase is exploited to an almost full extent. Figure 6.14 depicts the normalized throughput for high- and low-priority packets under the moderate multicast scenario; single-priority metrics are also included for reference. High-priority packets are again served optimally (but no considerable dierence can be seen against the single-layer case), while for low-priority packets the performance gains are approximately 100% when buer size is set to 2, and approximately 67% when buer size is compared to 1 (both comparisons are against the single-layer, single-buer conguration), thus buer size plays an important role in this setup. Note also that the network reaches its peak throughout regarding low-priority packets at load = 0:6, as opposed to the single-layer case where peak performance is attained at load = 0:3, indicating that the additional bandwidth oered by the multi-layer SEs is exploited to some good extent. Figure 6.13: (m=0.50) T h of multi-layer MINs Figure 6.14: (m=0.10) T h of multi-layer MINs Figures 6.15 and 6.16 show the normalized delay for the heavy- and medium-multicasting scenarios, respectively, considering both buer sizes (1 and 2) and both priority schemes (dual- and single-priority). The normalized delay for high-priority packets is close to the optimal in all cases. Expectedly, when the double-buered conguration is considered, delays are increased for low-priority packets and packets in the single-priority setup; the point beyond which the increment is considerable diers across the dierent scenarios, being located at = 0:3 for heavy multicasting, while for medium multicasting it has been shifted to = 0:5. In the heavy multicasting scenario, for loads <= 0:3 low-priority packets have delays comparable to those of packets in the single-priority setup, indicating that in this load range, the network has enough capacity to oer elevated quality of service to high-priority packets, without harming the quality of service oered to low-priority packets. In the medium multicasting scenario, the deviation the delay metrics for these classes of packets presents some observable deviation for loads >= 0:5. Figure 6.17 illustrates the universal performance factor for the heavy multicasting 125

129 Figure 6.15: D of multi-layer MINs (m=0.50) Figure 6.16: D of multi-layer MINs (m=0.10) scenario considering both buer sizes (1 and 2) and both priority schemes. We can observe that the Upf for high-priority packets continuously increases with the oered load under all congurations, indicating that the network has amble capacity to service optimally the increasing number of incoming high-priority packets. Regarding low-priority packets and packets in the single-priority setup, we can observe that initially the Upf improve in all setups, as more packets enter the network and therefore normalized throughput increases. In the heavy multicasting scenario and for buer size b = 2, the optimal point is reached at = 0:3, while beyond that point the sharp deterioration in the delay dominates over the small increments in the throughput, and thus the UPF appears to drop from that point onwards. In the same curves (SP-B2-M:50 and LP-B2-M:50), we can notice that for loads <= 0:6, the low-priority packets have a better overall quality of service as compared to the packets in the single priority setup: this is owing to the throughput gains obtained by the introduction of the additional buer queues, required for implementing the priority mechanism. For loads, however, >= 0:8, the increments in delay diminish these gains, and the overall QoS for low-priority packets appears smaller than the respective QoS of packets in the single priority setup. Analogous remarks hold for the single-buered conguration, with the overall performance drop point being located at = 0:5, and the point beyond which low priority packets begin to be served worse than packets in the single priority setup being located at = 0:7. Figure 6.18 illustrates the Upf for the medium multicasting scenario. High-priority packets are again served optimally, with the Upf exhibiting a sharper improvement as the load increases: this is due to the fact that for low loads, the network capacity is underutilized and hence the improvement potential is high. Regarding low-priority packets and packets in the single-priority scheme, we can notice here that double-buered setups are consistently better than their single-buer counterparts at all loads, indicating that the throughput gains obtained due to the introduction of additional buer queues dominate over the deteriorations in the delay. Beyond the point of = 0:7, only minor improvements 126

130 can be observed on the Upf for these cases, indicating that at this point the network has been saturated. Figure 6.17: (m=0.50) Upf of multi-layer MINs Figure 6.18: (m=0.10) Upf of multi-layer MINs Figure 6.19: P l of multi-layer MINs (m=0.50) Finally, gure 6.19 illustrates the packet loss probability for the heavy multicast scenario. We can observe that while high-priority packets are never lost, low-priority packets and packets in the single-priority setup can be lost at particularly light loads ( >= 0:2 for single-buer congurations and >= 0:3 for double-buered congurations). As anticipated, double-buer congurations achieve a lower packet loss probability, since in these setups it is more likely that a buer space is available to accommodate an incoming packet. In this diagram, it is worth noting that beyond the point of >= 0:6, the packet loss probability of the single-buered dual-priority setup is less than the packet loss probability of the double-buered single-priority setup. This stems from the availability of the extra buer queues in the dual priority setup, which beyond that point are utilized at maximum capacity. 127

131 6.5.5 Multicasting on Multi-layer Segment of Dual-priority MINs In this subsection we present our ndings for a special operational mode of the multi-layer MIN, in which multicasting occurs only at the last log 2 l +1 stages, i.e. packet cloning due to multicasting occurs only in the non-blocking segment. This mode of operation may be applied, for example, to cases of interconnected LANs, where multicasting/broadcasting can be performed within the limits of a single LAN but trac across distinct LANs is always unicast. As an example, setting l = 16 in a (64x64) MIN produces a conguration that can serve two interconnected LANs of 32 nodes each. A MIN in this mode combines both the LAN switch and the network trunk functionalities. Figure 6.20: (m=0.50) T h of multi-layer MINs Figure 6.21: (m=0.10) T h of multi-layer MINs In the diagrams, performance metrics are illustrated for dierent values of m (0.1, 0.5) and for l = 4, thus multicasting occurs only in the last 3 stages. Since these stages are non-blocking, both delay and loss probability are not aected by the value of m and are only related to the oered load ë and the buer size of the SEs in the single-layer segment. Therefore, variable m has been eliminated from diagrams depicting packet delay (gure 6.22) and loss probability (gure 6.23), and these performance factors are analyzed with respect only to the oered load and the buer size b. For both these performance factors, we can comment that their absolute values remain low, and are even lower than the corresponding metrics collected for unicast trac in single-layer MINs (gures 6.10 and 6.12, respectively). This is owing (a) to the availability of extra buer spaces due to the implementation of the dual priority scheme and (b) due to the fact that the probability of blockings in the last two stages drops to zero, whereas in gure 6.10 and gure 6.12 this does not hold. We can notice however that the normalized throughput in the heavy multicast scenario (gure 6.20) diers from the normalized throughput in the medium multicast scenario (gure 6.21): this is natural, since in the former case, more packet clonings occur (due to increased multicasting), thus more packets reach a destination port within any time unit. 128

132 Figure 6.22: D of multi-layer MINs Figure 6.23: P l of multi-layer MINs Figure 6.23 illustrates the packet loss probability against the oered load, both for the dual- and single-priority scenario. Again, high-priority packets are not lost under any circumstances, while double-bued setups expectedly achieve a smaller loss probability, since it is more likely that an incoming packet can nd available buer space. Dual priority congurations exhibit also smaller packet loss probability compared to their single-priority counterparts, due to the existence of additional queues. Figure 6.24: (m=0.50) Upf of multi-layer MINs Figure 6.25: (m=0.10) Upf of multi-layer MINs Finally, gure 6.24 illustrates the universal performance factor for the heavy-multicast scenario. We can notice here a consistent improvement of the Upf, until the point that the single-layer segment is saturated ( = 0:7). Beyond that point, small increments in throughput are counterbalanced with the increments in delay. Similar conclusions can be drawn from gure 6.25, which illustrates the Upf for the medium-multicast scenario. We can notice here that the absolute values of Upf are considerably higher (thus the network performance is considered worse), mainly owing to the reduced values of throughput (less 129

133 packets traverse the network due to reduced multicasting). Again, double-buered setups appear to have a performance edge over their single-buered counterparts, since they are able to attain higher throughput values, and the respective increase in the delay, while existent, is not sucient to cancel this advantage. 6.6 Conclusions Multistage Interconnection Network technology is a prominent approach for implementing NGNs, having an appealing cost/performance ratio and high performance. Multicasting however, which is a core requirement for NGNs, has been found to signicantly degrade MIN performance, and multi-layer MINs have been introduced to cope with trac shapes involving multicasting. In this chapter, we extensively study the performance of multilayer MINs operating under various overall input loads and multicast packet ratios, considering also a dual-priority scheme. We have additionally taken into account dierent buer size congurations for SEs, and more specically buer sizes equal to 1 and 2, which are proven to be the most ecient ones. For all these congurations, we have drawn conclusions regarding the network throughput, the packet delay and the packet loss probability, and we have also computed the Universal Performance Factor, a metric combining throughput and delay. The ndings of this performance evaluation can be used by network designers for drawing optimal congurations while setting up MINs, so as to best meet the performance and cost requirements under the anticipated trac load and quality of service specications. The presented results also facilitate performance prediction for multi-layer MINs before actual network implementation, through which deployment cost and rollout time can be minimized. 130

134 Bibliography [1] Abandah G.A., Davidson E.S. Modeling the communication performance of the IBM SP2. In Proceedings of the 10th International Parallel Processing Symposium (IPPS 96); Hawaii. IEEE Computer Society Press, [2] Adams G.B, Siegel H.J. The extra stage cube: A fault-tolerant interconnection network for supersystems. IEEE Trans. on Computers, 31(4)5, pp , May [3] Atiquzzaman M., Akhatar M.S. Ecient of Non-Uniform Trac on Performance of Unbuered Multistage Interconnection Networks. IEE Proceedings Part-E, [4] Awdeh R.Y., Mouftah H.T. Survey of ATM switch architectures. Computer Networks and ISDN Systems 27, pp , [5] Bolch G., Greiner S., Meer H., Trivedi K.S. Queueing Networks and Markov Chains - Modeling and Performance Evaluations with Computer Science Applications. John Wiley and Sons, New York, [6] Chang. H.K. Nonuniform memory reference of multistage interconnection networks. Computer Standards & Interfaces 26 pp , [7] Chen J.S.C., Guerin R. Performance study of an input queueing packet switch with two priority classes. IEEE Trans. Commun. 39(1), pp , [8] Choi C., Kim S. Hierarchical multistage interconnection network for shared-memory multiprocessor system. Proceedings of the 1997 ACM Symposium on Applied Computing, pp , [9] Cisco Systems. generation networks and the cisco carrier routing system overview.pdf (2004). [10] Cisco Systems. Service Providers Worldwide Driving Video/IPTV with Cisco IP NGN b.html (2005). [11] Cisco Systems. c1031/cdccont 0900aecd800f8118.pdf (2006). [12] D-Link. DES-3250TG 10/100Mbps managed switch TG.htm (2006). 131

135 [13] Garofalakis J, Stergiou E. An analytical performance model for multistage interconnection networks with blocking, Procs. of CNSR 2008, May 2008 [14] Garofalakis J, Stergiou E. Analytical Model for Performance Evaluation of Blocking Banyan Switches Supporting Double Priority, Proceedings of Communication Theory, Reliability, and Quality of Service (CTRQ 08), pp , July [15] German R. Performance Analysis of Communication Systems. John Wiley and Sons [16] Goke G.F, Lipovski G.J. Banyan Networks for Partitioning Multiprocessor Systems. Procs. of 1st Annual Symposium on Computer Architecture, pp , [17] Haas P. J. Stohastic Petri Nets. Springer Verlag [18] Hsiao S.H., Chen R. Y. Performance Analysis of Single-Buered Multistage Interconnection Networks. 3rd IEEE Symposium on Parallel and Distributed Processing, pp , [19] Ilyas M. and Syed M. A. An ecient multistage switching node architecture for broadband ISDNs, Telecommunication Systems pp , 1998 [20] Intel Corporation. Intel Express 460T standalone switch. (2006). [21] International Telecommunication Union (ITU). Draft Recommendation Y.NGN-FRA R2, Functional requirements and architecture of the NGN of release 2, SG13, Sep [22] International Telecommunication Union (ITU). Draft Recommendation Y.ngnmcastsf, NGN Multicast Service Framework, SG13, Sep [23] Jenq Y.C. Performance analysis of a packet switch based on single-buered banyan network. IEEE Journal Selected Areas of Communications, 1983 pp , [24] Jurczyk M. Performance Comparison of Wormhole-Routing Priority Switch Architectures. Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications 2001 (PDPTA'01); Las Vegas, pp , [25] Kim J., Shin T., and Yang M. Analytical modeling of a Multistage Interconnection Network with Buered axa Switches under Hot-spot Environment. Proceedings of PACRIM 07, [26] Lawrie A. Access and alignment of data in an array processor. IEEE Transactions on Computers, C-24(12): , Dec

136 [27] Lin T., Kleinrock L. Performance Analysis of Finite-Buered Multistage Interconnection Networks with a General Trac Pattern. Joint International Conference on Measurement and Modeling of Computer Systems. Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems, San Diego, California, United States, pp , [28] Lindermann C. Performance Modelling with Deterministic and Stohastic Petri Nets. John Wiley and Sons [29] Maggs B. M. Randomly-wired multistage networks, Statistical Science vol 8(1), pp , 1993 [30] Merchart A. A Markov chain approximation for analysis of Banyan networks. Proceedings of the ACM Sigmetrics Conference on Measurement and Modelling of Computer systems, [31] Mun H., Youn H.Y. Performance analysis of nite buered multistage interconnection networks. IEEE Transactions on Computers, pp , [32] Ng S. L., Dewar B. Load sharing replicated buered banyan networks with priority trac Connecting the System Australian Telecommunication Networks and Application Conference, Monash University, Clayton, Victoria, pp , [33] OECD. Convergence and Next Generation Networks. (2007) [34] Park J, Yoon H. Cost-eective algorithms for multicast connection in ATM switches based on self-routing multistage networks, Computer Communications, vol. 21, pp , [35] Patel J.H. Processor-memory interconnections for mutliprocessors. Procs. of 6th Annual Symposium on Computer Architecture. New York, pp , [36] Roughan M., Sen S., Spatscheck O., Dueld N. Class-of-Service Mapping for QoS: A Statistical Signature-based Approach to IP Trac Classication, Procs. of IMC 04, October 25 27, Taormina, Sicily, Italy, [37] Saleh M., Atiquzzaman M. Analysis of shared buer multistage networks with hot spot. IEEE First International Conference on Algorithms and Architectures for Parallel Processing, vol. 2, pp , 1995 [38] Shabati G., Cidon I, Sidi M., Two priority buered multistage interconnection networks, Journal of High Speed Networks, pp , [39] Shabati G., Cidon I, Sidi M., Two Priority Buered Multistage Interconnection Networks, IEEE High Performance Switching and Routing Conference HPSR 04 pp ,

137 [40] Sharma N. Review of recent shared memory based ATM switches, Computer Communications, vol. 22, pp , [41] Sivaram R, Panda D, Stunkel C. Ecient broadcast and multicast on multistage interconnection networks using multiport encoding, IEEE Transaction on Parallel and Distributed Systems, vol. 9(10), pp , October [42] Soumiya T., Nakamichi K., Kakuma S., Hatano T., Hakata A. The large capacity ATM backbone switch FETEX-150 ESP. Computer Networks, 31(6), pp , [43] Stevens W. R. TCP/IP Illustrated: Volume 1. The protocols, 10th Edition, Addison- Wesley Pub Company, [44] Theimer T, Rathgeb E, Huber M. Performance Analysis of Buered Banyan Networks, IEEE Transactions on Communications, vol. 39, no. 2, pp , [45] Torrellas J., Zhang Z. The Performance of the Cedar Multistage Switching Network. IEEE Transactions on Parallel and Distributed Systems, 8(4), pp , [46] Tse E.S.H. Switch fabric architecture analysis for a scalable bi-directionally recongurable IP router. Journal of Systems Architecture: the EUROMICRO Journal, 50(1), pp , [47] Turner J., Melen R. Multirate Clos Networks. IEEE Communications Magazine, 41, no. 10, pp , [48] Tutsch D, Hommel G.. Comparing Switch and Buer Sizes of Multistage Interconnection Networks in Case of Multicast Trac, Procs. of the High Performance Computing Symposium, (HPC 2002); San Diego, SCS, pp , [49] Tutsch D, Hendler M, Hommel G.. Multicast Performance of Multistage Interconnection Networks with Shared Buering, Procs. of ICN 2001, LNCS 2093, pp , [50] Tutsch D, Hommel G. Multilayer Multistage Interconnection Networks, Procs. of 2003 Design, Analysis, and Simulation of Distributed Systems (DASD 03). Orlando, USA, pp , [51] Tutsch D, Hommel G. Performance of buered multistage interconnection networks in case of packet multicasting, Procs. of Advances in Parallel and Distributed Computing, 19-21, pp , Mar [52] Tutsch D, Brenner M. MIN Simulate. A Multistage Interconnection Network Simulator. 17th European Simulation Multiconference: Foundations for Successful Modelling & Simulation (ESM 03); Nottingham, SCS, pp ,

138 [53] Tutsch D, Hommel G. Generating Systems of Equations for Performance Evaluation of Buered Multistage Interconnection Networks. Journal of Parallel and Distributed Computing, 62, no. 2, pp , [54] Upfal E., Feleprin S. and Snir M. Randomized routing with shorter paths, Proceedings of the 5th ACM Symposium on Parallel Systems, pp , 1993 [55] Vasiliadis D, Rizos G, Vassilakis C, Glavas E. Performance evaluation of twopriority network schema for single-buered Delta Network, Procs. of IEEE PIMRC 07, Sep [56] Vasiliadis D, Rizos G, Vassilakis C, Glavas E. Performance Evaluation of Multicast Routing over Multilayer Multistage Interconnection Network, Procs. of AICT 09, IEEE press, 2009 [57] Vasiliadis D, Rizos G, Vassilakis C. Performance Analysis of blocking Banyan Swithces, Procs. of CISSE 06, December, [58] Zhou B., Atiquzzaman M. Performance of output-multibuered multistage interconnection networks under general trac patterns, IEEE INFOCOM, 1994 [59] Zhou B., Atiquzzaman M. Impact of switch architectures on the performance of multistage interconnection networks. IEEE TENCON: Region 10 Ninth Annual International Conference, Singapore, pp ,

139 Author's Publications Simulation for Multistage Interconnection Networks using relaxed blocking model, D.C. Vasiliadis, and G.Å. Rizos, in proceedings of the ICCMSE 2006 International Conference of Computational Methods in Sciences and Engineering, Chania, Greece, October 2006, íol. 7, pp Performance Analysis of Multistage Interconnection Networks determining optimal parameter values for data intensive applications, D.C. Vasiliadis, G.Å. Rizos, and C. Vassilakis, in proceedings of the IBIMA 2006 International Conference on Internet & Information Systems in the Digital Age, Brescia, Italy, December 2006, pp Performance Analysis of blocking Banyan Switches, D.C. Vasiliadis, G.Å. Rizos, and C. Vassilakis, in proceedings of the ÉÅÅÅ sponsored International Joint Conference on Telecommunications and Networking TÅNE 06, December 2006, Springer pres, pp Performance Analysis of dual priority single-buered blocking Multistage Interconnection Networks, D.C. Vasiliadis, G.Å. Rizos, and C. Vassilakis, in the proceedings of the Third International Conference on Networking and Services (ICNS 07), IEEE Computer Society pres, posted in IEEE Digital Library, Athens, Greece, June 2007, art. no Performance evaluation of two-priority network schema for single-buered Delta Networks, D.C. Vasiliadis, G.Å. Rizos, C. Vassilakis, and E. Glavas, in the proceedings of the 18th Annual IEEE International Symposium on Personal, Indoor and Mobile Radio Communications of the PIMRC 2007, IEEE Computer Society pres, posted in IEEE Digital Library, Athens, Greece, September, 2007, art. no The role of priority mechanisms on performance metrics of double-buered Switching Elements, D.C. Vasiliadis, G.Å. Rizos, and C. Vassilakis, in the proceedings of the International Conference AIP American Institute of Physics, December 2007, CP 963, Vol. 2, Part B, pp Improving performance of nite-buered blocking Delta Networks with 2-class priority routing through asymmetric-sized buer queues, D.C. Vasiliadis, G.Å. Rizos, and

140 C. Vassilakis, in the proceedings of the Fourth Advanced International Conference on Telecommunications (AICT 2008), IEEE Computer Society pres, posted in IEEE Digital Library, Athens, Greece, June 2008, pp Routing and Performance Analysis of Double-Buered Omega Networks Supporting Multi-Class Priority Trac, D.C. Vasiliadis, G.Å. Rizos, C. Vassilakis, and E.Glavas, in the proceedings of the Third International Conference on Systems and Networks Communications (ICSNC 2008), IEEE Computer Society pres, posted in IEEE Digital Library, Sliema, Malta, October 2008, pp Performance Evaluation of Multicast Routing over Multilayer Multistage Interconnection Networks, D.C. Vasiliadis, G.Å. Rizos, C. Vassilakis, and E. Glavas in the proceedings of the Fifth Advanced International Conference on Telecommunications (AICT 2009), IEEE Computer Society pres, posted in IEEE Digital Library, Venice, Italy, May Routing and Performance Evaluation of Dual Priority Delta Networks under Hotspot Environment, D.C. Vasiliadis, G.Å. Rizos, and C. Vassilakis, in the proceedings of the First International Conference on Advances in Future Internet (AFIN 2009), IEEE Computer Society pres, posted in IEEE Digital Library, Athens, Greece, June Modelling and performance evaluation of a novel internal priority routing scheme for nite-buered multistage interconnection networks, D.C. Vasiliadis, G. Å. Rizos, C. Vassilakis, and E. Glavas, in the Elsevier Journal of Mathematical and Computer Modelling, Elsevier pres,(submitted). Modelling and performance study of nite-buered blocking Multistage Interconnection Networks supporting natively 2-class priority routing trac, D.C. Vasiliadis, G. Å. Rizos, and C. Vassilakis, in the Elsevier Journal of Mathematical and Computer Modelling, Elsevier pres,(submitted). Performance Analysis of Dual-Priority Multilayer Multistage Interconnection Networks under Multicast Environment, D.C. Vasiliadis, G.Å. Rizos, and C. Vassilakis, in the Journal of Networks,(submitted). 137

141 Short CV Dimitris Vasiliadis was born on June 27, 1966 in Arta, Greece. In September 1984 he entered the Department of Computer Engineering and Informatics, School of Engineering, University of Patras and he received his Diploma, as a Computer Engineer, in June He is currently System Administrator of the Network Operations Center (NOC) of Technological Educational Institute (T.E.I.) of Epirus, a node of the Greek Research and Technology Network (GRNET). He has published several papers. His research interests include Performance Analysis of Networking and Computer Systems, Computer Networks and Protocols, Telematics, QoS and New Services.