Chapter 2: Representation of Multimedia Data Chapter 3: Multimedia Systems Communication Aspects and Services Chapter 4: Multimedia Systems Storage Aspects Optical Storage Media Multimedia File Systems 4.2: Multimedia File Systems Traditional File Systems Multimedia File Systems Disk Scheduling Page 1
Why Multimedia File Systems? Heterogeneous data types including digital audio, animations and video Consuming enormous storage space Media are delay-sensitive: when user plays out or records a time dependent multimedia data object, the system must consume or produce at a constant data rate High demands to access to hard disc A new multimedia enabled file system is needed in two means: Organization of media content on the server Scheduling strategies for access to the data Page 2
Disk Layout Lehrstuhl für Informatik 4 The layout of a disk determines the way in which content is addressed how much storage space on the media is actually addressable and usable the density of stored content on the media Tracks and sectors A hard disk consists of one or more heads A hard disk is divided into tracks and further into sectors (512 Byte) The same track on all heads is called cylinder Storage of a file is done in terms of sectors Unused space of a sector is wasted Easy mapping of file location information to head movement and disc rotation Constant angular velocity (CAV), i.e. same access time to inner/outer tracks Access to a sector by a movable disk arm Page 3
Disk Layout Lehrstuhl für Informatik 4 Zone Bit Recording In the normal way, a sector at an outer radius has the same (sector) data amount, but more raw capacity. In principle, by this space is lost. Current approach for solution is zone bit recording Different read/write speeds, depending on the radius, allowing uniform sector size Place more popular media (movies) on an outer track to reduce average seek time, less popular media on an inner track. This saves disk arm movements. Now: how to place files on such a disc? Page 4
Use of Storage Medium Important: reduce read and write times by fewer seek operations lower rotational delay or latency high actual data transfer rate (can not be improved by placement) Method: store data in a specific pattern Divide file in blocks (can be bytes, or of larger size) Store blocks in certain patterns Larger block size Fewer seek operations Smaller number of requests But higher loss of storage space due to internal fragmentation (last block used only 50% on its sector on the average) Page 5
Traditional File Systems - File Structure How to place the records of a file? Contiguous Placement 1st file 2nd file 3rd file Non-contiguous Placement 1st file 2nd file 3rd file Page 6
Performance Consideration of File Structure Contiguous Placement: Disk access time for reading and writing is minimized Major disadvantage: file creation, deletion and size modification makes this sequential storing difficult Non-Contiguous Placement (two main approaches): 1. Linked Allocation: Using pointers for addressing the next block Fine for sequential access beginning pointer Random access is costly Long seek operations during playback 2. Indexed Allocation: Links are stored in an index-block Complex Performance depends on the index structure and size of the file (first block is 1) (next block is 2) 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 8 6 Page 7
Traditional File Systems: Disk Management Disk access is slow and costly - major bottleneck Techniques reducing overall disk access time: Block caches Keep blocks in memory for future use Reduces the number of disk access Reduce disk arm motion Blocks to be accessed in sequence are placed on the same cylinder Reduces the time for one disk access Take rotation into account by placing consecutive blocks in an interleaved manner Placement of mapping tables Mapping tables are placed in the middle of the disk Tables and the corresponding blocks are placed on the same cylinder Interleaved Storage Non-interleaved Storage Heads may read in parallel Page 8
Traditional File Systems: Disk Scheduling In traditional file systems, efficient usage of storage capacity is the main goal. The total time to service a request to a file in such a system consist of: Seek time, head positioning to appropriate track (diameter) Latency (rotation time), time to find the block in the track Actual data transfer time Technique to reduce delay: Seek operation Scheduling algorithms Latency File allocation methods Next, we will consider strategies for minimizing the seek time, i.e. for the positioning time of the head to the appropriate track. Tracks are numbered 0,..., N - 1. Here, 0 is the innermost and N - 1 the outermost track. Delay Page 9
Disk Scheduling: First-Come-First-Served (FCFS) Serve requests in order of arrival 133 0 13 31 5163 69 108 130 173 198 Queue i (51) 108 i+1 173 31 130 i+2 13 133 63 69 i i+1 i+2 order of successive requests + Easy to implement + Fair algorithm - Not optimal High average seek time Overall movement counted in number of tracks visited for FCFS (in an example scenario): 673 Page 10
Disk Scheduling: Shortest-Seek-Time First (SSFT) SSFT = Serve nearest request 133 0 13 31 51 63 69 108 130 173 Optimal overall movement: 198 SSTF movement: 243 Queue (51) 108 173 31 130 13 133 63 69 + Substantial improvement over FCFS - Still not optimal - Starvation (of some tracks if there is always a track with shorter seek time available) Page 11
Disk Scheduling: SCAN SCAN = serve requests in one direction; then reverse the movement Move from one end to the other, serving each request on the way 69 0 13 31 51 63 108 130 133 173 198 Queue Head Start Overall movement SCAN: 224 (51) 108 173 31 130 13 133 63 69 Page 12
Disk Scheduling: Circular-SCAN (C-SCAN) C-SCAN is similar to SCAN but returns immediately to the beginning if the end is reached; one idle head movement from one edge to the other between two consecutive scans 0 13 31 51 6369 108 130133 173 198 Queue + Fair service - More uniform waiting time - Performance not as good as SCAN + Middle tracks don t get a better service than edge tracks (such as with SCAN or with SSTF) Overall Movement C-SCAN: 376 (51) 108 173 31 130 13 133 63 69 Page 13
Multimedia File System Requirements of continuous data: File size: Highly structured data units (e.g. video and associated audio) New organization policies of the data on disk Efficient usage of limited storage is necessary Multiple data stream: For example: retrieval of a movie requires the processing and synchronization of audio and video data File access: High, continuous throughput Short maximum (not average) response times Real-time characteristic: Stream play-out in constant, gap-free rate additional buffers Page 14
Multimedia Disk Scheduling Algorithms G S + G Restrictions of data placement How to place media blocks? 6 Milliseconds for 3 blocks of data Parameters The size of a media block (granularity parameter G) # blocks: separation between successive blocks (scattering parameter S) playback duration Continuity requirement S+ G G r D data transfer rate from disk rd rc r C playback rate play back rate: 0.5 ms/block i.e. time to skip over a gap and to read the next media block is smaller than or equal to the duration of the playback e.g. G = 3 r D = 2 r C = 0,5 results in (S+G)/2 G/0.5 S+G 12 S 9 Page 15
Disk Scheduling Algorithms To fulfill the requirements of multimedia data, scheduling has another focus than in traditional file systems: Goals in Traditional File Systems: Reduce cost of seek time (effective utilization of disk arm) Achieve fair throughput Provide fair disk access Achieve short average response times Goals in Multimedia File Systems are different: Meet deadlines of all time-critical tasks Keep the necessary buffer space requirements low Find balance between time constraints and efficiency Page 16
Disk Scheduling: Earliest Deadline First (EDF) t 3 24 3 30 deadline track no. In EDF the block with the nearest deadline is read first. Equal deadlines FCFS 2 16 3 50 2 42 1 45 1 12 1 12 1 45 2 42 3 50 2 16 3 30 2 40 2 40 1 12 1 45 2 42 3 50 2 16 1 22 1 22 2 40 2 40 2 40 2 42 3 50 Poor throughput due to excessive seek time. Only deadlines are taken into account, but not track number. Very similar to FCFS: inefficient. Does not reflect the geographical position of tracks. 22 12 45 40 42 16 Page 17
Disk Scheduling: SCAN-EDF SCAN-EDF is a combination of: Deadline scheduling (as in EDF earlier deadlines are served first) Scanning (tasks with same deadline are served according to the actual scan direction) Problem: SCAN (i.e. use of scanning directions for tie break among equal deadlines) does not make much sense if too many different deadlines exist Thus: It has to be enforced that many requests have the same deadline In order to do so, all requests are grouped in a few groups which can be scanned together We require that deadlines D i are multiples of a common period p D i {1, 2, 3,...} Then deadlines with the same period can be grouped and served together by SCAN Page 18
Disk Scheduling: SCAN-EDF Implementation of SCAN-EDF by Perturbation of deadlines (in order to apply EDF) Let D i the deadline of task i and N i be the track number (0 N i < N max, e.g. N max = 100) Assume that D i N Modify D i towards D i (D i = perturbed deadline) D i = D i + f(n i ) f(n i ) converts the track number of i into a small perturbation of the deadline such that for equal deadlines the scanning is automatically applied If we choose (for example) Ni f(n i ) = N max 0 f(n ) < 1 i Thus if the deadline for a task on track 42 is equal to 3 then the perturbed deadline is 42 3+ = 3,42 100 This deadline is given to the task at arrival time Page 19
Disk Scheduling: SCAN-EDF t 2.16 16 3.50 50 2.42 42 1.45 45 1.12 12 2.40 40 1.22 22 Perturbed Deadline 1.12 12 2.40 40 1.22 22 12 Among the same deadline SCAN is applied Request with the earliest deadline is served Sensible only for a large number of requests 1.45 45 2.40 40 1.22 22 22 Track number 2.42 42 2.40 40 1.45 45 45 deadline 1, i.e. [1:2] 3.50 50 2.42 42 2.40 40 40 2.16 16 3.50 50 2.42 42 16 deadline 2, i.e. [2:3] Optimization only applies for requests with the same deadline before the comma Increase this probability by grouping the requests Page 20
Disk Scheduling: EDF, SCAN-EDF 0 10 20 30 40 50 SCAN-EDF EDF Deadlines Page 21
Disk Scheduling A small variation of deadline perturbation : The actual deadline given to the task is refined by: Taking into account the actual movement of the head at arrival time (i.e. upwards from 0 to N max - 1 or downwards from N max - 1 to 0) Considering the actual position N of the head The perturbed deadline for a task which resides on track N i is given by: D i = D i + f(n i ) where: Ni N if Ni N and "head moves upwards" Nmax Nmax Ni if Ni < N and "head moves upwards" Nmax f(n i ) = Ni if Ni N N > and "head moves downwards" max N Ni if Ni N and "head moves downwards" N This allows to serve new requests as soon as possible max Page 22
Group Sweeping Scheduling (GSS) Deadline 1.1 1.2 12 1.4 45 1.1 22 Group 1 SCAN 12, 22, 45 (ascending order) [in next cycle: descending order] Requests are served in cycles in a round-robin manner In one cycle requests are divided into groups. A group is served according to SCAN Service in a group may be in ascending or in descending order depending on the other groups Thus a smoothing buffer may be needed (to assure continuity) 3.4 24 3.3 30 2.0 16 3.3 50 2.2 42 1.2 45 1.4 12 2.4 40 1.1 22 Cycle 2.0 16 2.2 42 2.4 40 Group 2 SCAN 42, 40, 16 (descending order) Deadline 2.0 Group 3 Deadline 3.3 3.4 24 3.3 30 3.3 50 SCAN 24, 30, 50 (ascending order) t Page 23
Group Sweeping Scheduling (GSS) A particular stream can be the first one in its group in a given cycle, but the last one in its group in the next cycle This happens if the scan order is reversed, i.e. if we have an odd number of groups Thus we need a smoothing buffer in order to achieve continuity of play-out GSS is a trade-off between optimization of buffer space and arm movements Page 24
Group Sweeping Scheduling (GSS) - Mixed Strategy The mixed strategy is a compromise between Shortest seek ( greedy ) Balanced strategy Data retrieved from disk are placed into buffers. Different queues are used for different data streams. Shortest seek serves the stream whose data block is nearest Balanced serves the stream which has the lowest utilization of buffers (since this stream risks to run out of data) Page 25
Group Sweeping Scheduling (GSS) - Mixed Strategy Filling status of buffers indicate when to switch from SSTF to Balanced and vice versa 1 Urgency criterion: Urgency = Fullness ( all streams i) i Fullness i = small Urgency = high Balanced strategy should be used Page 26
Conclusion Lehrstuhl für Informatik 4 Multimedia Systems is not only about the media format (MPEG, PCM, ) also needs considerations how to store and access the media can be distributed: how to transmit the media over a network ( needs new user interfaces and programming concepts) Multimedia in the future Distributed applications are becoming more important Need of portability and system independence Better support for user interactivity Variety of new (still undiscovered) application domains? Page 27