Managing Your Digital Assets John Zubrzycki BBC Research and Development
Your requirements
Why are you keeping Digital Assets? Re-use in new content productions? Playout / delivery to the public? Selling content professionally? Heritage preservation? Are you working with other organisations?
What are you keeping? Complete programmes Stock shots Original camera rushes Subtitles, audio description Documents, photos
Digital Library or Archive? Library: Managed storage for sharing with several/many users Archive: Long-term preservation It there a distinction in your organisation?
Storage
Disk or Tape? Disk Quick access, slow read/write, Power on idle, Life: 5 yrs, Error rates: Desktop SATA: 1 sector in 10 14 bits Enterprise SATA: 1 sector in 10 15 bits Enterprise FC/SAS: 1 sector in 10 16 bits Tape Slow access, fast read/write, No power on idle, Life: 30 years, Error rates: LTO Tape: 1 bit in 10 17 bits Enterprise Tape: 1 bit in 10 19 bits
LTO tape Multivendor Filing system on tape LTFS LTO 5: 3 TB, 280 MB/s LTO 6: 6.2 TB, 400 MB/s Higher capacities and speeds: LTO 7: 16 TB, 788 MB/s LTO 8: 32 TB, 1180 MB/s
Optical disks Holographic not lived up to its promise (yet!) 1 Petabyte DVD in the labs: 1.5 TB commercially available
Solid State No moving parts Charge migration
Digital film Based on film s proven longevity Write once Best for long-term heritage preservation
Storage systems
Types of system Your computer Disks under the desk NAS / SAS IT network Production network
Digital file-based production Desktop editing Craft video editing Work In Progress store Freesat camera Production ingest station With QC Production libraries Playout Freeview WWW R&D BBC MMXII
Adding preservation archive to digital production Desktop editing Craft video editing Work In Progress store Freesat camera Production ingest station with QC Production libraries Playout Freeview WWW Archive management Archives R&D BBC MMXII
EBU Future Storage Systems Group: http://tech.ebu.ch/groups/fss Defining the requirement for storage systems designed for media: To advise users of media storage, and To advise the storage industry Investigating: Workflows for various production processes Categories of storage needed Suitable storage networks Range of storage systems available Developing methods to test storage network performance: BBC Meter: http://www.bbc.co.uk/rd/publications/whitepaper237 http://sourceforge.net/projects/msmeter/
Storage planning Your content collection will grow Build in scalability from Day 1 Expand the storage network just ahead of need: Storage costs falling Storage density rising Maximise the life of the technology used Consider best tape/disk mix Consider the importance of items in collection: Gold: Must not lose or degrade Silver: Need now, can reconsider retention Bronze: Quality or retention not important
Planning Tools PrestoPrime Storage Planning Tool A model for storage systems, to calculate Cost Risk Loss And compare what-if scenarios
Cloud storage Public 1 Renting storage on the open market Avoids management of the physical storage system Storage expansion simply by renting more Flexibility: Can move between providers for the best / cheapest tradeoff Needs fast public network for quick access
Cloud storage Public 2 Need to draw up an appropriate SLA* Security: Preventing unauthorised access Ownership / copyright Content may be stored abroad: Subject to foreign government access laws What happens if provider has system failures? Resilience policy Compensation policy What happens if company gets taken over / goes bust? (* Service Level Agreement)
Cloud storage - Private Dedicated facilities Security under your control Ease of management Needs fast private network for quick access: May need several sites to ease network traffic and delay You are responsible for operation, maintenance, expansion, etc. Responsible for content loss Outsource: To avoid management of the physical storage system
Content integrity
Is your content safe? We ve digitised all our content it s safe I just need a place to store the floppy disks Digital technology is evolving fast creating obsolescence in only a few years! Help! I got this message when I tried to open the file Preserve content in an error-tolerant way.
Yes if Keep multiple copies for disaster recovery Store at multiple sites How many copies? 2: then you have a backup
or better: How many copies? 3: then can check if at least two files are identical More: Lots of Copies Keep Stuff Safe (LOCKSS) Could use a filing system with inbuilt resilience such as ZFS Is all the content there? Incomplete or corrupted files do happen (bit rot) Sometimes there is no indication from the file system Need to verify the content not just the file Monitor system reliability: Start migration to new storage before it s too late
Errors in coded content In 2007: CERN found the average error rate in digital storage of ~10-7 Equivalent to 33 Gbytes in 330 PetaBytes 4,000 bit errors in 1 hr HD programme (AVC-Intra) Errors were highly bunched: e.g. due to RAID failure Effect of errors on coded pictures:
Do you know what you have? Are they all the same version? Lack of version control can lead to wasted storage or lost files Do you really know your file types? DRIOD file profiling tool and PRONOM * registry (* www.nationalarchives.gov.uk/pronom)
Content formats
Proliferation of digital formats
MXF: Media exchange Format
Dealing with all these formats? Traditional archivist approach: Keep what you re given Conserve the original file format Minimal intervention Production approach: Convert to current production format Content readily available Often more than one format in use
DPP format: Digital Production Partnership BBC, ITV, Channel 4, Sky, Channel 5, S4/C, UKTV and BT Sport Designed for completed programme deliveries Based on MXF with: AVC Intra compression at 100 Mb/s for HD IMX at 50 Mb/s for SD Founded on a new AMWA * international standard, AS-11 Minimum Programme Editorial and Technical Metadata Reference to European Broadcasting Union s EBUCore. Metadata provided both within the MXF file and as an XML file (* Advanced Media Workflow Association)
Digitising legacy content Ingex digitising software: Open source http://ingex.sourceforge.net Many good commercial services available Do you specify the DPP file delivery format? Some older content formats are actually very high quality: 35mm and 16mm film, HDCAM SR tape Poor quality content can be difficult to compress (encode): Noisy videotape and grainy, scratched, dirty film Need a higher quality archive format: AMWA AS-07 MXF Archiving & Preservation
R&D BBC MMXIII
Future proofing: The 100-year view
Future access (1) Production approach: Convert archive to current production format No legacy
Relationship between production and archive Production C1 C1 C1 Tape/ film Archive R&D BBC MMXII
Traditional archive migration not recommended for digital Upgrade 1 Production C2 C2 C1 T1 C2 Archive time R&D BBC MMXII
Traditional archive migration not recommended for digital Production Upgrade 2 C3 C3 C1 T1 C2 T2 C3 Archive time R&D BBC MMXII
Traditional archive migration not recommended for digital Production Upgrade 3 C4 C4 C2 C1 T1 T2 T3 C4 C3 Archive time R&D BBC MMXII
Codec concatenation Coded 1 st recode 2 nd recode 3 rd recode R&D BBC MMXII
Future access (2) Archaeologist / Archivist approach: Conservation However, it s easier to keep an object than a file Obsolescence: Media: move file to new media Operating system: recompile transcoder, coder and decoder Codec: decode to uncompressed Can t afford the storage space? Recode with an archive codec
Archive codec Picture quality needs to be higher than any foreseen delivery format Low compression ratio: 2:1 - ~6:1 Lossless: ~2:1 Flexible: Code any video format (past and future) Intra-frame coding: Limit error propagation Allows partial decoding Simple: Easy to recompile codec software for new operating systems Open standard: Multivendor implementations Popular
Archive codec options Motion JPEG2000 Used in the film industry Lossless mode being used by media archivists Complex No native interlace mode Future video standards not accommodated (yet) VC2: SMPTE standard Simple Interlace handled Inbuilt capability for future video standards No commercial implementations available to archivists yet
Archive migration recommended for digital Production Upgrade 1 Upgrade 2 Upgrade 3 C1 C1 C2 C3 C4 T1 T2 T3 C1 TL Tape/ film Uncompressed or C(Lite) Archive time R&D BBC MMXII
Archive migration recommended (phase 2) Production Upgrade 1 Upgrade 2 Upgrade 3 C2 C2 C2 C3 C4 T1 T2 T3 C2 TL Uncompressed or C(Lite) Archive time R&D BBC MMXII
Restoration
Restoration is good isn t it? Restoration can make content look like new again But what do we mean by look like? Any process carried out on content destroys information Future Ultra-HDTV systems may unveil restorations
Many beneficial processes Colour grading Noise reduction Scratch and dirt removal Field order correction Colour recovery Super-resolution
Recommended approach Store content at the highest quality possible with minimal processing Carry out restoration on copies for delivery to production What about Quality Checking and repair?
Archive/production system revisited
Adding preservation archive to digital production Desktop editing Craft video editing Work In Progress store Freesat camera Production ingest station with QC Restore Production libraries Playout Freeview WWW Transcode Archive management Transcode to C-Lite Migrate Archives R&D BBC MMXII
Adding preservation archive to digital production Desktop editing Craft video editing Work In Progress store Freesat camera Production ingest station with QC Restore Production libraries Playout Freeview WWW Transcode Archive management Transcode to C-Lite Migrate Archives R&D BBC MMXII
Management
Digital Asset Management Need to define your requirements carefully: Highly scalable to match storage expansion Accessible anywhere (web APIs): Access control by user type (archivist, programme maker ) Handle all types of digital object Can reference external metadata sources RDF * compatible Follow OAIS principles (* Resource Description Framework)
OAIS Reference Model A formal process to ensure nothing is forgotten in the operation of a digital archive, such as: Data integrity Security Administration Preservation planning Disaster recovery (http://public.ccsds.org/publications/archive/650x0b1.pdf)
Commercial and Open Source? Range of commercial products available Open Source options: P4 (PrestoCentre Tools) FedoraCommons Open Source systems can be supported commercially: Avoid proprietary systems Need to consider migration and disaster recovery from day 1 Libre Software Meeting 6-11 July 2013 / ULB, Brussels http://tech.ebu.ch/jahia/site/tech/events/opensource2013
I m an archivist: Now get me out of here! Federation: New system needs to interwork with existing/future libraries/archives Open interfaces between systems Open standards for data exchange Migration: All systems come to their end of useful life Migration eased by open standards and interfaces Disaster recovery: Physical damage to system though natural disaster Viruses, software problems, hardware failure
Sifting through the wreckage Can the surviving data be read on a completely independent system? Avoid proprietary solutions Does the media have a filing system? E.g. LTFS for LTO tape Can you find all the parts of the content? Was content spanned across tapes or disks? Can you find all the metadata associated with the content? Management systems need open standards/interfaces too!
Preservation metadata MPEG-A MP-AF * Provenance Context Reference Quality Integrity Authentication Fixity Rights Also Production metadata: EBUCore (http://tech.ebu.ch/metadataebucore) (* Multimedia Preservation Application Format)
Metadata in various databases Digitising job control Content catalogue Browse content Production processes Delivery database Operators database Quality Control database Storage performance Contracts Audio description Scripts Rights Subtitles Photos
Management coordination Digitising job control Content catalogue Production processes Contracts Delivery database Archive Management Rights Subtitles Operators database Audio description Quality Control database Storage performance Scripts Photos
Containers, Packages & Metadata
OAIS archive SIP Processing DIP AIP SIP: Submission Information Package AIP: Archive Information Package DIP: Dissemination Information Package
Container MXF, Quicktime Contains video and audio essence with essential technical metadata: Video resolution Frame Rate Sample structure Audio sampling rate Coding format Subtitles The minimum information needed to play the content
Package DPP, MPEG MP-AF, SMPTE AXF Content container plus associated descriptive metadata: MXF container file Production information Rights information Provenance Script All information needed for exchange, archive or sale of content DPP file exchange standard is a package: MXF AS-11 container for the content XML file for the descriptive metadata
Types of package (1 of 3) Manifest: XML (e.g. METS * ) list of descriptive metadata and unique identifiers to the locations of the content and further metadata: Metadata can be managed on independent systems Risk of disassociation of content from metadata E.g DPP file exchange standards DW_S2_Episode3.MXF SUBTIT LES DW_S2_Episode3 _METS_SIP.XML SUBTIT LES SUBTIT DW_S2_E3.txt LES (* Metadata Encoding and Transmission Standard) DW_S2_E3.JPG DW_S2_E3.DOC
Types of package (2 of 3) Monolithic: Content and metadata files gathered together within a single package file (similar in approach to creating a ZIP or TAR file): Security of a single file File system independent Difficult to update some of the metadata. E.g. SMPTE AXF *, Front Porch AXF, MPEG-A PA-AF ** DW_S2_Episode3_SIP.PAF Header Content Preservatio n Description Information Su b Su b su b (* Archive exchange Format) (** Professional Archival Application Format)
Types of package (3 of 3) Folders: Content and metadata contained in sub-folders of a folder tree Human Understandable Easy to update File system dependent Not secure Package_metadata.txt Package_manifest.txt data sub DW_S2_Episode3.MXF DW_S2_E3.txt DW_S2_E3.DOC DW_S2_E3.JPG
Unique identifiers ISAN (International Standard Audiovisual Number) EIDR (Entertainment Identifier Registry) CRID (Content Reference Identifier - TV Anytime) What s the best way to deal with content during production? File name should not be used as a unique identifier
Finding stuff: If you can t find it, then you don t have it The Future
BBC World Service radio archive Uses speech recognition to create keyword metadata Keywords are time-aligned with the content Crowd-sourcing to correct mistakes http://www.bbc.co.uk/rd/projects/worldservice-archive-proto
Automatic Metadata Generation Extracting features from programme content to produce metadata Machine Learning (SVM) to produce high level metadata Affective metadata (mood) for tone and pace of programme Merging affective (mood) and semantic (objects) metadata Automatic programme summarisation using affective metadata
Affective (Mood) Classification
Affective (Mood) and Semantic (Object) - Next Programme Recommendation Something informative more similar semantic than affective Something different different affective and semantic Something similar similar semantic and affective Something entertaining more similar affective than semantic
Automatic Programme Summarisation (patent filed) Summarise the interesting bits of a programme Interesting bits are identified as extremes in the programme s mood Granular mood metadata along the programme s time-line Use case 1: reducing the number of archive programmes that need to be viewed Use case 2: new format or preview service for audiences
F a s t P a c e - S l o w P a c e Programme Summarisation - Sherlock Time minutes H u m o r o u s - S e r i o u s
3D and Ultra-HD The Future
3D 1920 Stereoscopic 3D Captured with two cameras Transmitted as a combined picture E.g. Side-by-Side Archiving: Should store the two views before combined Multiview 3D: Captured with two or more cameras: Sometime plus a Depth Camera Various approaches to transmission: E.g. 2D + Depth Left eye Archiving: Should store the video and depth camera views before combined Right eye 1080
ITU-R standard for UHDTV: Rec. ITU-R BT.2020 Two versions of Ultra High Definition TV standardised by ITU 8k: 4360 lines with 7680 pixels / line (33 Mpixels) - SHV 4k: 2160 lines by 3840 pixels / line (8 Mpixels) (Digital Cinema 4k: 2160 x 4096) 16:9 aspect ratio
ITU-R standard for UHDTV: Rec. ITU-R BT.2020 24, 25, 30, 50, 60 or 120 frame/s progressive Plus drop-frame rates: 24/1.001. 30/1.001 or 60/1.001 progressive 4:4:4, 4:2:2 or 4:2:0 sampling lattices 10 or 12 bit quantisation Wide colour Gamut Constant and Non-constant luminance options
Super Hi-Vision Showcase Partners: NHK, BBC, OBS Associates: NTT, BT, Janet, Geant2, Internet2, Sinet4 3 Public View sites in UK 3 Public View sites in Japan 1 Private View site in USA 1 private site in the Olympic IBC R&D BBC MMXIII
Super Hi-Vision viewing experience 7680 100-200 inch Viewing distance: 1-2m 4320 65 inch Viewing distance: 65cm A3 wide Viewing distance: 25cm Viewing distance : 0.75 x Picture height Viewing angle : 100 degrees R&D BBC MMXIII
SHV listening experience R&D BBC MMXIII
Capture at Olympic venues Production & play out at BBC TVC VIP and public presentation R&D BBC MMXIII
UK PV Theatres Broadcasting House London National Media Museum Bradford R&D BBC Pacific Quay Glasgow BBC MMXIII
Transmission parameters Contribution links: 24 Gbit/s uncompressed video: 8 x 3Gbit/s HDSDI 36 Mbit/s: 12 x 3 Mbit/s AES-EBU Distribution Links: 280 Mbit/s AVC video coding bit rate 384 kbit/s AAC audio per channel 350 Mbit/s gross IP bit rate: Incl. FEC 20% overhead Split across two IP streams (2 x 175 Mbit/s) R&D BBC MMXIII
Recording SHV production format: 16 x P2 recorders in parallel 1.6 Gbit/s (16 x 100Mbit/s AVC-Intra) Equiv: 720 GB/hr Distribution format: 2 x ASI transport streams (2 x 216 Mbit/s) Equiv: 200 GB/hr R&D BBC MMXIII
Experimental recording Recorded uncompressed SHV using PCs with HDSDI cards: 24 Gbit/s (16 x 1.5 Gbit/s HDSDI) Equivalent to 10.8 TB/hr 35 TB ~30 LTO4 tapes Experimenting with Cloud storage R&D BBC MMXIII
Sources of help The Future
PrestoCentre: www.prestocentre.org An online Competence Centre for the A/V archive community Borne from the PrestoPrime project Much technical help on all aspects of A/V digitising and archiving Presto4U project: Communities of Practice:
Other sources of information AMWA: Advanced Media Workflow Association: http://www.amwa.tv EBU: European Broadcasting Union: http://tech.ebu.ch FIAT/IFTA: International Federation of Television Archives http://fiatifta.org/ MPEG: Motion Picture Experts Group: http://mpeg.chiariglione.org SMPTE: Society of Motion Picture and Television Engineers http://www.smpte.org