Paper Deploying Adobe Experience Manager DAM: Architecture blueprints and best practices Table of contents 1: Adobe DAM architecture blueprints 5: DAM best practices Adobe Experience Manager digital asset management (DAM) is a feature-packed capability that allows you to centralize and index all your content, generate variants or renditions, control workflows, trigger actions, process metadata, and much more. This white paper describes some of the architectures and best practices that you can combine to meet various enterprise scenarios. It also addresses operational tasks around workflow management, resource optimization, performance testing, and content and access control modeling. Adobe DAM architecture blueprints These blueprints are categorized according to four major use cases: asset ingestion, authoring, repository, and added features. Asset ingestion use cases Blueprint 1: Asset ingestion, high processing When authors perform bulk imports, such as 1,000 images at once, on a regular basis, you want to minimize the impact for the other authors. The critical points for the author instance are CPU and memory. Solution: Offload jobs to a farm of Experience Manager worker instances. You can offload entire workflows or just a few heavy steps by connecting an array of processing instances to the primary author instances via DAM proxy workers. The primary author instance thereby remains free to serve other users. DAM proxy workers are in charge of supervising remote tasks, gathering the results, and feeding them to the local workflow execution. Solution: Dedicate processing instances for asset ingestion. You can provision an instance dedicated to asset ingestion that starts and executes all workflows, generates renditions, extracts metadata, and so on. The instance replicates generated content in its final form to the central author instances, where editors can continue the ongoing editing process. FARM OF WORKER/PROCESSING AUTHOR INSTANCES Processing Processing Run workflow steps Replicate results back Offload CPU-intensive workflow steps via replication s, Project A Workflows running for Project A Replication s, Workflows running s, For both solutions, it is recommended to share the data store via a network file system to avoid unnecessary copying of binary payloads (see Blueprint 6).
Blueprint 2: Asset ingestion, high volume Imagine you have a database of one million products that has 10,000 modifications per day, picked up automatically from a back-end system. The repository becomes the bottleneck: While writes are happening, reads are blocked for consistency purposes. Other aspects to monitor are CPU utilization and repository read cache thrashing. To prevent this situation, segregate the import process on a dedicated author instance with its own repository. At completion, replicate a full diff/delta to the author environment, with chained replication to the publish environment, if necessary. Use a reserved replication queue to avoid delaying important editorial changes from publication. Backend System Import process Import Processing Import into repository Replication of diff s, Blueprint 3: Asset ingestion, high frequency Because this use case normally applies to DAM deployments that web content management (WCM) applications, it s explained only briefly here. The limiting factor tends to be the dispatcher cache, which is generally used only in conjunction with a publish environment, leading to excessive invalidation if the content is not structured appropriately and flooding of the main replication queue. The solution proposal consists of separating the replication queue, storing highly volatile content under a separate tree structure, and using a single-page flush. ing use cases Blueprint 4: Many concurrent authors Concurrent authors are users who are actively working on the system. Logged-in but inactive authors do not place additional load on the system, because Experience Manager is a stateless platform. Two types of authors are defined. behavior Typical system usage Limiting factor Producing authors Heavyweight editing: upload assets, demand renditions, trigger workflows CPU, memory, I/O Consuming authors Lightweight editing: Preview or review content, search and download assets, modify metadata CPU In general, consuming authors do not present an issue, because they are mostly read-intensive. Forming a cluster of author instances with a dispatcher in front helps distribute the CPU load evenly. With a large number of producing authors in active production, it is recommended to spin off each project into a separate author instance or environment in which the work in progress takes place. This technique is named content partitioning or sharding. 2
After editing is finished, the results are pushed to the common author environment via content packages or replication, as shown in the following figure. s, Project A s, Project B s, Project C REPLICATION Central REPLICATION Blueprint 5: Geographically distributed authors When providing for a worldwide team, two considerations are important: sizing the infrastructure in accordance with working hours around the globe and tackling network limitations. By growing and shrinking the author environment cluster and adding and removing Experience Manager worker instances at scheduled times, you can adapt the DAM infrastructure to handle the activity expected at any time of the day. In terms of the network, it s advisable to carefully measure latencies and throughput between the different authors locations and the DAM data center, because these variables affect the application s responsiveness and perceived speed. Put these techniques into practice to optimize the experience: Place a dispatcher in front of the author environment. Compress HTTP traffic between the browser and the dispatcher. Leverage the Client Library Manager to minify, concatenate, and compress client libraries. Cache in the browser all responses that are not under /content. AUTHOR SITE A s, AUTHOR SITE B s, Minimize requests (browser cache) Gzip HTTP responses DISPATCHER DATA CENTER Repository use cases Blueprint 6: Large DAM repositories To deal with huge repositories, such as over 5 million assets, 10 million nodes, and 10TB disk size, in a costeffective manner, split the persistent store and the data store (optimized for handling large binaries) onto different media. The persistent store requires very low latency I/O, hence local storage works best. For the data store, a higher latency is acceptable. Consider sharing a networked or cloud storage solution like network-attached storage (NAS) or Amazon S3 across all author instances. A shared data store, along with the Experience Manager DAM binary-less replication feature, reduces replication network traffic because all instances can read the binary directly from the data store. However, this approach requires running data store garbage collection (GC) on the instance that keeps references to all assets. 3
AUTHOR INSTANCE DAM Repository AUTHOR INSTANCE DAM Repository TAR Data Store Data Store TAR Local Cloud Storage or Local Network-attached Storage Additional considerations for asset size distribution Large repositories with extreme asset size distribution patterns have further implications. Two prominent cases are numerous small assets and fewer but more sizeable assets. Make sure to tune the TAR Persistence Manager and Data Store parameters for these scenarios, such as maximum file sizes, minimum record length, and so on. In the case of numerous small assets, TAR optimization removing orphaned nodes takes longer to run. Perform simulations under normal background load to gather baselines of duration, I/O loads, and so on, and then plan the daily optimization schedule appropriately. Optionally, adjust the delay after optimizing one transaction to lessen the impact on the production load. Avoid adding too many children to a single node, because the cost of adding each new child rises proportionally with the number of existing child nodes. Plan a nested organization (for example, date-based or alphabetical), which keeps the number of child nodes of every folder to 1,000 at most. For scenarios with fewer but larger assets, schedule manual data store GC more frequently to free disk space that can otherwise be occupied by large but orphaned assets. Also perform GC just before backups to minimize the payload of resulting archives. Added features use cases Blueprint 7: DAM with asset share With the Experience Manager Asset Share or DAM Finder templates, you can build simple search, browse, and download interfaces to make the asset repository accessible to any user via self-service, promoting asset discovery and reuse as a result. Asset Share requires a publish environment and a dispatcher. The following figure shows the final scenario. s, s, Dispatcher Dispatcher Asset Share Others Dispatcher DAM-ONLY DEPLOYMENT WITHOUT ASSET SHARE DAM-ONLY DEPLOYMENT WITH ASSET SHARE 4
Blueprint 8: DAM and Adobe Scene7 Scene7 is a hosted solution used to enhance, publish, and deliver dynamic marketing assets to web, mobile, social, email, and print. By merging creative content with consumer data, it presents a rich, immersive digital experience to each consumer in real time. Scene7 is offered as a Software as a Service (SaaS), so extra architectural considerations are not required. The only requirement is that the upload bandwidth is sufficient to work efficiently with Scene7 at author time. Experience Manager embeds direct links to Scene7 servers in the HTML served to the client. Consequently, end users browsers fetch asset renditions directly from Scene7, and the local publish environment is no longer burdened with serving the Scene7 enabled content, thus reducing the DAM storage footprint. Here s a bird s eye view of how the Scene7 integration works in terms of asset upload and serve cycles. Adobe Scene7 (SaaS) Visually configure assets 3 Fetch Rendition HTML Links to Scene7 resources 1 s, Upload S7 Master Asset Browser End-user 2 Serve HTML DISPATCHER Replication s, DAM best practices Plan and act intelligently Make sure that bulk asset uploads and scheduled processes are launched at off-peak times or provision and connect worker Experience Manager instances via proxy workers when expecting peak loads during standard hours. When certain projects are under active development, spin them off temporarily onto a dedicated author environment so that workflows are launched isolated from the core infrastructure. Tweak the default workflows Ensure that workflows don t incur any unwanted overhead by applying these techniques. Selective rendition generation. Prevent DAM from generating the costliest renditions under certain conditions based on asset metadata properties. Conditional workflows. Defer triggering workflows until the point where the ongoing asset editing finishes. Reduce asset activation delay. Disable redundant workflows on the publish environment (if you have one), because they would have already triggered in the author. Align the limit of concurrent workflows with your CPU features You can limit the number of concurrent workflows allowed in each Experience Manager DAM instance. Apply the following considerations based on the number of processing cores. Introduce a fudge factor, such as 1, to leave room for non-workflow-related tasks. Cater for multitasking features of your CPU, such as hyper-threading. Consider I/O and other wait times that might leave processing cores idle. 5
Scale out to a private cloud Customize Experience Manager DAM to dynamically spawn new CQ instances on a private cloud, either on a scheduled basis or on-demand when ingestion peaks are taking place. Gauge your setup with Adobe s Tough Day test Leverage the Tough Day Test, to assess your installations, calculate a baseline, and measure the impact of adjustments in a reliable, mathematical manner. Access control management best practices Model access control lists (ACLs) in terms of group permissions. Users come and go, but groups are long term. Groups also simplify the structure and provide a valuable overview of the user population. They also facilitate inheritance of permissions. Make sure to also review and restrict the default users and groups included out of the box to avoid surprises and security vulnerabilities. Complex ACL hierarchies have an impact on overall performance, because the system has to calculate access rights on every read. To optimize performance, keep it simple and follow a clean, hierarchical pattern, rather than assigning ad hoc, one-off permissions along the tree. Some DAM customers choose to follow a laxer ACL model, with the goal of promoting asset discovery and encouraging reuse of existing media. This allows them to cut media acquisition and procurement costs. Most assets are visible to anyone, but workflows are enforced to formally request usage and authorize modifications. Encourage users to grant impersonation privileges to peers in their business unit so that business matters don t come to a standstill when key users are out of the office. Sync with your central identity management Most businesses nowadays manage their user identities and roles in a central identity management application or directory, such as Active Directory or similar providers. You can configure Experience Manager DAM to sync with an external Identity Manager via industry-standard LDAP. Authentication is always validated against the LDAP provider (with a local cache), but the authorization model or ACLs are kept in Experience Manager. With this integration, DAM administrators are no longer the bottleneck to managing accounts as users come and go or get reassigned to different business units or projects within the organization. Extend the access control logic with custom requirements With Experience Manager DAM, you can tailor access control evaluation to your needs. DAM provides the hooks and tools to implement complex behavior, such as locking access to the original assets and only granting access to lower-resolution renditions or watermarks. Collaborate with external parties via Creative Cloud With Creative Cloud, you can work with external collaborators without providing them access to your infrastructure or network, because DAM author instances are normally located behind firewalls and IT security concerns apply. With the DAM Creative Cloud integration, assets uploaded by collaborators are immediately available for DAM authors to import within just a few clicks. To learn more about the Adobe Experience Manager DAM capability, visit http://dev.day.com/docs/en/cq/ current/dam/dam_documentation.html. To attend a DAM training, visit http://training.adobe.com/training/ courses/aem-digital-asset-management.html. Adobe Systems Incorporated 345 Park Avenue San Jose, CA 95110-2704 USA www.adobe.com Adobe, the Adobe logo, Creative Cloud, Illustrator, InDesign, and Photoshop are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States and/or other countries. Java is a trademark or registered trademark of Oracle and/or its affiliates. All other trademarks are the property of their respective owners. 2013 Adobe Systems Incorporated. All rights reserved. Printed in the USA. 8/13