Hadoop Security Analysis NOTE: This is a working draft. Notes are being collected and will be edited for readability.
|
|
|
- Martha Hicks
- 9 years ago
- Views:
Transcription
1 Hadoop Security Analysis NOTE: This is a working draft. Notes are being collected and will be edited for readability. Introduction This document describes the state of security in a Hadoop YARN cluster. First, this document describes the following entities and their interactions in a secure Hadoop cluster: tokens, principals, authentication mechanisms, authenticating parties, authorization mechanisms, authorized parties, and execution environment. Second, it examines these entities through the lens of adherence to commonly held security principles. Tokens In general, tokens are added to the current UGI (UserGroupInformation.java). This results in those tokens being added as credentials to the JAAS Subject associated with that UGI. An RPC call is then made in the context of a UGI.doAs which pushes the Subject onto the thread context. When a connection to a server is created, an appropriate token is selected from a UGI created from the current JAAS Subject. This can be seen in the getprotocolproxy methods of the RPC class: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ RPC.java. This selection is based on the type of server to which the connection is being established and the type of token it requires. (See Connector constructor in hadoop-common-project/hadoopcommon/src/main/java/org/apache/hadoop/ipc/client.java.) See method getproxy (lines ) in ClientServiceDelegate for an example of how tokens are set up by clients: hadoop-mapreduce-project/hadoop-mapreduceclient/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ ClientServiceDelegate.java. Each service that generates tokens has a master key that is used to generate a Message Authentication Code (MAC) for the token. This is also referred to as the token password. Servers store their master key in a SecretManager used with each RPC server. In several situations this master key is distributed between master and slaves services by registration or heartbeat interactions initiated by the slave services.
2 Principals Here we define the entities that may be authenticated and granted rights within a Hadoop cluster. User Users are internally represented within Hadoop as simple strings. At the boundaries various mechanisms are used to derive these simple strings. For example these could be derived from a Kerberos principal exchanged via SASL or SPNego. They could also be derived from the OS user running the client process. This user information is collected in an object that is frequently referred to as the UGI (UserGroupInformation.java): hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ UserGroupInformation.java. Group information is also represented internally within Hadoop as a simple collection of strings. This group information is also carried with the UGI. There is a pluggable mechanism by which a user name is resolved into a list of groups. This plugin mechanism is called GroupMappingServiceProvider: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ GroupMappingServiceProvider.java. You can see how these plugins are utilized by looking at
3 the getgroups method of the Groups class: hadoop-common-project/hadoop-common/src/ main/java/org/apache/hadoop/security/groups.java Core Services For each of the core services within a Hadoop cluster there is an associated Kerberos UPN. These UPNs are utilized for server-server and client-server interactions. The expected UPN values for each service are stored in the Hadoop configuration files. The service credentials are stored in keytab files and protected by native filesystem permissions. Service HDFS NameNode HDFS Secondary NameNode HDFS DataNode YARN ResourceManager YARN NodeManager MapReduce JobHistory Server Description Manages HDFS meta data. Central point of HDFS access. Stores backup of HDFS meta datalogs. Manages HDFS data blocks. Schedules container execution within a cluster. Manages container execution and shuffle results on one machine. Manages job history across a cluster. Container At its core YARN provides the ability to execute user code across machines in a cluster. This user code is executed in an environment called a container. Each container has an identity (ContainerId) and associated token (ContainerTokenIdentifier) created by the ResourceManager. A Container token is required to access the ContainerManager network interface which is hosted in the NodeManager. These Container tokens are held by the ApplicationMaster that created the container. The master secret for token creation is controlled by the ResourceManager. The ResourceManager is the issuing party for tokens. Job At this time Map Reduce is the mostly commonly deployed YARN application. A job is one the of concepts that exists at the MR application level. A job identity (JobId) is created by the MapReduce ApplicationMaster. The JobId is contained within a Job token (JobTokenIdentifier). The Job token is used by all Job tasks for mutual authentication when communicating with the MapReduce ApplicationMaster. Application
4 An application instance has an identity (ApplicationId) and a token (ApplicationToken) created by the ResourceManager. A client will utilize an application id to uniquely identify an application instance. The identity is requested by the client via ClientRMService.getNewApplicationId(). The ApplicationMaster may also identify itselft to the system using this ID. Localizer A Localizer is a short lived process launched by the NodeManager. It is responsible for setting up the execution environment for a task. This includes extracting the required job files from HDFS to the local file system. A Localizer s identity is represented by a LocalizerTokenIdentifier that allows the Localizer to report status back to the launching NameNode. Node A NodeManager represents each node (computer) in a cluster. Each NodeManager has a unique Kerberos principal associated with it. The NodeManagers are always running and interact with the ResourceManager to provide information about the state of the node and the ApplicationMasters to execute tasks. Resources TODO - Overview Block A block of data within HDFS is identified by a 64-bit integer. The NameNode allocates a block identifier when data is written to a file. The NameNode also stores metadata for this block identifier. The metadata tracks the association between a file and its blocks, and between blocks the DataNodes on which they are stored (included replicas). The lifecycle of a block identifier is controlled by the owner of the file to which the block belongs. A block identifier becomes invalid once the file to which it belongs is deleted. Shuffle Data (user/job Identifier/Map Identifier) Shuffle data is the output of the Map portion of a MapReduce job. It is generated by the Map task and stored on the local file system where the Map task was executed. When the Reduce task of a MapReduce job is run, this data is fetched by accessing a NodeManager component called the ShuffleHandler. This is necessary because Reduce tasks may not be run on the same node as the Map task. Shuffle data is identified by three pieces of information. 1. user - The user that submitted the job. 2. Job Identifier - The job identifier provided by the ResourceManager when the job was being defined. 3. Map Identifier - The identifier assigned by the MapReduce ApplicationMaster when the the Map task was executed. This is an instance of TaskAttemptID. The shuffle data is stored on the local file system of the Map tasks and is protected by OS privileges such that it is only accessible to the mapred user. The location
5 of this data on the local file system is partitioned using the identifiers above. See ShuffleHandler.sendMapOutput() for details. Authentication Mechanisms For each method of principal assertion: - Describe the mechanism - what principal is asserted Kerberos Kerberos is a technology developed by MIT in the 1980s as part of Project Athena and X Window System. Kerberos is designed to provide mutual authentication for two nodes (users or services) over an insecure network. Kerberos enables this by introducing a trusted third party called the Key Distribution Center or KDC. For more information, see Kerberos V4 and Kerberos V5. In Hadoop, Kerberos is used for mutual authentication in two cases: 1) mutual authentication between clients and services, and 2) mutual authentication between services. (In this paper, these two cases will be covered separately as their characteristics are different.). Authentication of services to clients and/or other services is to prevent trojan services from masquerading as Hadoop services. NOTE: Kerberos is not the only way that clients and services authenticate to one another. Hadoop has a token model that augments and complements Kerberos for many interactions. Typically Kerberos secured interfaces are used to acquire these tokens and then those tokens are used from that point forward for authentication. Client/Service Kerberos is required for client authentication with any service that issues delegation tokens (e.g. NameNode, ResourceManager, HistoryServer). (Note that the delegation tokens issued by these services are Hadoop-specific delegation tokens and have no relationship to the Kerberos delegation mechanism introduced in Kerberos V5.) Kerberos may also be used for client authentication with services that do not issue delegation tokens (e.g. Oozie). Some of these services (such as Oozie) may need to need to provide the client identity when communicating to core services. Kerberos delegation is not used in these cases. Instead, core services can be configured to allow the specification of a proxy user from the superuser principal. (For more information, see Secure Impersonation using UserGroupInformation.doAs.) <Diagram proxy and non-proxy> Two types of protocols support Kerberos authentication: RPC and HTTP. The RPC protocol uses SASL GSSAPI while the HTTP protocol uses SPNego. Service/Service
6 Kerberos is also used for authentication between Hadoop services that will, in turn, not act on behalf of a user. (When services will act on behalf of a client, tokens are used for authentication.) Authentication occurs on initial registration and all subsequent heartbeat interactions between DataNodes and NameNodes and between NodeManagers and ResourceManagers. To facilitate this, each service has a unique, host-specific Kerberos principal. To prevent this from becoming a configuration problem, host generic service principals (e.g. dn/[email protected]) are allowed (specified in configuration files). An important point to mention is that there is no real relationship between the Kerberos principal asserted by a given service and the OS user running the service. The only linkage is that the Keytab for the service must be owned and protected by the OS user running the service. Important Considerations All clients accessing a secure Hadoop cluster must have network access to the Kerberos infrastructure (e.g. KDC) used by the Hadoop cluster. This can sometimes make it difficult for clients to connect to a secure Hadoop cluster directly from their local computer. Kerberos is very sensitive to DNS configuration. On both client and service nodes, the fully qualified host name of each Hadoop service must resolve to the same IP address. This can make it difficult to setup Kerberos on a complex network. Trusted Proxy (doas) Utility services present a challenge from an authentication perspective. A utility service (such as Oozie) must authenticate a user and then perform operations with the user s identity. Delegation tokens cannot be used in this situation because they will only be issued for a request that was itself directly authenticated using Kerberos. For utility services to work they need to be able to assert a user s identity to the core services in a different, but trusted way. To facilitate this, Hadoop allows for superusers with Kerberos credentials to impersonate users without Kerberos credentials. Specification of which users can be impersonated (or from which groups users can be impersonated) and from which hosts impersonated users can connect is done in the core-site.xml configuration file. For example, if a user joe is configured as a trusted proxy user, then superuser oozie can connect to the NameNode via Kerberos, but request data using the identity joe (this is done by creating a proxy user ugi object for user ). The NameNode will check to make sure that the user joe can be impersonated and ensure the request originated from an allowed host. Once this has been verified, the identity joe will be trusted by the NameNode (i.e. Oozie can access data using the identity joe). (See hadoop.apache.org/docs/stable/secure_impersonation.html for more information). Block Access Token BlockAccessTokens (BATs) are used by HDFS clients (both users and Hadoop services) to perform operations on HDFS blocks. When a client requests access to a block, the NameNode makes an authorization decision and either issues a BAT to the client or fails the request (see HDFS File Permissions below). The client then presents the BAT to the appropriate DataNode. The DataNode checks the token s authenticity by verifying its signature using a shared secret (shared between the DataNode and the NameNode). The shared secret is a 64-bit random key
7 exchanged between the NameNode and each DataNode when DataNodes report status to the NameNode (heartbeats). The NameNode uses the shared secret key to sign all BATs with HmacSHA1. The DataNode implicitly trusts the authenticity of the client s identity asserted via the BlockAccessToken as a result of the signature validation. BATs have a limited lifetime, which is 10 hours by default (but is configurable via a setting in hdfs-site.xml configuration file). BATs cannot be renewed but a new one can be requested by client at any time. BATs are not (typically) persistently stored by applications and are never stored by a Hadoop service. BATs are present on the wire using two mechanisms: RPC/DTP protocol and HTTP protocol using the WebHDFS REST API. In both cases the entire BAT is sent to the DataNode along with the signature of the token that was generated with the shared secret key. The DataXceiver class contains methods writeblock() and readblock(), which are useful in understanding how the BAT is used. The location of this class in the Hadoop project is here: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ DataXceiver.java. HDFS NameNode Delegation Token HDFS NameNode delegation tokens are used to allow MapReduce jobs to access HDFS resources owned by the user that submitted the job. The containers running a job do not have the Kerberos credentials of the user that submitted the job, and thus need a delegation token. These delegation tokens are issued by the NameNode and are typically requested by the user during the definition of a MapReduce job and presented to the ResourceManager when a job is submitted. The renewer for a HDFS Delegation Token is typically the ResourceManager. In this way the ResourceManager can periodically refresh the token during the life of the application. As the renewer, the ResourceManager can also cancel the delegation token once the application is complete. YARN ResourceManager Delegation Token YARN ResourceManager Delegation Tokens are used to allow a MapReduce job to access the ResourceManager and submit new jobs. For example, a job can submit MapReduce jobs from a Task using a ResourceManager Delegation Token (and thus manage a job workflow). YARN ResourceManager Delegation Tokens are issued and renewed by the ResourceManager. YARN Application Token This token protects communication between an ApplicationMaster and the ResourceManager. The token is made available to the ApplicationMaster via the credentials provided by ResourceManager to the NodeManager and stored on disk in the containers private storage. When the ApplicationMaster is launched all of the tokens in the credentials are loaded into the UGI of the ApplicationMaster. The ApplicationToken is then selected whenever a connection is made by ApplicationMaster to the ResourceManager. This token is only valid for the lifetime of a particular ApplicationMaster instance. The token is used by the ApplicationMaster instance created as part of the application s execution to authenticate when communicating with the
8 ResourceManager. This application execution may result in multiple attempts to execute ApplicationMasters and Tasks. These are represented by ApplicationAttemptId and TaskAttemptId. YARN Node Manager Container Token This token protects communication between the ApplicationMaster and individual NodeManagers. Communication between the ApplicationMaster and the NodeManager is done to manage the life-cycle of Containers in which Tasks execute. This token is provided to the ApplicationMaster in response to the allocate request of the AMRMProtocol as part of a Container. The master secret for this token is propagated from the ResourceManager which manages this secret to each NodeManager via the registration (i.e. ResourceTracker.registerNodeManager) and heartbeat (i.e. ResourceTracker.nodeHeartbeat) APIs provided by ResourceManager. The ApplicationMaster will present the NodeManager tokens when using the NMClient APIs. Expired container tokens are still valid for calls to stopcontainer() and getcontainerstatus(). There is no way to renew a container token. YARN Localizer Token This token is used to protect the communication between a ContainerLocalizer and the NodeManager. A ContainerLocalizer is launched by the NodeManager before the Task Container is launched and is responsible for setting up the local file system for Task execution. The ContainerLocalizer uses the LocalizationProtocol to send status updates to the NodeManager. MapReduce Client Token <Kyle: fix this section> The MapReduce Client Token is used to secure connections made by a job client to the MapReduce ApplicationMaster. This token is created by the ResourceManager when a job is submitted. The token is provided to the job client via the ApplicationReport returned from the getapplicationreport() API in the ClientRMProtocol interface ( hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ ClientRMProtocol.java). Submitting a job is a three step process using the ClientRMProtcol interface: getnewapplication(), submitapplication(), getapplicationreport(). A MapReduce Client Token is a unique ID based on the cluster timestamp, application id and application attempt number. This is important because a given MapReduce Client Token can only be used to access the specific job/application identified by that id. MapReduce Client Tokens do not expire and are not specifically destroyed. However, they can only be used when a specific job is running within the MapReduce ApplicationMaster. A MapReduce Client Token is presented and authenticated using Hadoop s RPC/SASL token handling mechanisms. This involves verification between the job client and the ApplicationMaster using a shared secret. This shared secret is unique for each submitted
9 application (i.e. job) but will be reused for all client communication about that job. MapReduce Job Token A MapReduce Job Token is used to protect communication between an executing Task Container and the MapReduce ApplicationMaster. A MapReduce Job Token is created by the ApplicationMaster for each job. When a Task Container is launched via the NMClient.startCotnainer() API, this token is added to the ContainerLaunchContext. This token is then propagated by NodeManager to the Task Container via a token cache stored on the local disk. The Task Container then will present this token when connecting to the ApplicationMaster via the TaskUmbilicalProtocol API. MapReduce Shuffle Secret The MapReduce Shuffle Secret (a.k.a. shuffle key) is a secret that is shared between a ShuffleHandler running in the NodeManager and between the ReduceTasks of a MapReduce job. It is used as part of a mutual authentication exchange between these parties to ensure that only the components of a particular MapReduce job can access shuffle data for that job. A MapReduce task computes a HMAC-SHA1 hash of the requested URL and the current timestamp using the shuffle key as the signing key. If no shuffle key was specified by the client, the job token secret is used. This hash and the request for shuffle data are sent to the ShuffleHandler in the UrlHash header of a HTTP GET request. The ShuffleHandler can verify that the fetcher (i.e. the ReduceTask) has the same shared shuffle key by computing the same hash. The ShuffleHandler in turn creates a HMAC-SHA1 of the value in the UrlHash HTTP request header using the shared shuffle token as the secret. This value is returned to the fetcher in the ReplyHash HTTP response header along with the shuffle data in the body. From this, the fetcher can varify that the ShuffleHandler shares the same shuffle token by computing the same hash. This mechanism is used instead of other mutual authentication mechanisms available in Hadoop for efficiency. Shuffle data requests typically contain small amounts of data and occur at high volume during MapReduce jobs. This HTTP header based mechanism avoids overhead found in other mutual authentication mechanisms by returning server auth data with the response instead of a separate round trip between the client and service. Note that this HMAC-SHA1 hash exchange is only used for this purpose. The details of the service side MapReduce shuffle data exchange can be found in the following source path and class: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoopmapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ ShuffleHandler.java. The verifyrequest() method of this class is particularly illustrative. The details of the client side MapReduce shuffle exchange can be found in the following source path and class:
10 hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/ java/org/apache/hadoop/mapreduce/task/reduce/fetcher.java. The copyfromhost() method is useful for reference. SASL auth with secret/digest (SASL/DIGEST-MD5) For RPC interfaces with Hadoop tokens the mechanism of authentication is SASL utilizing the DIGEST-MD5 mechanism. (see sasl-refguide). The quality of protection defaults to authentication only but can be increased to authentication plus integrity protection or authentication plus integrity and confidentiality protection with the hadoop.rpc.protection setting (in core-site.xml). Server authentication is always enabled. Each token kind (mentioned above) serializes a different set of fields as an identifier. This identifier is used to populate the SASL name by means of the name callback. The token secret (A MAC generated by the token signer) is utilized as the SASL password. Note that DIGEST- MD5 is considered historic. SASL auth with delegation token Hadoop delegation tokens can be used for most RPC interfaces that would otherwise require Kerberos authentication. This alternate authentication mechanism is used in situations where calls need to be made on behalf of a user at a later time. This is required for example when a MapReduce job needs to access HDFS files. The SASL auth with secret/digest mechanism described above is utilized to authenticate a client as having possession of a signed token. The token identifies a specific principal, renewer and expiration. These tokens are issued by a particular authority (i.e. NameNode, ResourceManager or MR HistoryServer) and are only accepted by issuing authority. These authorities will only issue a delegation token for the kerberos authenticated user requesting the token. At the time that these tokens are issued, the requestor may specify another principal that may request that the token be renewed. This refresh will only be performed when made via a request authenticated via Kerberos. By default, delegation tokens are valid for 24 hours and may be renewed up to a maximum of 7 days. Delegation tokens are typically stored in HDFS with job related metadata. These tokens can be automatically cancelled as part of job completion. While these delegation tokens are valid, their status is persistently maintained by the issuing authority (e.g. NameNode). SASL mutual auth with Kerberos/GSSAPI Authentication for RPC calls can be performed using a built in SASL plugin for Kerberos/ GSSAPI in environments that support this. This allows single sign on behavior to the Hadoop cluster. A user has to authenticate either using the Kerberos kinit command line tool or (possibly) his desktop login. From that point forward until the login has expired requests made by the Hadoop CLI will use the identity authenticated by Kerberos when connecting to services. This is intended to provide mutual authentication such that the service can trust the identity of the user and the user can trust the identity of the service.
11 Authenticating Parties For each authenticating list: - authenticating party - principals utilized/asserted/presented User Each user using a secure Hadoop cluster must have a Kerberos principal. Users use this principal to authenticate to the core Hadoop services: ResourceManager, NameNode and MR HistoryServer. There are other utility services such as Oozie that users will autenticate with using their Kerberos principal. When users interact with other Hadoop slave components they will typically use service specific tokens. Many times, Block Access Tokens for example will contain users principal information which originates from their Kerberos principal. HDFS NameNode The Name Node and Secondary Name Node have a Kerberos principal of the form nn/ <host>@<realm>. This principal is used for mutual authentication with clients that communicate with the NameNode over RPC. In addition for HTTP/SPNego access they have the Kerberos principal HTTP/<host>@<realm>. This principal is used for mutual authentication with clients that communicate with the NameNode over HTTP/SPNego. Incoming requests are authenticated in one of four ways. 1. Kerberos via RPC - Incoming requests can be authenticated by using a SASL exchange to mutually authenticate client and service. 2. Kerberos vis HTTP/SPNego - Incoming requests can be authenticated by using a SPNego exchange to mutually authenticate client and service. 3. HDFS Delegation Token - Incoming requests can present a delegation token for client authentication. 4. Proxy User - Requests from an authorized proxy can assert the user s identity for an incoming request from the proxy. This provides delegated client authentication only. 5. Pseudo/Simple - If support enabled, incoming requests can assert the user s identity explicitly. The provider client identity assertion only. The Name Node does not initiate connections with any other service or user. The Name Node issues two types of tokens: Block Access Tokens and HDFS Delegation Tokens. Block Access Tokens are issued when a client attempts to access a HDFS block. These are only issued after a check to ensure the client principal is authorized to perform the desired operation on the resource. Delegation Tokens will be issued to any authentication client Kerberos: nn/<host>@<realm> SPNego: HTTP/<host>@<realm> Accepts: Pseudo, Kerberos, DelegationToken Issues: BlockAccessToken, HDFS DelegationToken
12 HDFS DataNode The Data Nodes each have a Kerberos principal of the form dn/<host>@<realm>. This principal is used for mutual authentication with the Name Node during heartbeat exchanges. Incoming requests are authenticated via BlockAccessTokens. The validity of BlockAccessTokens is verified by means of a secret key shared with the NameNode. The BlockAccessToken contains both the client identity and the allowed privileges. BlockAccessTokens are presented either as part of a RPC/SASL/Kerberos exchange or via HTTP query parameters when WebHDFS is enabled. Outgoing RPC requests to both the NameNode and other DataNodes are mutually authenticated via a SASL/Kerberos exchange. Outgoing requests to the NameNode are for heartbeats that report the status of each DataNode to the NameNode. In addition the BlockAccessToken shared secret key is returned in the response. The RPC communication with other DataNodes is for block replication and balancing. Kerberos: dn/<host>@<realm> Accepts: BlockAccessToken to authenticate and authorize a user principal. Issues: None YARN Resource Manager The ResourceManager only initiates connections only to the NodeManager. This connection is made in order to start a container for the ApplicationMaster on a node. This connection is authenticated using the ResourceManager s Kerberos principal. YARN Node Manager The NodeManager only establishes connections to the ResourceManager to register and report status. This connection is authenticated using the NameNode s Kerberos principal. Map Reduce Application Master The Map reduce Application toke manages creation of the Job Token containing the job identifier and a Password. The Map Reduce Application master creates the token by creating and initializing JobImpl. JobImpl.setup(), this triggers the creation. JobTokenSecretManager.createPassword() performs a MAC on the job identifier with a random per application master key. If no token has been set as MapReduceShuffleToken then the job token will be reused for the shuffle secret. The token is valid for the lifetime of a job container and not renewed or rotated. The general infrastructure for tokens is reused. In this case only one token is stored but by reusing the existing JobTokenSecretManager the network authentication code can be used as well. Map Reduce Task (Container) Map Reduce Tasks are these execution of user s map and reduce tasks. As such it interacts with many Hadoop services in similar ways to a real user. To accommodate this Hadoop
13 utilizes delegation tokens. These delegation tokens are used to authenticate to NameNode, ResourceManager and the MR HistoryServer. The delegation tokens are acquired by the user as part of job definition. These delegation tokens are submitted with the application as part of the YarnClient.submitApplication API via the ContainerLaunchContext. The tokens are stored in HDFS with the job metadata and then extracted for use by tasks. In addition to delegation tokens the tasks use the BlockAccessTokens to authenticate access to the DataNode just as would a normal client. Unique to task execution is the use of a JobToken and the Shuffle Secret. The JobToken is used to authenticate access to the ApplicationMaster for the purposes of status reporting. The Shuffle Secret is used to authenticate access to the Shuffle Service running within the NodeManager. Both of these basically authenticate the Job as opposed to the User for these services. Map Reduce History Server The MR History Server does not establish connections to other components. Authorization Mechanisms HDFS File Permissions HDFS authorizes access to files using POSIX style file permissions. Every file and directory in HDFS has an owner and a group associated with it. These are strings that are maintained by the NameNode. In addition each file has permissions for the owner, users in the same group and for all other users. At each directory level, read permission allows the file to be read or the directory listed. Write permission allows writing or appending to a file and creating or deleting files from a directory. Execution permission is required to access the child of a directory. There are two important mechanisms that affect how users and groups are determined for a given user. First there is a mapping between Kerberos principal and a simple username. This mapping is handled by mapping rules provided in core-site.xml configuration file for property hadoop.security.auth_to_local. The second is the mapping of a user to a set of groups. This is handled via a plugin implementation of the GroupMappingServiceProvider. For request that do not involve the actual content of the file, the NameNode will evaluate the user and group information and respond or fail based on those authorization checks. For interactions that affect the contents of files successful authentication will result in the generation of a BlockAccessTokens. These BlockAccessTokens are used to grant a specific principal specific capabilities on a specific block. The tokens are presented to the DataNodes when an access attempt is made for a specific block. The DataNode verifies the validity of the token using a secret key shared with the NameNode. The DataNode then checks the requested operation against the allowed capabilities enumerated within the BlockAccessToken. Block
14 Access Tokens are presented either as part of the Hadoop Data Transfer Protocol or via a HTTP query parameter. Map Reduce Job Queue ACLs Access to job queue can be restricted by using ACLs. The ACLs specify the users and groups that can access a specific queue. These ALCs can be found in the config file conf/mapredqueue-acls.xml. User and group information is determined as described in the HDFS File Permissions section. Map Reduce Job ACLs Access control for a particular job can be included in the configuration of that job via the job configuration properties. These properties are: mapreduce.job.acl-view-job and mapreduce.job.acl-modify-job. These contain lists of users and groups that will have either view or modify access to the job. By default the job owner and the superuser have full access to the job and this cannot be altered via these settings. Service Endpoint ACLs All RPC endpoints can have ACLs applied at the protocol layer. These ACLs can control the users and groups that can access a given service protocol. This is configured via the hadoopproxy.xml file that can be placed in the Hadoop conf directory. The same user mapping and group resolution mechanism described in the HDFS File Permissions section are used. Authorized Parties User Users are authorized for access to files in HDFS by the NameNode. Authorization decisions are made by comparing the clients principal and group membership to the file s ownership and privilege information. The result of authorization takes the form of either a failure or a BlockAccessToken being issued by the NameNode. Authorization in proven to the DataNode by presenting a BlockAccessToken. Name Node The NameNode is never authorized to perform any action via another service. Data Node The DataNode is never authorized to perform any action via another service. Resource Manager The ResourceManager only initiates connections to NodeManager in order to start a container for the ApplicationMaster. This operation is authorized using the same ContainerToken that an ApplicationMaster would use.
15 Node Manager NodeManagers initiate connections only to ResourceManager. Only NodeManagers with valid NodeManagerTokens (NMTokenIdentifier) are authorized interact with the NodeManager via the ResourceTracker API. MapReduce Application Master Application Masters initiate interactions with both ResourceManager and NodeManager. Only requests with valid Application Tokens (ApplicationTokenIdentifier) are authorized to utilize the AMRMProtocol. Similarly only requests with valid Node Manager Tokens (NMTokenIdentifier) are authorized to make requests using the ResourceTracker API. Map Reduce Task Tasks initiate requests to the Shuffle Handler within NodeManagers and the Map Reduce Application Master. Interaction with the Shuffle Handler s protocol requires a valid shuffle secret which is used to create a hash that the Shuffle Handler can verify. The Tasks interaction with the NodeManager s TaskUmbilicalProtocol API requires a valid Job Token (JobTokenIdentifier). Secret transport storage A container is launched by calling the startconatiner() RPC call on the ContainerManager service. Parameters to this call include the set of tokens which are read into the UserGroupInformation context of the container. The ConatinerLaunch class writes these credentials to the local filesystem private container path with an extension of.tokens. The private container path contains elements of both the application id and container id. Execution Environment TODO - Overview Service Processes: There are three broad classes of service processes in a Hadoop cluster: master service processes, slave service processes, utility service processes. Master Services Master service processes are the ones that users will typically initially connect to. When secured these require authentication via Kerberos. They may provide the location of slave and application processes along with tokens to access them. These service processes are typically always running at well known locations in the cluster. The are run as the the service user. For example NameNode will be running as user hdfs. ResourceManager will be running as user yarn. Slave Services Slave service processes make up the majority of the processes in a Hadoop cluster. The slave services are DataNode and NodeManager. These are typically run on most nodes of the cluster. These processes are also run as the user that represents the services: DataNode as user hdfs and NodeManager as user yarn. DataNodes do not use Kerberos to authenticate clients but rather BlockAccessTokens. Clients do not typically connect to NodeManagers and
16 connections to NodeManagers are protected by Kerberos authentication. Utility Services These are higher level services that utilize HDFS and YARN. An example of this is Oozie. These services run a service specific users. For example the Oozie service process runs as user oozie. They also typically have their own Kerberos principal. Clients will authenticate via Kerberos when connecting to these services. These services are typically configured as trusted proxies. This allows them to authenticate a user and then make requests to HDFS and YARN on behalf of that authenticated user. Application Processes: The processes are the ones in Hadoop that are actually run as the user that submitted the job. This includes the MapReduce ApplicationMaster and the map and reduce Tasks themselves. These processes are more transient in nature than other services. They are launched by Hadoop services and exit when they are complete. The MapReduce ApplicationMaster process will live only until the job for which it is managing is complete. A map or reduce Task process will live only until the task completes. In order to ensure that these processes have the correct OS level permissions they are run as the actual user that submitted the job. This is acomplished by utilizing a native (non-java) process launcher (i.e. container-launcher). The executable file is installed to be owned by the root user and has a special bit set that allows it to perform operations as root when running. The container-launcher only uses these elevated privileges to launch the MapReduce ApplicationMaster and Tasks as the user that submitted the job.
17 Adherence to Security Principles: The principles described in this section are taken from Secure_Coding_Principles. This section analyses to what degree Hadoop adheres to each security principle.
18 Minimize attack surface area By its design and purpose a Hadoop cluster presents several surface areas to authorized and unauthorized users. Surface Container node OS API Internal Network endpoints Role exposure Users authorized to execute application, jobs, queries Users authorized to execute application, jobs, queries Establish secure defaults Hadoop security is not enabled by default. Deploying a secure cluster requires a combination of network, machine, Domain Controller and Hadoop setup. Principle of Least privilege <kyle: complete> Principle of Defense in depth The principle of defense in depth suggests that where one control would be reasonable, more controls that approach risks in different fashions are better. Controls, when used in depth, can make severe vulnerabilities extraordinarily difficult to exploit and thus unlikely to occur. With secure coding, this may take the form of tier-based validation, centralized auditing controls, and requiring users to be logged on all pages. For example, a flawed administrative interface is unlikely to be vulnerable to anonymous attack if it correctly gates access to production management networks, checks for administrative user authorization, and logs all access. Fail securely Applications regularly fail to process transactions for many reasons. How they fail can determine if an application is secure or not. For example: isadmin = true; try { } codewhichmayfail(); isadmin = isuserinrole( Administrator );
19 catch (Exception ex) { } log.write(ex.tostring()); If either codewhichmayfail() or isuserinrole fails or throws and exception, the user is an admin by default. This is obviously a security risk. Don t trust services Many organizations utilize the processing capabilities of third party partners, who more than likely have differing security policies and posture than you. It is unlikely that you can influence or control any external third party, whether they are home users or major suppliers or partners. Therefore, implicit trust of externally run systems is not warranted. All external systems should be treated in a similar fashion. For example, a loyalty program provider provides data that is used by Internet Banking, providing the number of reward points and a small list of potential redemption items. However, the data should be checked to ensure that it is safe to display to end users, and that the reward points are a positive number, and not improbably large. Separation of duties A key fraud control is separation of duties. For example, someone who requests a computer cannot also sign for it, nor should they directly receive the computer. This prevents the user from requesting many computers, and claiming they never arrived. Certain roles have different levels of trust than normal users. In particular, administrators are different to normal users. In general, administrators should not be users of the application. For example, an administrator should be able to turn the system on or off, set password policy but shouldn t be able to log on to the storefront as a super privileged user, such as being able to buy goods on behalf of other users. Avoid security by obscurity Security through obscurity is a weak security control, and nearly always fails when it is the only control. This is not to say that keeping secrets is a bad idea, it simply means that the security of key systems should not be reliant upon keeping details hidden. For example, the security of an application should not rely upon knowledge of the source code being kept secret. The security should rely upon many other factors, including reasonable password policies, defense in depth, business transaction limits, solid network architecture, and fraud and audit controls.
20 A practical example is Linux. Linux s source code is widely available, and yet when properly secured, Linux is a hardy, secure and robust operating system. Keep security simple Attack surface area and simplicity go hand in hand. Certain software engineering fads prefer overly complex approaches to what would otherwise be relatively straightforward and simple code. Developers should avoid the use of double negatives and complex architectures when a simpler approach would be faster and simpler. For example, although it might be fashionable to have a slew of singleton entity beans running on a separate middleware server, it is more secure and faster to simply use global variables with an appropriate mutex mechanism to protect against race conditions. Fix security issues correctly Once a security issue has been identified, it is important to develop a test for it, and to understand the root cause of the issue. When design patterns are used, it is likely that the security issue is widespread amongst all code bases, so developing the right fix without introducing regressions is essential. For example, a user has found that they can see another user s balance by adjusting their cookie. The fix seems to be relatively straightforward, but as the cookie handling code is shared among all applications, a change to just one application will trickle through to all other applications. The fix must therefore be tested on all affected applications.
21 A1. Kerberos Overview The diagrams that follow, and their step by step descriptions, illustrate how Kerberos mutual authentication works. This is described at a fairly low level but there is still significant detail about Kerberos that is hidden or abstracted. The intention here is to illustrate the important aspects of Kerberos as they apply to this document and not complete coverage of Kerberos. To understand Kerberos, it is important to keep in mind that the KDC has access to the passwords for all users and services for which it will provide authentication. Note however that these passwords are never transmitted over the wire. Authentication is achieved by each party s ability to decrypt information that was encrypted with their password. Another important point to understand is that Hadoop s services are headless. This means that a user is not required to enter a password for these services to start and authenticate. These services are typically run as an OS user specific to the service. At installation time a password is generated for each services and stored in a Kerberos Keytab file unique to that user and service. The first diagram illustrates what occurs when a user logs in to a system either by executing kinit or by logging in to the desktop. A step by step description follows the diagram. 1. The user logs into the system. When done with kinit the system already knows the userprincipal (i.e. account name). When done via a desktop login the user-principal will collected along with the user-password. 2. The login process will contact the KDC Authentication Service to obtain a Ticket Granting Ticket or TGT for the user-principal. Here this TGT is called user-kdc-ticket. All Kerberos tickets have two layers. The other layer is encrypted with the client s password. The inner layer is encrypted with the service s password. In this case the
22 outer layer of the TGT is encrypted with the user s password and the inner layer is encrypted with the KDC s password. AS-REQ and AS-REP are the Kerberos protocol level names for the messages. 3. If the login process can decrypt the outer layer of user-kdc-ticket (i.e. TGT) then it can trust that the password provided by the user matches the password for that user know by the KDC. 4. If the user successfully authenticated, user-kdc-ticket (i.e. TGT) will be stored locally in the user s ticket cache. The ticket cache file should be protected as the OS level by user only read/write file permissions. 5. The login process informs the user the result of their authentication attempt. The important point is that there will only be a TGT in the user s ticket cache if authentication was successful. These TGTs are by default configured to expire in 24 hours. The second diagram shows the interactions that occur when a user attempts to access a Hadoop service. In this example the assumption is that the user has logged in as described above but has never before accessed the service (i.e. Name Node). Again, a step by step description will follow the diagram.
23
24 1. Initially the user executes the Hadoop CLI command to list a directory. For example: hadoop fs -ls /tmp 2. As the client is connecting to the Name Node using the RPC protocol a SASL/Kerberos Mutual Authentication handshake occurs. 3. First the TGT (i.e. user-kdc-ticket) is loaded from the user s ticket cache. For this example the assumption is that the user has previously logged in and therefore the ticket cache contains a valid TGT. 4. The SASL/Kerberos implementation will then check the ticket cache looking for a Service Ticket that would provide access to the Name Node. This is not shown in this example because it is assumed that no such ticket previously exists. Therefore, a request for this ticket is made to the KDC. The result is a service ticket (i.e. user-nnticket). The outer layer of this ticket is encrypted with the user s password and the inner layer of the ticket is encrypted with the Name Node s password. The names TGS-REQ and TGS-REP are the Kerberos protocol level names for these messages. 5. The user s password is obtained from the user s Keytab. 6. It is this decryption of the ticket and access to the ticket content that serves as the basis for authenticate the user to the service. Once decrypted a session-key is extracted from the service ticket. This same session-key is also present in the inner encrypted part of the ticket. This duplication will later be the basis for the authentication of the service. 7. If the service ticket can be decrypted it is know to be valid and it is stored, encrypted, in the user s ticket cache. A new service ticket will not be requested from the KDS for the same service until this service ticket expires. 8. The client prepares an authenticator (enc-user-auth) by encrypting a value with the session key extracted from the service ticket. Specifically this is the client principal and a timestamp. 9. The client sends both the (still encrypted with the service s password) inner portion of the service ticket (user-nn-ticket) and the encrypted (with the session key) authenticator (enc-user-auth) to the service. 10. The service must have access to its password in order to decrypt the inner portion of the service ticket received from the client. This password is loaded from the service s Keytab. 11. The service decrypts the inner portion of the service ticket (user-nn-ticket) with its password and extracts the session-key. 12. The service decrypts the authenticator (enc-user-auth) with that session-key. The service can then compare data (specifically the client principal) from the decrypted service ticket (user-nn-ticket) with data in the decrypted authenticator (dec-user-auth). If that data matches than the client is known to be authenticated by the KDC. This is true because the client has proven that they have access to the same session-key contained within the inner portion of the service ticket. 13. So far only client authentication has been established. In order for the service to authenticate to the client it must prove that it also has access to the same session-key. To do this, it will encrypt the authenticator (dec-user-auth) with the session-key extracted from the service ticket.
25 14. The service will then send the encrypted authenticator (enc-nn-auth) back to the client. AP-REP is the name of this message in the Kerberos protocol. 15. The client will attempt to decrypt the returned encrypted authenticator (enc-nn-auth) with the original session-key extract from the service ticket. 16. If the data within the authenticator matches then the client knows that the service was able to extract the same session-key from the service ticket. Therefore, it must have the same password as the KDC. 17. If mutual authentication was successful then the request is allowed through to the Name Node where it is processed and the results returned. A2. Token org.apache.hadoop.security.token.token The client side form of a token. This can be created either from an existing TokenIdentifier subclass instance or directly from the component parts. identifier byte[] The serialized form of the identity information in the TokenIdentifier password byte[] The serialized form of the password from the TokenIdentifier kind Text The kind of TokenIdentifier service Text The service to which the TokenIdentifier applies TokenRenewer renewer A plugin loaded via ServiceLoader for the TokenIdentifiers of this kind org.apache.hadoop.yarn.security.applicationtokenidentifier This token grants a specific ApplicationMaster instance access to the ResourceManager getkind() Text YARN_APPLICATION_TOKEN applicationattemptid ApplicationAttemptId Identifies the ApplicationMaster instance requesting access to the ResourceManager org.apache.hadoop.hdfs.security.token.block.blocktokenidentifier This token grants access to a specific HDFS block. getkind() Text HDFS_BLOCK_TOKEN expirydate long The time at which the token expires keyid int The ID of the key used to create the password userid String The principal of the authenticated user
26 blockpoolid String The block pool to which the block belongs blockid long The ID of the block to allow access EnumSet<AccessMode> modes The authorized access modes READ/WRITE/COPY/ REPLACE byte[] cache A cache of the serialized form of the token org.apache.hadoop.yarn.security.clienttokenidentifier This token grants a MapReduce client access to a particular MapReduce ApplicationMaster instance. getkind() Text YARN_CLIENT_TOKEN applicationattemptid ApplicationAttemptId The ApplicationMaster instance to which this token grants access org.apache.hadoop.mapreduce.security.token.jobtokenidentifier This token grants a specific Container access to the MapReduce ApplicationMaster to update the status of a particular job. getkind() Text mapreduce.job jobid Text The ID of the Job that this token can be used to access org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.s ecurity.localizertokenidentifier This token is used by a ContainerLocalizer to report status back to the NodeManager which executed it. getkind() Text Localizer org.apache.hadoop.hdfs.security.token.delegation.delegationtokenidentifie r This token is acquired by a user from the NameNode and is used to allow other Hadoop services to access HDFS on the user s behalf for a limited period of time. getkind() Text HDFS_DELEGATION_TOKEN
27 owner Text The effective username of the token owner renewer Text The principal of the service that is authorized to renew this token realuser Text The real username of the token owner issuedate long The timestamp when this token was issued maxdate long The timestamp at which this token will expire sequencenumber int Nonce to ensure generated token passwords are unique masterkeyid int The ID of the master secret used to create this token s password org.apache.hadoop.mapreduce.v2.api.mrdelegationtokenidentifier This token is acquired by a user from the ResourceManager and is used to allow other Hadoop services to access the ResourceManager on the user s behalf for a limited period of time. getkind() Text MR_DELEGATION_TOKEN owner Text The effective username of the token owner renewer Text The principal of the service that is authorized to renew this token realuser Text The real username of the token owner issuedate long The timestamp when this token was issued maxdate long The timestamp at which this token will expire sequencenumber int Nonce to ensure generated token passwords are unique masterkeyid int The ID of the master secret used to create this token s password org.apache.hadoop.yarn.security.client.rmdelegationtokenidentifier This token is acquired by a user from the MapReduce HistoryServer and is used to allow other Hadoop services to access the MapReduce HistoryServer on the user s behalf for a limited period of time. getkind() Text RM_DELEGATION_TOKEN owner Text The effective username of the token owner renewer Text The principal of the service that is authorized to renew this token realuser Text The real username of the token owner
28 issuedate long The timestamp when this token was issued maxdate long The timestamp at which this token will expire sequencenumber int Nonce to ensure generated token passwords are unique masterkeyid int The ID of the master secret used to create this token s password A2. Sequence Diagrams HDFS Bootstrap YARN Bootstrap Map Reduce Job Definition Map Reduce Job Submission Map Reduce Job Initiation Map Reduce Map Task Execution Map Reduce Reduce Task Execution Map Reduce Job Completion Map Reduce Job Monitoring
29 YARN Node Manager Token Flow
30
31 YARN Application Token Flow
32 YARN Container Token Flow
33
34 Map Reduce Client Token Flow Map Reduce Shuffle Secret Flow HDFS Block Access Token Flow
35 HDFS Name Node Delegation Token Flow YARN Resource Manager Delegation Token Flow Map Reduce History Server Delegation Token Flow Hadoop Process Launching Overview Hadoop SASL/DIGEST-MD5 Token Authentication Server Interactions
36
37 Generic Case:
38 MR-specific case: Name Protocol Holder Generator MRDelegation MRClient User Agent RMDelegation ClientRM User Agent Resource Manager Delegation Application AMRM App Master Resource Manager Block DataXceiver Client Name Node Data Node Client ClientRM User Agent ResourceManager
39 Container ContainerManager AppMaster ResourceManager Job Shuffle MRClientProtocolSer vice MRContainer MRAppMaster Localizer Localization N/A N/A
Hadoop Security Design
Hadoop Security Design Owen O Malley, Kan Zhang, Sanjay Radia, Ram Marti, and Christopher Harrell Yahoo! {owen,kan,sradia,rmari,cnh}@yahoo-inc.com October 2009 Contents 1 Overview 2 1.1 Security risks.............................
Evaluation of Security in Hadoop
Evaluation of Security in Hadoop MAHSA TABATABAEI Master s Degree Project Stockholm, Sweden December 22, 2014 XR-EE-LCN 2014:013 A B S T R A C T There are different ways to store and process large amount
Columbia University Web Security Standards and Practices. Objective and Scope
Columbia University Web Security Standards and Practices Objective and Scope Effective Date: January 2011 This Web Security Standards and Practices document establishes a baseline of security related requirements
Integrating Kerberos into Apache Hadoop
Integrating Kerberos into Apache Hadoop Kerberos Conference 2010 Owen O Malley [email protected] Yahoo s Hadoop Team Who am I An architect working on Hadoop full time Mainly focused on MapReduce Tech-lead
Apache Sentry. Prasad Mujumdar [email protected] [email protected]
Apache Sentry Prasad Mujumdar [email protected] [email protected] Agenda Various aspects of data security Apache Sentry for authorization Key concepts of Apache Sentry Sentry features Sentry architecture
Centrify Server Suite 2015.1 For MapR 4.1 Hadoop With Multiple Clusters in Active Directory
Centrify Server Suite 2015.1 For MapR 4.1 Hadoop With Multiple Clusters in Active Directory v1.1 2015 CENTRIFY CORPORATION. ALL RIGHTS RESERVED. 1 Contents General Information 3 Centrify Server Suite for
OpenHRE Security Architecture. (DRAFT v0.5)
OpenHRE Security Architecture (DRAFT v0.5) Table of Contents Introduction -----------------------------------------------------------------------------------------------------------------------2 Assumptions----------------------------------------------------------------------------------------------------------------------2
Guide to SASL, GSSAPI & Kerberos v.6.0
SYMLABS VIRTUAL DIRECTORY SERVER Guide to SASL, GSSAPI & Kerberos v.6.0 Copyright 2011 www.symlabs.com Chapter 1 Introduction Symlabs has added support for the GSSAPI 1 authentication mechanism, which
docs.hortonworks.com
docs.hortonworks.com Hortonworks Data Platform : Hadoop Security Guide Copyright 2012-2015 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform, powered by Apache Hadoop, is a massively
How To Use Kerberos
KERBEROS 1 Kerberos Authentication Service Developed at MIT under Project Athena in mid 1980s Versions 1-3 were for internal use; versions 4 and 5 are being used externally Version 4 has a larger installed
Columbia University Web Application Security Standards and Practices. Objective and Scope
Columbia University Web Application Security Standards and Practices Objective and Scope Effective Date: January 2011 This Web Application Security Standards and Practices document establishes a baseline
IceWarp Server - SSO (Single Sign-On)
IceWarp Server - SSO (Single Sign-On) Probably the most difficult task for me is to explain the new SSO feature of IceWarp Server. The reason for this is that I have only little knowledge about it and
Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. https://hadoop.apache.org. Big Data Management and Analytics
Overview Big Data in Apache Hadoop - HDFS - MapReduce in Hadoop - YARN https://hadoop.apache.org 138 Apache Hadoop - Historical Background - 2003: Google publishes its cluster architecture & DFS (GFS)
HDFS Users Guide. Table of contents
Table of contents 1 Purpose...2 2 Overview...2 3 Prerequisites...3 4 Web Interface...3 5 Shell Commands... 3 5.1 DFSAdmin Command...4 6 Secondary NameNode...4 7 Checkpoint Node...5 8 Backup Node...6 9
Spectrum Scale HDFS Transparency Guide
Spectrum Scale Guide Spectrum Scale BDA 2016-1-5 Contents 1. Overview... 3 2. Supported Spectrum Scale storage mode... 4 2.1. Local Storage mode... 4 2.2. Shared Storage Mode... 4 3. Hadoop cluster planning...
Onegini Token server / Web API Platform
Onegini Token server / Web API Platform Companies and users interact securely by sharing data between different applications The Onegini Token server is a complete solution for managing your customer s
docs.hortonworks.com
docs.hortonworks.com Hortonworks Data Platform : Hadoop Security Guide Copyright 2012-2014 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform, powered by Apache Hadoop, is a massively
SSL A discussion of the Secure Socket Layer
www.harmonysecurity.com [email protected] SSL A discussion of the Secure Socket Layer By Stephen Fewer Contents 1 Introduction 2 2 Encryption Techniques 3 3 Protocol Overview 3 3.1 The SSL Record
VMware Identity Manager Administration
VMware Identity Manager Administration VMware Identity Manager 2.6 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new
OPC UA vs OPC Classic
OPC UA vs OPC Classic By Paul Hunkar Security and Communication comparison In the world of automation security has become a major source of discussion and an important part of most systems. The OPC Foundation
Tenrox. Single Sign-On (SSO) Setup Guide. January, 2012. 2012 Tenrox. All rights reserved.
Tenrox Single Sign-On (SSO) Setup Guide January, 2012 2012 Tenrox. All rights reserved. About this Guide This guide provides a high-level technical overview of the Tenrox Single Sign-On (SSO) architecture,
Configuration of Kerberos Constrained Delegation On NetScaler Revision History
Configuration of Kerberos Constrained Delegation On NetScaler Revision History Revision Date Author Contributors Comments 1.0 Dec. 2011 Raymond Initial draft 1.1 May. 2012 Raymond Added configuration section
CA Performance Center
CA Performance Center Single Sign-On User Guide 2.4 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation ) is
Mac OS X Directory Services
Mac OS X Directory Services Agenda Open Directory Mac OS X client access Directory services in Mac OS X Server Redundancy and replication Mac OS X access to other directory services Active Directory support
Apache Hadoop YARN: The Nextgeneration Distributed Operating. System. Zhijie Shen & Jian He @ Hortonworks
Apache Hadoop YARN: The Nextgeneration Distributed Operating System Zhijie Shen & Jian He @ Hortonworks About Us Software Engineer @ Hortonworks, Inc. Hadoop Committer @ The Apache Foundation We re doing
4.2: Kerberos Kerberos V4 Kerberos V5. Chapter 5: Security Concepts for Networks. Lehrstuhl für Informatik 4 Kommunikation und verteilte Systeme
Chapter 2: Security Techniques Background Chapter 3: Security on Network and Transport Layer Chapter 4: Security on the Application Layer Secure Applications Network Authentication Service: Kerberos 4.2:
Big Data Operations Guide for Cloudera Manager v5.x Hadoop
Big Data Operations Guide for Cloudera Manager v5.x Hadoop Logging into the Enterprise Cloudera Manager 1. On the server where you have installed 'Cloudera Manager', make sure that the server is running,
CrashPlan Security SECURITY CONTEXT TECHNOLOGY
TECHNICAL SPECIFICATIONS CrashPlan Security CrashPlan is a continuous, multi-destination solution engineered to back up mission-critical data whenever and wherever it is created. Because mobile laptops
Architecture of Enterprise Applications III Single Sign-On
Architecture of Enterprise Applications III Single Sign-On Haopeng Chen REliable, INtelligent and Scalable Systems Group (REINS) Shanghai Jiao Tong University Shanghai, China e-mail: [email protected]
Smart Card Authentication. Administrator's Guide
Smart Card Authentication Administrator's Guide October 2012 www.lexmark.com Contents 2 Contents Overview...4 Configuring the applications...5 Configuring printer settings for use with the applications...5
IBM SPSS Collaboration and Deployment Services Version 6 Release 0. Single Sign-On Services Developer's Guide
IBM SPSS Collaboration and Deployment Services Version 6 Release 0 Single Sign-On Services Developer's Guide Note Before using this information and the product it supports, read the information in Notices
Leverage Active Directory with Kerberos to Eliminate HTTP Password
Leverage Active Directory with Kerberos to Eliminate HTTP Password PistolStar, Inc. PO Box 1226 Amherst, NH 03031 USA Phone: 603.547.1200 Fax: 603.546.2309 E-mail: [email protected] Website: www.pistolstar.com
Authentication Application
Authentication Application KERBEROS In an open distributed environment servers to be able to restrict access to authorized users to be able to authenticate requests for service a workstation cannot be
Hadoop Applications on High Performance Computing. Devaraj Kavali [email protected]
Hadoop Applications on High Performance Computing Devaraj Kavali [email protected] About Me Apache Hadoop Committer Yarn/MapReduce Contributor Senior Software Engineer @Intel Corporation 2 Agenda Objectives
Configuring Integrated Windows Authentication for IBM WebSphere with SAS 9.2 Web Applications
Configuring Integrated Windows Authentication for IBM WebSphere with SAS 9.2 Web Applications Copyright Notice The correct bibliographic citation for this manual is as follows: SAS Institute Inc., Configuring
Cloudera Manager Training: Hands-On Exercises
201408 Cloudera Manager Training: Hands-On Exercises General Notes... 2 In- Class Preparation: Accessing Your Cluster... 3 Self- Study Preparation: Creating Your Cluster... 4 Hands- On Exercise: Working
TIBCO ActiveMatrix BPM Single Sign-On
Software Release 3.1 November 2014 Two-Second Advantage 2 Important Information SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH EMBEDDED OR BUNDLED TIBCO SOFTWARE IS SOLELY TO ENABLE
InfoCenter Suite and the FDA s 21 CFR part 11 Electronic Records; Electronic Signatures
InfoCenter Suite and the FDA s 21 CFR part 11 Electronic Records; Electronic Signatures Overview One of the most popular applications of InfoCenter Suite is to help FDA regulated companies comply with
MongoDB Security Guide
MongoDB Security Guide Release 2.6.11 MongoDB, Inc. December 09, 2015 2 MongoDB, Inc. 2008-2015 This work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 3.0 United States License
User-ID Best Practices
User-ID Best Practices PAN-OS 5.0, 5.1, 6.0 Revision A 2011, Palo Alto Networks, Inc. www.paloaltonetworks.com Table of Contents PAN-OS User-ID Functions... 3 User / Group Enumeration... 3 Using LDAP Servers
Setup Guide Access Manager 3.2 SP3
Setup Guide Access Manager 3.2 SP3 August 2014 www.netiq.com/documentation Legal Notice THIS DOCUMENT AND THE SOFTWARE DESCRIBED IN THIS DOCUMENT ARE FURNISHED UNDER AND ARE SUBJECT TO THE TERMS OF A LICENSE
Apache HBase. Crazy dances on the elephant back
Apache HBase Crazy dances on the elephant back Roman Nikitchenko, 16.10.2014 YARN 2 FIRST EVER DATA OS 10.000 nodes computer Recent technology changes are focused on higher scale. Better resource usage
Introduction to Computer Security
Introduction to Computer Security Authentication and Access Control Pavel Laskov Wilhelm Schickard Institute for Computer Science Resource access: a big picture 1. Identification Which object O requests
Fairsail REST API: Guide for Developers
Fairsail REST API: Guide for Developers Version 1.02 FS-API-REST-PG-201509--R001.02 Fairsail 2015. All rights reserved. This document contains information proprietary to Fairsail and may not be reproduced,
Managing Users and Identity Stores
CHAPTER 8 Overview ACS manages your network devices and other ACS clients by using the ACS network resource repositories and identity stores. When a host connects to the network through ACS requesting
WATCHING THE WATCHDOG: PROTECTING KERBEROS AUTHENTICATION WITH NETWORK MONITORING
WATCHING THE WATCHDOG: PROTECTING KERBEROS AUTHENTICATION WITH NETWORK MONITORING Authors: Tal Be ery, Sr. Security Research Manager, Microsoft Michael Cherny, Sr. Security Researcher, Microsoft November
Kerberos. Public domain image of Heracles and Cerberus. From an Attic bilingual amphora, 530 520 BC. From Italy (?).
Kerberos Public domain image of Heracles and Cerberus. From an Attic bilingual amphora, 530 520 BC. From Italy (?). 1 Kerberos Kerberos is an authentication protocol and a software suite implementing this
Hadoop Security Design Just Add Kerberos? Really?
isec Partners, Inc. Hadoop Security Design Just Add Kerberos? Really? isec Partners, Inc. is an information security firm that specializes in application, network, host, and product security. For more
RSA Authentication Manager 7.1 Basic Exercises
RSA Authentication Manager 7.1 Basic Exercises Contact Information Go to the RSA corporate web site for regional Customer Support telephone and fax numbers: www.rsa.com Trademarks RSA and the RSA logo
IWA AUTHENTICATION FUNDAMENTALS AND DEPLOYMENT GUIDELINES
IWA AUTHENTICATION FUNDAMENTALS AND DEPLOYMENT GUIDELINES TECHNICAL BRIEF INTRODUCTION The purpose of this document is to explain how Integrated Windows Authentication (IWA) works with the ProxySG appliance,
ArcGIS Server Security Threats & Best Practices 2014. David Cordes Michael Young
ArcGIS Server Security Threats & Best Practices 2014 David Cordes Michael Young Agenda Introduction Threats Best practice - ArcGIS Server settings - Infrastructure settings - Processes Summary Introduction
Configuring Security Features of Session Recording
Configuring Security Features of Session Recording Summary This article provides information about the security features of Citrix Session Recording and outlines the process of configuring Session Recording
docs.hortonworks.com
docs.hortonworks.com Hortonworks Data Platform: Configuring Kafka for Kerberos Over Ambari Copyright 2012-2015 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform, powered by Apache Hadoop,
Architecting the Future of Big Data
Hive ODBC Driver User Guide Revised: July 22, 2013 2012-2013 Hortonworks Inc. All Rights Reserved. Parts of this Program and Documentation include proprietary software and content that is copyrighted and
Security Digital Certificate Manager
IBM i Security Digital Certificate Manager 7.1 IBM i Security Digital Certificate Manager 7.1 Note Before using this information and the product it supports, be sure to read the information in Notices,
Implementation of Hadoop Distributed File System Protocol on OneFS Tanuj Khurana EMC Isilon Storage Division
Implementation of Hadoop Distributed File System Protocol on OneFS Tanuj Khurana EMC Isilon Storage Division Outline HDFS Overview OneFS Overview HDFS protocol on OneFS HDFS protocol server implementation
Two SSO Architectures with a Single Set of Credentials
Two SSO Architectures with a Single Set of Credentials Abstract Single sign-on (SSO) is a widely used mechanism that uses a single action of authentication and authority to permit an authorized user to
Security Digital Certificate Manager
System i Security Digital Certificate Manager Version 5 Release 4 System i Security Digital Certificate Manager Version 5 Release 4 Note Before using this information and the product it supports, be sure
Like what you hear? Tweet it using: #Sec360
Like what you hear? Tweet it using: #Sec360 HADOOP SECURITY Like what you hear? Tweet it using: #Sec360 HADOOP SECURITY About Robert: School: UW Madison, U St. Thomas Programming: 15 years, C, C++, Java
ShopWindow Integration and Setup Guide
ShopWindow Integration and Setup Guide Contents GETTING STARTED WITH SHOPWINDOW TOOLSET... 3 WEB SERVICES, CLIENT SOFTWARE, OR DIRECT?...3 SHOPWINDOW SIGNUP...4 ACCESSING SHOPWINDOW TOOLSET...4 WEB SERVICES...
Securing Web Services From Encryption to a Web Service Security Infrastructure
Securing Web Services From Encryption to a Web Service Security Infrastructure Kerberos WS-Security X.509 TLS Gateway OWSM WS-Policy Peter Lorenzen WS-Addressing Agent SAML Policy Manager Technology Manager
T his feature is add-on service available to Enterprise accounts.
SAML Single Sign-On T his feature is add-on service available to Enterprise accounts. Are you already using an Identity Provider (IdP) to manage logins and access to the various systems your users need
Configuring Integrated Windows Authentication for Oracle WebLogic with SAS 9.2 Web Applications
Configuring Integrated Windows Authentication for Oracle WebLogic with SAS 9.2 Web Applications Copyright Notice The correct bibliographic citation for this manual is as follows: SAS Institute Inc., Configuring
SECURITY IMPLEMENTATION IN HADOOP. By Narsimha Chary(200607008) Siddalinga K M(200950034) Rahman(200950032)
SECURITY IMPLEMENTATION IN HADOOP By Narsimha Chary(200607008) Siddalinga K M(200950034) Rahman(200950032) AGENDA What is security? Security in Distributed File Systems? Current level of security in Hadoop!
FINAL DoIT 04.01.2013- v.8 APPLICATION SECURITY PROCEDURE
Purpose: This procedure identifies what is required to ensure the development of a secure application. Procedure: The five basic areas covered by this document include: Standards for Privacy and Security
Introduction to Hadoop. New York Oracle User Group Vikas Sawhney
Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop
Thick Client Application Security
Thick Client Application Security Arindam Mandal ([email protected]) (http://www.paladion.net) January 2005 This paper discusses the critical vulnerabilities and corresponding risks in a two
Cloudera Backup and Disaster Recovery
Cloudera Backup and Disaster Recovery Important Note: Cloudera Manager 4 and CDH 4 have reached End of Maintenance (EOM) on August 9, 2015. Cloudera will not support or provide patches for any of the Cloudera
BlackBerry Enterprise Service 10. Secure Work Space for ios and Android Version: 10.1.1. Security Note
BlackBerry Enterprise Service 10 Secure Work Space for ios and Android Version: 10.1.1 Security Note Published: 2013-06-21 SWD-20130621110651069 Contents 1 About this guide...4 2 What is BlackBerry Enterprise
Red Hat Enterprise ipa
Red Hat Enterprise ipa Introduction Red Hat Enterprise IPA enables your organization to comply with regulations, reduce risk, and become more efficient. Simply and centrally manage your Linux/Unix users
VMware Horizon Workspace Security Features WHITE PAPER
VMware Horizon Workspace WHITE PAPER Table of Contents... Introduction.... 4 Horizon Workspace vapp Security.... 5 Virtual Machine Security Hardening.... 5 Authentication.... 6 Activation.... 6 Horizon
Architecting the Future of Big Data
Hive ODBC Driver User Guide Revised: July 22, 2014 2012-2014 Hortonworks Inc. All Rights Reserved. Parts of this Program and Documentation include proprietary software and content that is copyrighted and
Colligo Engage Windows App 7.0. Administrator s Guide
Colligo Engage Windows App 7.0 Administrator s Guide Contents Introduction... 3 Target Audience... 3 Overview... 3 Localization... 3 SharePoint Security & Privileges... 3 System Requirements... 4 Software
Apache Hadoop 2.0 Installation and Single Node Cluster Configuration on Ubuntu A guide to install and setup Single-Node Apache Hadoop 2.
EDUREKA Apache Hadoop 2.0 Installation and Single Node Cluster Configuration on Ubuntu A guide to install and setup Single-Node Apache Hadoop 2.0 Cluster edureka! 11/12/2013 A guide to Install and Configure
Introduction to HDFS. Prasanth Kothuri, CERN
Prasanth Kothuri, CERN 2 What s HDFS HDFS is a distributed file system that is fault tolerant, scalable and extremely easy to expand. HDFS is the primary distributed storage for Hadoop applications. HDFS
Receiver Updater for Windows 4.0 and 3.x
Receiver Updater for Windows 4.0 and 3.x 2015-04-12 05:29:34 UTC 2015 Citrix Systems, Inc. All rights reserved. Terms of Use Trademarks Privacy Statement Contents Receiver Updater for Windows 4.0 and 3.x...
Configuring Integrated Windows Authentication for JBoss with SAS 9.3 Web Applications
Configuring Integrated Windows Authentication for JBoss with SAS 9.3 Web Applications Copyright Notice The correct bibliographic citation for this manual is as follows: SAS Institute Inc., Configuring
Security IIS Service Lesson 6
Security IIS Service Lesson 6 Skills Matrix Technology Skill Objective Domain Objective # Configuring Certificates Configure SSL security 3.6 Assigning Standard and Special NTFS Permissions Enabling and
SAML-Based SSO Solution
About SAML SSO Solution, page 1 SAML-Based SSO Features, page 2 Basic Elements of a SAML SSO Solution, page 2 SAML SSO Web Browsers, page 3 Cisco Unified Communications Applications that Support SAML SSO,
TIBCO ActiveMatrix BPM Single Sign-On
TIBCO ActiveMatrix BPM Single Sign-On Software Release 4.0 November 2015 Two-Second Advantage 2 Important Information SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH EMBEDDED OR
Workday Mobile Security FAQ
Workday Mobile Security FAQ Workday Mobile Security FAQ Contents The Workday Approach 2 Authentication 3 Session 3 Mobile Device Management (MDM) 3 Workday Applications 4 Web 4 Transport Security 5 Privacy
SHARPCLOUD SECURITY STATEMENT
SHARPCLOUD SECURITY STATEMENT Summary Provides details of the SharpCloud Security Architecture Authors: Russell Johnson and Andrew Sinclair v1.8 (December 2014) Contents Overview... 2 1. The SharpCloud
Kerberos. Login via Password. Keys in Kerberos
Kerberos Chapter 2: Security Techniques Background Chapter 3: Security on Network and Transport Layer Chapter 4: Security on the Application Layer Secure Applications Network Authentication Service: Kerberos
Data Collection and Analysis: Get End-to-End Security with Cisco Connected Analytics for Network Deployment
White Paper Data Collection and Analysis: Get End-to-End Security with Cisco Connected Analytics for Network Deployment Cisco Connected Analytics for Network Deployment (CAND) is Cisco hosted, subscription-based
Evaluation of different Open Source Identity management Systems
Evaluation of different Open Source Identity management Systems Ghasan Bhatti, Syed Yasir Imtiaz Linkoping s universitetet, Sweden [ghabh683, syeim642]@student.liu.se 1. Abstract Identity management systems
Oracle WebCenter Content
Oracle WebCenter Content 21 CFR Part 11 Certification Kim Hutchings US Data Management Phone: 888-231-0816 Email: [email protected] Introduction In May 2011, US Data Management (USDM) was
TIBCO Spotfire Platform IT Brief
Platform IT Brief This IT brief outlines features of the system: Communication security, load balancing and failover, authentication options, and recommended practices for licenses and access. It primarily
How to Install and Configure EBF15328 for MapR 4.0.1 or 4.0.2 with MapReduce v1
How to Install and Configure EBF15328 for MapR 4.0.1 or 4.0.2 with MapReduce v1 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,
Cleaning Encrypted Traffic
Optenet Documentation Cleaning Encrypted Traffic Troubleshooting Guide iii Version History Doc Version Product Date Summary of Changes V6 OST-6.4.300 01/02/2015 English editing Optenet Documentation
Criteria for web application security check. Version 2015.1
Criteria for web application security check Version 2015.1 i Content Introduction... iii ISC- P- 001 ISC- P- 001.1 ISC- P- 001.2 ISC- P- 001.3 ISC- P- 001.4 ISC- P- 001.5 ISC- P- 001.6 ISC- P- 001.7 ISC-
CS 356 Lecture 28 Internet Authentication. Spring 2013
CS 356 Lecture 28 Internet Authentication Spring 2013 Review Chapter 1: Basic Concepts and Terminology Chapter 2: Basic Cryptographic Tools Chapter 3 User Authentication Chapter 4 Access Control Lists
Kerberos and Active Directory symmetric cryptography in practice COSC412
Kerberos and Active Directory symmetric cryptography in practice COSC412 Learning objectives Understand the function of Kerberos Explain how symmetric cryptography supports the operation of Kerberos Summarise
Mashery OAuth 2.0 Implementation Guide
Mashery OAuth 2.0 Implementation Guide June 2012 Revised: 7/18/12 www.mashery.com Mashery, Inc. 717 Market Street, Suite 300 San Francisco, CA 94103 Contents C hapter 1. About this Guide...5 Introduction...
Kerberos and Single Sign-On with HTTP
Kerberos and Single Sign-On with HTTP Joe Orton Red Hat Introduction The Problem Current Solutions Future Solutions Conclusion Overview Introduction WebDAV: common complaint of poor support for authentication
VMware Identity Manager Administration
VMware Identity Manager Administration VMware Identity Manager 2.4 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new
FreeIPA Cross Forest Trusts
Alexander Bokovoy Andreas Schneider May 10th, 2012 1 FreeIPA What is FreeIPA? Cross Forest Trusts 2 Samba 3 Demo Talloc Tutorial Pavel Brezina wrote Talloc tutorial! http://talloc.samba.org/
