A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-custom-ami.html below:

Using a custom AMI to provide more flexibility for Amazon EMR cluster configuration

Using a custom AMI to provide more flexibility for Amazon EMR cluster configuration

When you use Amazon EMR 5.7.0 or higher, you can choose to specify a custom Amazon Linux AMI instead of the default Amazon Linux AMI for Amazon EMR. A custom AMI is useful if you want to do the following:

A custom AMI must exist in the same AWS Region where you create the cluster. It should also match the EC2 instance architecture. For example, an m5.xlarge instance has an x86_64 architecture. Therefore, to provision an m5.xlarge using a custom AMI, your custom AMI should also have x86_64 architecture. Similarly, to provision an m6g.xlarge instance, which has arm64 architecture, your custom AMI should have arm64 architecture. For more information about identifying a Linux AMI for your instance type, see Find a Linux AMI in the Amazon EC2 User Guide.

Important

EMR clusters that run Amazon Linux or Amazon Linux 2 Amazon Machine Images (AMIs) use default Amazon Linux behavior, and do not automatically download and install important and critical kernel updates that require a reboot. This is the same behavior as other Amazon EC2 instances that run the default Amazon Linux AMI. If new Amazon Linux software updates that require a reboot (such as kernel, NVIDIA, and CUDA updates) become available after an Amazon EMR release becomes available, EMR cluster instances that run the default AMI do not automatically download and install those updates. To get kernel updates, you can customize your Amazon EMR AMI to use the latest Amazon Linux AMI.

Creating a custom Amazon Linux AMI from a preconfigured instance

The basic steps for pre-installing software and performing other configurations to create a custom Amazon Linux AMI for Amazon EMR are as follows:

After you create the image based on your customized instance, you can copy that image to an encrypted target as described in Creating a custom AMI with an encrypted Amazon EBS root device volume.

Tutorial: Creating an AMI from an instance with custom software installed To launch an EC2 instance based on the most recent Amazon Linux AMI
  1. Use the AWS CLI to run the following command, which creates an instance from an existing AMI. Replace MyKeyName with the key pair you use to connect to the instance and MyAmiId with the ID of an appropriate Amazon Linux AMI. For the most recent AMI IDs, see Amazon Linux AMI.

    Note

    Linux line continuation characters (\) are included for readability. They can be removed or used in Linux commands. For Windows, remove them or replace with a caret (^).

    
    aws ec2 run-instances --image-id MyAmiID \
    --count 1 --instance-type m5.xlarge \
    --key-name MyKeyName --region us-west-2
    

    The InstanceId output value is used as MyInstanceId in the next step.

  2. Run the following command:

    aws ec2 describe-instances --instance-ids MyInstanceId

    The PublicDnsName output value is used to connect to the instance in the next step.

To connect to the instance and install software
  1. Use an SSH connection that lets you run shell commands on your Linux instance. For more information, see Connecting to your Linux instance using SSH in the Amazon EC2 User Guide.

  2. Perform any required customizations. For example:

    sudo yum install MySoftwarePackage
    sudo pip install MySoftwarePackage
How to use a custom AMI in an Amazon EMR cluster

You can use a custom AMI to provision an Amazon EMR cluster in two ways:

You can use only one of the two options when provisioning an EMR cluster, and you cannot change it once the cluster has started.

Considerations for using single versus multiple custom AMIs in an Amazon EMR cluster Consideration Single custom AMI Multiple custom AMIs

Use both x86 and Graviton2 processors with custom AMIs in the same cluster

Not supported

Supported

AMI customization varies across instance types

Not supported

Supported

Change custom AMIs when adding new task instance groups/fleets to a running cluster. Note: you cannot change the custom AMI of existing instance groups/fleets.

Not supported

Supported

Use AWS Console to start a cluster

Supported

Not supported

Use AWS CloudFormation to start a cluster

Supported

Supported

Use a single custom AMI in an EMR cluster

To specify a custom AMI ID when you create a cluster, use one of the following:

Amazon EMR console
To specify a single custom AMI from the console
  1. Sign in to the AWS Management Console, and open the Amazon EMR console at https://console.aws.amazon.com/emr.

  2. Under EMR on EC2 in the left navigation pane, choose Clusters, and then choose Create cluster.

  3. Under Name and applications, find Operating system options. Choose Custom AMI, and enter your AMI ID in the Custom AMI field.

  4. Choose any other options that apply to your cluster.

  5. To launch your cluster, choose Create cluster.

AWS CLI
To specify a single custom AMI with the AWS CLI
Use multiple custom AMIs in an Amazon EMR cluster

To create a cluster using multiple custom AMIs, use one of the following:

The AWS Management Console currently does not support creating a cluster using multiple custom AMIs.

Example - Use the AWS CLI to create an instance group cluster using multiple custom AMIs

Using the AWS CLI version 1.20.21 or higher, you can assign a single custom AMI to the entire cluster, or you can assign multiple custom AMIs to every instance node in your cluster.

The following example shows a uniform instance group cluster created with two instance types (m5.xlarge) used across node types (primary, core, task). Each node has multiple custom AMIs. The example illustrates several features of the multiple custom AMI configuration:

aws emr create-cluster --instance-groups 
InstanceGroupType=PRIMARY,InstanceType=m5.xlarge,InstanceCount=1,CustomAmiId=ami-123456 
InstanceGroupType=CORE,InstanceType=m5.xlarge,InstanceCount=1,CustomAmiId=ami-234567
InstanceGroupType=TASK,InstanceType=m6g.xlarge,InstanceCount=1,CustomAmiId=ami-345678
InstanceGroupType=TASK,InstanceType=m5.xlarge,InstanceCount=1,CustomAmiId=ami-456789
Example - Use the AWS CLI version 1.20.21 or higher to add a task node to a running instance group cluster with multiple instance types and multiple custom AMIs

Using the AWS CLI version 1.20.21 or higher, you can add multiple custom AMIs to an instance group that you add to a running cluster. The CustomAmiId argument can be used with the add-instance-groups command as shown in the following example. Notice that the same multiple custom AMI ID (ami-123456) is used in more than one node.

aws emr create-cluster --instance-groups 
InstanceGroupType=PRIMARY,InstanceType=m5.xlarge,InstanceCount=1,CustomAmiId=ami-123456 
InstanceGroupType=CORE,InstanceType=m5.xlarge,InstanceCount=1,CustomAmiId=ami-123456
InstanceGroupType=TASK,InstanceType=m5.xlarge,InstanceCount=1,CustomAmiId=ami-234567

{
    "ClusterId": "j-123456",
    ...
}

aws emr add-instance-groups --cluster-id j-123456 --instance-groups InstanceGroupType=Task,InstanceType=m6g.xlarge,InstanceCount=1,CustomAmiId=ami-345678
Example - Use the AWS CLI version 1.20.21 or higher to create an instance fleet cluster, multiple custom AMIs, multiple instance types, On-Demand primary, On-Demand core, multiple core and task nodes
aws emr create-cluster --instance-fleets 
InstanceFleetType=PRIMARY,TargetOnDemandCapacity=1,InstanceTypeConfigs=['{InstanceType=m5.xlarge, CustomAmiId=ami-123456}'] 
InstanceFleetType=CORE,TargetOnDemandCapacity=1,InstanceTypeConfigs=['{InstanceType=m5.xlarge,CustomAmiId=ami-234567},{InstanceType=m6g.xlarge, CustomAmiId=ami-345678}']
InstanceFleetType=TASK,TargetSpotCapacity=1,InstanceTypeConfigs=['{InstanceType=m5.xlarge,CustomAmiId=ami-456789},{InstanceType=m6g.xlarge, CustomAmiId=ami-567890}']
Example - Use the AWS CLI version 1.20.21 or higher to add task nodes to a running cluster with multiple instance types and multiple custom AMIs
aws emr create-cluster --instance-fleets 
InstanceFleetType=PRIMARY,TargetOnDemandCapacity=1,InstanceTypeConfigs=['{InstanceType=m5.xlarge, CustomAmiId=ami-123456}'] 
InstanceFleetType=CORE,TargetOnDemandCapacity=1,InstanceTypeConfigs=['{InstanceType=m5.xlarge,CustomAmiId=ami-234567},{InstanceType=m6g.xlarge, CustomAmiId=ami-345678}']

{
    "ClusterId": "j-123456",
    ...
}

aws emr add-instance-fleet --cluster-id j-123456 --instance-fleet 
InstanceFleetType=TASK,TargetSpotCapacity=1,InstanceTypeConfigs=['{InstanceType=m5.xlarge,CustomAmiId=ami-234567},{InstanceType=m6g.xlarge, CustomAmiId=ami-345678}']
Managing AMI package repository updates

On first boot, by default, Amazon Linux AMIs connect to package repositories to install security updates before other services start. Depending on your requirements, you may choose to disable these updates when you specify a custom AMI for Amazon EMR. The option to disable this feature is available only when you use a custom AMI. By default, Amazon Linux kernel updates and other software packages that require a reboot are not updated. Note that your networking configuration must allow for HTTP and HTTPS egress to Amazon Linux repositories in Amazon S3, otherwise security updates will not succeed.

Warning

We strongly recommend that you choose to update all installed packages on reboot when you specify a custom AMI. Choosing not to update packages creates additional security risks.

With the AWS Management Console, you can select the option to disable updates when you choose Custom AMI.

With the AWS CLI, you can specify --repo-upgrade-on-boot NONE along with --custom-ami-id when using the create-cluster command.

With the Amazon EMR API, you can specify NONE for the RepoUpgradeOnBoot parameter.

Creating a custom AMI with an encrypted Amazon EBS root device volume

To encrypt the Amazon EBS root device volume of an Amazon Linux AMI for Amazon EMR, copy a snapshot image from an unencrypted AMI to an encrypted target. For information about creating encrypted EBS volumes, see Amazon EBS encryption in the Amazon EC2 User Guide. The source AMI for the snapshot can be the base Amazon Linux AMI, or you can copy a snapshot from an AMI derived from the base Amazon Linux AMI that you customized.

Note

Beginning with Amazon EMR version 5.24.0, you can use a security configuration option to encrypt EBS root device and storage volumes when you specify AWS KMS as your key provider. For more information, see Local disk encryption.

You can use an external key provider or an AWS KMS key to encrypt the EBS root volume. The service role that Amazon EMR uses (usually the default EMR_DefaultRole) must be allowed to encrypt and decrypt the volume, at minimum, for Amazon EMR to create a cluster with the AMI. When using AWS KMS as the key provider, this means that the following actions must be allowed:

The simplest way to do this is to add the role as a key user as described in the following tutorial. The following example policy statement is provided if you need to customize role policies.

JSON
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "EmrDiskEncryptionPolicy",
      "Effect": "Allow",
      "Action": [
        "kms:Encrypt",
        "kms:Decrypt",
        "kms:ReEncrypt*",
        "kms:CreateGrant",
        "kms:GenerateDataKeyWithoutPlaintext",
        "kms:DescribeKey"
      ],
      "Resource": [
        "*"
      ]
    }
  ]
}                 
            
Tutorial: Creating a custom AMI with an encrypted root device volume using a KMS key

The first step in this example is to find the ARN of a KMS key or create a new one. For more information about creating keys, see Creating keys in the AWS Key Management Service Developer Guide. The following procedure shows you how to add the default service role, EMR_DefaultRole, as a key user to the key policy. Write down the ARN value for the key as you create or edit it. You use the ARN higher, when you create the AMI.

To add the service role for Amazon EC2 to the list of encryption key users with the console
  1. Sign in to the AWS Management Console and open the AWS Key Management Service (AWS KMS) console at https://console.aws.amazon.com/kms.

  2. To change the AWS Region, use the Region selector in the upper-right corner of the page.

  3. Choose the alias of the KMS key to use.

  4. On the key details page under Key Users, choose Add.

  5. In the Attach dialog box, choose the Amazon EMR service role. The name of the default role is EMR_DefaultRole.

  6. Choose Attach.

To create an encrypted AMI with the AWS CLI

The output of the command provides the ID of the AMI that you created, which you can specify when you create a cluster. For more information, see Use a single custom AMI in an EMR cluster. You can also choose to customize this AMI by installing software and performing other configurations. For more information, see Creating a custom Amazon Linux AMI from a preconfigured instance.

Best practices and considerations

When you create a custom AMI for Amazon EMR, consider the following:

For more information, see Creating an Amazon EBS-backed Linux AMI in the Amazon EC2 User Guide.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4