How can I use AWS DataSync to transfer the data to or from a cross-account Amazon S3 location?

6 minute read
1

I want to use AWS DataSync to transfer data to or from a cross-account Amazon Simple Storage (Amazon S3) bucket.

Short description

To use DataSync for cross-account data transfer, do the following:

  1. Use AWS Command Line Interface (AWS CLI) or AWS SDK to create a cross-account Amazon S3 location in DataSync.
  2. Create a DataSync task that transfers data from the source bucket to the destination bucket.

Keep in mind the following limitations when using DataSync to transfer data between buckets owned by different S3 accounts:

  • DataSync doesn't apply the bucket-owner-full-control access control list (ACL) when transferring data to a cross-account destination bucket. This leads to object ownership issues in the destination bucket.
  • For a cross-account S3 location, only a cross-account bucket in the same Region is supported. If you attempt a cross-account and a cross-Region S3 location, then you receive the GetBucketLocation or Unable to connect to S3 endpoint errors. So, if a task is created in source account, the task must be created in the same Region as the destination bucket. If a task is created in destination account, then the task must be created in same Region as the source bucket.
  • You can't use the cross-account pass role to access the cross-account S3 location.

You can configure the DataSync task in the destination account to pull data from the source by working around the preceding limitations.

Resolution

Perform the required checks

Suppose that the source account has the cross-account source S3 bucket and the destination account has the destination S3 bucket and the DataSync task. Perform the following checks:

AWS Identity and Management (IAM) user/role: Check if the following IAM users or roles have the required permissions:

  • The user or role that you're using to create the cross-account S3 location
  • The role that you assigned to the S3 location

Source bucket policy: Be sure that the source bucket policy allows both IAM users/roles in the destination account to access the bucket. The following example policy grants the access to source bucket to both IAM users/roles:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::1111222233334444:role/datasync-config-role",
          "arn:aws:iam::1111222233334444:role/datasync-transfer-role"
        ]
      },
      "Action": [
        "s3:GetBucketLocation",
        "s3:ListBucket",
        "s3:ListBucketMultipartUploads"
      ],
      "Resource": [
        "arn:aws:s3:::example-source-bucket"
      ]
    },
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::1111222233334444:role/datasync-config-role",
          "arn:aws:iam::1111222233334444:role/datasync-transfer-role"
        ]
      },
      "Action": [
        "s3:AbortMultipartUpload",
        "s3:DeleteObject",
        "s3:GetObject",
        "s3:ListMultipartUploadParts",
        "s3:PutObjectTagging",
        "s3:GetObjectTagging",
        "s3:PutObject"
      ],
      "Resource": [
        "arn:aws:s3:::example-source-bucket/*"
      ]
    }
  ]
}

Be sure to replace the following values in the preceding policy:

  • example-source-bucket with the name of the source bucket
  • 1111222233334444 with the account ID of the destination account
  • datasync-config-role with the IAM role that's used for DataSync configuration (example: create a source S3 location and the task in DataSync)
    Note: You might also use an IAM user. This article considers the use of the IAM role.
  • dataysnc-transfer-role with the IAM role that's assigned when creating the source S3 location
    Note: DataSync uses this role to access the cross-account data.

Destination S3 location:

Use AWS CLI or SDK to create a cross-account source S3 location in DataSync

Note: Creating a cross-account S3 location is not supported in the AWS Management Console.

You can create the cross-account S3 location using either of the following methods:

  • Use a configuration JSON file.
  • Use the options in the AWS CLI command.

Use a configuration JSON file

1.    Create a configuration JSON file input.template for the cross-account S3 location with the following parameters:

{
  "Subdirectory": "",
  "S3BucketArn": "arn:aws:s3:::[Source bucket]",
  "S3StorageClass": "STANDARD",
  "S3Config": {
    "BucketAccessRoleArn": "arn:aws:iam::1111222233334444:role/datasync-transfer-role"
  }
}

2.    Create an S3 location by running the following AWS CLI command:

aws datasync create-location-s3 --cli-input-json file://input.template --region example-DataSync-Region

Note: If you receive errors when running AWS CLI commands, make sure that you’re using the most recent version of the AWS CLI.

For more information, see create-location-s3.

When the S3 location is created, you see the following output:

{
"LocationArn": "arn:aws:datasync:example-Region:123456789012:location/loc-0f8xxxxxxxxe4821"
}

Note that 123456789012 is the account ID of the source account.

Use the options in the AWS CLI command

Run the following AWS CLI command with appropriate options:

aws datasync create-location-s3 --s3-bucket-arn arn:aws:s3:::example-source-bucket --s3-storage-class STANDARD --s3-config BucketAccessRoleArn="arn:aws:iam::1111222233334444:role/datasync-transfer-role" --region example-DataSync-Region

Be sure to replace the following values in the command:

  • example-source-bucket with the name of the source bucket
  • example-DataSync-Region with the Region where you'll be creating the DataSync task.

Create a DataSync task

Configure the DataSync task, and start the task from the DataSync console. For more information, see Starting your AWS DataSync task.

Known errors and resolutions

Error: error creating DataSync Location S3: InvalidRequestException: Please provide a bucket in the xxx region where DataSync is currently used

If you receive this error, then confirm that the bucket and IAM policies include the following required permissions:

"Action": [
"s3:GetBucketLocation",
"s3:ListBucket",
"s3:ListBucketMultipartUploads"
]

If you get this error when using a cross-account bucket, then be sure that the buckets are in the same Region as your DataSync task

S3 object ownership issues

DataSync doesn't support using a cross-account bucket as the destination location. Therefore, you can't use the ACL bucket-owner-full-control. If the DataSync task runs from the source bucket account, the objects uploaded to the destination bucket account might have the object ownership issue. To resolve this issue, if the destination bucket has no objects that are using ACLs, consider disabling the ACLs on the destination bucket. For more information, see Controlling ownership of objects and disabling ACLs for your bucket. Otherwise, it's a best practice to configure the DataSync task in the destination account to pull data from the source.


Related information

How to use AWS DataSync to migrate data between Amazon S3 buckets

AWS OFFICIAL
AWS OFFICIALUpdated a year ago