What are common issues that might occur when using native backup and restore in RDS for SQL Server?

10 minute read
0

I'm performing a native backup or restore for my Amazon Relational Database Service (Amazon RDS) for Microsoft SQL Server instance. What are common errors that I might encounter during this process?

Resolution

When using the RDS for SQL Server native backup and restore option, you might encounter validation errors. These errors are displayed immediately, and a task isn't created. The following are common errors and suggested fixes:

Error: Aborted the task because of a task failure or a concurrent RESTORE_DB request

This error occurs if you have space-related issues on the DB instance when restoring the backup from Amazon Elastic Compute Cloud (Amazon EC2) or on-premises:

[2022-04-07 05:21:22.317] Aborted the task because of a task failure or a concurrent RESTORE_DB request.
[2022-04-07 05:21:22.437] Task has been aborted
[2022-04-07 05:21:22.440] There is not enough space on the disk to perform restore database operation.

To resolve this error, do the following:

Option 1:

1.    Run the following command on the source instance (EC2 or on-premises). This command verifies the size of the database, including the data file and Tlog file. In the following example, replace [DB_NAME] with the name of your database.

SELECT DB_NAME(database_id) AS DatabaseName,
Name AS Logical_Name,
Physical_Name, (size*8)/1024/1024 SizeGB
FROM sys.master_files
WHERE DB_NAME(database_id) = '[DB_NAME]'
GO
Database Size = (DB_Name size + DB_Name_Log size)

2.    Compare the source instances data base size with the available storage on the DB instance. Increase the available storage accordingly and then restore the database.

Option 2:

Shrink the current DB log file on the source SQL Server to clear up unused space, and then perform the database backup.

Use the following command to shrink the log file.

DBCC SHRINKFILE (LogFileName, Desired Size in MB)

Error: Aborted the task because of a task failure or a concurrent RESTORE_DB request

The following error occurs when you have permission issues related to the AWS Identity and Access Management (IAM) role or policy associated with the SQLSERVER_BACKUP_RESTORE option:

[2020-12-15 08:56:22.143] Aborted the task because of a task failure or a concurrent RESTORE_DB request.
[2020-12-15 08:56:22.213] Task has been aborted
[2020-12-15 08:56:22.217] Access Denied

To resolve this error, do the following:

1.    Verify the restore query to make sure that the S3 bucket and the folder prefix are correct:

exec msdb.dbo.rds_restore_database
      @restore_db_name='database_name',
      @s3_arn_to_restore_from='arn:aws:s3:::bucket_name/file_name_and_extension';

2.    Verify that the IAM policy includes the following attributes:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:GetBucketLocation"
      ],
      "Resource": "arn:aws:s3:::bucket_name"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObjectAttributes",
        "s3:GetObject",
        "s3:PutObject",
        "s3:ListMultipartUploadParts",
        "s3:AbortMultipartUpload"
      ],
      "Resource": "arn:aws:s3:::bucket_name/*"
    }
  ]
}

Note: Replace arn:aws:s3:::bucket_name with the ARN of your S3 bucket.

3.    Verify that the policy is correctly associated with the role given in the SQLSERVER_BACKUP_RESTORE option.

4.    Verify that the SQLSERVER_BACKUP_RESTORE option is the option group associated with the DB instance:

S3 Bucket ARN
S3 folder prefix (Optional)

For more information, see How do I perform native backups of an Amazon RDS DB instance that's running SQL Server?

Error: Aborted the task because of a task failure or a concurrent RESTORE_DB request

This error is commonly associated with cross account database restore.

Example:

  • Account A has an S3 bucket where the backup is stored.
  • Account B has an RDS DB instance where the restore needs to be done.

The error occurs when you have permission-related issues in an IAM role or policy associated with the option. Or, there is a permissions issue with the bucket policy associated with the S3 bucket in the cross account.

[2022-02-03 15:57:22.180] Aborted the task because of a task failure or a concurrent
RESTORE_DB request.
[2022-02-03 15:57:22.260] Task has been aborted
[2022-02-03 15:57:22.263] Error making request with Error Code Forbidden and Http Status Code Forbidden. No further error information was returned by the service.

To resolve this error, do the following:

1.    Verify that the IAM policy in Account B (the account where the DB instance that you will be restoring to is located) includes the following attributes:

{
  "Version": "2012-10-17",
  "Statement":
    [
      {
        "Effect": "Allow",
        "Action":
          [
            "s3:ListBucket",
            "s3:GetBucketLocation"
          ],
        "Resource": "arn:aws:s3:::name_of_bucket_present_in_Account_A"
      },
      {
        "Effect": "Allow",
        "Action":
          [
            "s3:GetObject",
            "s3:PutObject",
            "s3:ListMultipartUploadParts",
            "s3:AbortMultipartUpload"
          ],
        "Resource": "arn:aws:s3::: name_of_bucket_present_in_Account_A /*"
      },
      {
        "Action": [
          "kms:DescribeKey",
          "kms:GenerateDataKey",
          "kms:Decrypt",
          "kms:Encrypt"
          "kms:ReEncryptTo",
          "kms:ReEncryptFrom"
        ],
        "Effect": "Allow",
        "Resource": [
          "arn:aws: PUT THE NAME OF THE KEY HERE",
          "arn:aws:s3::: name_of_bucket_present_in_Account_A /*"
        ]
      }
    ]
}

2.    Verify that the bucket policy associated with the S3 bucket in Account A includes the following attributes:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Permission to cross account",
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::AWS-ACCOUNT-ID-OF-RDS:role/service-role/PUT-ROLE-NAME"   /*---- Change Details here
        ]
      },
      "Action": [
        "s3:ListBucket",
        "s3:GetBucketLocation"
             ],
      "Resource": [
        "arn:aws:s3:::PUT-BUCKET-NAME"   /*---- Change Details here
      ]
    },
    {
      "Sid": "Permission to cross account on object level",
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::AWS-ACCOUNT-ID-OF-RDS:role/service-role/PUT-ROLE-NAME"   /*---- Change Details here
        ]
      },
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:ListMultipartUploadParts",
        "s3:AbortMultipartUpload"
      ],
      "Resource": [
        "arn:aws:s3::: PUT-BUCKET-NAME/*"  /*---- Change Details here
      ]
    }
  ]
}

For more information, see the following:

Error: Cannot find server certificate with thumbprint 'XXXXXX'

This error occurs when you try to restore a database with Transparent Data Encryption (TDE) from EC2 or on-premises to RDS for SQL Server:

[2022-06-1511:55:22.280] Cannot find server certificate with thumbprint 'XXXXXXX'.
[2022-06-15 11:55:22.280] RESTORE FILELIST is terminating abnormally.
[2022-06-15 11:55:22.300] Aborted the task because of a task failure or a concurrent RESTORE_DB request.
[2022-06-15 11:55:22.333] Task has been aborted
[2022-06-15 11:55:22.337] Empty restore file list result retrieved.

This error indicates an attempt to restore a backup of a database that is encrypted using TDE to a SQL instance other than its original server. The TDE certificate of the original server must be imported to the destination server. For more information on importing server certificates and respective limitations, see Support for Transparent Data Encryption in SQL Server.

To resolve this error apart from importing certificates, do the following:

There are two workarounds available to prevent this error.

Option 1: The database backup is sourced from on-premises or an EC2 instance but target RDS SQL Server is in MultiAZ

1.    Create a backup of the source database with TDE turned on.

2.    Restore the backup as new DB with in your on-premises server.

3.    Turn off TDE on the newly created database. Use the following commands to turn off TDE:

Run the following command to turn off encryption on the database. In the following command, replace Databasename with the correct name for your database.

USE master;
GO
ALTER DATABASE [Databasename] SET ENCRYPTION OFF;
GO

Run the following command to drop the DEK used for encryption. In the following command, replace Databasename with the correct name for your database.

USE [Databasename];
GO
DROP DATABASE ENCRYPTION KEY;
GO

4.    Create a native SQL Server backup and restore this new backup to the desired RDS instance. For more information, see How do I perform native backups of an Amazon RDS DB instance that's running SQL Server?

5.    Turn TDE back on in the new RDS database.

Option 2: The database is sourced from an RDS for SQL Server database that’s encrypted with TDE

1.    Use a snapshot from the source instance to restore the DB in to a new instance.

2.    Turn off TDE on the database created from the snapshot.

3.    Create a native SQL backup and restore this new backup to the desired RDS instance.

4.    Turn TDE back on in the new RDS database.

Common errors observed for native backup on RDS for SQL Server

Error: Aborted the task because of a task failure or an overlap with your preferred backup window for RDS automated backup

The following error occurs when you have permission issues related to the IAM role or policy associated with the SQLSERVER_BACKUP_RESTORE option.

[2022-07-16 16:08:22.067]
Task execution has started. 
[2022-07-16 16:08:22.143] Aborted the task because of a task failure or an overlap with your preferred backup window for RDS automated backup.
[2022-07-16 16:08:22.147] Task has been aborted [2022-07-16 16:08:22.150] Access Denied

To resolve this issue, do the following:

1.    Verify the restore query to make sure that the S3 bucket and the folder prefix are correct:

exec msdb.dbo.rds_restore_database
      @restore_db_name='database_name',
      @s3_arn_to_restore_from='arn:aws:s3:::bucket_name/file_name_and_extension';

2.    Verify that the IAM policy includes the following attributes:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:GetBucketLocation"
      ],
      "Resource": "arn:aws:s3:::bucket_name"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObjectAttributes",
        "s3:GetObject",
        "s3:PutObject",
        "s3:ListMultipartUploadParts",
        "s3:AbortMultipartUpload"
      ],
      "Resource": "arn:aws:s3:::bucket_name/*"
    }
  ]
}

Note: Replace arn:aws:s3:::bucket_name with the ARN of your S3 bucket.

3.    Verify that the policy is correctly associated with the role shown in the SQLSERVER_BACKUP_RESTORE option.

4.    Verify that the SQLSERVER_BACKUP_RESTORE option in the option group associated with the DB instance.

S3 Bucket ARN
S3 folder prefix (Optional)

For more information, see How do I perform native backups of an Amazon RDS DB instance that's running SQL Server?

Error: Write on "XXX" failed, Unable to write chunks to S3, S3 write stream upload failed

This is a known issue with RDS for SQL Server. The database size is sometimes estimated incorrectly and causes the backup procedure to fail with the following error.

[2022-04-21 16:45:04.597] reviews_consumer/reviews_consumer_PostUpdate_042122.bak: Completed processing 100% of S3 chunks.
[2022-04-21 16:47:05.427] Write on "XXXX" failed: 995(The I/O operation has been aborted because of either a thread exit or an application request.) A nonrecoverable I/O error occurred on file "XXXX:" 995(The I/O operation has been aborted because of either a thread exit or an application request.). BACKUP DATABASE is terminating abnormally.
[2022-04-21 16:47:22.033] Unable to write chunks to S3 as S3 processing has been aborted. [2022-04-21 16:47:22.040] reviews_consumer/reviews_consumer_PostUpdate_042122.bak: Aborting S3 upload, waiting for S3 workers to clean up and exit
[2022-04-21 16:47:22.053] Aborted the task because of a task failure or an overlap with your preferred backup window for RDS automated backup.
[2022-04-21 16:47:22.060] reviews_consumer/reviews_consumer_PostUpdate_042122.bak: Aborting S3 upload, waiting for S3 workers to clean up and exit
[2022-04-21 16:47:22.067] S3 write stream upload failed. Encountered an error while uploading an S3 chunk: Part number must be an integer between 1 and 10000, inclusive S3 write stream upload failed. Encountered an error while uploading an S3 chunk: Part number must be an integer between 1 and 10000, inclusive S3 write stream upload failed. Encountered an error while uploading an S3 chunk: Part number must be an integer between 1 and 10000, inclusive S3 write stream upload failed. Encountered an error while uploading an S3 chunk: Part number must be an integer between 1 and 10000, inclusive

The work around for this error is to turn on database backup compression. This compresses the backup, making it easier for S3 to receive the file.

Run the following command to turn on backup compression:

exec rdsadmin..rds_set_configuration 'S3 backup compression', 'true';

3 Comments

Thanks for the detailed information. It helped me to troubleshoot a issue with one of my platinum customer who are using the SQL server workloads. At the end , customer was able to resolve the issue. Thanks again

AWS
replied 6 months ago

Our storage is set to 400GB with 1500GB threshold with auto-scaling but we get [2022-04-07 05:21:22.317] Aborted the task because of a task failure or a concurrent RESTORE_DB request. [2022-04-07 05:21:22.437] Task has been aborted [2022-04-07 05:21:22.440] There is not enough space on the disk to perform restore database operation.

So my question, does auto-scale not come into play with this type of action? I assume no, but I can't find any info.

db
replied 2 months ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERATOR
replied 2 months ago