Skip to main content

Copy Objects between S3 Buckets using AWS Lambda

In this article we will use AWS Lambda service to copy objects/files from one S3 bucket to another. Below are the steps we will follow in order to do that:

  1. Create two buckets in S3 for source and destination.
  2. Create an IAM role and policy which can read and write to buckets. 
  3. Create a Lamdba function to copy the objects between buckets.
  4. Assign IAM role to the Lambda function.
  5. Create an S3 event trigger to execute the Lambda function. 

1. Create S3 Buckets:

Create 2 buckets in S3 for source and destination. You can refer to my previous post for the steps about creating a S3 bucket. I have created the buckets highlighted in blue below, that I will be using in this example:



2. Create IAM Policy and Role:

Now go to Security -> IAM (Identity and Access Management).
  • Click on Policies -> Create policy
  • Click on JSON tab and enter below lines. You will need to modify line number 10 and 18 with the source and destination buckets that you have created.
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": "arn:aws:s3:::source-bucket104/*"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject"
            ],
            "Resource": "arn:aws:s3:::dest-bucket104/*"
        }
    ]
}




In the above declaration, I am creating a policy with read permissions "GetObject" to source bucket "source-bucket104" and write permissions "PutObject" to destination bucket "dest-bucket104".
  • Click on Review policy.
  • Provide a name for your policy and click "Create Policy".
  • Now click on Roles -> Create role
  • Under "Select type of trusted entity", select "AWS Service"
  • Select S3 service
  • Click "Next: Permissions"
  • Now attach the policy that you created in previous step to this role by selecting the checkbox next to the policy name and click next.

  • Click Next on the add tags screen.
  • Provide a name for the role and click "Create role".
3. Create Lambda Function:
  • Go to Services -> Compute -> Lambda
  • Click "Create function"
  • Provide a name of the function.
  • Select runtime as "Python 3.8"
  • Under "Permissions", click on "Choose or create an execution role".
  • From the drop down list choose the role that was created in previous step.
  • Click "Use an existing role".
  • Click "Create function"

  • Under the function code, type the below code:
    import json
    import boto3
    s31 = boto3.client("s3")
    def lambda_handler(event, context):
   dest_bucket = 'dest-bucket104'
   src_bucket = event['Records'][0]['s3']['bucket']['name']
   filename = event['Records'][0]['s3']['object']['key']
   copy_source = {'Bucket': src_bucket, 'Key': filename}
   s31.copy_object(CopySource=copy_source, Bucket=dest_bucket, Key=filename)
   return {
       'statusCode': 200,
       'body': json.dumps('Files Copied')
   }



4. Create Trigger:
  • Click "Add Trigger"
  • In Trigger configuration, select S3.
  • Under bucket "Select the Source bucket"
  • Under "Event Type", select "Put"
  • Click "Add" 

  • Click Save
Lambda function should look like below:


Test your Lambda function:

Do a test by uploading a file to the source S3 bucket. If all the configurations are correct, the file(s) should be copied to the destination bucket.

Comments

Popular posts from this blog

Configure Oracle ASM Disks on AIX

Configure Oracle ASM Disks on AIX You can use below steps to configure the new disks for ASM after the raw disks are added to your AIX server by your System/Infrastructure team experts: # /usr/sbin/lsdev -Cc disk The output from this command is similar to the following: hdisk9 Available 02-T1-01 PURE MPIO Drive (Fibre) hdisk10 Available 02-T1-01 PURE MPIO Drive (Fibre) If the new disks are not listed as available, then use the below command to configure the new disks. # /usr/sbin/cfgmgr Enter the following command to identify the device names for the physical disks that you want to use: # /usr/sbin/lspv | grep -i none This command displays information similar to the following for each disk that is not configured in a volume group: hdisk9     0000014652369872   None In the above example hdisk9 is the device name and  0000014652369872  is the physical volume ID (PVID). The disks that you want to use may have a PVID, but they must not belong to a volu...

Gitlab installation steps on Redhat Linux

In this blog we will see the steps to install Gitlab on Redhat Enterprise Linux 6. I will be using the virtual machine "gitserver" that I have created on Google Cloud. You can use any server or VM running RHEL 6 and follow these steps. Follow the below steps to install gitlab. Run these steps as root user. # yum install -y curl policycoreutils-python openssh-server cronie # lokkit -s http -s ssh  # yum install postfix  # service postfix start  # chkconfig postfix on  # curl https://packages.gitlab.com/install/repositories/gitlab/gitlab-ee/script.rpm.sh | sudo bash  # EXTERNAL_URL="http://34.69.44.142" yum -y install gitlab-ee  You will see a screen similar to below, once your gitlab installation is successful. You can now access the gitlab console using the http or https url that you provided during the installation, i.e., http://<ip/server_name> http://gitserver.localdomain.com or  http://34.69.44.142 When you open the c...

Load records from csv file in S3 file to RDS MySQL database using AWS Data Pipeline

 In this post we will see how to create a data pipeline in AWS which picks data from S3 csv file and inserts records in RDS MySQL table.  I am using below csv file which contains a list of passengers. CSV Data stored in the file Passenger.csv Upload Passenger.csv file to S3 bucket using AWS ClI In below screenshot I am connecting the RDS MySQL instance I have created in AWS and the definition of the table that I have created in the database testdb. Once we have uploaded the csv file we will create the data pipeline. There are 2 ways to create the pipeline.  Using "Import Definition" option under AWS console.                    We can use import definition option while creating the new pipeline. This would need a json file which contains the definition of the pipeline in the json format. You can use my Github link below to download the JSON definition: JSON Definition to create the Data Pipeline Using "Edit Architect" ...