4 Ways To Accelerate S3 Data Transfers

By: Daniel Marquard, Cloud Engineer III

Although Amazon does not guarantee bandwidth or latency with Amazon Web Services’ S3 offering, there are steps you can take to accelerate data transfers to and from S3 buckets.

S3 Transfer AccelerationTransfer

Acceleration is a premium offering for S3, offering upgraded bandwidth to and from S3 buckets. Starting at $0.04 per gigabyte in addition to nominal S3 data transfer rates, Transfer Acceleration allows for data to be transferred quickly to and from S3 buckets via AWS edge locations.

This can be useful for serving and accepting content from over the Internet but is less effective for speeding up the transfer of data between S3 buckets.

Multithreading via AWS Command Line Interface

Using the AWS Command Line Interface (CLI), it is possible to simultaneously run multiple cp, mv, or sync operations in parallel. There are multiple approaches to dividing the work, but in this example, we’ll move files beginning with lowercase “a” through “n” with one command, and files beginning with lowercase “o” through “z” using a second command. aws s3 cp s3://srcbucket/ s3://destbucket/

–recursive –exclude “o*” –exclude “p*” –exclude “q*” –exclude “r*” –exclude “s*”

–exclude “t*” –exclude “u*” –exclude “v*” –exclude “w*” –exclude “x*”

–exclude “y*” –exclude “z*”

aws s3 cp s3://srcbucket/ s3://destbucket/

–recursive –exclude “a*” –exclude “b*” –exclude “c*” –exclude “d*” –exclude “e*”

–exclude “f*” –exclude “g*” –exclude “h*” –exclude “i*” –exclude “j*” –exclude “k*”

–exclude “l*” –exclude “m*” –exclude “n*”

When executed consecutively, the workload for this intra-bucket transfer is split between two jobs, decreasing the time it takes to copy data from one S3 bucket to another.

AWS Import/Export For transfers exceeding 1 TB, Amazon Web Services’ Snowball offering can be used. Snowball is a petabyte-scale data transfer solution for securely transferring large amounts of data in and out of AWS.

S3DistCp with Amazon Elastic MapReduce

When transfer time is of the utmost importance in transferring data across S3 buckets, S3DistCp can be used in conjunction with Amazon Web Services’ Elastic MapReduce (EMR). This approach, which requires running an EMR cluster, comes at an additional cost, but promises high speed and fault tolerance.

Recent Posts

See All

A JHC Technology Rundown on re:Invent Part 2

By: Mike Atkinson, Senior Cloud Engineer Members of the engineering team had the opportunity to attend Amazon Web Services’ annual re:Invent conference in Las Vegas. Every year, AWS announces dozens o

  • Twitter
  • Facebook
  • Black LinkedIn Icon
  • YouTube

163 Waterfront Street, Suite 450

National Harbor, MD 20745


©2019 by JHC Technology. Proudly created with Wix.com

DUNS: 961809790 | CAGE Code: 5YRC8 | NAICS Codes: 423430, 518210, 541511, 541512, 541513, 541519, 541990