aws s3 downloading a folder

In this article i will illustrate how to download all the files inside a directory in aws s3 object store. Amazon S3 does not have a folder structure as you would see in other filesystem like hdfs or NTFS but has a flat structure. However, for the sake of organizational simplicity, the Amazon S3 supports the folder concept as a means of grouping objects. Amazon S3 does this by using a shared name prefix for objects.

To download a set of files that share a common key prefix from Amazon S3 we have to use TransferManager class downloadDirectory method. The method takes the Amazon S3 bucket name containing the objects you want to download, the object prefix shared by all of the objects, and a File object that represents the directory to download the files into on your local system. If the named directory doesn’t exist yet, it will be created.

Below is the example


import java.io.File;
import com.amazonaws.AmazonServiceException;
import com.amazonaws.services.s3.transfer.MultipleFileDownload;
import com.amazonaws.services.s3.transfer.TransferManager;
import com.amazonaws.services.s3.transfer.TransferManagerBuilder;
import com.wdc.ddp.objectstore.ObjectStoreApi;
import com.wdc.ddp.objectstore.ObjectStoreFactory;
import com.wdc.ddp.objectstore.ObjectStoreType;

public class CopyCommonKeyPrefix {

public static void main(String[] args) {

final String bucket_name = "Bucket_Name";
final String key_prefix_upto_directory = "key_prefix";
File local_directory_path = new File("local_path");

final ObjectStoreApi objectStoreApiAws = ObjectStoreFactory.getDataExtractor(ObjectStoreType.AWS);
TransferManager transferManagerAws = TransferManagerBuilder.standard()
.withS3Client(objectStoreApiAws.getConnection()).build();
try {

MultipleFileDownload xferAws = transferManagerAws.downloadDirectory(bucket_name, key_prefix_upto_directory,
local_directory_path);

XferMgrProgress.showTransferProgress(xferAws);
XferMgrProgress.waitForCompletion(xferAws);

} catch (AmazonServiceException e) {
System.err.println(e.getErrorMessage());
System.exit(1);
}
transferManagerAws.shutdownNow();

}

}

waitForCompletion method blocks the thread until the transfer of object is complete. shutdownNow methos forcefully shuts down the TransferManager instance which is currently executing and transfers will not be allowed to finish. It also by default shuts down the underlying Amazon S3 client.

Lets code some of the dependent class we have used in the above class

ObjectStoreFactory class to get the object store connection which will be usefull, if we have more than one object store implementations like s3 and ceph.


public class ObjectStoreFactory {

public static ObjectStoreApi getDataExtractor(ObjectStoreType objectStoreType) {

ObjectStoreApi objectStoreApi = null;
if (objectStoreType.equals(ObjectStoreType.AWS)) {
objectStoreApi = new AmazonAwsApi();
} else if (objectStoreType.equals(ObjectStoreType.CEPH)) {
objectStoreApi = new CephApi();
} else {
throw new InvalidParameterException();
}
return objectStoreApi;
}

}

ObjectStoreApi interface


public interface ObjectStoreApi extends Serializable {

public abstract AmazonS3 getConnection();

}

AmazonAwsApi implementation class which has a getConnection method which returns a AmazonS3 client. Here we are using the CipherText class to load the encrypted accessKey and secretKey from the configuration file .


public class AmazonAwsApi implements ObjectStoreApi {

private static final long serialVersionUID = 433434;
public static AmazonS3 conn;

static {
ConfigFile aws_credential = new ConfigFile(Constants.OBJECT_STORE_CREDENTIALS_CONFIG, FileType.property);
String access_key_amazon = CipherText.decrypt(aws_credential.getString("accessKey.amazon"));
String secret_key_amazon = CipherText.decrypt(aws_credential.getString("secretKey.amazon"));
AWSCredentials credentials = new BasicAWSCredentials(access_key_amazon, secret_key_amazon);
ClientConfiguration clientConfig = new ClientConfiguration();
clientConfig.setProtocol(Protocol.HTTP);
conn = new AmazonS3Client(credentials, clientConfig);
}

public AmazonS3 getConnection()

{
return conn;
}
}

ObjectStoreType Enum to specify all the supported object stores.


public enum ObjectStoreType {
AWS, CEPH

}

Below is the CipherText class which is used to load the encrypted credentials


import java.security.InvalidKeyException;
import javax.crypto.BadPaddingException;
import javax.crypto.Cipher;
import javax.crypto.IllegalBlockSizeException;
import javax.crypto.SecretKey;
import javax.crypto.spec.SecretKeySpec;
import org.apache.commons.codec.binary.Base64;

/**
* A simple text cipher to encrypt/decrypt a string.
*/
public class CipherText {
private static byte[] linebreak = {};
private static String secret = "secret_hash_key";
private static SecretKey key;
private static Cipher cipher;
private static Base64 coder;

static {
try {
key = new SecretKeySpec(secret.getBytes(), "AES");
cipher = Cipher.getInstance("AES/ECB/PKCS5Padding", "SunJCE");
coder = new Base64(32, linebreak, true);
} catch (Throwable t) {
t.printStackTrace();
}
}

public static synchronized String encrypt(String plainText) {
try {
cipher.init(Cipher.ENCRYPT_MODE, key);
} catch (InvalidKeyException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
byte[] cipherText=null;
try {
cipherText = cipher.doFinal(plainText.getBytes());
} catch (IllegalBlockSizeException | BadPaddingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return new String(coder.encode(cipherText));
}

public static synchronized String decrypt(String codedText) {
byte[] encypted = coder.decode(codedText.getBytes());
try {
cipher.init(Cipher.DECRYPT_MODE, key);
} catch (InvalidKeyException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
byte[] decrypted=null;
try {
decrypted = cipher.doFinal(encypted);
} catch (IllegalBlockSizeException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (BadPaddingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return new String(decrypted);
}

}