2013-12-22

Amazon Glacier in Python with Boto (Part 2: Checking Inventory)

Step 1: Request contents in a vault.

from boto.glacier.layer1 import Layer1
import sys

access_key_id = "YOUR_ACCESS_KEY_ID"
secret_key = "YOUR_SECRET_KEY"

if len(sys.argv) < 2:
    print("Usage: python request_inventory.py VAULT_NAME\n")
    exit()

target_vault_name = sys.argv[1]
 
glacier_layer1 = Layer1(aws_access_key_id=access_key_id, aws_secret_access_key=secret_key)
 
print("operation starting...");
 
job_id = glacier_layer1.initiate_job(target_vault_name, {"Description":"inventory-job", "Type":"inventory-retrieval", "Format":"JSON"})
 
print("inventory job id: %s"%(job_id,));
 
print("Operation completed.")
The output may look like this:
{'Marker': None, u'RequestId': 'A_LONG_STRING', 'JobList': [{'CompletionDate': None, 'VaultARN': 'arn:aws:glacier:us-east-1:YOUR_AWS_ACCOUNT_ID:vaults/YOUR_VAULT_NAME', 'RetrievalByteRange': None, 'SHA256TreeHash': None, 'Completed': False, 'InventorySizeInBytes': None, 'JobId': 'A_LONG_STRING', 'ArchiveId': None, 'JobDescription': 'inventory-job', 'StatusMessage': None, 'StatusCode': 'InProgress', 'Action': 'InventoryRetrieval', 'ArchiveSHA256TreeHash': None, 'CreationDate': '2013-12-22T06:25:59.528Z', 'SNSTopic': None, 'ArchiveSizeInBytes': None}]}

Step 2 (optional): make sure that the job is being processed.


The code below check the status of all uncompleted jobs in a vault.

from boto.glacier.layer1 import Layer1
import sys

access_key_id = "YOUR_ACCESS_KEY_ID"
secret_key = "YOUR_SECRET_KEY"

if len(sys.argv) < 2:
    print("Usage: python check_jobs_in_a_vault.py VAULT_NAME\n")
    exit()

target_vault_name = sys.argv[1]
 
glacier_layer1 = Layer1(aws_access_key_id=access_key_id, aws_secret_access_key=secret_key)
 
print("operation starting...")

print glacier_layer1.list_jobs(target_vault_name, completed=False) # only uncompleted jobs will be printed

print("operation finished")
The result may look like this.
{'Marker': None, u'RequestId': 'A_LONG_STRING', 'JobList': [{'CompletionDate': None, 'VaultARN': 'arn:aws:glacier:us-east-1:YOUR_AWS_ACCOUNT_ID:vaults/YOUR_VAULT_NAME', 'RetrievalByteRange': None, 'SHA256TreeHash': None, 'Completed': False, 'InventorySizeInBytes': None, 'JobId': 'A_LONG_STRING', 'ArchiveId': None, 'JobDescription': 'inventory-job', 'StatusMessage': None, 'StatusCode': 'InProgress', 'Action': 'InventoryRetrieval', 'ArchiveSHA256TreeHash': None, 'CreationDate': 'A_TIME_STAMP', 'SNSTopic': None, 'ArchiveSizeInBytes': None}]}
Note that the "StatusCode" of the job is InProgress. After a few hours, the job might be done and its status will become like this:
{'Marker': None, u'RequestId': 'A_LONG_STRING', 'JobList': [{'CompletionDate': 'A_TIME_STAMP', 'VaultARN': 'arn:aws:glacier:us-east-1:YOUR_AWS_ACCOUNT_ID:vaults/YOUR_VAULT_NAME', 'RetrievalByteRange': None, 'SHA256TreeHash': None, 'Completed': True, 'InventorySizeInBytes': SIZE_IN_BYTES, 'JobId': 'A_LONG_STRING', 'ArchiveId': None, 'JobDescription': 'inventory-job', 'StatusMessage': 'Succeeded', 'StatusCode': 'Succeeded', 'Action': 'InventoryRetrieval', 'ArchiveSHA256TreeHash': None, 'CreationDate': 'A_TIME_STAMP', 'SNSTopic': None, 'ArchiveSizeInBytes': None}]}
Note that now the "StatusCode" is Succeeded and the job request is ready for retrieval.

Step 3: Fetch the result of this job (i.e., an inventory request).

from boto.glacier.layer1 import Layer1
import sys

access_key_id = "YOUR_ACCESS_KEY_ID"
secret_key = "YOUR_SECRET_KEY"

if len(sys.argv) < 3:
    print("Usage: python fetch_inventory.py VAULT_NAME JOB_ID\n")
    exit()

[target_vault_name, jobid] = sys.argv[1:3]
 
glacier_layer1 = Layer1(aws_access_key_id=access_key_id, aws_secret_access_key=secret_key)
 
print("operation starting...")

print glacier_layer1.get_job_output(target_vault_name, jobid) 
 
print("Operation completed.")
The result looks like this:
{u'TreeHash': None, 'VaultARN': 'arn:aws:glacier:us-east-1:YOUR_AWS_ACCOUNT_ID:vaults/MY_VAULT_NAME', u'ContentType': 'application/json', u'RequestId': 'A_LONG_STRING', u'ContentRange': None, 'InventoryDate': 'A_TIME_STAMP', 'ArchiveList': [{'ArchiveId': 'A_VERY_LONG_STRING', 'ArchiveDescription': 'THE_ARCHIVE_NAME', 'CreationDate': 'A_TIME_STAMP', 'SHA256TreeHash': 'A_LONG_HASH_VALUE', 'Size': A_BIG_NUMBER}]}

As you can see that I have only one archive in this vault.

Next time, I will talk about how to delete archives in a vault.

No comments: