S3 Objects

All functionalities to deal with AWS S3 Objects.

Check

Check objects on S3 bucket.

s3_tools.objects.check.object_exists(bucket: str, key: str, aws_auth: Dict[str, str] = {}) bool[source]

Check if an object exists for a given bucket and key.

Parameters
  • bucket (str) – Bucket name where the object is stored.

  • key (str) – Full key for the object.

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Returns

True if the object exists, otherwise False.

Return type

bool

Raises

Exception – Any problem with the request is raised.

Example

>>> object_exists("myBucket", "myFiles/music.mp3")
True

Delete

Delete objects from S3 bucket.

s3_tools.objects.delete.delete_keys(bucket: str, keys: List[str], dry_run: bool = True, aws_auth: Dict[str, str] = {}) None[source]

Delete all objects in the keys list from S3 bucket.

Parameters
  • bucket (str) – AWS S3 bucket where the objects are stored.

  • keys (List[str]) – List of object keys.

  • dry_run (bool) – If True will not delete the objects.

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Examples

>>> delete_keys(
...     bucket="myBucket",
...     keys=[
...         "myData/myMusic/awesome.mp3",
...         "myData/myDocs/paper.doc"
...     ],
...     dry_run=False
... )
s3_tools.objects.delete.delete_object(bucket: str, key: str, aws_auth: Dict[str, str] = {}) None[source]

Delete a given object from S3 bucket.

Parameters
  • bucket (str) – AWS S3 bucket where the object is stored.

  • key (str) – Key for the object that will be deleted.

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Examples

>>> delete_object(bucket="myBucket", key="myData/myFile.data")
s3_tools.objects.delete.delete_prefix(bucket: str, prefix: str, dry_run: bool = True, aws_auth: Dict[str, str] = {}) Optional[List[str]][source]

Delete all objects under the given prefix from S3 bucket.

Parameters
  • bucket (str) – AWS S3 bucket where the objects are stored.

  • prefix (str) – Prefix where the objects are under.

  • dry_run (bool) – If True will not delete the objects.

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Returns

List of S3 keys to be deleted if dry_run True, else None.

Return type

List[str]

Examples

>>> delete_prefix(bucket="myBucket", prefix="myData")
[
    "myData/myMusic/awesome.mp3",
    "myData/myDocs/paper.doc"
]
>>> delete_prefix(bucket="myBucket", prefix="myData", dry_run=False)

Download

Download S3 objects to files.

s3_tools.objects.download.download_key_to_file(bucket: str, key: str, local_filename: str, progress=None, task_id: int = - 1, aws_auth: Dict[str, str] = {}) bool[source]

Retrieve one object from AWS S3 bucket and store into local disk.

Parameters
  • bucket (str) – AWS S3 bucket where the object is stored.

  • key (str) – Key where the object is stored.

  • local_filename (str) – Local file where the data will be downloaded to.

  • progress (rich.Progress) – Instance of a rich Progress bar, by default None.

  • task_id (int) – Task ID on the progress bar to be updated, by default -1.

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Returns

True if the local file exists.

Return type

bool

Examples

>>> read_object_to_file(
...     bucket="myBucket",
...     key="myData/myFile.data",
...     local_filename="theFile.data"
... )
True
s3_tools.objects.download.download_keys_to_files(bucket: str, keys_paths: List[Tuple[str, str]], threads: int = 5, show_progress: bool = False, aws_auth: Dict[str, str] = {}) List[Tuple[str, str, Any]][source]

Download list of objects to specific paths.

Parameters
  • bucket (str) – AWS S3 bucket where the objects are stored.

  • keys_paths (List[Tuple[str, str]]) – List with a tuple of S3 key to be downloaded and local path to be stored. e.g. [(“S3_Key”, “Local_Path”), (“S3_Key”, “Local_Path”)]

  • threads (int) – Number of parallel downloads, by default 5.

  • show_progress (bool) – Show progress bar on console, by default False. (Need to install extra [progress] to be used)

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Returns

A list with tuples formed by the “S3_Key”, “Local_Path”, and the result of the download. If successful will have True, if not will contain the error message. Attention, the output list may not follow the same input order.

Return type

list of tuples

Examples

>>> download_keys_to_files(
...     bucket="myBucket",
...     keys_paths=[
...         ("myData/myFile.data", "MyFiles/myFile.data"),
...         ("myData/myMusic/awesome.mp3", "MyFiles/myMusic/awesome.mp3"),
...         ("myData/myDocs/paper.doc", "MyFiles/myDocs/paper.doc")
...     ]
... )
[
    ("myData/myMusic/awesome.mp3", "MyFiles/myMusic/awesome.mp3", True),
    ("myData/myDocs/paper.doc", "MyFiles/myDocs/paper.doc", True),
    ("myData/myFile.data", "MyFiles/myFile.data", True)
]
s3_tools.objects.download.download_prefix_to_folder(bucket: str, prefix: str, folder: str, search_str: Optional[str] = None, remove_prefix: bool = True, threads: int = 5, show_progress: bool = False, aws_auth: Dict[str, str] = {}) List[Tuple[str, str, Any]][source]

Download objects to local folder.

Function to retrieve all files under a prefix on S3 and store them into local folder.

Parameters
  • bucket (str) – AWS S3 bucket where the objects are stored.

  • prefix (str) – Prefix where the objects are under.

  • folder (str) – Local folder path where files will be stored.

  • search_str (str) – Basic search string to filter out keys on result (uses Unix shell-style wildcards), by default is None. For more about the search check “fnmatch” package.

  • remove_prefix (bool) – If True will remove the the prefix when writing to local folder. The remaining “folders” on the key will be created on the local folder.

  • threads (int) – Number of parallel downloads, by default 5.

  • show_progress (bool) – Show progress bar on console, by default False. (Need to install extra [progress] to be used)

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Returns

A list with tuples formed by the “S3_Key”, “Local_Path”, and the result of the download. If successful will have True, if not will contain the error message.

Return type

List[Tuples]

Examples

>>> download_prefix_to_folder(
...     bucket="myBucket",
...     prefix="myData",
...     folder="myFiles"
... )
[
    ("myData/myFile.data", "MyFiles/myFile.data", True),
    ("myData/myMusic/awesome.mp3", "MyFiles/myMusic/awesome.mp3", True),
    ("myData/myDocs/paper.doc", "MyFiles/myDocs/paper.doc", True)
]

List

List S3 bucket objects.

s3_tools.objects.list.list_objects(bucket: str, prefix: str = '', search_str: Optional[str] = None, max_keys: int = 1000, aws_auth: Dict[str, str] = {}) List[str][source]

Retrieve the list of objects from AWS S3 bucket under a given prefix and search string.

Parameters
  • bucket (str) – AWS S3 bucket where the objects are stored.

  • prefix (str) – Prefix where the objects are under.

  • search_str (str) – Basic search string to filter out keys on result (uses Unix shell-style wildcards), by default is None. For more about the search check “fnmatch” package.

  • max_keys (int) – Max number of keys to have pagination.

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Returns

List of keys inside the bucket, under the path, and filtered.

Return type

List[str]

Examples

>>> list_objects(bucket="myBucket", prefix="myData")
[
    "myData/myFile.data",
    "myData/myMusic/awesome.mp3",
    "myData/myDocs/paper.doc"
]
>>> list_objects(bucket="myBucket", prefix="myData", search_str="*paper*")
[
    "myData/myDocs/paper.doc"
]

Move

Move S3 objects.

s3_tools.objects.move.move_keys(source_bucket: str, source_keys: List[str], destination_bucket: str, destination_keys: List[str], threads: int = 5, aws_auth: Dict[str, str] = {}) None[source]

Move a list of S3 objects from source bucket to destination.

Parameters
  • source_bucket (str) – S3 bucket where the objects are stored.

  • source_keys (List[str]) – S3 keys where the objects are referenced.

  • destination_bucket (str) – S3 destination bucket.

  • destination_keys (List[str]) – S3 destination keys.

  • threads (int, optional) – Number of parallel uploads, by default 5.

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Raises
  • IndexError – When the source_keys and destination_keys have different length.

  • ValueError – When the keys list is empty.

Examples

>>> move_keys(
...     source_bucket='bucket',
...     source_keys=[
...         'myFiles/song.mp3',
...         'myFiles/photo.jpg'
...     ],
...     destination_bucket='bucket',
...     destination_keys=[
...         'myMusic/song.mp3',
...         'myPhotos/photo.jpg'
...     ]
... )
s3_tools.objects.move.move_object(source_bucket: str, source_key: str, destination_bucket: str, destination_key: str, aws_auth: Dict[str, str] = {}) None[source]

Move S3 object from source bucket and key to destination.

Parameters
  • source_bucket (str) – S3 bucket where the object is stored.

  • source_key (str) – S3 key where the object is referenced.

  • destination_bucket (str) – S3 destination bucket.

  • destination_key (str) – S3 destination key.

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Examples

>>> move_object(
...    source_bucket='bucket',
...    source_key='myFiles/song.mp3',
...    destination_bucket='bucket',
...    destination_key='myMusic/song.mp3'
... )

Read

Read S3 objects into variables.

s3_tools.objects.read.read_object_to_bytes(bucket: str, key: str, aws_auth: Dict[str, str] = {}) bytes[source]

Retrieve one object from AWS S3 bucket as a byte array.

Parameters
  • bucket (str) – AWS S3 bucket where the object is stored.

  • key (str) – Key where the object is stored.

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Returns

Object content as bytes.

Return type

bytes

Examples

>>> read_object_to_bytes(
...     bucket="myBucket",
...     key="myData/myFile.data"
... )
b"The file content"
s3_tools.objects.read.read_object_to_dict(bucket: str, key: str, aws_auth: Dict[str, str] = {}) Dict[Any, Any][source]

Retrieve one object from AWS S3 bucket as a dictionary.

Parameters
  • bucket (str) – AWS S3 bucket where the object is stored.

  • key (str) – Key where the object is stored.

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Returns

Object content as dictionary.

Return type

Dict[Any, Any]

Examples

>>> read_object_to_dict(
...     bucket="myBucket",
...     key="myData/myFile.json"
... )
{"key": "value", "1": "text"}
s3_tools.objects.read.read_object_to_text(bucket: str, key: str, aws_auth: Dict[str, str] = {}) str[source]

Retrieve one object from AWS S3 bucket as a string.

Parameters
  • bucket (str) – AWS S3 bucket where the object is stored.

  • key (str) – Key where the object is stored.

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Returns

Object content as string.

Return type

str

Examples

>>> read_object_to_text(
...     bucket="myBucket",
...     key="myData/myFile.data"
... )
"The file content"

Upload

Upload files to S3 bucket.

s3_tools.objects.upload.upload_file_to_key(bucket: str, key: str, local_filename: str, progress=None, task_id: int = - 1, aws_auth: Dict[str, str] = {}) str[source]

Upload one file from local disk and store into AWS S3 bucket.

Parameters
  • bucket (str) – AWS S3 bucket where the object will be stored.

  • key (str) – Key where the object will be stored.

  • local_filename (str) – Local file from where the data will be uploaded.

  • progress (rich.Progress) – Instance of a rich Progress bar, by default None.

  • task_id (int) – Task ID on the progress bar to be updated, by default -1.

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Returns

The S3 full URL to the file.

Return type

str

Examples

>>> write_object_from_file(
...     bucket="myBucket",
...     key="myFiles/music.mp3",
...     local_filename="files/music.mp3"
... )
http://s3.amazonaws.com/myBucket/myFiles/music.mp3
s3_tools.objects.upload.upload_files_to_keys(bucket: str, paths_keys: List[Tuple[str, str]], threads: int = 5, show_progress: bool = False, aws_auth: Dict[str, str] = {}) List[Tuple[str, str, Any]][source]

Upload list of files to specific objects.

Parameters
  • bucket (str) – AWS S3 bucket where the objects will be stored.

  • paths_keys (List[Tuple[str, str]]) – List with a tuple of local path to be uploaded and S3 key destination. e.g. [(“Local_Path”, “S3_Key”), (“Local_Path”, “S3_Key”)]

  • threads (int, optional) – Number of parallel uploads, by default 5.

  • show_progress (bool) – Show progress bar on console, by default False. (Need to install extra [progress] to be used)

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Returns

A list with tuples formed by the “Local_Path”, “S3_Key”, and the result of the upload. If successful will have True, if not will contain the error message. Attention, the output list may not follow the same input order.

Return type

List[Tuple[str, str, Any]]

Examples

>>> upload_files_to_keys(
...     bucket="myBucket",
...     paths_keys=[
...         ("MyFiles/myFile.data", "myData/myFile.data"),
...         ("MyFiles/myMusic/awesome.mp3", "myData/myMusic/awesome.mp3"),
...         ("MyFiles/myDocs/paper.doc", "myData/myDocs/paper.doc")
...     ]
... )
[
    ("MyFiles/myMusic/awesome.mp3", "myData/myMusic/awesome.mp3", True),
    ("MyFiles/myDocs/paper.doc", "myData/myDocs/paper.doc", True),
    ("MyFiles/myFile.data", "myData/myFile.data", True)
]
s3_tools.objects.upload.upload_folder_to_prefix(bucket: str, prefix: str, folder: str, search_str: str = '*', threads: int = 5, show_progress: bool = False, aws_auth: Dict[str, str] = {}) List[Tuple[str, str, Any]][source]

Upload local folder to a S3 prefix.

Function to upload all files for a given folder (recursive) and store them into a S3 bucket under a prefix. The local folder structure will be replicated into S3.

Parameters
  • bucket (str) – AWS S3 bucket where the object will be stored.

  • prefix (str) – Prefix where the objects will be under.

  • folder (str) – Local folder path where files are stored. Prefer to use the full path for the folder.

  • search_str (str.) – A match string to select all the files to upload, by default “*”. The string follows the rglob function pattern from the pathlib package.

  • threads (int, optional) – Number of parallel uploads, by default 5

  • show_progress (bool) – Show progress bar on console, by default False. (Need to install extra [progress] to be used)

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Returns

A list with tuples formed by the “Local_Path”, “S3_Key”, and the result of the upload. If successful will have True, if not will contain the error message.

Return type

List[Tuple[str, str, Any]]

Examples

>>> upload_folder_to_prefix(
...     bucket="myBucket",
...     prefix="myFiles",
...     folder="/usr/files",
... )
[
    ("/usr/files/music.mp3", "myFiles/music.mp3", True),
    ("/usr/files/awesome.wav", "myFiles/awesome.wav", True),
    ("/usr/files/data/metadata.json", "myFiles/data/metadata.json", True)
]

Write

Write variables into S3 objects.

s3_tools.objects.write.write_object_from_bytes(bucket: str, key: str, data: bytes, aws_auth: Dict[str, str] = {}) str[source]

Upload a bytes object to an object into AWS S3 bucket.

Parameters
  • bucket (str) – AWS S3 bucket where the object will be stored.

  • key (str) – Key where the object will be stored.

  • data (bytes) – The object data to be uploaded to AWS S3.

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Returns

The S3 full URL to the file.

Return type

str

Raises

TypeError – If data is not a bytes type.

Examples

>>> data = bytes("String to bytes", "utf-8")
>>> write_object_from_bytes(
...     bucket="myBucket",
...     key="myFiles/file.data",
...     data=data
... )
http://s3.amazonaws.com/myBucket/myFiles/file.data
s3_tools.objects.write.write_object_from_dict(bucket: str, key: str, data: Dict, aws_auth: Dict[str, str] = {}) str[source]

Upload a dictionary to an object into AWS S3 bucket.

Parameters
  • bucket (str) – AWS S3 bucket where the object will be stored.

  • key (str) – Key where the object will be stored.

  • data (dict) – The object data to be uploaded to AWS S3.

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Returns

The S3 full URL to the file.

Return type

str

Raises

TypeError – If data is not a dict type.

Examples

>>> data = {"key": "value", "1": "text"}
>>> write_object_from_dict(
...     bucket="myBucket",
...     key="myFiles/file.json",
...     data=data
... )
http://s3.amazonaws.com/myBucket/myFiles/file.json
s3_tools.objects.write.write_object_from_text(bucket: str, key: str, data: str, aws_auth: Dict[str, str] = {}) str[source]

Upload a string to an object into AWS S3 bucket.

Parameters
  • bucket (str) – AWS S3 bucket where the object will be stored.

  • key (str) – Key where the object will be stored.

  • data (str) – The object data to be uploaded to AWS S3.

  • aws_auth (Dict[str, str]) – Contains AWS credentials, by default is empty.

Returns

The S3 full URL to the file.

Return type

str

Raises

TypeError – If data is not a str type.

Examples

>>> data = "A very very not so long text"
>>> write_object_from_text(
...     bucket="myBucket",
...     key="myFiles/file.txt",
...     data=data
... )
http://s3.amazonaws.com/myBucket/myFiles/file.txt