Writing data on the fly to EDITO data storage

There are many ways to upload files in EDITO, and you can use your favorite tools to do it. The following content focuses on writing data directly from python code, without using the local file system.

Uploading bytes

The following code uses the boto3 library to upload bytes directly in the bucket_name at the object_key location:

import boto3
import os

def get_s3_client():
    return boto3.client(
        "s3",
        endpoint_url=os.environ["S3_ENDPOINT"],
        aws_access_key_id=os.environ["AWS_ACCESS_KEY_ID"],
        aws_secret_access_key=os.environ["AWS_SECRET_ACCESS_KEY"],
        aws_session_token=os.environ["AWS_SESSION_TOKEN"],
    )


def save_bytes_to_s3(bucket_name: str, object_bytes, object_key: str):
    response = get_s3_client().put_object(
        Bucket=bucket_name,
        Body=object_bytes,
        Key=object_key,
    )
    if response["ResponseMetadata"]["HTTPStatusCode"] == 200:
        print(f"Successfully uploaded bytes to S3 {bucket_name}/{object_key}")
    else:
        print(
            f"Failed to upload bytes to S3 {bucket_name}/{object_key}: {response}"
        )

Getting the bytes of an object is library/object-specific, but quite easy to do for images, texts, videos, etc. The following section focuses on NetCDF content.

Uploading NetCDF files

The xarray library provides an easy way to get NetCDF bytes out of a xarray.Dataset using the to_netcdf() method (without path argument). For example, using the code above, you could:

import xarray

dataset: xarray.Dataset = ...

save_bytes_to_s3(
    bucket_name="my_bucket",
    object_bytes=dataset.to_netcdf(),
    object_key="path/to/my_file.nc",
)