cache_s3_based#
Note
TTL and automatic S3 cleanup
When ttl is set, an expired entry is deleted from S3 at the moment it is first read back after expiry, no S3 Lifecycle Rule or background job is required to keep the bucket clean.
Because TTLs are typically set to hours or days, expirations are
infrequent; the extra DeleteObject call on each expiry has negligible
overhead compared to the cost of recomputing the cached value.
S3-backed caching decorator.
- class core_aws.decorators.cache_s3_based._S3Backend(fcn_qualname: str, bucket: str, key_prefix: str = 'cache/', ttl: float | None = None, s3_kwargs: Dict[str, Any] | None = None)[source]#
Bases:
L2BackendL2Backendthat stores cache entries as pickle files in S3.Each entry is stored as
{key_prefix}{qualname}/{md5_of_cache_key}and serialized as{"v": value, "t": timestamp}, matching the on-disk format used by:_DiskBackend.The S3 client is created lazily on the first cache access.
- __init__(fcn_qualname: str, bucket: str, key_prefix: str = 'cache/', ttl: float | None = None, s3_kwargs: Dict[str, Any] | None = None) None[source]#
- Parameters:
fcn_qualname –
__qualname__of the decorated function; used to derive a unique S3 prefix so different functions never share entries.bucket – S3 bucket where cached objects are stored.
key_prefix – Prefix prepended to every S3 key. Default:
"cache/".ttl –
Time-to-live in seconds. Entries older than this are deleted from S3 and treated as a miss on
load().Nonemeans entries never expire.Because TTLs are typically set to hours or days, expirations are infrequent; the extra
delete_objectcall on expiry has negligible overhead and keeps the bucket clean automatically, without requiring an S3 Lifecycle Rule.s3_kwargs – Extra keyword arguments forwarded to
S3Client.
- load(cache_key: Any) Any[source]#
Return the cached value, or
_MISSwhen:the S3 object does not exist (
NoSuchKey), orthe entry is older than
ttlseconds, in which case the stale object is deleted from S3 before returning_MISS, keeping the bucket clean without relying on S3 Lifecycle Rules.
Because TTLs are typically set to hours or days, these deletions are infrequent and the extra
delete_objectcall has negligible overhead.All other
ClientErrorexceptions (e.g.AccessDenied) are re-raised.
- _abc_impl = <_abc._abc_data object>#
- core_aws.decorators.cache_s3_based.cache_s3_based(*, bucket: str, key_prefix: str = 'cache/', maxsize: int | None = None, ttl: float | None = None, s3_kwargs: Dict[str, Any] | None = None) Callable[source]#
Write-through caching decorator: L1 is a bounded in-memory LRU (
_CacheWrapper); the fallback is an S3 bucket (_S3Backend).Every new result is written to both L1 and S3 immediately. When L1 is full the least-recently-used entry is evicted from memory only, the S3 object is kept. A subsequent call with the same arguments (from the same or a different process/machine) reloads the value from S3 without invoking the wrapped function.
- Parameters:
bucket – S3 bucket where cached objects are stored.
key_prefix – Prefix prepended to every S3 key. Default:
"cache/".maxsize – Maximum number of entries kept in the in-memory L1 cache.
Nonemeans unbounded.ttl –
Time-to-live in seconds applied symmetrically to both layers. Expired L1 entries are evicted from memory. Expired S3 entries are deleted from S3 and return
_MISSon load, keeping the bucket clean automatically without S3 Lifecycle Rules.None(default) means entries never expire.TTLs are typically set to hours or days, so expirations are infrequent and the extra
delete_objectcall has negligible overhead.s3_kwargs – Extra keyword arguments forwarded to
S3Client(e.g.region_name,endpoint_url).
- Returns:
The wrapped function.
Example
from core_aws.decorators import cache_s3_based @cache_s3_based(bucket="my-cache-bucket", key_prefix="etl/", ttl=3600) def fetch_reference_data(dataset: str) -> dict: ...