I am implementing an asynchronous remote (to aws S3) logging handler (in python, but it doesn't matter) supporting 4 modes:
- Immediate: the messages get written immediately to S3 - I am
trying to avoid this mode
- By queue size: the messages get stored
into a linked queue, and after reaching a certain size (say 500ko),
the whole content is persisted all at once to a file in S3
By delimiters: all the messages contained between 2
delimiters, will be treated as 1 and sent all at once, e.g:
message \n message2 \n will be sent at once
- By timer: every 10 seconds or so, the cumulated logs get sent to a file in S3.
The reason I decided to implement remote logging is that the cluster could crash at any time or get terminated after finishing its processing and I would therefore lose the locally stored logs, when it would be helpful to know "why" it crashed of course, but also and very importantly, "when" it crashed, especially for long running operations, in which case the processing could be resumed at that stage.
An idea would be to have a queue (with ActiveMQ or Kafka etc.) on which messages get published in real-time, then probably aggregated before going to S3, but I thought it would probably be overkill to drag a whole broker infrastructure for this use-case.
My implementation works, but my questions are more conceptual and best-practice oriented:
In case of using the modes "2" (by queue size) and "4" (by timer), how could I be notified of the end of the programs execution, so I flush the content of the local queue, and stop the timer thread? currently marking the logging thread as
daemon but I obviously miss the last messages, looking now for a better way, to avoid
daemons yet getting notified that the main thread finished, to delay the termination of these threads until everything gets pushed.
Does my approach make sense? or am I completely, and badly in a hacky way, reinventing the wheel?