My incident timestamp is in local time — what is the safe conversion?

Bind the naive local time to an explicit IANA zone with zoneinfo, then convert to UTC epoch seconds before any comparison. A daylight-saving error shifts the boundary by an hour; running the server with time_zone=+00:00 and log_timestamps=UTC removes the ambiguity.

Timestamp Targeting Strategies for MySQL Binary Log Archiving and PITR Automation

Point-in-Time Recovery is, at its core, a coordinate-resolution problem: an incident is reported as a wall-clock time (“the bad DELETE ran around 14:07 UTC”), but mysqlbinlog replays a byte stream keyed on a log file, a log position, and a set of GTIDs. This page defines how an automated pipeline turns a human timestamp into an exact, replayable stop coordinate — and why the obvious approach, passing --stop-datetime straight through, silently over- or under-recovers in production. Naive targeting fails for three concrete reasons: MySQL binary log event headers store time only to whole-second resolution, so two transactions committed in the same second are indistinguishable by datetime alone; incident timestamps arrive in the operator’s local zone while binlog headers are always UTC epoch seconds, so an un-normalized conversion moves the recovery boundary by whole hours; and --stop-datetime is boundary-exclusive in a way that surprises operators mid-incident. This is a component of the broader Automated Binlog Archiving to Object Storage pipeline, and it assumes the archived chain it resolves against is the gap-free, checksum-verified stream produced by the AWS S3 & GCS Sync Pipelines stage.

Visual Overview

Core Concept & Prerequisites

Every event written to a MySQL binary log carries a 19-byte common header whose first field is a 4-byte unsigned integer: timestamp, the number of seconds since the Unix epoch, always in UTC, regardless of time_zone, log_timestamps, or the session that produced the event. That single fact drives the whole strategy. The header has no sub-second component, so the finest boundary a datetime target can express is a one-second window — and any second in which more than one transaction committed is ambiguous. Resolving a timestamp to a unique recovery point therefore means two steps: narrow to the target second by header scan, then disambiguate within that second by log position or GTID.

The second foundational fact is that mysqlbinlog --stop-datetime is boundary-exclusive: it stops at the first event whose timestamp is equal to or later than the argument, so an event committed exactly at the target second is not applied. Operators who expect “recover up to and including 14:07:03” and pass --stop-datetime="... 14:07:03" lose every transaction stamped 14:07:03. Precise recovery treats the datetime only as a coarse locator and pins the true boundary with a position or a GTID set.

Prerequisites for the resolver described here:

MySQL 8.0.22+ — for SHOW BINARY LOG STATUS (the modern spelling of SHOW MASTER STATUS) and the performance_schema.log_status view used to read the live coordinate and gtid_executed in one snapshot.
GTID mode enabled — gtid_mode=ON with enforce_gtid_consistency=ON, so the resolver can emit a GTID stop set as the deterministic alternative to a fragile datetime boundary. The upstream guarantees come from a gap-free GTID Tracking & Enforcement pipeline.
binlog_format=ROW — datetime targeting is only meaningful when replay is deterministic; the reasoning is covered in ROW vs STATEMENT vs MIXED Formats.
Python 3.10+ with mysql-connector-python for pooled metadata reads, python-mysql-replication to stream and inspect event headers without shelling out to mysqlbinlog, and tenacity for retrying transient connection failures.

Timezone discipline is the last prerequisite and the one most likely to be skipped. Application logs, APM dashboards, and SHOW BINLOG EVENTS rendered in a session zone all present local time; the resolver must reject naive input or bind it to an explicit zone before converting to the UTC epoch the headers actually use. A one-hour daylight-saving error here silently recovers 3,600 seconds too much or too little transaction history.

Production-Grade Python Implementation

The module below resolves a target timestamp to an exact stop coordinate without invoking mysqlbinlog: it streams event headers directly with python-mysql-replication, tracking the last transaction boundary at or before the normalized UTC target and continuing one step past it to detect same-second ambiguity. It normalizes any input zone to UTC epoch seconds, uses a pooled connection for the metadata probe, retries transient failures with tenacity, and emits structured JSON so a single resolution can be traced end to end. A match statement selects the resolution mode — a bare datetime, a datetime plus position, or a GTID stop set — so the caller always receives an unambiguous coordinate.

#!/usr/bin/env python3
"""Resolve a human timestamp to an exact MySQL binlog stop coordinate.
Targets: MySQL 8.0.22+, Python 3.10+.
Requires: mysql-connector-python, python-mysql-replication, tenacity.
"""
from __future__ import annotations

import datetime as dt
import json
import logging
import zoneinfo
from dataclasses import dataclass, field
from enum import Enum

from mysql.connector import pooling
from pymysqlreplication import BinLogStreamReader
from pymysqlreplication.event import GtidEvent, XidEvent
from tenacity import retry, stop_after_attempt, wait_exponential_jitter


# ---- structured logging -------------------------------------------------
class JsonFormatter(logging.Formatter):
    def format(self, record: logging.LogRecord) -> str:
        payload = {"level": record.levelname, "event": record.getMessage()}
        if isinstance(record.args, dict):
            payload |= record.args
        return json.dumps(payload)


_handler = logging.StreamHandler()
_handler.setFormatter(JsonFormatter())
logger = logging.getLogger("pitr_target_resolver")
logger.addHandler(_handler)
logger.setLevel(logging.INFO)


class TargetBefore(Exception):
    """Target precedes the earliest available (archived + local) event."""


class Mode(Enum):
    DATETIME = "stop-datetime"          # coarse: whole-second boundary only
    POSITION = "stop-position"          # exact: disambiguates same-second commits
    GTID = "exclude-gtids"              # deterministic: replay-order independent


@dataclass(slots=True, frozen=True)
class Coordinate:
    log_file: str
    log_pos: int
    last_gtid: str | None
    utc_epoch: int
    ambiguous_second: bool              # True => datetime alone is insufficient
    mode: Mode

    def stop_args(self) -> list[str]:
        # The mysqlbinlog flags a recovery run should actually use.
        match self.mode:
            case Mode.POSITION:
                return ["--stop-position", str(self.log_pos)]
            case Mode.GTID if self.last_gtid:
                return ["--exclude-gtids", self.last_gtid]
            case _:
                iso = dt.datetime.fromtimestamp(self.utc_epoch, dt.timezone.utc)
                return ["--stop-datetime", iso.strftime("%Y-%m-%d %H:%M:%S")]


def normalize_to_utc(target: str, *, assume_zone: str = "UTC") -> int:
    """Parse an ISO-8601 target and return UTC epoch seconds.

    A naive input is bound to `assume_zone` (never guessed silently); an
    aware input is converted. Whole-second truncation matches header
    resolution so comparisons are exact.
    """
    if target.endswith("Z"):
        target = target[:-1] + "+00:00"
    parsed = dt.datetime.fromisoformat(target)
    if parsed.tzinfo is None:
        parsed = parsed.replace(tzinfo=zoneinfo.ZoneInfo(assume_zone))
    epoch = int(parsed.astimezone(dt.timezone.utc).timestamp())
    logger.info("normalized", {"input": target, "assume_zone": assume_zone,
                               "utc_epoch": epoch})
    return epoch


@retry(stop=stop_after_attempt(5), wait=wait_exponential_jitter(initial=1, max=20),
       reraise=True)
def _earliest_log(pool: pooling.MySQLConnectionPool) -> str:
    """The oldest binlog MySQL still holds locally (SHOW BINARY LOGS)."""
    conn = pool.get_connection()
    try:
        cur = conn.cursor()
        cur.execute("SHOW BINARY LOGS")            # MySQL 8.0+
        first = cur.fetchone()
        if first is None:
            raise TargetBefore("server reports no binary logs")
        return first[0]
    finally:
        conn.close()


def resolve(target_utc: int, pool: pooling.MySQLConnectionPool,
            source: dict, *, start_file: str | None = None) -> Coordinate:
    """Stream event headers and return the exact stop coordinate.

    Walks transaction boundaries (GTID -> XID/commit) tracking the last one
    at or before `target_utc`. Reads exactly one boundary past the target to
    detect a same-second collision that makes a datetime stop ambiguous.
    """
    start_file = start_file or _earliest_log(pool)
    last: Coordinate | None = None
    pending_gtid: str | None = None
    ambiguous = False

    stream = BinLogStreamReader(
        connection_settings=source,
        server_id=4_294_967_295,               # ephemeral reader id
        log_file=start_file,
        log_pos=4,                              # first event after magic number
        blocking=False,
        resume_stream=True,
    )
    try:
        for event in stream:
            match event:
                case GtidEvent():
                    pending_gtid = event.gtid
                case XidEvent():                # a committed transaction boundary
                    ts = event.timestamp
                    if ts <= target_utc:
                        if last is not None and last.utc_epoch == ts:
                            ambiguous = True    # >1 commit in the same second
                        last = Coordinate(
                            log_file=stream.log_file, log_pos=stream.log_pos,
                            last_gtid=pending_gtid, utc_epoch=ts,
                            ambiguous_second=False, mode=Mode.DATETIME)
                    else:
                        break                   # first commit past the target
    finally:
        stream.close()

    if last is None:
        raise TargetBefore(f"no committed event <= {target_utc} from {start_file}")

    mode = Mode.GTID if last.last_gtid else (Mode.POSITION if ambiguous else Mode.DATETIME)
    resolved = Coordinate(last.log_file, last.log_pos, last.last_gtid,
                          last.utc_epoch, ambiguous, mode)
    logger.info("resolved", {"log_file": resolved.log_file, "log_pos": resolved.log_pos,
                             "gtid": resolved.last_gtid, "ambiguous": ambiguous,
                             "mode": mode.value})
    return resolved

Two decisions carry the guarantee. First, the resolver keys everything on the XID (commit) boundary rather than on individual row events, so the stop coordinate always lands on a transaction edge — replaying to the middle of a transaction would leave the recovered instance inconsistent. Second, it reads exactly one commit past the target to set ambiguous_second; when two transactions share the target second it downgrades from a datetime stop to a position or GTID stop automatically, so the caller can never be handed a boundary that recovers the wrong number of transactions. The GTID mode is the strongest output — a stop set is independent of replay order and survives a re-derived position after a log rotation, which is why it is preferred whenever gtid_mode=ON.

Configuration Reference

These server variables govern how a timestamp maps to a coordinate and how far back the mapping can reach. Set them before an incident, not during one.

Variable	Type	Default	Recommended	PITR impact
`time_zone`	string	`SYSTEM`	`+00:00`	Running the server in UTC removes an entire class of conversion error; session output and `NOW()` semantics align with the UTC epoch stored in event headers.
`log_timestamps`	enum	`UTC`	`UTC`	Keeps error-log and general-log timestamps in UTC so operator-facing incident times match binlog header time; a `SYSTEM` setting reintroduces local-zone drift into the evidence you target against.
`gtid_mode`	enum	`OFF`	`ON`	Enables the deterministic GTID stop set the resolver prefers over a whole-second datetime boundary.
`enforce_gtid_consistency`	enum	`OFF`	`ON`	Keeps GTID-unsafe statements out of the log so every resolved boundary is replayable.
`binlog_expire_logs_seconds`	integer (s)	`2592000` (30d)	`≥ 259200` (3d)	The local floor: a target older than this is unresolvable from disk and must be fetched from object storage first.
`binlog_rows_query_log_events`	boolean	`OFF`	`ON`	Writes the original SQL text as `Rows_query` events, so header scanning around the target second can show which statement to stop before during triage.
`max_binlog_size`	integer (bytes)	`1073741824` (1 GiB)	`104857600`–`268435456`	Smaller segments mean finer archived granularity and faster targeted fetches, at the cost of more objects.

-- MySQL 8.0.22+ : make timestamp targeting deterministic
SET PERSIST time_zone                   = '+00:00';
SET PERSIST log_timestamps              = 'UTC';
SET PERSIST gtid_mode                   = 'ON';
SET PERSIST enforce_gtid_consistency    = 'ON';
SET PERSIST binlog_rows_query_log_events = 'ON';

SET PERSIST writes to mysqld-auto.cnf, so the settings survive a restart without a my.cnf edit racing your configuration-management tool. Retention itself must be governed against replication lag and backup cadence, as detailed in Binlog Retention Boundaries — a retention floor that is shorter than your longest plausible detection-to-recovery interval turns timestamp targeting into an archive fetch every time.

Validation & Verification Gates

A resolved coordinate is a hypothesis until it clears the gates a real recovery run depends on:

Base-backup floor check. Recovery always starts from a physical base backup, so the resolved coordinate must fall at or after the backup’s binlog_position. Cross-reference Percona XtraBackup’s xtrabackup_binlog_info (or the equivalent manifest) and reject any target that precedes it, surfacing the earliest valid recovery time instead of silently starting mid-chain. This is the contract shared with Base Backup Integration for PITR.

GTID contiguity around the boundary. Prove the archived chain has no hole between the base backup’s gtid_purged and the resolved stop set. An empty GTID_SUBTRACT means every transaction the recovery needs is present:

-- MySQL 8.0+ : empty result => the chain from base backup to target is gap-free
SELECT GTID_SUBTRACT(
         '<resolved stop gtid set>',
         GTID_SUBTRACT('<archived union>', '<base backup gtid_purged>')
       ) AS missing_gtids;

Same-second disambiguation. If the resolver flagged ambiguous_second, assert the output mode is POSITION or GTID, never DATETIME. A datetime stop on an ambiguous second is a rejectable result, not a warning.
Dry-run replay. Before touching a recovery target, run mysqlbinlog with the resolved stop_args() piped to --verbose (no mysql client on the other end) to confirm the boundary event is exactly the last transaction you intend to keep. Dry-run reconciliation is the mandatory pre-flight; pairing it with the checksum-verified segments from the Compression & Encryption Workflows stage guarantees the bytes you resolved against are the bytes you will replay.

When the target predates binlog_expire_logs_seconds, the segment is no longer on disk and the resolver must first pull it from object storage. A metadata index that maps timestamp_range -> object_key (mirrored in the manifest the sync pipeline commits) lets the fetch bypass a full-archive scan and land the one segment the target second lives in.

Error Handling & Failure Modes

Timestamp resolution fails in a small set of well-defined ways; map each to a defined action rather than a retry loop.

ERROR 1236 (HY000): Could not find first log file name in binary log index file — the resolver referenced a segment MySQL already purged because the target predates the local retention floor. Action: fetch the archived segment from object storage and re-run the resolver against start_file set to the fetched log; if it is not archived either, the target is unrecoverable and the earliest valid time must be surfaced. The retry/backoff scaffolding for the fetch itself lives in Error Handling & Retry Logic.
ERROR 3546 (HY000): @@GLOBAL.GTID_PURGED cannot be changed during replay — the base backup’s gtid_purged does not line up with the archived chain’s starting GTID, so the resolved boundary sits on a discontinuous chain. Action: reconcile the base-backup and archiving retention windows and re-pin the starting object before replaying.
ERROR 1837 / ERROR 1840 on gtid_purged/gtid_executed — the recovery instance already has GTIDs that collide with the set you are trying to seed. Action: restore onto a clean instance whose gtid_executed is empty, then apply the resolved stop set.
Ambiguous second with no GTIDs — gtid_mode=OFF and two commits share the target second, so neither a datetime nor a portable GTID stop is available. Action: fall back to --stop-position against the exact log_pos the header scan returned, and treat the missing GTID mode as a finding to remediate before the next incident.
Naive timestamp with no explicit zone — the operator supplied 2026-07-04 14:07:03 with no offset. Action: refuse to guess; require assume_zone to be set explicitly (the resolver binds it, it never defaults silently to the server locale), because a wrong zone moves the boundary by whole hours.

Fail loud on every one of these rather than replaying a boundary that recovers the wrong window. Route unrecoverable-target cases to the same fallback path described in Fallback Routing Strategies.

Observability & Alerting

Treat resolution correctness and archive reachability as first-class signals, not incident-time surprises. Export metrics that reveal a targeting problem before an operator needs the target:

pitr_resolution_seconds — histogram of resolve latency; a rising p99 usually means header scans are ranging across more segments than expected, i.e. the retention floor is too tight.
pitr_target_before_window_total — count of targets rejected as older than the earliest available event; a nonzero rate means your retention or archiving lag is shorter than your real detection time.
pitr_ambiguous_second_total — count of resolutions that required position/GTID disambiguation; not an error, but a useful measure of commit density around the times you recover to.
pitr_archive_fetch_total{result} and pitr_archive_miss_total — every miss is a segment that should have been archived and was not; page on it.

Anchor lag measurement to MySQL’s own authoritative position so exposure is measured, not guessed:

-- MySQL 8.0+ : live coordinate + gtid_executed in one snapshot
SELECT * FROM performance_schema.log_status\G
-- Compare the reported LOCAL binary_log_file/position against the newest
-- verified manifest row to compute how far the archive trails the live tail.

Emit structured fields (utc_epoch, log_file, log_pos, gtid, mode, ambiguous) as JSON, exactly as the resolver above does, so a single resolution can be replayed from an operator’s incident timestamp all the way to the coordinate a recovery run consumed. Where the target’s segment had to be pulled from storage, correlate against the ordered delivery guarantees of the Async Processing & Queue Management worker pool so a resolution delay can be attributed to fetch backpressure rather than to the scan itself.

Frequently Asked Questions

Why can't --stop-datetime alone give me exact recovery precision?

Binary log event headers store time as whole seconds of the UTC epoch — there is no sub-second field. If two transactions commit in the same second, --stop-datetime cannot separate them: it stops at the first event whose timestamp is equal to or later than the argument, so it either keeps both or excludes both. It is also boundary-exclusive, so a datetime equal to a commit’s second drops that commit. For a unique, inclusive boundary you must narrow to the second with a datetime, then pin the exact transaction with --stop-position or an --exclude-gtids set.

Should I target a timestamp or a GTID for recovery?

Resolve from a timestamp because that is how incidents are reported, but recover to a GTID stop set whenever gtid_mode=ON. A GTID boundary is independent of replay order and survives a log rotation or a re-derived byte position, whereas a raw log_pos is only valid against the specific log file it was read from. The resolver in this page emits a GTID stop set as its strongest output and falls back to a position only when GTIDs are unavailable.

My incident timestamp came from an app log in local time — what's the safe conversion?

Bind the naive local time to an explicit IANA zone (for example Europe/Berlin) with zoneinfo, then convert to UTC epoch seconds before any comparison — never let the server’s session zone guess for you. The most common failure is a daylight-saving offset that shifts the boundary by an hour and recovers 3,600 seconds too much or too little. Running the server itself with time_zone='+00:00' and log_timestamps=UTC removes the ambiguity from the evidence you target against in the first place.

How do I target a timestamp older than binlog_expire_logs_seconds?

The segment is no longer on local disk, so the resolver must fetch it from object storage first. Use a metadata index that maps a timestamp_range to an object_key so the fetch pulls only the one segment the target second lives in rather than scanning the whole archive, verify the segment’s checksum, then run the resolver with start_file pointed at the fetched log. If the target predates even the oldest archived segment or the base backup, it is unrecoverable — surface the earliest valid recovery time instead of starting mid-chain.

Base Backup Integration for PITR — pins the starting position the resolved coordinate must contiguously extend.
AWS S3 & GCS Sync Pipelines — the ordered, checksum-verified transport that produces the archived chain you resolve against.
Rotation Scheduling & Cron Automation — controls segment size and purge timing, which set the granularity and reach of timestamp targeting.
GTID Tracking & Enforcement — the gap-free GTID stream that makes a deterministic stop set possible.
The Binary Log — MySQL 8.0 Reference Manual — canonical documentation for event headers and mysqlbinlog targeting options.

Back to Automated Binlog Archiving to Object Storage.

Timestamp Targeting Strategies for MySQL Binary Log Archiving and PITR Automation #

Visual Overview #

Core Concept & Prerequisites #

Production-Grade Python Implementation #

Configuration Reference #

Validation & Verification Gates #

Error Handling & Failure Modes #

Observability & Alerting #

Frequently Asked Questions #

Related #

Related pages

Async Processing & Queue Management for Binary Log Archiving and PITR Automation

AWS S3 & GCS Sync Pipelines for MySQL Binary Log Archiving and PITR Automation

Base Backup Integration for PITR: Anchoring Binary Log Archives to a Verifiable Recovery Coordinate

Compression & Encryption Workflows for MySQL Binary Log Archiving and PITR Automation

Error Handling & Retry Logic for MySQL Binary Log Archiving and PITR Automation

Rotation Scheduling & Cron Automation for MySQL Binary Log Archiving and PITR