what is meaning partial resynchronization of redis?

Learn what is meaning partial resynchronization of redis? with practical examples, diagrams, and best practices. Covers redis development techniques with visual explanations.

Understanding Redis Partial Resynchronization (PSYNC)

Diagram illustrating Redis master-replica architecture with data flow

Explore how Redis uses Partial Resynchronization (PSYNC) to efficiently recover from master-replica disconnections, minimizing data transfer and improving replication performance.

Redis replication is a crucial feature for high availability and scalability. When a Redis replica disconnects from its master, it needs to resynchronize to catch up on any missed data. Full resynchronization (FULLSYNC) involves transferring the entire dataset, which can be resource-intensive. To mitigate this, Redis introduced Partial Resynchronization (PSYNC), a smarter mechanism that aims to transfer only the missing data. This article delves into the mechanics of PSYNC, its benefits, and how it works under the hood.

The Need for Partial Resynchronization

In a Redis replication setup, the master continuously sends a stream of commands to its replicas. If a replica disconnects for a short period, it's likely that only a small amount of data has changed on the master. A full resynchronization in such scenarios would be inefficient, consuming significant network bandwidth and CPU cycles on both the master and the replica. This is where PSYNC comes into play, offering a more intelligent and performant way to bring replicas up to date.

flowchart TD
    A[Replica Disconnects] --> B{Can PSYNC be used?}
    B -->|Yes| C[Replica sends PSYNC command]
    C --> D{Master has replication backlog?}
    D -->|Yes| E[Master sends missing commands]
    E --> F[Replica applies commands]
    F --> G[Replication Resumed]
    D -->|No| H[Master sends FULLSYNC]
    H --> I[Replica performs FULLSYNC]
    I --> G
    B -->|No| H

Redis Resynchronization Decision Flow

How PSYNC Works: Replication ID and Offset

At the core of PSYNC are two key pieces of information: the replication ID (replid) and the replication offset. Every Redis master has a unique replication ID. When a master restarts or undergoes a failover, it generates a new replication ID. The replication offset is a monotonically increasing counter that tracks the amount of data (in bytes) that has been processed by the master and sent to its replicas.

When a replica connects to a master, it stores the master's replication ID and its current replication offset. If the replica disconnects and then reconnects, it attempts to perform a PSYNC by sending its last known master replication ID and offset to the current master. The master then checks if it can satisfy the request using its replication backlog.

PSYNC <master_replid> <offset>

The PSYNC command sent by a replica to its master

The Replication Backlog Buffer

The master maintains a fixed-size circular buffer called the replication backlog. This buffer stores a history of all the write commands executed on the master. When a replica requests a PSYNC, the master checks if the requested offset falls within the range of data currently available in its backlog.

If the requested offset is within the backlog, the master sends only the commands from that offset up to its current offset, effectively resynchronizing the replica with minimal data transfer. If the requested offset is too old (i.e., the data has been overwritten in the backlog) or the replication ID doesn't match, the master will respond with a +FULLRESYNC command, forcing the replica to perform a full resynchronization.

💡

The size of the replication backlog is configurable via the repl-backlog-size directive in redis.conf. A larger backlog increases the chances of successful PSYNCs, especially for replicas that might be disconnected for longer periods, but also consumes more memory on the master.

# Default replication backlog size is 1mb
repl-backlog-size 1mb

# Example: Increase backlog size to 128mb
repl-backlog-size 128mb

Configuring the replication backlog size in redis.conf

Benefits and Limitations of PSYNC

PSYNC significantly improves the efficiency of Redis replication by reducing the need for full resynchronizations. This leads to:

Reduced Network Bandwidth: Only the delta of changes is transferred.
Faster Recovery: Replicas catch up much quicker after short disconnections.
Lower Master Load: The master doesn't need to generate and transfer a full RDB file as frequently.

However, PSYNC has limitations:

Master Restart/Failover: If the master restarts and generates a new replication ID, or if a failover occurs and a new master takes over, PSYNC will likely fail, leading to a full resynchronization.
Replication Backlog Size: If a replica is disconnected for too long and its requested offset falls outside the master's replication backlog, a full resynchronization will be triggered.
Replica Restart: If a replica restarts and loses its replication state, it will also need to perform a full resynchronization.

Diagram showing successful partial resynchronization versus full resynchronization

Comparison of data transfer during PSYNC vs. FULLSYNC