Guid to Base34 encoder/decoder

Learn guid to base34 encoder/decoder with practical examples, diagrams, and best practices. Covers c#, encoding, guid development techniques with visual explanations.

GUID to Base34 Encoding and Decoding in C#

Hero image for Guid to Base34 encoder/decoder

Learn how to efficiently encode and decode GUIDs into a compact, URL-safe Base34 string representation using C#.

Globally Unique Identifiers (GUIDs), also known as UUIDs, are 128-bit numbers used to uniquely identify information in computer systems. While incredibly useful for their uniqueness, their standard string representation (e.g., 3f2504e0-4f89-11d3-9a0c-0305e82c3301) can be quite long and include characters that are not URL-safe (like hyphens). This article explores a practical approach to convert GUIDs into a shorter, URL-safe Base34 string and back again using C#.

Why Base34 for GUIDs?

The standard string representation of a GUID is 36 characters long (32 hex digits + 4 hyphens). When used in URLs, file names, or other contexts where brevity and character safety are important, this can be problematic. Base64 is a common encoding scheme, but it includes +, /, and = characters which require URL encoding. Base34, by contrast, can be designed to use only alphanumeric characters (0-9, A-Z, excluding I, L, O, U for readability and avoiding confusion with 1, 0, V respectively, or similar custom sets) making it inherently URL-safe and often shorter than Base64 for GUIDs.

flowchart TD
    A[GUID (128-bit)] --> B{Convert to byte array}
    B --> C[BigInteger representation]
    C --> D{Base34 Encoding}
    D --> E[Base34 String]
    E --> F{Base34 Decoding}
    F --> G[BigInteger representation]
    G --> H{Convert to byte array}
    H --> I[GUID (128-bit)]

GUID to Base34 Encoding/Decoding Workflow

Implementing the Base34 Encoder/Decoder

The core idea is to treat the 128-bit GUID as a large integer. We can convert the GUID's byte array into a BigInteger, then perform base conversion on this BigInteger to our custom Base34 alphabet. The reverse process involves converting the Base34 string back to a BigInteger and then to a GUID byte array.

using System;
using System.Linq;
using System.Numerics;
using System.Text;

public static class GuidBase34Converter
{
    private const string Alphabet = "0123456789ABCDEFGHJKMNPQRSTVWXYZ"; // 34 characters, excluding I, L, O, U
    private static readonly BigInteger Base = new BigInteger(Alphabet.Length);

    public static string Encode(Guid guid)
    {
        // GUID bytes are typically little-endian, BigInteger expects big-endian
        // Also, BigInteger constructor can add a leading zero if the most significant bit is set
        // to ensure it's treated as positive. We need to handle this.
        byte[] guidBytes = guid.ToByteArray();
        Array.Reverse(guidBytes); // Convert to big-endian for BigInteger

        // Add a leading zero byte to ensure BigInteger treats it as positive
        // and to prevent issues if the most significant bit of the GUID is 1.
        byte[] bigIntBytes = new byte[guidBytes.Length + 1];
        Buffer.BlockCopy(guidBytes, 0, bigIntBytes, 0, guidBytes.Length);
        BigInteger number = new BigInteger(bigIntBytes);

        StringBuilder result = new StringBuilder();
        while (number > 0)
        {
            number = BigInteger.DivRem(number, Base, out BigInteger remainder);
            result.Insert(0, Alphabet[(int)remainder]);
        }

        // Handle the case of a zero GUID
        if (result.Length == 0)
        {
            return Alphabet[0].ToString();
        }

        return result.ToString();
    }

    public static Guid Decode(string base34String)
    {
        BigInteger number = BigInteger.Zero;
        foreach (char c in base34String)
        {
            int index = Alphabet.IndexOf(c);
            if (index == -1)
            {
                throw new ArgumentException($"Invalid character '{c}' in Base34 string.");
            }
            number = number * Base + index;
        }

        // Convert BigInteger back to byte array
        byte[] bigIntBytes = number.ToByteArray();

        // Remove the leading zero byte if it was added during encoding
        byte[] guidBytes;
        if (bigIntBytes.Length > 16 && bigIntBytes[bigIntBytes.Length - 1] == 0)
        {
            guidBytes = new byte[16];
            Buffer.BlockCopy(bigIntBytes, 0, guidBytes, 0, 16);
        }
        else if (bigIntBytes.Length < 16)
        {
            // Pad with zeros if the decoded number is smaller than a GUID
            guidBytes = new byte[16];
            Buffer.BlockCopy(bigIntBytes, 0, guidBytes, 0, bigIntBytes.Length);
        }
        else
        {
            guidBytes = bigIntBytes;
        }

        Array.Reverse(guidBytes); // Convert back to little-endian for GUID constructor

        // Ensure the byte array is exactly 16 bytes for the GUID constructor
        if (guidBytes.Length != 16)
        {
            // This case should ideally not be hit if encoding/decoding is symmetric
            // but provides a safeguard.
            byte[] paddedBytes = new byte[16];
            Buffer.BlockCopy(guidBytes, 0, paddedBytes, 0, Math.Min(guidBytes.Length, 16));
            guidBytes = paddedBytes;
        }

        return new Guid(guidBytes);
    }
}

Usage and Considerations

Using the GuidBase34Converter is straightforward. You can encode any Guid into a Base34 string and decode a valid Base34 string back into a Guid. The resulting Base34 string for a GUID will typically be around 25-26 characters long, a significant reduction from 36 characters, and will only contain URL-safe characters from the defined alphabet.

using System;

public class Example
{
    public static void Main(string[] args)
    {
        Guid originalGuid = Guid.NewGuid();
        Console.WriteLine($"Original GUID: {originalGuid}");

        string encodedString = GuidBase34Converter.Encode(originalGuid);
        Console.WriteLine($"Encoded Base34: {encodedString} (Length: {encodedString.Length})");

        Guid decodedGuid = GuidBase34Converter.Decode(encodedString);
        Console.WriteLine($"Decoded GUID:  {decodedGuid}");

        Console.WriteLine($"Match: {originalGuid == decodedGuid}");

        // Example with a known GUID
        Guid specificGuid = new Guid("3f2504e0-4f89-11d3-9a0c-0305e82c3301");
        string specificEncoded = GuidBase34Converter.Encode(specificGuid);
        Console.WriteLine($"\nSpecific GUID: {specificGuid}");
        Console.WriteLine($"Specific Encoded: {specificEncoded}"); // Expected: "3F2504E04F8911D39A0C0305E82C3301" (if using hex alphabet)
                                                                  // With our alphabet, it will be different and shorter.

        // Test with a zero GUID
        Guid zeroGuid = Guid.Empty;
        string zeroEncoded = GuidBase34Converter.Encode(zeroGuid);
        Console.WriteLine($"\nZero GUID: {zeroGuid}");
        Console.WriteLine($"Zero Encoded: {zeroEncoded}");
        Console.WriteLine($"Zero Decoded: {GuidBase34Converter.Decode(zeroEncoded)}");
    }
}

1. Define your Base34 Alphabet

Carefully select the 34 characters that will form your encoding alphabet. Exclude ambiguous characters to improve readability and reduce errors. The example uses 0-9 and A-Z excluding I, L, O, U.

2. Convert GUID to BigInteger

Obtain the byte array of the GUID using guid.ToByteArray(). Reverse the array to convert from little-endian (GUID's default) to big-endian, which BigInteger expects. Add a leading zero byte to ensure the BigInteger is always positive.

3. Encode BigInteger to Base34 String

Repeatedly divide the BigInteger by the base (alphabet length), taking the remainder as the index into your alphabet. Prepend the character to your result string until the BigInteger becomes zero.

4. Decode Base34 String to BigInteger

Iterate through the Base34 string. For each character, find its index in the alphabet. Multiply the current BigInteger by the base and add the character's index.

5. Convert BigInteger back to GUID

Convert the resulting BigInteger back to a byte array. Handle potential padding or truncation to ensure it's exactly 16 bytes. Reverse the byte array back to little-endian before constructing the Guid.