Obtain a Blogger's blog ID from its friendly URL without screen scraping
Categories:
How to Extract a Blogger Blog ID from its Friendly URL

Learn how to reliably obtain a Blogger blog's unique ID from its user-friendly URL, bypassing the need for screen scraping or complex API calls. This method is crucial for integrating with Blogger's Data API.
When working with Blogger's Data API, you often need the blog's unique ID to perform operations like fetching posts or updating settings. While the blog ID is readily available in the dashboard URL (e.g., https://www.blogger.com/blog/posts/BLOG_ID
), it's not directly exposed in the public-facing 'friendly' URL (e.g., https://example.blogspot.com/
). This article provides a robust PHP-based solution to extract this ID without resorting to unreliable screen scraping techniques.
Understanding Blogger's Blog ID Structure
Blogger assigns a unique numerical ID to each blog. This ID is a crucial identifier for programmatic access via the Blogger Data API. The challenge arises because the public URL, which is what users typically see and share, does not contain this ID. Instead, it uses a subdomain or a path-based structure (e.g., yourblogname.blogspot.com
or www.yourcustomdomain.com
). The key to obtaining the ID without screen scraping lies in understanding how Blogger redirects or serves content, which often involves a hidden reference to this ID.
flowchart TD A[Friendly URL (e.g., example.blogspot.com)] --> B{Make HTTP Request} B --> C{Examine Response Headers} C --> D{Look for 'X-Blog-ID' Header} D -- Found --> E[Extract Blog ID] D -- Not Found --> F[Error: Blog ID not found]
Process for extracting Blogger Blog ID from a friendly URL
The 'X-Blog-ID' Header Method
The most reliable and efficient way to get a Blogger blog ID from its friendly URL is to make an HTTP request to the blog's URL and inspect the response headers. Blogger often includes a custom header, X-Blog-ID
, which contains the exact blog ID you need. This method is superior to screen scraping because it relies on a structured piece of data provided by Blogger itself, making it less prone to breakage from layout changes.
<?php
function getBloggerBlogId(string $blogUrl): ?string
{
// Ensure the URL has a scheme
if (!preg_match("/^https?:\[\/\\]/i", $blogUrl)) {
$blogUrl = 'http://' . $blogUrl;
}
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $blogUrl);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1); // Get headers
curl_setopt($ch, CURLOPT_NOBODY, 1); // Only get headers, no body
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // Follow redirects
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // For local testing, consider true in production
$response = curl_exec($ch);
$header_size = curl_getinfo($ch, CURLINFO_HEADER_SIZE);
$headers = substr($response, 0, $header_size);
curl_close($ch);
if (preg_match('/^X-Blog-ID: (\d+)/mi', $headers, $matches)) {
return $matches[1];
}
return null;
}
// Example Usage:
$friendlyUrl = 'https://blogger.googleblog.com/'; // Official Blogger blog
$blogId = getBloggerBlogId($friendlyUrl);
if ($blogId) {
echo "Blog ID for '{$friendlyUrl}': {$blogId}\n";
} else {
echo "Could not find Blog ID for '{$friendlyUrl}'\n";
}
$friendlyUrl2 = 'https://example.blogspot.com/'; // Replace with a real blog URL for testing
$blogId2 = getBloggerBlogId($friendlyUrl2);
if ($blogId2) {
echo "Blog ID for '{$friendlyUrl2}': {$blogId2}\n";
} else {
echo "Could not find Blog ID for '{$friendlyUrl2}'\n";
}
?>
curl
requests handle redirects (CURLOPT_FOLLOWLOCATION
) as Blogger URLs might redirect to their canonical versions, especially with custom domains.How the PHP Code Works
The provided PHP function getBloggerBlogId
takes a friendly Blogger URL as input and returns the blog ID. It uses cURL
to make a HEAD
request (or a GET
request with CURLOPT_NOBODY
set to true) to the URL. This is efficient as it only fetches the HTTP headers, not the entire page content. The CURLOPT_FOLLOWLOCATION
option is crucial because Blogger often uses redirects, especially for custom domains, to point to the actual blog content. After retrieving the headers, a regular expression searches for the X-Blog-ID
header and extracts the numerical ID.
CURLOPT_SSL_VERIFYPEER
is set to false
in the example for broader compatibility, it's highly recommended to set it to true
in production environments and ensure your system has up-to-date CA certificates for secure connections.