Python Requests library redirect new url

Learn python requests library redirect new url with practical examples, diagrams, and best practices. Covers python, http, http-redirect development techniques with visual explanations.

Handling Redirects and Accessing New URLs with Python Requests

A visual representation of an HTTP redirect chain, showing an initial request, an intermediate redirect, and the final destination URL. Arrows indicate the flow of redirection.

Learn how to effectively manage HTTP redirects in Python using the requests library, including accessing the final URL and understanding redirect behavior.

When making HTTP requests, it's common to encounter redirects. A server might tell your client that the resource you're looking for has moved to a different URL. The Python requests library handles these redirects automatically by default, but understanding how to inspect and control this behavior is crucial for robust web scraping, API interactions, and general HTTP client development. This article will guide you through managing redirects, accessing the final URL after a redirect, and configuring requests to suit your needs.

Understanding HTTP Redirects

HTTP redirects are a standard mechanism for a web server to inform a client (like your browser or a Python script) that the resource it requested is now located at a different URI. This is often indicated by HTTP status codes in the 3xx range (e.g., 301 Moved Permanently, 302 Found, 307 Temporary Redirect, 308 Permanent Redirect).

By default, the requests library automatically follows these redirects. This means that when you make a request to a URL that redirects, requests will transparently follow the redirect chain until it reaches the final destination or a configured limit. The response object you receive will be for the final URL in the chain.

A flowchart illustrating the HTTP redirect process. It starts with 'Client sends GET request to URL A', followed by 'Server responds with 302 Redirect to URL B'. Then 'Client automatically sends GET request to URL B', and finally 'Server responds with 200 OK for URL B'.

Typical HTTP Redirect Flow

Accessing the Final URL After Redirects

After requests has followed a redirect, the response object provides several attributes to inspect the redirect chain and the final URL. The most important attribute for getting the final URL is response.url.

Additionally, response.history is a list of Response objects from the redirect chain, ordered from the first redirect to the last. Each object in response.history represents an intermediate redirect response, allowing you to see the status codes and URLs of each step in the redirection process.

import requests

# Example URL that redirects (e.g., a shortened URL or an old page)
redirecting_url = 'http://httpbin.org/redirect-to?url=http://httpbin.org/get'

response = requests.get(redirecting_url)

print(f"Initial request URL: {redirecting_url}")
print(f"Final URL after redirects: {response.url}")
print(f"Final status code: {response.status_code}")

if response.history:
    print("\nRedirect history:")
    for resp in response.history:
        print(f"  - {resp.status_code} {resp.url}")
else:
    print("\nNo redirects occurred.")

Getting the final URL and redirect history

Disabling Automatic Redirects

There are scenarios where you might want to prevent requests from automatically following redirects. For instance, you might want to manually inspect the redirect response, extract the redirect URL, or handle specific redirect codes differently. You can disable automatic redirects by setting the allow_redirects parameter to False in your request call.

import requests

redirecting_url = 'http://httpbin.org/redirect-to?url=http://httpbin.org/get'

# Disable automatic redirects
response_no_redirect = requests.get(redirecting_url, allow_redirects=False)

print(f"Request URL (no redirects): {redirecting_url}")
print(f"Status code: {response_no_redirect.status_code}")
print(f"Location header (redirect target): {response_no_redirect.headers.get('Location')}")
print(f"Response URL (original): {response_no_redirect.url}")

if response_no_redirect.is_redirect:
    print("\nThis was a redirect response!")
    print(f"Redirect target: {response_no_redirect.headers['Location']}")
else:
    print("\nThis was not a redirect response.")

Disabling redirects and inspecting the redirect response

💡

When allow_redirects is False, the response.url will be the URL of the initial request, not the target of the redirect. To find the target, you'll typically need to check the Location header in the response.

Handling Too Many Redirects

By default, requests has a redirect limit of 30. If a request encounters more than 30 redirects in a chain (which can happen with misconfigured servers or malicious redirect loops), it will raise a TooManyRedirects exception. You can adjust this limit using the max_redirects parameter within a Session object, though it's generally not recommended to set it excessively high.

import requests
from requests.exceptions import TooManyRedirects

# This URL will redirect many times
long_redirect_chain_url = 'http://httpbin.org/redirect/35' # 35 redirects

try:
    response = requests.get(long_redirect_chain_url)
    print(f"Successfully reached final URL: {response.url}")
except TooManyRedirects:
    print("Error: Too many redirects encountered!")
    # You can still inspect the last response before the exception
    # For example, if you caught the exception from a session object
    # print(session.history[-1].url) # This would require a session object

# Example with a custom redirect limit using a Session
with requests.Session() as session:
    session.max_redirects = 5 # Set a lower limit for demonstration
    try:
        response = session.get(long_redirect_chain_url)
        print(f"Successfully reached final URL with custom limit: {response.url}")
    except TooManyRedirects:
        print(f"Error: Too many redirects encountered with custom limit ({session.max_redirects})!")
        # The last response before the exception is often available in session.history
        if session.history:
            print(f"Last URL before exception: {session.history[-1].url}")

Handling and configuring redirect limits

⚠️

Be cautious when increasing max_redirects. A very high limit can lead to infinite loops or excessive resource consumption if a server is misconfigured.

Python Requests library redirect new url

Tags:

Categories:

Handling Redirects and Accessing New URLs with Python Requests

Understanding HTTP Redirects

Accessing the Final URL After Redirects

Disabling Automatic Redirects

Handling Too Many Redirects