Python Requests library redirect new url

Learn python requests library redirect new url with practical examples, diagrams, and best practices. Covers python, http, http-redirect development techniques with visual explanations.

Handling Redirects and Accessing New URLs with Python Requests

Hero image for Python Requests library redirect new url

Learn how to effectively manage HTTP redirects in Python using the Requests library, including how to retrieve the final URL after a redirect and control redirect behavior.

When making HTTP requests, it's common for a server to respond with a redirect, instructing your client to fetch content from a different URL. The Python requests library handles these redirects automatically by default, but understanding how to inspect the redirect chain and access the final URL is crucial for many web scraping, API interaction, and testing scenarios. This article will guide you through the requests library's redirect mechanisms, showing you how to retrieve the new URL, disable automatic redirects, and analyze the redirect history.

Understanding Automatic Redirects

By default, the requests library automatically follows HTTP redirects (status codes 301, 302, 303, 307, 308). This means that when you make a request to a URL that redirects, requests will transparently follow the redirect chain until it reaches the final destination or encounters an error. The Response object returned will contain the content from the final URL in the chain.

import requests

# Example URL that redirects (e.g., a shortened URL or old domain)
redirect_url = "http://httpbin.org/redirect-to?url=https://www.example.com"

response = requests.get(redirect_url)

print(f"Original URL requested: {redirect_url}")
print(f"Final URL after redirects: {response.url}")
print(f"Final status code: {response.status_code}")

Demonstrating automatic redirect following and retrieving the final URL.

Inspecting the Redirect History

Sometimes, knowing just the final URL isn't enough. You might need to understand the entire sequence of redirects that occurred. The response.history attribute provides a list of Response objects for each redirect that the request encountered before reaching its final destination. This list is ordered from the first redirect to the last.

import requests

redirect_url = "http://httpbin.org/redirect/3" # This URL redirects 3 times

response = requests.get(redirect_url)

print(f"Final URL: {response.url}")
print(f"Redirect history ({len(response.history)} redirects):")
for i, hist_resp in enumerate(response.history):
    print(f"  Redirect {i+1}: {hist_resp.status_code} from {hist_resp.url} to {hist_resp.headers['Location']}")

Accessing the redirect history to see each step of the redirection chain.

Hero image for Python Requests library redirect new url

HTTP Redirect Flow with Multiple Hops

Disabling Automatic Redirects

There are scenarios where you might want to prevent requests from automatically following redirects. For instance, you might want to manually inspect the redirect response, extract information from the redirect header (like the Location header), or handle specific redirect types differently. You can disable automatic redirects by setting the allow_redirects parameter to False in your request.

import requests

redirect_url = "http://httpbin.org/redirect-to?url=https://www.example.com"

# Disable automatic redirects
response = requests.get(redirect_url, allow_redirects=False)

print(f"Status code: {response.status_code}")
print(f"Response URL (original request): {response.url}")

if response.is_redirect:
    print(f"This was a redirect! New location: {response.headers['Location']}")
else:
    print("No redirect occurred or it was handled automatically.")

Disabling automatic redirects to manually inspect the redirect response.

Practical Applications and Best Practices

Understanding redirect behavior is vital for robust web interactions. Here are some common use cases and best practices:

  • Web Scraping: Always check response.url to ensure you're scraping the content from the intended page, especially when dealing with shortened URLs or sites with dynamic redirects.
  • API Interactions: Some APIs might use redirects for authentication flows or resource relocation. Knowing how to handle them ensures your application follows the correct path.
  • Link Validation: When validating external links, disabling redirects can help you identify if a link immediately redirects, which might indicate an outdated or broken link.
  • Performance: While requests handles redirects efficiently, a very long redirect chain can impact performance. response.history can help identify such chains.
  • Security: Be cautious with redirects, especially when dealing with user-provided URLs, as they can be used in phishing or open redirect vulnerabilities. Always validate the Location header if you're manually processing redirects.