Random User Agent String Python Script
This is more of a helpful snippit than a useful program but it can sometimes be useful to have some user agent strings handy for web scraping.
Some websites check the user agent string and will filter the results of a request. It’s a very simple way to prevent automated scraping. But it is very easy to get around. The user agent can also be checked by spam filters to help detect automated posting.
A great resource for finding and understanding what user agent strings mean is UserAgentString.com.
This simple snippit uses a file containing the list of user agent strings that you want to use. It can very simply source that file and return a random one from the list.
Here’s my source file UserAgents.txt:
Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.3) Gecko/20090913 Firefox/3.5.3 Mozilla/5.0 (Windows; U; Windows NT 6.1; en; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3 (.NET CLR 3.5.30729) Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3 (.NET CLR 3.5.30729) Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.1) Gecko/20090718 Firefox/3.5.1 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/532.1 (KHTML, like Gecko) Chrome/4.0.219.6 Safari/532.1 Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; InfoPath.2) Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; SLCC1; .NET CLR 2.0.50727; .NET CLR 1.1.4322; .NET CLR 3.5.30729; .NET CLR 3.0.30729) Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.2; Win64; x64; Trident/4.0) Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; SV1; .NET CLR 2.0.50727; InfoPath.2)Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 6.0; en-US) Mozilla/4.0 (compatible; MSIE 6.1; Windows XP)
And here is the python code that makes getting a random agent very simple:
#!/usr/bin/env python # -*- coding: utf-8 -*- # (C) 2009 HalOtis Marketing # written by Matt Warren # http://halotis.com/ import random SOURCE_FILE='UserAgents.txt' def get(): f = open(SOURCE_FILE) agents = f.readlines() return random.choice(agents).strip() def getAll(): f = open(SOURCE_FILE) agents = f.readlines() return [a.strip() for a in agents] if __name__=='__main__': agents = getAll() for agent in agents: print agent
You can grab the source code for this along with my other scripts from the bitbucket repository.
More from halotis.com
Related posts:
- Python Web Crawler Script
- Wordpress Blog Posting Robot
- Getting Ezine Article Content Automatically with Python
- Translating Text Using Google Translate and Python
- Google Page Rank Python Script



Random User Agent String Python Script – This is more of a helpful snippit than a useful program but it can sometim.. http://bit.ly/2u1z4K
This comment was originally posted on Twitter