Tip: How to find a browser user-agent string
February 14, 2005 on 11:21 am | In General, PHP, Java, Python | Add a commentThis is a quick method of finding the user-agent string of any browser that supports JavaScript.
Type the following in the address bar.
javascript: alert(navigator.userAgent);
Related Posts:
- How to Format Dates for SQL in Java
- Dream coding a Composition
- Validating a Date Field with RIFE
- Rapid web development
TruStudio Plugin for Eclipse
January 25, 2005 on 9:39 am | In PHP, Java, Python | Add a commentIf you are looking to do PHP development in Eclipse, TruStudio from xored software is a very good candidate as an IDE/plugin.
I used it in the past when it was fully open-source and liked it. I have not had a chance to try it again since then, although their marketing e-mail sounds promising. The products page, when I finally looked at it, lists two versions of the plugin: foundation and professional. The features page further lists in details all the capabilities of TruStudio together with markings to differentiate the free foundation version from the professional one.
If I have to do further PHP coding, I will seriously consider this, despite my reservations towards Eclipse.
Technorati Tags: PHP, coding, Eclipse, plugin
Related Posts:
- MyEclipse IDE Now Supports Eclipse 3.1
- JSF development with NetBeans 4.0
- Essential Software for Mac OS X
- JPOX
Google News Parser in Python
March 27, 2004 on 9:37 am | In Python | 1 comment
NOTE:
Use at own risk. Google News may not tolerate parsing of its web site.
In this article, I present a Python that grabs the HTML from Google News and extracts links from specified sections.
The script makes use of the urllib and sgmllib libraries, both of which are included in the standard Python package.
One of the quirk used is subclassing FancyURLopener so that the user-agent can be masqueraded since Google blocks attempts to parse its pages.
This is done in the following code:
class MyURLOpener(urllib.FancyURLopener):
def __init__(self, *args):
self.version = "Mozilla//5.001 (windows; U; NT4.0; en-us) Gecko//25250101"
apply(urllib.FancyURLopener.__init__, (self,) + args)
Another neat addition is caching. The expiry time of the cache file is set by the constant CACHE_DELAY.
CACHE_DELAY = 600 # in seconds
The parsing code is abstracted in the class GoogleNewsParser, a subclass of SGMLParser that, as its name implies, allows parsing the HTML code. The concept is simple. There are various tag handlers that are executed when the parser comes across the corresponding tags. For example:
def start_a(self, attrs):
for k, v in attrs:
# ignore all entries except those for the category we are looking for
if k == "name" and v == self.currentCategory:
self.categoryOn = 1
break
elif k == "name" and v != self.currentCategory and self.categoryOn == 1:
self.categoryOn = 0
self.setnomoretags()
if self.categoryOn == 1:
for k, v in attrs:
# look for external links. when we find one, we start reading its title
if k == "href" and re.search("^#", v) == None and re.search("^/news?", v) == None and re.search("^../", v) == None:
self.currentAnchorHREF = v
self.dataOn = 1
else:
self.currentAnchoreHREF = ""
self.dataOn = 0
And, later:
def end_a(self):
if self.categoryOn == 1 and self.dataOn == 1:
self.addUrl()
self.dataOn = 0
Because of the abstraction, this simple code needs to be called from any program:
# MAIN PROGRAM
parser = GoogleNewsParser()
parser.process(GOOGLE_URL, "REGION")
# Print the links
links = parser.getHrefs()
for k in links:
print k[0] + ">>" + k[1]
The complete source code is here.
Related Posts:
- wxPython on Panther
- Beagle Dynamic Desktop Search Tool
- IM Online Status Indicators
- MSN toobar Suite Beta
wxPython on Panther
October 27, 2003 on 8:46 pm | In Python | Add a commentwxPython can be installed on Mac OS X 10.3 (Panther) after installing MacPython.
However, the installation of wxPython will fail. After looking around and not finding any solution on the web, I found the following cause.
The MacPython for Mac OS X 10.3 installer does not include the Python core, but instead creates the framework directories required to Python work correctly. It does this by creating directories under /System/Library/Frameworks/Python.framework. Which is good.
The wxPython installer, on the other hand, attempts to install the wx and wxPython modules in /Library/Frameworks/Python.framework. Which is not good.
The solution is to manually move the files from /Library/Frameworks/Python.framework to /System/Library/Frameworks/Python.framework when the installation fails. Note that the modules should go in the site-packages directory somewhere within the latter.
For the wxPython demo app to run, you must edit RunDemo.app to point to the correct location of Python. Here is what needs to be done:
$ cd /Applications/wxPythonOSX-2.4.2.4/RunDemo.app/Contents/MacOS $ vim RunDemo (and change the first line to #!/usr/bin/python) $ ln -sf /usr/bin/python Python
Repeat for the other demo apps.
Note: You may need root privileges to edit these files.
Related Posts:
- How to Handle Exceptions in EJB
- Google News Parser in Python
- HOWTO: Improved JPOX integration with NetBeans 4.0
- Developing web applications with RIFE
Powered by blog.mu with Pool theme design by Borja Fernandez.

