Tuesday, November 24, 2009

Django template urlencode unicode characters (Google App Engine)

I have just started work on my next Google App Engine project, and I have been using the Django Template urlencode filter to urlencode strings before displaying them as components inside an URL like this: <a href="/?{{ unsafe_string|urlencode }}" title="..">..</a>, to prevent XSS (Cross Site Scripting) attack, etc. This is all good as long as unsafe_string does not contain unicode characters. Underneath, the Django urlencode filter calls urllib.quote method to do the encoding. Unfortunately, this method does not like unicode characters, and a string containing unicode (utf-8) characters has to be utf-8 encoded before it can be put through urllib.quote (and hence the Django urlencode filter).

So I created a custom Django template filter to circumvent this issue:

Note: I have only tested it in Google App Engine (GAE) using its Django 0.96 template support. Also, the registration mechanism described below is specific to Google App Engine's own templating engine. However, there is no reason why this custom filter would not work in non-GAE Python apps using Django 0.96 or 1.0 templates. Please follow instructions here to configure your app if you are not using GAE.

The custom tag (which handles both unicode and "non-unicode" characters) itself is very straightforward to create.

I created a module customtags.xss (you can name it something else) as follows:

1) under the root folder of my GAE application, I created a folder "customtags"

2) inside customtags, I created two empty files xss.py and __init__.py (__init__.py will remain empty)

3) inside xss.py, I put
import types
import urllib
from django import template

register = template.Library()

@register.filter
def unicode_urlencode(value):
if type(value) is types.UnicodeType:
return urllib.quote(value.encode("utf-8"))
else:
return urllib.quote(value)
So, the filter utf-8 encodes the string first if it is of type UnicodeType. (You may want to handle exceptions more gracefully though - at the moment, I am not catching exceptions...)

Let's look at how to register this template library so that you can use the filter inside your Django templates.

In your application script, for example, YOUR_APP_NAME.py, simply insert the line:
webapp.template.register_template_library("customtags.xss")
after the import statements at the top of the script. Obviously, if you are calling your module something else, you have to change the line above accordingly.

Now you can write <a href="/?{{ unsafe_string|unicode_urlencode }}" title="..">..</a> in your template!

Friday, November 13, 2009

GPolyline decoding in Python

I am currently working on providing GPX data for routes on http://zoomaroundtown.appspot.com. As the routes are stored as Google Map GPolyline encoded strings in the datastore, I need a way to decode them back into latitude and longitude points before I can create the respective GPX files. Mark McClure has an excellent page on this topic. Apart from his Javascript implementation, I have also found a PHP port of the decoder function. However, there does not seem to the any Python port, so I created a straightforward port:

You can donwload the Python file here.



def decode_line(encoded):

"""Decodes a polyline that was encoded using the Google Maps method.

See http://code.google.com/apis/maps/documentation/polylinealgorithm.html

This is a straightforward Python port of Mark McClure's JavaScript polyline decoder
(http://facstaff.unca.edu/mcmcclur/GoogleMaps/EncodePolyline/decode.js)
and Peter Chng's PHP polyline decode
(http://unitstep.net/blog/2008/08/02/decoding-google-maps-encoded-polylines-using-php/)
"""

encoded_len = len(encoded)
index = 0
array = []
lat = 0
lng = 0

while index < encoded_len:

b = 0
shift = 0
result = 0

while True:
b = ord(encoded[index]) - 63
index = index + 1
result |= (b & 0x1f) << shift
shift += 5
if b < 0x20:
break

dlat = ~(result >> 1) if result & 1 else result >> 1
lat += dlat

shift = 0
result = 0

while True:
b = ord(encoded[index]) - 63
index = index + 1
result |= (b & 0x1f) << shift
shift += 5
if b < 0x20:
break

dlng = ~(result >> 1) if result & 1 else result >> 1
lng += dlng

array.append((lat * 1e-5, lng * 1e-5))

return array

if __name__ == "__main__":
latlngs = decode_line("grkyHhpc@B[[_IYiLiEgj@a@q@yEoAGi@bEyH_@aHj@m@^qAB{@IkHi@cHcAkPSiMJqEj@s@CkFp@sDfB}Ex@iBj@S_AyIkCcUWgAaA_JUyAFk@{D_]~KiLwAeCsHqJmBlAmFuXe@{DcByIZIYiBxBwAc@eCcAl@y@aEdCcBVJpHsEyAeE")
for latlng in latlngs:
print str(latlng[0]) + "," + str(latlng[1])


Hope you Python GMap coders out there will find this useful!

Wednesday, November 04, 2009

Javascript Tester iGoogle gadget

Have been looking at iGoogle gadgets for work (I know they are so 2008, but still a good idea...) Anyway I created a simple Javascript Tester gadget for myself, which some of you may find useful: (See this gadget in the iGoogle directory)

Tip: Handy for testing (parsing) JSON objects!

Being an iGoogle gadget, you can add it to you iGoogle homepage, but you can always just use the embedded gadget right here!

Monday, November 02, 2009

Twitter Search API and Google App Engine

So in my previous post, I talked about using Twitter as a commenting engine in http://zoomaroundtown.appspot.com/, and, towards the end, I briefly mentioned that the Twitter Search API (http://apiwiki.twitter.com/Twitter-API-Documentation) does not really work on Google App Engine (and most other cloud environments).

The reason why making Twitter Search API requests on Google App Engine often fails is simple. Twitter rate-limits requests per IP, and on Google App Engine, you share IPs with loads of other apps, a lot of which are probably trying to do the same thing as you i.e. making Twitter Search requests. Unfortunately Twitter Search API is still not incorporated into the Twitter REST API, which means there is no way of identifying requests via your username, so Twitter can only really rate-limit according to IPs.

To circumvent this, I created a simple PHP proxy on my other website (hosted on a "real" machine") :

$url = $_GET['url'];
$session = curl_init($url);
curl_setopt($session, CURLOPT_HEADER, false);
curl_setopt($session, CURLOPT_RETURNTRANSFER, true);
$json = curl_exec($session);
header("Content-Type: application/json");
echo $json;
curl_close($session);


and in my GAE Python app, instead of sending the requests straight to Twitter using URL Fetch, I send the requests to this proxy, which does a simple straightforward relay job for me.

Of course, this is less than ideal, as this goes in the face of the philosophy of deploying in a scalable cloud environment such as Google App Engine. But until Twitter allows username-based identification in Twitter Search requests, or Google sorts something out with Twitter (and other service API providers, for that matter), this seems to be the best way to get round this problem!

(I have heard that you are get your own IP on Amazon AWS, but I don't know enough about Amazon AWS to comment on it. Anyway no doubt more and more cloud infrastructures relying on shared IPs will be developed, and I can only see it heading that way...)

Using Twitter as a commenting engine for Zoom around Town

So with the number of routes in http://zoomaroundtown.appspot.com/ increasing daily, the time has come for me to start considering implementing some kind of commenting functionality for the added cycle routes. I have not had the time to implement a user login system (I am still hoping to see more routes in the database first), let alone a commenting system. However one day, a simple idea came to me. What about getting people to comment on routes by creating tweets and then all I have to do is to display these tweets using Twitter Search. I was quite keen on exploring this idea, and I implemented the following in Zoom around Town:

For each route, I programmatically create a bit.ly link for its unique page url e.g. (http://bit.ly/39oF3z for http://zoomaroundtown.appspot.com/findRoutes?id=24001). On the page, I then display a message telling people to comment on the route by sending a tweet starting with @zoomaroundtown http://bit.ly/39oF3z (I'm even including a link that automatically populates the starting text for the user: http://twitter.com/home?status=%40zoomaroundtown%20http%3A%2F%2Fbit.ly%2F39oF3z%20).

To display comments, I use the Search API (http://apiwiki.twitter.com/Twitter-API-Documentation) to search for all tweets sent to zoomaroundtown containing the string http://bit.ly/39oF3z i.e. using the search string to:zoomaroundtown http://bit.ly/39oF3z.

Whether this is a viable way of enabling commenting Zoom around Town, only time can tell. However there are obvious pros and cons:

PROs:

Extremely easy to implement (especially when I don't have a user login system yet)

Increased site exposure on Twitter

CONs:

Twitter Search only seems to return tweets no more than 2 weeks old

I have no control over content (but since these are public tweets, do I really care?)

Last but not least, something unfortunately I only managed to discover after deployment to the live cloud environment - Twitter Search's rate limiter does not seem to like Google App Engine, and it is this point that will lead us to my next post, but for now it suffices to say that Twitter Search rate-limits requests per IP, so it is not surprising that in a cloud environment where one is sharing IPs with loads of other apps, some requests are bound to get rejected!