Google

Thursday, November 23, 2006

Google Local's AJAX Data Objects Demystified

JSON is slowly, but steadily, becoming an Internet standard. Refer to the RFC 4627. Not too long ago, Google Local server changed its AJAX data object format from XML to JSON-like notation. In this log, we will see how we tweak the Google Local data object to suit our JSON parser (a.k.a. eval())

Background

JSON standard requires the name in the name value pair to be a string enclosed in quotation marks. The Google Local server sends AJAX data object in a JSON-like format in which the name is not enclosed in quotation marks. For example:
{
  title: "from: ... to: 15939 ... - Google Maps",
  vartitle: "",
  url: "/maps?saddr=...&daddr=...&ie=UTF8",
  urlViewport: false,
  form:
  {
    selected: "d",
    q:
    {
      q: "from: ... to: ..."
    },
    .
    .
    .
}

This disparity between actual data object and the standard can easily be eliminated by enclosing all the names in quotes. The best way to achieve this is to use regular expression. A regular expression can easily find a name by searching for a alphanumeric word before colon. Of course, care must be taken not to confuse words before colons within a string value. The following code snippet shows how to achieve this:
# Javascript constants not known to python
false = False
true = True
null = None

# Helper RegEx based function to enclose names in quotes
# (which is required in JSON) - Google server does not put
# quote around names
# E.g. {title: "From: Home To: Office"} is converted into
#      {"title": "From: Home To: Office"}
import re
pat = re.compile(r"""
 (?P<name>\w+)       # Name of JSON pair
 \s* :               # Whitespace, and a colon
 |                   # Or
 (".*?")             # Enclosed in double quotes
""", re.VERBOSE)
def subfunc(match):
    # If pattern was found enclosed in double quotes,
    # do not substitute
    if match.group(2):
        return match.group(2)
    else:
        return '"'+match.group(1)+'"'+':'

This "pre-processed" data object is ready to be consumed by our JSON parser to give us all the information we need from the Google Local server.

Summary

This little log just goes on to show that with Python (even on an S60 handset) we can get the information we need in the format we want with minimal effort. I wasted much time thinking about parsing techniques before I stumbled across this simple solution. And there is unmistakable elegance in simplicity, don't you think?

Labels: , , , ,

Google

Wednesday, November 22, 2006

JSON Parser Using Python


JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. More information about JSON can be found at http://www.json.org/. Yahoo's Douglas Crockford developed this format and within a short period, many web based services have adopted JSON as the primary data interchange format. In this log, we will look at one such service.

Background

There are a bunch of JSON parsers out there, such as:

An example JSON document appears as follows:

{
  "Image": {
  "Width": 800,
  "Height": 600,
  "Title": "View from 15th Floor",
  "Thumbnail": {
    "Url": "http://www.example.com/image/481989943",
    "Height": 125,
    "Width": "100"
    },
  "IDs": [116, 943, 234, 38793]
  }
}

For a Python developer, a JSON document is nothing but a dictionary with each element being either an element, array or another dictionary. We can use this fact to our advantage, by simply using the eval() function provided by Python to not only parse this document, but to create a data structure in one statement!

The following code snippet shows how to use the eval() function to parse JSON object. This snippet is based on http://developer.yahoo.com/python/python-json.html:


# If needed set the proxy address
import os
os.environ['http_proxy'] = 'http://[proxy_host]:[proxy_port]/'
import urllib
APP_ID = 'YahooDemo' # Change this to your API key
SEARCH_BASE = 'http://api.search.yahoo.com/WebSearchService/V1/webSearch'

class YahooSearchError(Exception):
    pass

def search(query, results=20, start=1, **kwargs):
    kwargs.update({
        'appid': APP_ID,
        'query': query,
        'results': results,
        'start': start,
        'output': 'json'
    })
    url = SEARCH_BASE + '?' + urllib.urlencode(kwargs)
    f = urllib.urlopen(url)
    buff = f.read().replace('\\/', '/')
    f.close()
    result = eval(buff)
    if 'Error' in result:
        # An error occurred; raise an exception
        raise YahooSearchError, result['Error']
    return result['ResultSet']

info = search('json python')
results = info['Result']
for result in results:
    print result['Title'], result['Url']


Caveats

The eval() function is not safe from programmer errors. Putting a try...catch around the function call fixes that problem. But there is another sinister problem with eval(), not unlike SQL. The eval() function will not distinguish between a JSON object and a malicious statement. Consider the following:

import os
.
.
eval("os.remove('something_very_important')")

Care should be taken to use eval when the string to be evaluated is from a trusted source. I consider Yahoo! or Google servers trusted.

Summary

This article briefly touches upon using the eval() function to understand JSON documents. This will be further elaborated when we will look at Google Local client-server interactions later. I hope you find this trick useful...
Google

Sunday, November 19, 2006

Virtual Earth Maps on the Go

For a map client, it is necessary to get the maps for a given location. Most GPS navigation systems store the maps of the entire nation at all zoom levels in their memory. On handsets, this might not be the best strategy. We know that Virtual Earth server has all the maps required for a map client, just an HTTP request away. All we need to know is how to get them.

Background

In the last log, we checked out some of the geocoding services provided by virtual earth and used them in our Python on S60 programs. In this one, we will see how to download the maps of the location of your choice. Our primary goal here is to find a mapping function from latitude, longitude, zoom to a URL to download the map. Once we have a mapping function, Python will take care of all the basic functionality to give us a fully functional map client without having to store large amount of data.

How Virtual Earth Stores Maps

The Virtual Earth map system, like many map systems (e.g. Google Maps) uses Mercator projection to fit Earth's curved suface onto a flat sheet (screen in present case). The Mercator projection as described in the Wikipedia article is a mathematical scheme that converts latitude into an y value and longitude into x. These x and y values are then used very easily to store and retrieve maps from the server. When I started putting together a map client for Virtual Earth, I thought that once this mapping function is implemented, half my job will be done. But that wasn't to be as Virtual Earth server stores quad-tree encoded x, y values (a la Google Maps satellite images server).

Mapping Function

The mapping function to map the latitude, longitude, zoom to map URL is as follows:

url = f(latitude, longitude, zoom)


import math
earthRadius = 6378137
earthCircum = earthRadius * 2.0 * math.pi
earthHalfCirc = earthCircum / 2.0

def LongitudeToXAtZoom(lon, zl):
    arc = earthCircum / ((1 << zl) * 256)
    metersX = earthRadius * DegToRad(lon)
    return int(round((metersX + earthHalfCirc) / arc))

def LatitudeToYAtZoom(lat, zl):
    arc = earthCircum / ((1 << zl) * 256)
    sinLat = math.sin(DegToRad(lat))
    metersY = earthRadius / 2 * math.log((1.0 + sinLat) / (1.0 - sinLat))
    return int(round((earthHalfCirc - metersY) / arc))

def DegToRad(d):
    return d * math.pi / 180.0

def RadToDeg(r):
    return r * 180.0 / math.pi

def TileToQuadKey(tx, ty, zl):
    quad = ''
    for i in range(zl, 0, -1):
        mask = 1 << (i - 1)
        cell = 0
        if (tx & mask) != 0:
            cell = cell + 1
        if (ty & mask) != 0:
            cell = cell + 2
        quad = quad + str(cell)
    return quad

def GetTileSpecs(lat, lon, zoom):
    tx = LongitudeToXAtZoom(lon, zoom) / 256
    cx = LongitudeToXAtZoom(lon, zoom) % 256
    ty = LatitudeToYAtZoom(lat, zoom) / 256
    cy = LatitudeToYAtZoom(lat, zoom) % 256
    server = ((tx & 1) + ((ty & 1) << 1)) % 4
    q = TileToQuadKey(tx, ty, zoom)
    filename = 'r%s.png' % q
    url = 'http://r%d.ortho.tiles.virtualearth.net/tiles/%s?g=22' % (server, filename)
    return (filename, url, cx, cy)

lon = -117.068092
lat = 32.9913528
zoom = 17
print GetTileSpecs(lat, lon, zoom)



This Python code snippet is another exmaple of how easily we can implement mapping client in Python. Adding Python for S60 wrapper around this to have a map client on the phone. I have done some ground work to get you started. Avaiable at:


Summary

Hopefully, this log will get you excited, not just about Python, but about mapping techniques as well. It is very interesting to learn how map service providers implement their services. I hope you can build upon the available code to add address search and direction finding using the routines in velocation.py. Happy mapping...
Google

Saturday, November 18, 2006

Virtual Earth Geocoding Services - Python Style

In this log, I will take you through using Python to utilize the Virtual Earth geocoding services on your PyS60 handset.

Background

Geocoding is the analysis technique of geo-demographic data such as ZIP codes, counties, regions, etc. These techniques include, among others, translating a postal address into a geographical coordinates. Web portals such as Google Maps and Virtual Earth allow the users to do this through their portals. What if you had power to do this from a handset, which might not be able to open these JavaScript heavy web portals?

Virtual Earth Services

Virtual Earth is a Microsoft product that provides web bases map services amongst other to the users. You can find a US postal address on the map, find directions between 2 points, locate yourself on the map (e.g. based on your IP address) and lot more. These services are consumed by the code-behind running on the website and are invisible to the user. But a simple network packet watcher shows the following:

Find Address:
http://local.live.com/search.ashx?b=[address]

Locate Me:
http://local.live.com/WiFiIPService/locate.ashx

Find Directions:
http://local.live.com/directions.ashx?start=[address1]&end=[address2]

The responses are in form of C# code. E.g. when searching for an address, you can expect:
/*1.3.0515*/
SetViewport(32.9500497939651,-117.086802489646,32.9307362060349,-117.117483510354);
VE_Scratchpad.AddLocation('12278 Scripps Summit Dr, San Diego, CA 92131-3697, United States', 32.940393, -117.102143, '');


Python Solution

Of course, you can use all the frameworks available to achieve what we are set out to do. Our solution is quite elegant as you will see in some time. We make use of the fact that C# syntax resembles that of Python when it comes to arrays. If we take out comments, function names, and class names, we have tuple of arrays of tuples (see the example above).

The solution looks as follows:


# If needed set the proxy address
import os
os.environ['http_proxy'] = 'http://[proxy_host]:[proxy_port]/'

# Definations for eval()
false = 0
true = 1

import urllib

# To remove the comments and non-py elements in the response buffer
import re

comment_pat = re.compile('(/\*.*?\*/)(".*?")', re.S)
def subfunc(match):
   if match.group(2):
      return match.group(2)
   else:
      return ''

nonpy_pat = re.compile('(new [\w\s,.]+)(".*?")', re.S)

def getAddress(address):
   encaddress = urllib.urlencode({'b': address})
   url = 'http://local.live.com/search.ashx?%s' % encaddress
   sock = urllib.urlopen(url)
   resp = comment_pat.sub(subfunc, sock.read())
   sock.close()
   lines = resp.split(';')
   for instruction in lines:
      if instruction.find('VE_Scratchpad.AddLocation') > -1:
         break
   locationstr = instruction.replace('VE_Scratchpad.AddLocation', '')
   (infostr, lat, lon, pad) = eval(locationstr)
   return (infostr, lat, lon)

def locateMe():
   url = 'http://local.live.com/WiFiIPService/locate.ashx'
   sock = urllib.urlopen(url)
   resp = comment_pat.sub(subfunc, sock.read())
   sock.close()
   locationstr = resp.replace('SetAutoLocateViewport', '')[:-1]
   (lat, lon, zoomradius, unknown, message) = eval(locationstr)
   return (lat, lon, zoomradius)

if __name__ == "__main__":
   result = getAddress('12278 scripps summit dr san diego ca')
   print result
   result = locateMe()
   print result



Summary

This example shows how simply Python eval() allows parsing complex text. I have left few issues to be addressed, such as ambigous address, etc. Moreover, you can explore the directions as well. Hope you enjoy!
Google

Wednesday, November 15, 2006

Text To Speech Using Python for S60

Recently, I came across a very interesting post to symbianexample.com by Artem http://symbianexample.com/texttospeech. This article publishes a 'hidden' S60 feature - An MMF plugin that is able to play text!! This plugin apparently exists only for S60 2.8+ phones. I have not been able to confirm that.

Imagine your program synthesizing any text... Who would not like their programs to speak intelligently to the user? I for one would for sure!!

Now, I am a die-hard PyS60 fan... I jumped on this opportunity to enable PyS60 scripts with TTS. The result was a Python extension, _ttsplayer.pyd.

_ttsplayer.pyd is a C++ Python extension DLL. This DLL wraps the CTtsPlayer class provided by Artem in his original post. The extension provides only one method:

playtext(text) # text is unicode


The usage of this module is very simple:

import ttsplayer
ttsplayer.playtext(u'Hello Mr. Anderson')



As per Artem's article, this MMF plugin synthesizes the speech using Speaker INDependent (SIND) framework. As you will find out when you start using this module, the quality of the generated speech is poor. Although, in more recent S60 devices (such as N75 - cool phone! oh sorry, cool computer!) we see evidence of a High Quality TTS framework. I am not sure whether SIND framework will eventually merge with this High Quality TTS framework. I can only hope that it does!

The package is published here: http://www2.cs.uh.edu/~nikhilv/PyS60/ttsplayer.zip

The package contains the following components:
  • Source package - CTtsPlayer class and Python module wrapper, MMP file, PKG file, bld.inf file. These file allow you to create PYD extension module using your favorite S60 SDK.
  • Python module - This module wraps the C++ extension DLL loading etc.
  • Example python script - To show usage in a PyS60 script.
  • Unsigned SIS file S60 3.0 - You can sign this SIS using your favorite certificate and install it if you do not want to go through the trouble of compilation.


I hope you like this little extension module. There are countless uses for this module. One can imagine a PyS60 script that gets directions from one location to another (see Rantings of a Snake Charmer IV) and dictates the direction steps to the user using TTS!

Thanks to Artem for finding this hidden feature in S60!