from __future__ import *
Apache X-Forwarded-For caveat
September 23, 2005 at 02:08 PM | categories: python, debugging | View CommentsWhen using Apache's mod_proxy in a reverse proxying scenario, usually you'll want to take a look at the (seemingly undocumented) X-Forwarded-For header. This header contains whatever the client sent for X-Forwarded-For (if anything), plus the remote IP address of the client.
So, if you're trying to do anything with this header, check for commas and pick out the last piece, because the client can send anything they want to, and you shouldn't ever trust the client:
def get_request_ip(request):
"""get the IP of a request in twisted.web-speak"""
host = request.transport.getPeer().host
# Twisted doesn't support IPv6 anyway :)
if host != "127.0.0.1":
return host
header = request.received_headers.get('x-forwarded-for', None)
if header is None:
return host
return header.split(',')[-1].strip()
I gleaned this from reading the Apache 2.0.54 source, I couldn't find any description of the behavior of mod_proxy in Apache's docs, only how to configure it. I was surprised that it even preserved what the client sent!
Bookmarklet Based Debugging
July 03, 2005 at 03:04 PM | categories: debugging, AJAX, javascript, MochiKit | View CommentsI haven't seen this discussed much, but one of the techniques I've found really useful for debugging JavaScript is what we've been calling BBD: Bookmarklet Based Debugging. The reason we've settled on this is that it's entirely unobtrusive. Our pages look exactly how they're supposed to look, with no extra "debugging" widgets or text sitting around on the page. However, if I need more information, I just whack a bookmarklet and I get all of the information I need. I'm aware that there are various bookmarklets out there that display various kinds of JavaScript information that might be useful for debugging, but not in such a way that you instrument your page for application specific BBD. Also, most of the JavaScript debugging bookmarklets out there only work in Mozilla, and I do most of my development in Safari.
The reason this works for us is because we use a common logging infrastructure across our whole web app: MochiKit.Logging. This, and the rest of MochiKit, will be liberally licensed open source (likely MIT) in less than two weeks. MochiKit.Logging is vaguely similar to Python's logging module or Apache's log4j, but without all of the headaches and mindless configuration sludge.
Typical usage for our Logger object looks like this:
// these are global functions for convenience:
log("some", "objects", "here", "at", "the", "INFO", "level");
logDebug("some", "objects", "here", "at", "the", "DEBUG", "level");
// or more generally:
logger.baseLog(logLevel, objects...);
When one of these log methods are called, two things happen:
- Any attached listeners to the logger are notified (which may or may not do something interesting with the log message)
- The global logger stuffs the log message away in a log buffer (of configurable size, by default it's infinite) for easy retrieval
Listeners are really just for future enhancements (i.e. "real-time" log window by bookmarklet), but our current bookmarklet simply formats the last 40 messages or so as an alert message and spits it to the browser.
Currently, the bookmarklet URL for MochiKit.Logging's brain dump is simply: javascript:logger.debuggingBookmarklet(), which allows for future improvements (i.e. something better than alert(...)) without "upgrading" the bookmarkets in everyone's browser.
- UPDATE:
- MochiKit is now available. Check it out at mochikit.com!
Crashtastic
July 01, 2005 at 12:08 PM | categories: debugging, AJAX, javascript | View CommentsIt's amazing that anyone trusts web browsers. On more than one occasion, I've written perfectly valid and correct JavaScript code that works fine in one browser, and crashes another. Crashes are bad.
The fixes for these issues are always the hardest because it's not your fault. Normally when I have a problem in my JavaScript code, I take it to Firefox (I normally develop in Safari), as its debugging tools and JavaScript exceptions are about twelve billion times better than Safari and IE combined. That option is useless when the code works fine in Firefox.
The most recent problem I had was with Safari. Admittedly, this is the first time I've had a crashfest with Safari, so that's good, but this particular bug was quite nasty. I managed to track it down to the JavaScript function it was happening in with a series of alert() statements (why browsers have modal dialogs available beats the hell out of me; can anyone say denial of service?). However, when I started instrumenting the crashing function with alert statements, it didn't crash anymore. Heisenbugs that aren't your fault make you wish very bad things to happen to whomever deserves it.
In the end, the workaround was to replace some perfectly valid code with different perfectly valid code that had the same overall macro effect, but was locally quite distinct. I filed a few crash reports, but as I wasn't able to reproduce the heisenbug in less-then-2kloc-of-javascript scenario, I don't see a Radar happening any time soon.
iPod model detection
June 16, 2005 at 03:55 PM | categories: python, debugging, iPod, macosx | View CommentsFor various reasons, I have a need to determine the version of an iPod. Previously, I only needed to know if an iPod was "firmware version 2" or later (aka "Dock-connector"), and there's a whole slew of obvious ways to determine that (presence of a Notes folder would be the most obvious). There are also ways to grab iPod information with SPI (via COM on win32 or the private iPod.framework on Mac OS X), but I like to avoid that whenever possible.
So far, the most reliable method I've found was with a little parsing of the SysInfo file. This file is located at "$(IPOD_VOLUME)/iPod_Control/Device/SysInfo". The iPod_Control folder is going to be hidden on either platform, so you may not have noticed it before. Many examples of SysInfo files can be found by googling for some of the keys, such as buildID, visibleBuildID, boardHwName, etc. The most complete resource I've found is the on the iPodLinux forums.
As-is, the SysInfo files don't make any of the useful information very obvious. However, with a little reverse-engineering of the iPod Updater for Mac OS X, it was relatively easy to figure out what it was using to determine the iPod model.
In the Resources folder of any iPod Updater, you'll find a lot of interesting things. The most interesting bits are UpdaterVersions.plist and the Updates folder.
UpdaterVersions.plist is a typical XML property list with a single root key: Versions. Under this key you'll find a dictionary that maps updaterFamily to a dictionary of information about the iPods in that updaterFamily, not unlike the information you will find in a SysInfo. The dicts with a displayInAbout key with a true value are the most interesting, all of the other entries are simply variations on the theme (i.e. the iPod U2 Special Edition or the various colors of iPod Mini). When using this filter, you should have exactly one entry per unique iPodFamily value.
The Updates folder is interesting because it contains icons for each iPod, as well as the firmware images. The icons are named either DeviceIcon-$(iPodFamily).icns or DeviceIcon-$(iPodFamily)-$(iPodColor).icns. I'm relatively certain that the color can be determined from the last three characters of the pszSerialNumber, but I don't have enough iPods around to test that theory.
With all of this information, it would almost be easy to determine the model of an iPod. However, it's not quite that simple. Not all SysInfo files contain an iPodFamily key, and some that do incorrectly report 0! The only reliable identifier for an iPod family is the following (given a dict from either the UpdaterVersions.plist or from a SysInfo file):
def buildPair(dct):
buildID = dct.get('buildID', 0)
visibleBuildID = dct.get('visibleBuildID') or dct.get('VisibleBuildID', 0)
# grab these bits: 0xFF000000L
return (buildID >> 24, visibleBuildID >> 24)
What I'm doing here is pulling the "major" version out of the version keys. It appears that the versioning convention is: 0x0ABC8000 where the firmware version is pretty-printed as "A.B.C", trimming "C" and possibly "B" if they are 0. This is similar to the hex-version-convention you see in gestaltSystemVersion ('sysv') and other places. It would be a little less ugly if the UpdaterVersions.plist used the same case as the SysInfo files for the visibileBuildID key! The reason that we need both the buildID and the visibleBuildID is that the iPod Mini and "3G iPod" both report a buildID major version of 2, however the iPod Mini has a visibleBuildID major version of 1.
Parsing the SysInfo file in a manner that will grab the information in the right way is pretty trivial, but perhaps not so obvious:
import os
def infoForMountedPod(path):
path = os.path.join(path, u'iPod_Control', u'Device', u'SysInfo')
lines = file(path).read().splitlines()
d = {}
for line in lines:
try:
key, value = line.split(': ', 1)
except ValueError:
continue
try:
value = long(value.split(None, 1)[0], 0)
except (ValueError, IndexError):
pass
d[key] = value
return d
infoForMountedPod simply looks for all lines that look like a key-value pair (containing a ": "). For each of these lines, it will set the key-value pair in the returned dictionary. For lines whose first "word" (split on whitespace) is parseable as an integer (typically as hex), it will be set as an integer. Otherwise, it is set to the string value of that line (i.e. pszSerialNumber).
Parsing UpdaterVersions.plist is pretty trivial too:
import plistlib
def parseUpdaterVersions(path):
# this is Python 2.4 plistlib API
# in 2.3 you can use plistlib.Plist.fromFile
versions = plistlib.readPlist(path)
buildPairs = {}
families = {}
for k,v in versions['Versions'].iteritems():
# only pick out unique iPodFamily dicts
if v['displayInAbout']:
fam = v['iPodFamily']
# skip pre-2.x iPods and iPod Shuffle.
# This is specific to my use case, you may
# not want this filter.
if 1 < fam < 128:
assert fam not in families
families[fam] = v
pair = buildPair(v)
assert pair not in buildPairs
buildPairs[pair] = v
return buildPairs, families
Note that all of this code is cross-platform, but you'll need an UpdaterVersions.plist or equivalent from somewhere, and you'll need to pick up plistlib.py from Python's src/Lib/plat-mac dir if using it on some other platform. Alternatively, you could use ElementTree to parse the plist XML. Its iterparse documentation has an excellent example of how easily it can be used to parse these files.
JavaScript sucks (volume 1)
June 02, 2005 at 05:17 PM | categories: debugging, AJAX, javascript | View CommentsJavaScript Sucks:
- You can extend types by assigning stuff to their prototype, but that can easily break code since these new things can show up when iterating over the properties of an instance. There is no way to hide these new properties. Only magical built-in prototypes can define hidden properties.
- Arrays are braindead because you can't trust the built-in ways to compare them. Strangely enough it seems that == is broken in most implementations, but the other comparison operators seem to do the right thing. However, that is observed behavior, and I don't trust it to do the same thing everywhere.
- The only built-in mapping is the object, which means all keys must be strings.
- Everything is an object, except for the three or forty things that aren't. Ugh.
- Referencing names of things that don't exist is not an error, but returns undefined. Referencing a property of undefined is an error. It's a great way to get an exception very, very, far away from the code that was broken.
- Some implementations don't like extraneous commas, so you can't sanely write multi-line literal objects or arrays. When you have them, you'll get strange errors and it will take you a long time to find and fix them.
- There's no way to introspect the stack beyond the function that's currently running, and the function that called it. Your test failed, somewhere. Debugging JavaScript sucks.
- Exceptions raised by "C code" often carry no information about the last JavaScript line that was running (if you were especially fortunate, it would simply crash the browser, in IE's case anyway). Finding where an error happened basically requires a binary search of the new code that you introduced with debugging statements until you find it.
- Unit tests are good, the JavaScript unit test frameworks available aren't. The fact that exceptions are so hard to track down makes unit tests basically necessary in order to remain sane, why the hell isn't there something as good as py.test? There's a port of Perl's TestSimple, which is usable, but there's plenty of things not to like about it right now (which is to be expected, at 0.03).
- There's no operator overloading.
- There's no sprintf (to format numbers, mostly).
- Generally, there are a lot of objects that behave like arrays, but aren't, so you either have to convert them to arrays or not expect to be able to use array methods everywhere. Fortunately, there isn't a quick and easy way to convert an array-like object to a real array. Awesome.
- No varargs syntax, but you can call functions with variable arguments and painfully extract additional arguments from the arguments local variable. So, it's kinda like Perl subs, except since arguments isn't a real array, you can't even shift off the arguments or otherwise use it as a regular array. Ugh.
- foo.bar() and var bar = foo.bar; bar(); do entirely different things. You can write a function that does return a "bound function", but then you have this: var bar = bind(foo.bar, foo);. Dumb!
IDN Spoofing Defense for Safari
February 07, 2005 at 05:51 PM | categories: python, debugging, macosx, py2app, PyObjC | View Comments- UPDATE:
- Apple has properly resolved this issue with Safari, see About Safari International Domain Name Support.
Soon after I got home from ShmooCon, I saw that the Shmoo Group came up with a new domain spoofing exploit for which "no defense exists". It's pretty amazing that browsers actually implement IDN without any kind of protection, so I decided to quickly hack up a defense for Safari on Mac OS X 10.3 (and probably later).
- Application:
- IDNSnitch.app.tgz (335k) USE AT YOUR OWN RISK. This probably will cause instability in Safari.
- Source:
- http://svn.red-bean.com/pyobjc/trunk/pyobjc/Examples/Inject/IDNSnitch
The hack is implemented in two parts, an application and a plugin.
Application
IDNSnitch.py is a simple application that scans NSWorkspace for Safari instances, and registers for application launch notifications for new Safari instances. When it sees an instance of Safari, it uses objc.inject to load the plugin into the target pid.
Plugin
IDNSnitchPlugin.py is where all the magic happens. It swizzles NSURLRequest's designated initializer by creating a category after caching the original IMP. This swizzled initializer calls into another method that checks the NSURL and has an opportunity to return a different one. The implemented checker that looks for the ACE (ASCII compatible encoding) prefix in the host of the given NSURL. If it sees an ACE prefix, it presents an alert panel to the user showing them the raw IDN URL, the "display" host name, and the unicode escaped host name. The user can then decide whether to allow or deny requests from this host, and this decision is cached for the rest of their Safari session, but not persisted. If they choose Deny, it simply returns about:blank rather than the original URL.
Discovery
In order to discover how to implement this hack, I attached gdb to Safari, like so:
% ps -auxwww|grep Safari bob 21062 0.0 3.9 254720 40976 ?? S 5:23PM 0:57.19 /Applications/Safari.app/Contents/MacOS/Safari -psn_0_193331201 bob 21103 0.0 0.0 17052 8 std R+ 5:40PM 0:00.00 grep Safari [meth:~] bob% gdb attach 21062
I then thought that I could easily pick out when Safari used URLs by putting a break point on NSURL's designated initializer:
(gdb) fb -[NSURL initWithString:relativeToURL:] Breakpoint 1 at 0x90a2d51c (gdb) c
After going to a few URLs, I noticed that it would often have the URL cached somehow. So then I looked at a NSURL backtrace and saw that NSURLRequest was probably used more often, so I put a break point on its designated initializer:
^C Program received signal SIGINT, Interrupt. 0x900074c8 in mach_msg_trap () (gdb) fb -[NSURLRequest initWithURL:cachePolicy:timeoutInterval:] Breakpoint 2 at 0x90a0b0b8 (gdb) c
NSURLRequest is indeed used all the time, so I took a look at what a spoofed URL looks like:
Breakpoint 2, 0x90a0b0b8 in -[NSURLRequest initWithURL:cachePolicy:timeoutInterval:] () (gdb) po $r5 https://www.xn--pypal-4ve.com/
At this point I had everything I needed to know, so I wrote the code.
UPDATE:
- Rewritten in pure Python (requires svn trunk of PyObjC), hopefully fixed threading bugs.
- Fixed some more bugs and made it smaller
- An alternate implementation of this is available in Saft v7.5.1 and later (have not tried it myself)
- One of the authors of the IDN standard writes about a more balanced solution to this issue. I had actually considered doing it this way, but I simply didn't have the time or interest in creating the custom dialogs required. This functionality should be in unicodedata, but it's not, though Blocks.txt would be trivial to parse.
- An up and coming Mozilla extension, TrustBar, attempts to solve this and other issues for Mozilla and FireFox
Advanced Debugging Techniques: ktrace
February 04, 2005 at 12:29 AM | categories: debugging, c, python, macosx, py2app, wxPython | View CommentsI had spent the past few days on and off trying to help a py2app user with a very hairy problem: when wxPython 2.4.2.4 was bundled, the main menu didn't work. Running the application as an alias bundle or with pythonw worked just fine, but when built as standalone or semi-standalone, the menu no longer showed up.
Right off the bat I (correctly) assumed it was a problem with wxPython or his sample code, because py2app doesn't do anything that would cause this sort of behavior. It does link to Cocoa, but it doesn't call into any AppKit functionality unless it needs to display an error message. Somewhat trivially converting his source to work with wxPython 2.5 solved this issue, so I was rather stumped.
I had a hunch that perhaps there was something that wxPython 2.4.2.4 wants that didn't end up in the bundle, so I broke out ktrace. Using ktrace is rather simple:
% ktrace ./dist/test.app/Contents/MacOS/test
This will create a file in the current directory, ktrace.out, with a log of just about everything that happened. For efficiency, this file is in a binary format that you must process with kdump. The output of kdump is quite lengthy, but it looks like this:
15121 ktrace CALL execve(0xbffffd1b,0xbffffc84,0xbffffc8c) 15121 ktrace NAMI "dist/test.app/Contents/MacOS/test" 15121 ktrace NAMI "/usr/lib/dyld" 15121 test RET execve 0
CALL is the actual system call, NAMI is a name to inode translation that the system call used, and RET is the value returned to the application. If there was an error during the system call, kdump will gladly tell you everything you wanted to know:
15121 test CALL open(0xbfffe7f0,0,0x1b6) 15121 test NAMI "/Library/Preferences/org.pythonmac.unspecified.test.plist" 15121 test RET open -1 errno 2 No such file or directory
Since I suspected that wxPython was missing a file, I wanted to narrow down the output, so I naturally used grep on the kdump output to find errors:
% kdump | grep -B 2 errno | grep wx 15121 test NAMI "/Users/bob/Desktop/simple/dist/test.app/Contents/Resources/wxPython" 15121 test NAMI "/Users/bob/Desktop/simple/dist/test.app/Contents/Resources/wxPython.so" 15121 test NAMI "/Users/bob/Desktop/simple/dist/test.app/Contents/Resources/wxPythonmodule.so" 15121 test NAMI "/Users/bob/Desktop/simple/dist/test.app/Contents/Resources/wxPython.py" ....
Unfortunately, what we're looking at here (and for the next 150 or so lines) is primarily just Python doing its module import search. Since I knew that all of sys.path pointed to locations under Resources, I can just filter that out. It is extremely unlikely that wxPython wants anything from there:
% kdump | grep -B 2 errno | grep wx | grep -v Resources 15121 test NAMI "/Users/bob/Desktop/simple/dist/test.app/Contents/MacOS/../Frameworks/libwx_mac-2.4.0.rsrc" 15121 test NAMI "/libwx_mac-2.4.0.rsrc"
There it is! So now I just need to make the setup.py look like the following:
from distutils.core import setup
import py2app
setup(
app = ['test.py'],
data_files = [
('../Frameworks', ['/usr/local/lib/libwx_mac-2.4.0.rsrc']),
],
)
Now the application works as expected. py2app 0.1.9 will include a recipe for wxPython to make sure that this file ends up in the bundle automagically, among other new features.
Disabling a CPU with the CHUD framework
January 30, 2005 at 02:14 PM | categories: debugging, c, macosx | View CommentsXcode Tools has an optional component, CHUD Tools (Computer Hardware Understanding Development Tools), that consists of some useful performance tools and low-level hardware facilities. My Dual 2ghz G5 has been having some serious stability problems lately, with what I believe is a dying CPU or logic board. When I saw errant CPU messages in the system log after experiencing unexplicable kernel panics and crashes I decided to see what would happen if I toggled the second CPU off with the Hardware preference pane that ships with CHUD. It worked! My G5 is now usable (though I will of course still get it repaired, but it's not convenient to do so at this time).
Unfortunately, when I reboot the machine, this setting is lost and all bets are off as to whether I'll be able to disable the second CPU before the machine crashes, so I decided to look into what I could do. I opened up the Hardware preference pane nib with Interface Builder to see what message was sent to change the CPU count (unsurprisingly, setCPUCount:), then I used class-dump to find the implementation address of that message. I then did an otool disassembly of the Hardware preference pane (otool -tVv ...) so that I could see what the code looked like at that address. It called an unconspicuously named function chudSetNumProcessors from the CHUDCore.framework subframework of the umbrella CHUD.framework, which happens to ship with documented headers!
At first, I tried writing a simple C program that naively called right into chudSetNumProcessors, which returned an error code that I didn't expect (from the documentation): something about the kext not being loaded. I knew the kext was indeed loaded, because the Hardware preference pane works and I've used Shark recently, so I looked at the headers for initialization functions. Unsurprisingly, I needed to call chudInitialize before trying to talk to the CHUD kext, so I ended up with the following program:
/*
% cc -o setNumProcessors setNumProcessors.c -framework CHUD
*/
#include <unistd.h>
#include <CHUD/CHUD.h>
int main(int argc, char **argv) {
int rval = 0;
int status = chudInitialize();
if (status != chudSuccess) {
fprintf(stderr, "FATAL: Could not initialize chud, error %dn", chudInitialize());
return -1;
}
if (argc == 2) {
int cpuCount;
int curCPUCount = chudProcessorCount();
int physCPUCount = chudPhysicalProcessorCount();
sscanf(argv[1], "%d", &cpuCount);
if (cpuCount < 1 || cpuCount > physCPUCount) {
fprintf(stderr, "CPU count of %d not acceptable, expecting between 1 and %dn", cpuCount, physCPUCount);
rval = -1;
} else {
int res;
res = chudSetNumProcessors(cpuCount);
if (res != chudSuccess) {
fprintf(stderr, "Could not change CPU count to %d, error %dn", cpuCount, res);
rval = -1;
}
}
} else if (argc > 2) {
fprintf(stderr, "Must take zero or one argumentsn");
rval = -1;
}
printf("CPU Count: %d of %dn", chudProcessorCount(), chudPhysicalProcessorCount());
chudCleanup();
return -1;
}
Now I can call this setNumProcessors application early on in the boot process and increase my odds of being able to use my computer on reboot!
UPDATE: rentzsch commented with a better solution. It's also possible to disable multiprocessing even earlier by twiddling a setting in Open Firmware (QA1141).