Iteration in JavaScript
In JavaScript, there are basically two kinds of object iteration:
All objects support property enumeration:
for (var propName in someObject) { var value = someObject[propName]; }Some objects support the "Array protocol":
for (var i=0; i<someObject.length; i++) { var value = someObject[i]; }
These both suck.
Property enumeration is only really useful for debugging, since chances are the objects will have a lot of properties that you do not want them to have, and there is no standard way to have properties that are hidden from enumeration.
Array protocol enumeration sucks because:
- Writing for loops is so "C"
- Not every object that supports the "array protocol" is actually an array, so the "higher order" methods to manipulate the array such as Array.prototype.slice aren't going to be available everywhere. DOM NodeList objects and the special arguments array are particularly common examples of these array-like objects.
Fortunately, this doesn't stop you from writing your own iteration protocol. Here is an example of a JavaScript iterator in the style of Python's PEP 234 - Iterators:
StopIteration = function () {};
StopIteration.prototype = new Error();
StopIteration.name = 'StopIteration';
StopIteration.message = 'StopIteration';
arrayLikeIterator = function (arrayLike) {
var i = 0;
return {
'next': function () {
if (i >= arrayLike.length) {
throw StopIteration;
}
return arrayLike[i++];
}
};
};
var countToFive = arrayIterator([1, 2, 3, 4, 5]);
try {
while (true) {
var item = countToFive.next();
// do something with item here
}
} catch (e) {
if (e != StopIteration) {
throw e;
}
// pass
}As you can see, it's not pleasant to work directly in this style. We've actually made things worse, but we've gained two important things in the process:
The concept of an iterator separate from the object being iterated
This separation is very important. When you have iterators that are separate from the object being iterated, you can do crazy things like iterate in reverse without creating a new array, define an iteration over objects that aren't normally iterable (particularly important since JavaScript doesn't have operator overloading), or define an alternative kind of iteration over something (such as walking a tree in pre-order, or iterating over every even item in an array, etc.).
Iterators that don't need to know about length or index
This basically means that iterators are lazy. You can define infinite iterators, filter chains, etc. This can be extremely important when doing some kinds of programming in a functional style.
Unfortunately, as-is, we've made things too hard to use liberally. That's when you bring in the sledgehammers, er, functions that make this less painful. Basically, these fall into three groups:
- Functions that create iterators from an iterable object (not necessarily an iterator)
- Functions that transform an iterator into some kind of primitive object
- Functions that transform one iterator into another iterator
The most important function in the first group (following the Pythonisms) is iter(iterable[, sentinel]). iter's job is to determine if iterable is already an iterator, or match it up with the appropriate iterator factory to make one out of it. The optional sentinel argument is just an additional convenience (which I carried over into MochiKit's iterators).
Determining whether an iterable is already an iterator is trivial, simply check typeof(iterable.next) == 'function'. However, if it's not, then it has to do some grunt work. In Python style, our implementation of iter allows an object to implement an iter() method (like Python's __iter__) which returns an iterable. However, since we don't do any built-in object prototype hacking, we've also allowed for adapters (using the same methodology that's in Comparison and Adaptation in JavaScript) to be defined.
Just like registering comparators, registering an iterable is a matter of writing a function that detects the kind of object you want to iterate (check), and another function (wrap) that is the iterator factory for that kind of object. Note that we recommend adapting on how objects act, not their type. Since JavaScript's type system is sufficiently weak, adapting based on type is a non-starter. For example, many of the objects you may run across in JavaScript will behave similarly to an Array, but aren't quite. The DOM's NodeList is a particularly ugly example. It has a length property, and it supports the [``same syntax]`` as Arrays do, but they aren't Arrays. The arguments special object is similarly Array-like, but isn't a real Array.
In order to work around this, Duck Typing is usually the right answer. Instead of checking the constructor, instanceof, etc., you simply introspect the object to see if it looks like a duck, and quacks like one. For example, the registration for turning array-like objects into iterables looks like this:
isArrayLike = function (obj) {
// working around blog-bug with "and" due to ampersands
return !(typeof(obj) != 'object' || typeof(obj.length) != 'number');
}
arrayLikeIterator = function (arrayLike) {
var i = 0;
return {
'next': function () {
if (i >= arrayLike.length) {
throw StopIteration;
}
return arrayLike[i++];
}
};
};
registerIteratorFactory(
// name of the adapter
"arrayLike",
// check function
isArrayLike,
// wrap function
arrayLikeIterator
);With this, given a suitable definition of iter, we can write iter(someIterable) without having to worry so much about what someIterable is.
The second group of functions let you do the inverse: take something iterable, and make it primitive again. There are two basic examples of this, reduce and list. The implementations are as follows (the documentation is better in MochiKit, I shortened it for brevity):
list = function (iterable) {
/***
Create an array from an iterable.
***/
// fast-path if they passed in an Array already
if (typeof(iterable.slice) == 'function') {
return iterable.slice();
}
// it's good to do this for convenience,
// it's cheap if it's already iterable
iterable = iter(iterable);
var rval = [];
try {
while (true) {
rval.push(iterable.next());
}
} catch (e) {
if (e != StopIteration) {
throw e;
}
}
return rval;
}
reduce = function (func, iterable, /* optional */initial) {
/***
Create a value from an iterable by applying a function of two arguments
cumulatively to all of its elements.
***/
iterable = iter(iterable);
if (arguments.length < 3) {
try {
initial = iterable.next();
} catch (e) {
if (e == StopIteration) {
e = new TypeError("reduce() of empty sequence with no initial value");
}
throw e;
}
}
try {
while (true) {
initial = func(initial, iterable.next());
}
} catch (e) {
if (e != StopIteration) {
throw e;
}
}
return initial;
}The last group of functions completes the toolkit by allowing you to transform an iterator into another iterator. These functions are imap, ifilter, ifilterfalse, etc. Effectively what these let you do is compose something a bit more declarative. For example:
// reduce(operator.add, sequence, 0) is also sum(sequence) in MochiKit sumOfAllObjectsGreaterThanTen = reduce(operator.add, // all objects that are greater than 10 (10 is less than them) ifilter(partial(operator.lt, 10), // skip all objects that are undefined or null ifilterfalse(isUndefinedOrNull, allObjects) ), // use 0 as the initial value, i.e. (((0 + p0) + p1) + p2 ...) 0 );
The equivalent in idiomatic JavaScript looks more like this:
var sumOfAllObjectsGreaterThanTen = 0; for (var i = 0; i < allObjects.length; i++) { var o = allObjects[i]; // working around blog-bug with "and" due to ampersands if (!(isUndefinedOrNull(o) || o <= 10)) { sumOfAllObjectsGreaterThanTen += o; } }
Granted, this particular example doesn't give you much. However, if you're familiar with the functional style of programming you can easily see where this goes. Unfortunately, I'm not really in the mindset to teach at the moment, so if you don't get it yet, you're not going to get it from me (today).
- UPDATE:
- Check out the full implementation of this in MochiKit!
Is this putting too much Python into Javascript? For instance, what if you create a sentinal that is strictly for iteration. E.g. “EndIter = new Object();”. And then a loop looks like
for (iterable=iter(obj);; (i=iterable()) != EndIter) {...}. Though, I’m going to guess you are thinking about replacingforwith something likeforeach(obj, function(i) {...}). And looking at it, that looks quite a bit better than anything you can do withfor.I think it’s worth testing performance, though; both of using a sentinal vs. exceptions, how fast code blocks are, whether adapters work reasonably fast, etc. Javascript isn’t fast, and it feels like these are adding some substantial layers of abstraction. Also, Javascript just isn’t Python; not only are the layers of abstraction a miscognate (almost like Python but not quite), but it creates a Javascript-Python subset, which isolates the library from other libraries and non-Python users.
Comment by Ian Bicking — 2005-07-06 @ 8:47 pm
You can create a sentinel that’s strictly for iteration, but then you can’t include it in an iteration, which is dumb.
Yes there is a
forEach(obj, function(i) { ... }).Because performance does matter (and is terrible for a lot of things), there are separate functions that work directly with strictly array-like-objects. map(fn, p, q, …) takes and returns arrays, imap(fn, p, q, …) takes iterables and returns and iterator. Just like in Python. forEach, list, and a few other functions have fast-paths for when you pass them arrays.
I haven’t done any profiling other than running our app in browsers, and what we’ve found is that network latency and DOM manipulation dwarfs the hell out of anything else we’re doing.
As far as the JavaScript-Python miscognate stuff.. I don’t give a shit. JavaScript is hardly capable of doing anything before you throw a support library under it, and whatever support library you throw underneath is going to be some JavaScript+NotSuck subset that not everybody understands out of the womb. It’s simply NOT possible to do comparisons, iterations, deferred operations, etc. with JavaScript syntax alone because it simply doesn’t have the language features required to support them. You might as well code up solutions that are proven to work and might even be familiar to the people who will choose your library (in this case, probably mostly people who are familiar with Python).
Comment by Bob Ippolito — 2005-07-06 @ 9:05 pm
I feel sorry for anybody who has to read your JavaScript code after the fact. C style for loops are standard for a reason, they’re simple, flexible, and everybody knows how to use them. Thanks for making things harder.
Comment by Bryan — 2005-07-07 @ 2:38 pm
I feel sorry for anyone who thinks that C style for loops are the best way to do things. Sucks to be you, I guess.
Comment by Bob Ippolito — 2005-07-07 @ 2:42 pm
My intuition is that it is a mistake to throw StopIteration instead of simply returning StopIteration. I don’t know Python, so maybe exceptions are used for ordinary, non-exceptional conditions in that language. My take, however, is that the end of an interation is a normal occurrance, and having to catch an exception to detect it is a bad thing. Yes, you’ve hidden this will all the helper functions, but still.
I comment #2 above, you say that the problem with sentinels is that they cannot be returned by an iterator. Sure. That’s probably a good reason not to choose null as your sentinel (although there is a school of thought that would argue that collections should never contain null and that null does make a good sentinel). But if you have a unique object (or fuction like StopIteration), whose only purpose is to serve as your end-of-iteration sentinel, I can’t think of a reason why you’d ever want to include this in an iterator.
Having said all that, I do love the arrayLikeIterator() function. An elegant use of closures. MochiKit looks very interesting.
Comment by David Flanagan — 2005-07-21 @ 1:44 pm
Your intuition is wrong :) Exceptions are the right way to signal exceptional conditions.
You can’t think of any reason why you’d want to include StopIteration? What about interating over all of the property values in MochiKit.Iter?
Comment by Bob Ippolito — 2005-07-21 @ 2:18 pm
How about a compromise: give iterators a hasNext() method like they have in Java, but still throw your exception as you do now. In internal methods that process deep filter chains or whatever, you just ignore the hasNext() method and wait for the exception. But just plain folks who want to use an iterator themselves (instead of using a higher-level) function don’t have to catch an exception. They can just do while(iterator.hasNext()) { process(iterator.next()); }. hasNext() is just a way of querying whether a call to next() will throw an exception.
Comment by David Flanagan — 2005-07-23 @ 12:26 am
No.
forEach(iterator, function (obj) { process(obj); });or:
exhaust(imap(process, iterator));Comment by Bob Ippolito — 2005-07-23 @ 12:29 am
nifty. i think a better way to do
for (var i = 0; i < all.length; i++) { var obj = all[i]; ...might be
for (var i = 0, obj; obj = all[i]; i++) { …Comment by tim baker — 2005-07-25 @ 11:41 pm
oops, did not encode that “<”. i meant
for (var i = 0, obj; obj = all[i]; i++) { …might be better than
for (var i = 0; i < all.length; i++) { var obj = all[i]; …Comment by tim baker — 2005-07-25 @ 11:45 pm
Uh.. no. If all[i] happens to be a false value then you lose.
Comment by Bob Ippolito — 2005-07-25 @ 11:59 pm
I think that for loop would at least be better, if not great, if you did:
for (var i=0, obj; ! (obj=all[i]) === undefined; i++) {}In theory
undefinedis a more purely sentinal object thannullor the other false objects.If you wanted to make a purely sentinal object that would be less likely to leak into the environment, you could probably do:
function makeMakeSentinal() {
var sentinal = new Object();
function create() {
return sentinal;
}
}
makeSentinal = makeMakeSentinal();
This would give you a truly private object with a unique and persistent identity that could only be retrieved with a function call, and so would never appear in any enumeration of an object (so long as people are disciplined enough not to store the result of that function call anywhere except in private local variables).
But the only justification I see for one over the other is performance, which requires testing, which I am too lazy to do now, so I can’t complain…
Comment by Ian Bicking — 2005-07-28 @ 1:08 pm
Whoops,
makeMakeSentinal()should returncreate.Comment by Ian Bicking — 2005-07-28 @ 1:09 pm
Just thought I’d respond to this:
Property enumeration is only really useful for debugging, since chances are the objects will have a lot of properties that you do not want them to have, and there is no standard way to have properties that are hidden from enumeration.
This is a gross exageration. There’s an easy way to transform an array into an associative array such that you can iterate over it using the ‘in’ syntax. A simple example:
function arrayIter(array)
{
var obj = {};
for (var i = 0; i < array.length; i++)
{
obj[array[i]] = true;
}
return obj;
}
var myArray = [4,9,16];
for (elem in arrayIter(myArray))
{
document.writeln(elem);
}
Admittedly I don’t see a way of having ‘in’ work together with arbitrary iterators, representing infinite streams or the like.
Comment by Per Vognsen — 2005-08-05 @ 10:29 am
Actually, no, that won’t work unless the array contains only strings. Property names are strings, not arbitrary objects (though some implementations support that).
Comment by Bob Ippolito — 2005-08-05 @ 6:47 pm
First, regarding for loops and properly using them… ++i, not i++.
Second, now that Javascript 1.7 is available with built in Python-esque iterators, how does that compare with what you were trying to achieve?
Comment by NS — 2006-12-19 @ 8:48 am
There’s absolutely no difference whatsoever between ++i and i++ in this context. I don’t know why you wasted the effort to suggest one over the other.
JavaScript 1.7’s iterators *are* what I proposed here, plus other enhancements. They effectively tracked Python 2.5’s iterators.
Comment by bob — 2006-12-19 @ 9:06 am
Because ++i is faster. You seem concerned with having perfect code, and one should never use i++ unless it’s necessary.
I ask about the 1.7 iterators because I was curious if they captured all of what you are trying to do.
Do you know if it is possible with JS 1.7 to, for instance, apply a forEach() method to an HTML Collection object? This is the best I could come up with:
var nodes = Iterator(document.getElementsByTagName(”input”))
for ( n in nodes )
if ( n[1].getAttribute(”type”) == “text” )
n[1].value = null
But that doesn’t seem any different than:
var nodes = document.getElementsByTagName(”input”)
for ( let i = 0; i
Comment by NS — 2006-12-19 @ 10:27 am
++i isn’t meaningfully faster than i++ in most JS interpreters.
for each (n in nodes) is faster than for (i=0; …) though.
Comment by bob — 2006-12-19 @ 6:23 pm