“There is no silver bullet—but there are no werewolves.”
From No Name: Just Notes on Software Reuse by Robert Biddle, Angela Martin, and James Noble. In response to Frederick Brooks.
July 3, 2007
A very witty observation that could render my education worthless
June 29, 2007
Browsing gzipped tarballs in Emacs
Somehow the presence of a hundred billion .tar.gz files on the Internet prevents Google from giving me this very simple tip. Just opening a .tar file in Emacs will let you browse it as if it were a directory. If the file is gzipped, you need to enable auto-compression-mode: either do M-x auto-compression-mode or (setq auto-compression-mode t) (e.g., in your .emacs file).
Google Desktop for Linux!
Hooray! Beagle search for Gnome? Boo! Go away!
Google, if I give you all my data, do you promise to be gentle?
May 30, 2007
Multi-line Comments in Make
You see, this is why I hate Make. Did you know that a backslash at the end of a comment line extends the comment to the next line? For example:
# This is a comment \ and this is still a comment
This is all very nice and logical—a trailing backslash means the same thing no matter where it appears in a file—but it has all the niceness and logic just exactly backwards. The behavior of (line-based) comments in every other programming environment I know of is: a comment character (in this case ‘#‘) introduces a comment that is terminated by the end of the line; if a line is not preceded by a comment character, it is not a comment.
This may seem harmless. But consider the following:
FILES = \
file1 \
file2 \
file3
Now suppose we decide to temporarily remove file1:
FILES= \
# file1 \
file2 \
file3
Does FILES equal "file2 file3"? No! FILES is empty. And that’s if you’re lucky and you didn’t get some weird syntax error.
May 23, 2007
The Annals of OCaml Compiler Errors
X is not a compilation unit description.
X is not a file type that the compiler expected to receive as input. For example, X is a .cma file and you’r running the compiler with the -a flag—.cma files are only expected on the final link; linking a library into a library doesn’t make any sense.
April 27, 2007
Ubuntu Names
Oddly enough, Ubuntu has a web page about the names. Apparently, it just didn’t occur to them to go in alphabetical order until after breezy. But why did they skip C?
I think Adjective Animal would be a great name for their 27th release.
Azureus
What’s the story with this piece of crap? Azureus seems to be every right-thinking Linux geek’s favorite BitTorrent client, but I have never once had it succesfully download anything. It hangs and/or crashes pretty much immediately. This has been the case on multiple computers and four different versions of Ubuntu. Is it me? Is it SafePeer?
[UPDATE] There was a widespread, virulent bug in Azureus. I’ve been able to work around it, though I’m not quite sure how. It really is a great BitTorrent client…
Feisty Fawn
I upgraded my laptop from Ubuntu edgy to feisty last night and I’m please to report (after a rather unpleasant experience upgrading from dapper to edgy, documented here, here, and here) that the whole thing went off without a hitch. The upgrade process asked a lot of annoying questions about whether or not it should clobber various config files, but I just said “Yes” to all of them and have seen no ill effects. It even gracefully downgraded my nVidia driver to the latest version in the Ubuntu repository and Suspend still works. Who knew that was possible?
Are there any great benefits to upgrading? Um… not that I can tell. The version of Liferea is more recent and there’s a search button in the Panel. These are hardly blockbuster features. The main benefit to me is that “are you running feisty?” will not be the first and last response to every question I post to the support forums. (This will be replaced with “are you running gutsy?” within a month.*)
* Speaking of which: I am really fond of the names Breezy Badger, Dapper Drake*, Edgy Eft, and Feisty Fawn. But, Gutsy Gibbon? Yuck.
** Does anyone know the story behind the naming scheme? I.e., how Ubuntu went from Warty Warthog and Hoary Hedgehog to the current Sue Grafton-esque release-naming convention? And why they skipped A and C?
April 25, 2007
Profiling OCaml… Revealed!
Profiling OCaml code is kind of a hassle. The simplest thing is to use ocamlopt with the -p option, then apply gprof as usual. The problem here is that the debugging symbols produced by the OCaml compiler are of limited usefulness. For example, fun expressions show up with names like camlModule__fun_2397 (where 2397 has nothing to do with anything) and, I think, continuation-passing transformations in the back-end can lead to confusing call graph relationships where functions that shouldn’t be compute-intensive at all end up looking like hot spots.
Now, you may think this is all due to the conversion to C calling conventions and the corresponding loss of high-level information at execution time and therefore the solution would be to profile bytecode. So you might try to compile with ocamlcp, the profiling bytecode compiler. Along the way, you’ll figure out that ocamlcp doesn’t allow the -pp option… No problem—if your project is sufficiently small or your Makefiles are sufficiently modular—you can just run the preprocessor separately and pass the preprocessed files in to ocamlcp (just add pr_o.cmo to your camlp4 command, to dump the pretty-printed version of your code instead of the AST object).*
Then you’ll discover** that what ocamlprof gives you is not a data dump like the output of gprof, but a source file annotated with execution counts for each expression. And you’ll realize that this is in some ways useless—you really need time information to do effective profiling. For example, the polymorphic equality function (that’s, um, = for you non-functional programming types) is going to have a massive execution count in just about any program you write; that doesn’t mean you need to rip it out and hot-rod it.***
Now, here’s where I made an interesting discovery: the byte- and native-code compilers seemingly dismantle the source code in similar or identical ways. You can take the execution count for an anonymous fun expression from the gprof output and match it up with the execution count on the source expression from ocamlprof.
Here’s an example. gprof tells me the following:
Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 28.57 0.08 0.08 1064344 0.00 0.00 compare_val 17.86 0.13 0.05 82370 0.00 0.00 camlAtp__itlist_116 10.71 0.16 0.03 2284937 0.00 0.00 caml_apply2 7.14 0.18 0.02 1657397 0.00 0.00 camlMlss__fun_1052
The far left column tells you what percentage of the execution time was spent in the function named on the far right. The column in the middle tells you how many time the function was called. The first three rows name built-in and generic functions—it’s not surprising that the program spends a lot of time comparing things, iterating over lists, and invoking functions. The fourth row names camlMlss__fun_1052, an anonymous function, which accounts for 7.14% of running time. Where is this function?
gprof outputs the following call graph information:
-----------------------------------------------
28217 camlMlss__fun_1046 [167]
1629180 camlAtp__itlist_116 [11]
[16] 12.0 0.02 0.01 1657397 camlMlss__fun_1052 [16]
In other words, camlMlss__fun_1052 is called from some other anonymous function and from a generic list iterator. That’s not very helpful.
But if we go over to the output of ocamlprof, we find this:
let saturation_rule2 f terms theta =
(* 546 *) itlist
(fun fm r ->
(* 28217 *) try
let g = f fm in
itlist
(fun fm r ->
(* 1657397 *) try g fm @ r with
Match_failure _ -> r)
theta r
with
Match_failure _ -> r)
theta []
The numbers in comments are invocation counts. The innermost fun expression is called 1,657,397 times. Does that number look familiar? Notice also that the next enclosing fun expression is called 28,217 times, which is exactly the number of calls attributed to the anonymous parent of camlMlss__fun_1052 in the gprof call graph data. We have found our hot (er, warm-ish) spot!
I’m not sure how reliably this works in general. The cited results were obtained by running ocamlcp with the -g and -p f arguments. It might be fun, if one could find some spare time, to write a utility that used the ocamlcp output to annotate the gprof data with (probable) line numbers.
* Note that the pre-processed file must have a .ml extension or the OCaml compiler will refuse to have anything to do with it. Note also that foo-pp.ml is not an option, because the filename must be a valid module identifer when the first letter is capitalized (i.e., your canonical [A-Za-z][A-Za-z0-9_]* identifier).****
** We assume throughout that you are a foolish person like me: that you only read the documentation for such things far enough to get them running and are consequently constantly surprised by what programs actually do, since you assume that they ought to do what they seem to be intended to do.
*** Although you may run into trouble if you use polymorphic equality on big, complicated (or, gasp, cyclic) data structures.
**** Note to self: is there a reason the OCaml convention is to use lowercase file names when the module system implicitly capitalizes it and for use as a module identifier and a capitalized file name is also accepted by the compiler? I.e., why don’t we match the case of file names and implicit module declarations? Oh me, oh my, why, why, why?
April 17, 2007
Unzip into a subdirectory!
Most folks know this. If you are one of those folks, just move along. I’ll wait…
Now. When you make a TAR or ZIP file, make it so the files will un-tar/zip into a subdirectory, damn it! Do you really think I want your files spread all over my /usr/local tree?
This is especially common amongst people who make ZIP files, so I’m usually on my guard for it. But tarballers: come on! Get with it! It’s easy: “cd ..; tar cvf foo.tar DIR“




