354 ./configure --prefix=/home/chris/tools/builds/why 359 godi_console perform -build godi-lablgtk2 370 sudo apt-get install libgl-dev 371 sudo apt-get install libgl1-mesa-dev 372 godi_console perform -build godi-lablgtk2 380 sudo apt-get install libglu1-mesa-dev 381 godi_console perform -build godi-lablgtk2 384 sudo apt-get install freeglut3-dev 385 godi_console perform -build godi-lablgtk2 391 sudo apt-get install tcl-dev 392 godi_console perform -build godi-lablgtk2 394 sudo apt-get install tk-dev 395 godi_console perform -build godi-lablgtk2 398 sudo apt-get install libgtkgl2.0-dev 403 godi_console perform -build godi-lablgtk2 404 ./configure --prefix=/home/chris/tools/builds/why 408 make 409 make install
August 27, 2008
Portrait of Frustration in Eighteen Commands: An Abbreviated History
June 29, 2008
OCaml’s Unix module and ARGV
Be warned: the string array
argument to Unix.create_process
et al. represents the entire argument vector: the first element should be the command name. I didn’t expect this, since there is a separate prog
argument to create_process
, and ended up with weird behavior* like,
# open Unix;; # create_process "sleep" [|"10"|] stdin stdout stderr;; 10: missing operand Try `10 --help' for more information. - : int = 22513
This can be a bit insidious—in many cases skipping the first argument will only subtly change the behavior of the child process.
Note that the prog
argument is what matters in terms of invoking the sub-process—the first element of the argument vector is what just what is passed into the process. Hence,
# create_process "gcc" [|"foo";"--version"|] stdin stdout stderr;; - : int = 24364 foo (GCC) 4.2.3 (Ubuntu 4.2.3-2ubuntu7)
* Actually, this “weird behavior” is the test that finally made me realize what was going on. The emergent behavior of my app was much more mysterious…
January 24, 2008
The Triumphant Return of C-c C-t
The upgrade to Ubuntu gutsy and/or Emacs 22 broke my favorite feature of tuareg/ocaml-mode: C-c C-t
for “show type” in OCaml buffers (this requires compiling with -dtypes
, which generates type annotation files). I suffered without this for a length of time which is either embarrassing or impressive, depending on whether you consider poking around inside Emacs Lisp files a productive or unproductive use of time…
I finally broke down and fixed it today. The problem is simply that Emacs and OCaml packages aren’t cooperating properly. My solution, which may or may not be optimal, is as follows:
- Copy the directory
/usr/share/emacs/site-lisp/ocaml-mode
to a path of your choosing, say~/.emacs.d/emacs22/ocaml-mode
. Let’s call this directoryDIR
- (Optional) In Emacs 22, execute
C-u 0 M-x byte-recompile-directory
and chooseDIR
. - Add the following line to your
.emacs
file:(or (< emacs-major-version 22) (push "DIR" load-path))
The test for whether it worked is: load a .ml
file and type C-c C-t
. In the mini-buffer, you’ll either see “type: ...
“; “Point is not within a typechecked expression or pattern
“; or “No annotation file...
” If it says “C-c C-t is undefined
“, then you have failed.
May 23, 2007
The Annals of OCaml Compiler Errors
X is not a compilation unit description.
X
is not a file type that the compiler expected to receive as input. For example, X
is a .cma
file and you’r running the compiler with the -a
flag—.cma
files are only expected on the final link; linking a library into a library doesn’t make any sense.
April 25, 2007
Profiling OCaml… Revealed!
Profiling OCaml code is kind of a hassle. The simplest thing is to use ocamlopt
with the -p
option, then apply gprof
as usual. The problem here is that the debugging symbols produced by the OCaml compiler are of limited usefulness. For example, fun
expressions show up with names like camlModule__fun_2397
(where 2397
has nothing to do with anything) and, I think, continuation-passing transformations in the back-end can lead to confusing call graph relationships where functions that shouldn’t be compute-intensive at all end up looking like hot spots.
Now, you may think this is all due to the conversion to C calling conventions and the corresponding loss of high-level information at execution time and therefore the solution would be to profile bytecode. So you might try to compile with ocamlcp
, the profiling bytecode compiler. Along the way, you’ll figure out that ocamlcp
doesn’t allow the -pp
option… No problem—if your project is sufficiently small or your Makefiles are sufficiently modular—you can just run the preprocessor separately and pass the preprocessed files in to ocamlcp
(just add pr_o.cmo
to your camlp4
command, to dump the pretty-printed version of your code instead of the AST object).*
Then you’ll discover** that what ocamlprof
gives you is not a data dump like the output of gprof
, but a source file annotated with execution counts for each expression. And you’ll realize that this is in some ways useless—you really need time information to do effective profiling. For example, the polymorphic equality function (that’s, um, =
for you non-functional programming types) is going to have a massive execution count in just about any program you write; that doesn’t mean you need to rip it out and hot-rod it.***
Now, here’s where I made an interesting discovery: the byte- and native-code compilers seemingly dismantle the source code in similar or identical ways. You can take the execution count for an anonymous fun
expression from the gprof
output and match it up with the execution count on the source expression from ocamlprof
.
Here’s an example. gprof
tells me the following:
Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 28.57 0.08 0.08 1064344 0.00 0.00 compare_val 17.86 0.13 0.05 82370 0.00 0.00 camlAtp__itlist_116 10.71 0.16 0.03 2284937 0.00 0.00 caml_apply2 7.14 0.18 0.02 1657397 0.00 0.00 camlMlss__fun_1052
The far left column tells you what percentage of the execution time was spent in the function named on the far right. The column in the middle tells you how many time the function was called. The first three rows name built-in and generic functions—it’s not surprising that the program spends a lot of time comparing things, iterating over lists, and invoking functions. The fourth row names camlMlss__fun_1052
, an anonymous function, which accounts for 7.14% of running time. Where is this function?
gprof
outputs the following call graph information:
-----------------------------------------------
28217 camlMlss__fun_1046 [167]
1629180 camlAtp__itlist_116 [11]
[16] 12.0 0.02 0.01 1657397 camlMlss__fun_1052 [16]
In other words, camlMlss__fun_1052
is called from some other anonymous function and from a generic list iterator. That’s not very helpful.
But if we go over to the output of ocamlprof
, we find this:
let saturation_rule2 f terms theta = (* 546 *) itlist (fun fm r -> (* 28217 *) try let g = f fm in itlist (fun fm r -> (* 1657397 *) try g fm @ r with Match_failure _ -> r) theta r with Match_failure _ -> r) theta []
The numbers in comments are invocation counts. The innermost fun
expression is called 1,657,397 times. Does that number look familiar? Notice also that the next enclosing fun
expression is called 28,217 times, which is exactly the number of calls attributed to the anonymous parent of camlMlss__fun_1052
in the gprof
call graph data. We have found our hot (er, warm-ish) spot!
I’m not sure how reliably this works in general. The cited results were obtained by running ocamlcp
with the -g
and -p f
arguments. It might be fun, if one could find some spare time, to write a utility that used the ocamlcp
output to annotate the gprof
data with (probable) line numbers.
* Note that the pre-processed file must have a .ml
extension or the OCaml compiler will refuse to have anything to do with it. Note also that foo-pp.ml
is not an option, because the filename must be a valid module identifer when the first letter is capitalized (i.e., your canonical [A-Za-z][A-Za-z0-9_]*
identifier).****
** We assume throughout that you are a foolish person like me: that you only read the documentation for such things far enough to get them running and are consequently constantly surprised by what programs actually do, since you assume that they ought to do what they seem to be intended to do.
*** Although you may run into trouble if you use polymorphic equality on big, complicated (or, gasp, cyclic) data structures.
**** Note to self: is there a reason the OCaml convention is to use lowercase file names when the module system implicitly capitalizes it and for use as a module identifier and a capitalized file name is also accepted by the compiler? I.e., why don’t we match the case of file names and implicit module declarations? Oh me, oh my, why, why, why?