Procrastiblog

June 29, 2008

OCaml’s Unix module and ARGV

Filed under: OCaml, Tech — Chris @ 9:10 pm

Be warned: the string array argument to Unix.create_process et al. represents the entire argument vector: the first element should be the command name. I didn’t expect this, since there is a separate prog argument to create_process, and ended up with weird behavior* like,

# open Unix;;
# create_process "sleep" [|"10"|] stdin stdout stderr;;
10: missing operand
Try `10 --help' for more information.
- : int = 22513

This can be a bit insidious—in many cases skipping the first argument will only subtly change the behavior of the child process.

Note that the prog argument is what matters in terms of invoking the sub-process—the first element of the argument vector is what just what is passed into the process. Hence,

# create_process "gcc" [|"foo";"--version"|] stdin stdout stderr;;
- : int = 24364
foo (GCC) 4.2.3 (Ubuntu 4.2.3-2ubuntu7)

* Actually, this “weird behavior” is the test that finally made me realize what was going on. The emergent behavior of my app was much more mysterious…

June 24, 2008

Resetting a Terminal

Filed under: Linux, Tech — Chris @ 2:04 pm

You tried to cat a binary file and now your terminal displays nothing but gibberish? Just type reset (it may look like ⎼␊⎽␊├).

It has taken me more than 10 years to learn this.

[UPDATE] Interestingly, this doesn’t work in my (alas, ancient) Mac OS X 10.3.9 terminal. Any tips? Also, why did curl URL_TO_BINARY hose my terminal in the first place?

June 23, 2008

Ripping a Muxtape

Filed under: Tech — Chris @ 2:50 pm

So Muxtape is a pretty cool site, but a little frustrating. If a friend posts a really cool mixtape (maybe you know somebody who just barely entered the Aughties), it would be nice to be able to download it and save it, just like all those old cassette mixtapes sentimentally rotting underneath your bed.

Enter muxrip. This simple Ruby script takes the name of the mixtape, downloads it, and creates a playlist for you in M3U or iTunes format. (Acknowledgments: the script basically just adds some polish to this previous effort.)

PLEASE: Use this script responsibly. It would be a shame for Muxtape to get shut down.

ALSO: I wouldn’t be surprised if this suddenly stopped working. It depends on elements of the page layout and URL scheme that might (almost certainly will) change without notice.

June 21, 2008

An Open Letter to eMusic

Filed under: Music, Not Tech — Chris @ 5:08 pm

I regret to inform you I am canceling my eMusic subscription,
effective immediately. Although I admire the fact that you have
provided DRM-free music downloads since the pre-Napster era and try my
best to support small, independent businesses, my dissatisfaction with
your service has been too great for too long and the convenience and
selection offered by your competitors (e.g., Amazon’s MP3 store) is
too good to pass up. It pains me to see big players like Amazon and
Apple push companies like eMusic out of business, but if you are to
survive, you will have to be more innovative and customer-focused than
you have been in the time that I have subscribed. I hope that you will
re-think your business model, increase the value of your product, and
win me back as a customer in the future.

In that spirit, I want to offer some specific advice about how your
service could improve.

– Your site provides almost no information about what albums will be
available when. So far as I can tell, the only information provided
is a small “Coming Soon” box with no more than 8 artists—often
just the names of the artists without release dates—in the bottom
corner of the “New on eMusic” page. Albums that have been released
and are available for download elsewhere are not acknowledged on
the artist page, not even to say “this album will be available
soon.” For example, Sloan’s “Parallel Play” has been available on
Amazon since June 10. As of June 21, I can find no information on
your site about whether this album will ever be available, even
though you offer all of Sloan’s previous albums on the same label.

– If I want to download an album with more tracks than I have in my
monthly subscription, a pop-up asks me if I want to upgrade my
subscription (i.e., to permanently increase my monthly fee and
download allotment). Although there are “Booster Packs” allowing
the one-time download of 10 or 20 tracks, this option is not
presented in the pop-up, nor in the page presented when one clicks
on “More Options”—only a savvy and determined user will find
them. The Booster Packs should not only be made easily available at
this point, there should be an additional option that you do not
provide: to download as many tracks as I have available within my
subscription and queue up the remaining tracks for download when my
account refreshes. This doesn’t have to be the first option
presented—I understand the desire to nudge your users towards
more spending more money on the site—but it should be available
(and one should not cross the line from nudging your customers to
misleading them and ripping them off).

These two points may seem inconsequential, but they have been a
constant source of annoyance for me. It is small matters like these
that build a customer relationship that survives a spotty selection
and waiting for the latest indie hits.

Best regards,
Chris

June 19, 2008

Proof that H Really Did Finish Her Ph.D.

Filed under: Not Tech — Chris @ 6:51 pm

DSC02213.JPG

June 8, 2008

Tweaking an RSS Feed in Python

Filed under: Python, Tech — Chris @ 8:00 pm

I’ve been teaching myself a bit of Python by the just-in-time learning method: start programming, wait for the interpreter to complain, and go check the reference manual; keep the API docs on your hard disk and sift through them when you need a probably-existing function. Recently, I wanted to write a very simple script to manipulate some XML (see below) and I was surprised (though it has been noted before) at the relatively confused state of the art in Python and XML.

First of all, the Python XML API documentation is more or less “go read the W3C standards.” Which is fine, but… make the easy stuff easy, people.

Secondly, the supposedly-standard PyXML library has been deprecated in some form or fashion such that some of the examples from the tutorial I was working with have stopped working (in particular, the xml.dom.ext module has gone somewhere. Where, I do not know).

So, in the interest of producing more and better code samples for future lazy programmers, here’s how I managed to solve my little problem.

The Problem: Twitter’s RSS feeds don’t provide clickable links

The Solution: A script suitable for use as a “conversion filter” in Liferea (and maybe other feed readers too, who knows?). The script should:

  1. Read and parse an RSS/Atom feed from the standard input.
  2. Grab the text from the feed items and “linkify” them
  3. Print the modified feed on the standard output.

Easy, right? Well, yeah. The only tricky bit was using the right namespace references for the Atom feed, but again that’s only because I refuse to read and comprehend the W3C specs for something so insignificant. I ended up using the lxml library, because it worked. (The script would be about 50% shorter if I hadn’t added a command-line option --strip-user to strip the username from the beginning of items in a single-user feed and a third shorter than that if it only handled RSS or Atom and not both.)

Here’s the code, in toto. (You can download it here.)

#! /usr/bin/env python

from sys import stdin, stdout
from lxml import etree
from re import sub
from optparse import OptionParser

doc = etree.parse(stdin)

def addlinks(path,namespaces=None):
    for node in doc.xpath(path,namespaces=namespaces):
        # Turn URLs into HREFs
        node.text = sub("((https?|s?ftp|ssh)\:\/\/[^\"\s\<\>]*[^.,;'\">\:\s\<\>\)\]\!])",
                        "<a href=\"\\1\">\\1</a>",
                        node.text)
        # Turn @ refs into links to the user page
        node.text = sub("\B@([_a-z0-9]+)",
                        "@<a href=\"http://twitter.com/\\1\">\\1</a>",
                        node.text)

def stripuser(path,namespaces=None):
    for node in doc.xpath(path,namespaces=namespaces):
        node.text = sub("^[A-Za-z0-9_]+:\s*","",node.text)

parser = OptionParser(usage = "%prog [options] SITE")
parser.add_option("-s", "--strip-username",
                   action="store_true",
                   dest="strip_username",
                  default=False,
                  help="Strip the username from item title and description")
(opts,args) = parser.parse_args()

# For RSS feeds
addlinks("//rss/channel/item/description")
# For Atom feeds
addlinks( "//n:feed/n:entry/n:content",
           {'n': 'http://www.w3.org/2005/Atom'} )

if opts.strip_username:
     # RSS title/description
     stripuser( "//rss/channel/item/title" )
     stripuser( "//rss/channel/item/description" )
     # Atom title/description
     stripuser( "//n:feed/n:entry/n:title",
                 namespaces = {'n': 'http://www.w3.org/2005/Atom'} )
     stripuser( "//n:feed/n:entry/n:content",
                 namespaces = {'n': 'http://www.w3.org/2005/Atom'} )

doc.write(stdout)

If there are any Python programmers in the audience and I’m doing something stupid or terribly non-idiomatic, I’d be glad to know.

Thanks in part to Alan H whose Yahoo Pipe was almost good enough (it doesn’t handle authenticated feeds, as far as I can tell) and from whom I ripped off the regular expressions.

[UPDATE] Script changed per first commenter.

Top Chef and BSG Catch-Up

Filed under: Battlestar Galactica, Not Tech, Top Chef, TV — Chris @ 4:18 pm

I have been remiss in blogging Top Chef and Battlestar Galactica this year. Suffice it to say I’m watching and enjoying, but my ardor for both has somewhat dimmed.

Unlike previous seasons of Top Chef, I don’t have a real rooting interest in any of the cheftestants this year. If I were forced to choose I would guess Richard is probably going to win (he’s about as well-liked as Stephanie and more consistent). I—along with the rest of the world—loathe Lisa, but she’s just kind of a bad trip, not really a boo-hiss, lie-to-your-face villain in the Tiffani/Omarosa mold. An interesting bit of data, for those Lisa-haters who suspect they are suffering from an irrational aversion to her attitude, looks, and posture: she has—by far—the worst record of any cheftestant to appear in a Top Chef finale (1 Elimination win, 1 place, no Quickfire wins; she has been up for elimination or on the losing team in the last seven consecutive episodes (!)). Incidentally, Richard (3 Elimination wins, 5 places, and 2 Quickfire wins) and Stephanie (4 Elimination wins, 5 places, and 1 Quickfire win) have by far the best records of any previous cheftestant, period. (In comparison, the previous three winners (Harold, Ilan, and Hung) had only 4 Elimination wins total.)

On the other side, BSG has been doing a lot of the mythical flim-flam (I don’t really care where Earth is or whether they ever find it) and not so much of the intense post-9/11 fractured-mirror business that made the first three seasons so addictive. The characters have been getting pushed around the chessboard willy-nilly without much attention paid to consistency or plausibility (to wit: President Lee Adama), all in service of a presumed “mind-blowing” series finale (to arrive not before calendar year 2009, as I understand it) that I am quite certain will disappoint (I’m not going to be X-Files‘ed ever again).

So there’s your TV-blogging for the year. Back to work.

April 30, 2008

Linux Quickies

Filed under: Emacs, Linux, Tech — Chris @ 8:31 pm


The upgrade from Ubuntu Gutsy to Hardy Heron (cool logo, right?) was relatively uneventful. Some minor points…

  • I always thought the main Ubuntu servers would farm my downloads off to an appropriate mirror, but apparently that’s not the case. You’re likely to get better download times if you choose a mirror in System -> Administration -> Software Sources. If you choose “Other…”, there’s a “Select Best Server” feature. Oddly, my best response times were from New Zealand… maybe because they were all asleep when I tried it.
  • The “ugly fix” for the infamous hard disk annihilating bug stopped working after I upgraded. This new, different (but still ugly) fix worked for me. It would be really great if the Ubuntu team could find a way to make the OS stop trying to kill my hard disk by default.
  • My WiFi light stopped working after the upgrade. This is very easily fixed by installing the package linux-backports-modules-hardy.
  • etckeeper is a great idea: it puts all the config files in /etc under Git, Mercurial, or Bazaar source control and forces APT to commit before and after any upgrade, so it’s easy to isolate and revert changes. (As a side note, using Bazaar for a few weeks makes it physically painful to be forced to deal with CVS.)
  • Anti-aliased fonts in Emacs are really nice. On Ubuntu Hardy, install emacs-snapshot-gtk (on prior releases, downloads “Pretty Emacs”), then run emacs-snapshot instead of emacs (or run update-alternatives to set emacs-snapshot as the default). You should then be able to run, e.g., emacs --font "Monospace-10" and get pretty, pretty (lick-able, as they say) fonts. Other reasonable choices are "BitstreamVeraSansMono-X" or "LiberationMono-X", where X is your desired point size. You can also invoke M-x set-default-font and type your choice interactively, but for some reason the TrueType fonts above won’t tab-complete—if you type a non-existent font, Emacs will silently use the default system fixed-width font (see System -> Preferences -> Appearance -> Fonts). I’ve added the following to my .emacs:

    (if (>= emacs-major-version 23)
    (set-default-font "Monospace-10"))

    (The conditional is necessary if you may come into contact with earlier versions of Emacs, which will barf on TrueType fonts.)

  • In my experience, the fonts in your web browser will look better if you don’t use Microsoft’s gratis TrueType core fonts (package msttcorefonts in Ubuntu/Debian). In particular, the Trebuchet font (which crops up frequently, including at the top of this page) tends to look pretty bad with subpixel rendering turned on. Red Hat’s Liberation fonts (package ttf-liberation) are designed as drop-in replacements for the Microsoft fonts, but I haven’t seen much value in installing them.
  • The instructions I gave last month for hooking up to a projector aren’t complete, because they often won’t let you run the projector at a resolution greater than 640×480. This led to a rather embarrassing scene in front a class of undergraduates, where OpenOffice.org simply refused to operate at such a pathetic resolution. This problem can be solved by the methods presented here, though it requires a bit of tweaking to get things just so. I haven’t yet discovered a minimal solution—first I need to crack the meaning of the X11 “MetaModes” option. When I do, you’ll be the first to know.

April 15, 2008

Only Thus Can It Be Unmade

Filed under: Linux, Tech — Chris @ 3:08 pm

The cleverer among you will espy the problem below immediately

$ export DATE=`date`
$ echo $(DATE)
bash: DATE: command not found

In my half-caffeinated state, it took several minutes of frustration to figure out what was wrong: $(DATE) is a Make-style variable; in Bash, $(DATE) is the same as `DATE` (a command substitution). The correct token is $DATE.

$ echo $DATE
Tue Apr 15 11:08:38 EDT 2008

I apologize for inflicting my stupidity upon you.

April 3, 2008

On the Subject of Dementia

Filed under: Not Tech, Politics, YouTube — Chris @ 10:35 pm

Ladies and gentlemen, I give you Mike Gravel, former Democratic and current Libertarian candidate for president. (Via Matthew Yglesias, who needs the traffic.)

« Newer PostsOlder Posts »

Create a free website or blog at WordPress.com.