hpr4607 :: UNIX Curio #3 - basename and dirname

Pulling apart filenames

Hosted by Vance on Tuesday, 2026-03-31 is flagged as Clean and is released under a CC-BY-SA license.
unix curio, unix, filenames. (Be the first).

Listen in ogg, opus, or mp3 format. Play now:

Duration: 00:13:08
Download the transcription and subtitles.

general.

This series is dedicated to exploring little-known—and occasionally useful—trinkets lurking in the dusty corners of UNIX-like operating systems.

Hopefully it doesn't seem like I'm picking on Linux Journal , but like UNIX Curio #1 (HPR4587), this column has been inspired by an article of theirs 1 . The author was demonstrating a clever bash script that would take a filename and send the file to standard output or, if the filename ended in .gz, decompress it and send the result to standard output. Slightly rearranged, he had:

F=`echo $1 | perl -pe 's/\.gz$//'`
if [[ -f $F ]] ; then
  cat $F
elif [[ -f $F.gz ]] ; then
  gunzip -c $F
fi

He took some heat on the web site and in letters to the magazine for cranking up a whole Perl interpreter just to chop the .gz off the end of a filename. Our curio for today is a standard UNIX utility made for just this purpose called basename 2 . Along with its brother dirname 3 , it is used to pull apart pathnames to get the part you want. What basename does is remove any leading path on the name given to it, and if a suffix is specified as well, removes that also. If a directory path with a trailing slash is given, it returns the last part with no slashes. Here are some examples:

$ basename /bin/gzip
gzip
$ basename /bin/gzip .so
gzip
$ basename /usr/lib/libz.so .so
libz
$ basename /usr/lib/
lib

The counterpart, dirname , does essentially the opposite. It removes the last part of the pathname and returns a directory name (with no trailing slash):

$ dirname /usr/lib/libz.so
/usr/lib
$ dirname /usr/lib/
/usr
$ dirname file_in_this_dir
.

So we can replace the first line of the script up top with F=`dirname $1`/`basename $1 .gz` , get the same result, and be sure it will work on any UNIX-like system, no Perl necessary. The more observant among you may be thinking " sed could do that, too!" and you're right; F=`echo $1 | sed 's/\.gz$//'` also would work anywhere.

One might suspect that as a general-purpose text processor, sed would be slower than basename and dirname . To see how they compared, we ran each method against a randomly-generated list of 5,000 filenames. Turns out the critics were right, as Perl ran the longest at 59 seconds. Using basename / dirname took 44 seconds—a nice improvement, but sed blew past it at 34 seconds. Probably the fact that only one call to sed was needed versus two for basename and dirname made the difference.

Helpful suggestions in response to the article revealed a shell curio. You may have seen the brace syntax for parameters. For example, to show a filename $F with an "X" appended, you can't use echo $FX because that means a parameter named FX . Instead, you'd use echo ${F}X and the shell only interprets what's inside the braces as the parameter name.

Modifiers can also go inside the braces 4 and one of these, %, is just what we need to chop off that extension. This works in bash , zsh , and any shell conforming to the current POSIX standard, but not csh and friends or older implementations of the Bourne shell. We can rewrite the first line of the original script as simply F=${1%.gz} and forgo any outside utilities. Performance? Under half a second to process those 5,000 filenames. Not bad at all.

References:

  1. Treating Compressed and Uncompressed Data Sources the Same https://www.linuxjournal.com/content/treating-compressed-and-uncompressed-data-sources-same
  2. Basename specification https://pubs.opengroup.org/onlinepubs/009695399/utilities/basename.html
  3. Dirname specification https://pubs.opengroup.org/onlinepubs/009695399/utilities/dirname.html
  4. Shell Command Language: Parameter Expansion https://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_06_02

This article was originally written in July 2010. The podcast episode was recorded in March 2026.


Comments

Subscribe to the comments RSS feed.

Leave Comment

Note to Verbose Commenters
If you can't fit everything you want to say in the comment below then you really should record a response show instead.

Note to Spammers
All comments are moderated. All links are checked by humans. We strip out all html. Feel free to record a show about yourself, or your industry, or any other topic we may find interesting. We also check shows for spam :).

Provide feedback
Your Name/Handle:
Title:
Comment:
Anti Spam Question: What does the letter P in HPR stand for?