Archive

Posts Tagged ‘xargs’

-e to execute from command line

September 23rd, 2009 2 comments

One of the useful tools in the unix world is the command line. Often you may have a request like

give me a list of all the unique Agencies specified in all the XML files in a given directory tree where an agency is shown in an element <agency id="IJKXYZ">

this can be easily achived (for pretty printed XML documents, each element on it’s own line) with a variety of unix tools like so

find . *xml | xargs grep agency.id $1 | awk -F '"' '{print $2}' | sort | uniq -c

find all the *.xml files from the current directory, for this list execute grep using xargs and grep for the agency.id pattern (the ‘.’ matches any character, in this case a space, ‘ ‘) use awk, delimiting on double quotes, to print out the ID’s, then sort them and run them through uniq with -c flag to get the count of occurrences for every agency ID. Yes, there are no doubt other ways of doing this, and different sort’s of optimisations but this is just the way my brain thinks and wire’s together these commands.

As already mentioned, this will only work if the XML is pretty printed and not if all the white space has been removed, there may also be many cases where the request is more complicated but it feels like opening up a text editor to write a program should not be necessary. This is where the -e option of many scripting languages comes in, it lets you run the language from the command line.

The basic hello world in a couple of such languages

groovy -e "println 'hi'"
perl -e 'print "hi\n"'
ruby -e 'puts "hi\n"'
jruby -e 'puts "hi\n"'

As a more complex example you may want to find out the day of the week for a given date. This can be done like so

$ groovy -e 'println Date.parse("MM/dd/yyyy", "12/13/1974").format("EEEE")'
Friday

of course again there may be a unix command out there which is more concise or which the user may be more familiar with, in which case great, use that instead. If on the other hand it is a situation where the code solution comes to you immediately then why not use it? If this is the way your brain is wired, and you are more competent at using your programming language of choice then why go reading man pages when you can do this simply with a scripting language and the -e flag.

Counter to that argument I had a snippet in PowerShell, the Microsoft Windows shell language for finding the day for a given date and it goes like this

PS> (get-date "12/13/1974").DayOfWeek

Very concise indeed.

Of course the real power of scripting languages is when you use them in conjunction with pipes and unix tools. In this case I want a histogram of how many files I modify for each day of the week for a given directory.

groovy -e 'new File(".").eachFile{file -> println new Date(file.lastModified()).format("EEEE")}' \
 | sort | uniq -c | sort -rn
 
19 Monday
15 Tuesday
10 Wednesday
5 Friday
4 Thursday

Given that it was a work directory then the work day’s only is understandable. Also as I am running this on a Wednesday morning that may skew the results of “last modified” if I have modified most files. Still it seems that Thursday and Friday are the low parts of the week.

Feel free to add how you use scripting languages from the command line in the comments. Do be careful cutting and pasting any code samples and make sure you know what you are doing prior to doing so.

I will look at a few common examples I use regularly in future posts and I hope to see more people using scripting languages from the command line to solve their problems.