Archive

Archive for September, 2009

ruby from the command line, taking it too far

September 24th, 2009 No comments

In my last post I looked at using -e to run scripting languages directly from the command line. Just as I was not sure what would make ruby a good candidate for an example, a colleague from work scoffed at some image manipulation suggestion I made which I claimed would take only half an hour to code. If something takes 30 minutes to code then it surely can be run directly from the command line?

The problem is such, given a company logo of a particular aspect ration, create a new logo, with a different aspect ratio and fill out the logo with the predominant color (most used color) from the original logo. Of course there are other ways around this like setting transparent backgrounds and default to something neutral like white but in this case I suggested that using the predominant color of the logo would maintain the branding of the company logo.

A search on the internet for “color quantization”, “color count” and “color histogram” quickly directed me to ImageMagick and in this case RMagick for ruby bindings to the library. More info can be found at http://www.imagemagick.org/RMagick/doc/

As a starting point I simply required the library, instantiated an image and returned the width and height of it

ruby -e 'require "rubygems"; require "RMagick"; \
img = Magick::ImageList.new("filename.gif"); puts img.columns; puts img.rows'
160
30

Ok having to require rubygens and RMagick as well as using the package name Magick starts making the command line a little bit unwieldily. Considering up to this moment I had not done anything, it was going to get worse. First some code to return the histogram of colors and how many pixels use it, sorted to get the main color, then one which is used by the most pixels

main_color = img.color_histogram.sort{|a,b| b[1] <=> a[1]}.firtst

then create a new larger image in this case 2 x width and 6 x height, fill it’s background with the main color, create a drawing object, create an image composite using the original image and place it so that it will be centered on the new larger image, finally draw the composite on the new larger image and write to disk

l_img = Image.new(img.columns*2, img.rows*6) {
  self.background_color = main_color[0]
};
gc = Magick::Draw.new;
gc.composite((l_img.columns - img.columns)/2,(l_img.rows - img.rows)/2,0,0,img);
gc.draw(l_img);
l_img.write("large_file.gif")

to make the work worth while, wrap it in a find to look for all gif’s in a given directory tree, and create new large_<filename>.gif image

ruby -e ' \
require "rubygems"; \
require "find"; \
require "RMagick"; \
include Magick; \
Find.find("images") {|file| \
  if file =~ /gif$/ then \
    img = ImageList.new(file); \
    main_color = img.color_histogram.sort{|a,b| b[1] <=> a[1]}.first; \
    l_img = Image.new(img.columns*2, img.rows*6) {self.background_color = main_color[0]}; \
    gc = Magick::Draw.new; \
    gc.composite((l_img.columns-img.columns)/2,(l_img.rows-img.rows)/2,0,0,img);  \
    gc.draw(l_img); \
    l_img.write(File.join(File.dirname(file), "large_"+File.basename(file))) \
  end \
}'

or if you want to just run this and use some images directly from the internet try this

ruby -e 'require "rubygems"; require "RMagick"; include Magick; \
img = ImageList.new("http://cdn-0.nflximg.com/us/layout/signup/950/header/netflix_logo.gif"); \
main_color = img.color_histogram.sort{|a,b| b[1] <=> a[1]}.first; \
l_img = Image.new(img.columns*2, img.rows*6){self.background_color = main_color[0]}; \
gc = Magick::Draw.new; \
gc.composite((l_img.columns-img.columns)/2,(l_img.rows-img.rows)/2,0,0,img); \
gc.draw(l_img); img.write("original.gif"); l_img.write("large.gif")'

and the results are

Ok so in this case I am taking the command line programming a little too far. I did manage to do the whole task above without ever opening up a text editor but I had some clever command line tricks up my sleeve which I will cover in a future post. Still it is good to know that such power is available from the command line.

-e to execute from command line

September 23rd, 2009 2 comments

One of the useful tools in the unix world is the command line. Often you may have a request like

give me a list of all the unique Agencies specified in all the XML files in a given directory tree where an agency is shown in an element <agency id="IJKXYZ">

this can be easily achived (for pretty printed XML documents, each element on it’s own line) with a variety of unix tools like so

find . *xml | xargs grep agency.id $1 | awk -F '"' '{print $2}' | sort | uniq -c

find all the *.xml files from the current directory, for this list execute grep using xargs and grep for the agency.id pattern (the ‘.’ matches any character, in this case a space, ‘ ‘) use awk, delimiting on double quotes, to print out the ID’s, then sort them and run them through uniq with -c flag to get the count of occurrences for every agency ID. Yes, there are no doubt other ways of doing this, and different sort’s of optimisations but this is just the way my brain thinks and wire’s together these commands.

As already mentioned, this will only work if the XML is pretty printed and not if all the white space has been removed, there may also be many cases where the request is more complicated but it feels like opening up a text editor to write a program should not be necessary. This is where the -e option of many scripting languages comes in, it lets you run the language from the command line.

The basic hello world in a couple of such languages

groovy -e "println 'hi'"
perl -e 'print "hi\n"'
ruby -e 'puts "hi\n"'
jruby -e 'puts "hi\n"'

As a more complex example you may want to find out the day of the week for a given date. This can be done like so

$ groovy -e 'println Date.parse("MM/dd/yyyy", "12/13/1974").format("EEEE")'
Friday

of course again there may be a unix command out there which is more concise or which the user may be more familiar with, in which case great, use that instead. If on the other hand it is a situation where the code solution comes to you immediately then why not use it? If this is the way your brain is wired, and you are more competent at using your programming language of choice then why go reading man pages when you can do this simply with a scripting language and the -e flag.

Counter to that argument I had a snippet in PowerShell, the Microsoft Windows shell language for finding the day for a given date and it goes like this

PS> (get-date "12/13/1974").DayOfWeek

Very concise indeed.

Of course the real power of scripting languages is when you use them in conjunction with pipes and unix tools. In this case I want a histogram of how many files I modify for each day of the week for a given directory.

groovy -e 'new File(".").eachFile{file -> println new Date(file.lastModified()).format("EEEE")}' \
 | sort | uniq -c | sort -rn
 
19 Monday
15 Tuesday
10 Wednesday
5 Friday
4 Thursday

Given that it was a work directory then the work day’s only is understandable. Also as I am running this on a Wednesday morning that may skew the results of “last modified” if I have modified most files. Still it seems that Thursday and Friday are the low parts of the week.

Feel free to add how you use scripting languages from the command line in the comments. Do be careful cutting and pasting any code samples and make sure you know what you are doing prior to doing so.

I will look at a few common examples I use regularly in future posts and I hope to see more people using scripting languages from the command line to solve their problems.