Wednesday, August 12, 2009

Rails Dump Format Error Mystery

One day your rails app stopped working and the only clue you saw from console log was the 'Dump format error' exception. It started when users logged in and visit a page and received the dreaded HTTP 500 internal error. After it happened, they could no longer browse to any page of your rails app, what should you do?

The first I did was trying to find out what caused this problem, once this exception was thrown, I couldn't reload the page or visit path of the app as it kept showing the 500 error. I knew it must have something to do with session. So I closed the web browser and reopened it, it worked but as long as I visited the problematic page, the same problem happened again.

What does the dump format error mean? After a quick google, I found out that it was thrown when there was a problem with marshalling/unmarshalling objects. Who did that? Why did rails send objects over the wire? Then I looked at the controller's action and found out this line:

flash[:object_with_error] = user

So someone put an object to the flash which was in turn kept in a cookie session data store. Since the cookie is limited to 4K bytes, storing I suspect that the cookie was full and the data couldn't be fully marshalled. That caused the dump format error.

The flash hash usually contains a message to be displayed in a view. Why did someone put a binary object (in this case, it's an activerecord model) in it? I found out later that it was because the action took a form, verify the data, and redirect to a different action. If there were validation errors, it stored the activerecord with errors in the flash so that the redirecting action would be able to fill out the form with errors.

However, we should avoid storing binary objects in flash (and in cookies) since the object graph can easily exceed the cookie size limit. Instead, if we really need to pass temporary data between sessions, pass only string values. In this case, pass only the form data and error messages.

Problem solved, lesson learned. Case closed.

Saturday, July 25, 2009

Sort your flickr albums in Ruby with Flickr_fu

I have tons of photos in flickr organized by photo sets. Each photo set name has a date prefix like '20090725 Chicago downtown'. I use the prefix to sort album folders in my local drive. However when I upload them to flickr, if I don't upload them in the same order, I will have to use Flickr's Organize to manually drag and drop ones to the right place. It is tedious and there is no quick and easy way to automate it.

I know that flickr provides APIs that allow developers to manage photos programatially in pratically any modern language. Since I am a fan of Ruby, I chose a ruby libray, flickr_fu from commontread.

So I take an hour to set up and write a ruby class to re-order my 1000+ photosets. The class name is FlickrPhotoSetSorter. Here is the step that I did and you can follow, assuming you have ruby 1.8+ installed.

1. Install flickr_fu
sudo gem install flickr-fu

You may have to install a required gem; xml-magic separately.

2. Create flickr.yml to store my flickr API key and secret key. Replace "your key" and "your secret" with the yours. You may obtain them from Flickr API keys. The token_cache.yml is a file flickr_fu uses to store your session token (flickr calls it 'frob').
## YAML Template.
--- !map:Hash
key: "your key"
secret: "your secret"
token_cache: "token_cache.yml"
3. Authorize your program to read/write your albums. I defined a method, authorize, in my FlickrPhotoSetSorter class.

require 'flickr_fu'

class FlickrPhotoSetSorter
def initialize
@flickr = Flickr.new('flickr.yml')
end

def authorize
puts "visit the following url, then click once you have authorized:"
# request write permissions
puts @flickr.auth.url(:write)
gets
flickr.auth.cache_token
end
end

FlickrPhotoSetSorter.new.authorize
The method generates and displays a URL to flickr.com that you will have to visit in order to ensure that you have rights to your account. Here is an example of the result. I replaced all ids with a dummy 12345.

visit the following url, then click once you have authorize
http://flickr.com/services/auth/?frob=12345&auth_token=12345&api_key=12345&perms=write&api_sig=12345
4. Once I authorized the application, I can start loading a list of my photosets using flickr_fu's Photosets class.

  def photosets_list
Flickr::Photosets.new(@flickr).get_list
end
5. Since I have a large collection of my photosets, I don't want to load them from flickr.com every time I run the script. So I chose to store the result into a file that I can load later. I created a file, list.txt and store each photoset with id, number of photos, title and description separated by a pipe symbol '|'
  def store_photoset_list(file_name = 'list.txt')
File.open(file_name, "w") do |file|
photosets_list.map do |photoset|
file.puts("#{photoset.id}|#{photoset.num_photos}|#{photoset.title}|#{photoset.description}")
end
end
end

FlickrPhotoSetSorter.new.store_photoset_list
6. To load it back, firstly I created a method to read a line and returns a hash that contain the information from a photoset.
  def photoset_from(line)
id, num_photos, title, description = line.split('|')
return nil unless id.to_i > 0
{ :id => id,
:num_photos => num_photos,
:title => title,
:description => description }
end


Then, I defined a method to load the file and build an photoset array.

  def load_photoset_list(file_name = 'list.txt')
return @photosets_list if @photosets_list
@photosets_list = []
lines = IO.readlines(file_name)
lines.each do |line|
@photosets_list << photoset_from(line)
end
@photosets_list.compact
end
7. Since flickr_fu does not have a method to sort or update the order of photosets, I have to do it by myself. It is super easy since I already have everything I need. Flickr has an API to order photo sets, flickr.photosets.orderSets and it requires a list of photoset ids separated by commas.

So I sort the photoset by title descending and collect ids and join them with ','. At the end, I use flickr_fu's send_request and pass the API name with the list of photoset ids using HTTP post.

  def order_photosets_by_title
ordered_list = load_photoset_list.sort { |a,b| (b[:title] || "") <=> (a[:title] || "") }
ordered_ids = ordered_list.map { |set| set[:id] }.join(',')
@flickr.send_request('flickr.photosets.orderSets', { :photoset_ids => ordered_ids }, :post)
end
8. To run the script, just use:

FlickrPhotoSetSorter.new.order_photosets_by_title


That's it. Now my photoset list are sorted by title descending order. I can upload older photosets, rerun the script and have them sort again and never worry that my photosets will be out of order again.

Here is the entire source code of the FlickrPhotoSetSorter class.
require 'flickr_fu'

class FlickrPhotoSetSorter
def initialize
@flickr = Flickr.new('flickr.yml')
end

def authorize
puts "visit the following url, then click once you have authorized:"
# request write permissions
puts @flickr.auth.url(:write)
gets
flickr.auth.cache_token
end

def photosets_list
Flickr::Photosets.new(@flickr).get_list
end

def store_photoset_list(file_name = 'list.txt')
File.open(file_name, "w") do |file|
photosets_list.map do |photoset|
file.puts("#{photoset.id}|#{photoset.num_photos}|#{photoset.title}|#{photoset.description}")
end
end
end

def load_photoset_list(file_name = 'list.txt')
return @photosets_list if @photosets_list
@photosets_list = []
lines = IO.readlines(file_name)
lines.each do |line|
@photosets_list << photoset_from(line)
end
@photosets_list.compact
end

def photoset_from(line)
id, num_photos, title, description = line.split('|')
return nil unless id.to_i > 0
{ :id => id,
:num_photos => num_photos,
:title => title,
:description => description }
end

def order_photosets_by_title
ordered_list = load_photoset_list.sort { |a,b| (b[:title] || "") <=> (a[:title] || "") }
ordered_ids = ordered_list.map { |set| set[:id] }.join(',')
@flickr.send_request('flickr.photosets.orderSets', { :photoset_ids => ordered_ids }, :post)
end

end


Feel free to use the code for you needs. If you want to reorder with different criteria, update the sort block in the order_photosets_by_title method. Rename the method as appropriate.

Enjoy.

Saturday, March 14, 2009

Small quiz solutions in 12 languages

Last week I posted a short programming quiz in my Thai development community at narisa.com and asked members to come up with solutions in many languages as possible. We finally got 12 solutions implemented in C, C#, Clojure, Erlang, Groovy, Haskell, Java, PHP, Python, Ruby, Scala, Visual Basic with LINQ. Some languages have more than one implementations.

The quiz is about parsing and sorting an input text file. There are records; one per line. Each record has many fields delimited by a pipe symbol '|'. The first field is a record number. We want to sort all the records by this number and display the sorted result one record per line. Each line will display the record number followed by sorted fields. (The original problem didn't have number sort and first column exclusion constraints)

For example, an input.txt file:
13|hello|world|bangkok
4|1monkey|ant|dog|cow|cat
2|pink|yellow|red|magenta
1|earth|sun|jupiter|pluto


The expected result:
1 earth jupiter pluto sun
2 magenta pink red yellow
4 1monkey ant cat cow dog
13 bangkok hello world


Notice that the record number 13 is after record number 4. So we need to compare the number by value, not by string literal. Each record has to exclude the field number and sort the rest alphabetically.

Here are solutions. Your truly provided 2 solutions in Ruby and Scala. You will see that Thai development community is quite vibrant from a variety of language choices. Look like Groovy is among the most popular one. You will see that scripting languages are good for this kind of problem in terms of compactness and declarative style. Read the original post in Thai here.

C (gcc) by iWat
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

struct record
{
int number;

int num_fields;
char **fields;

char *buffer;
};

static int record_number_comparator(struct record *first, struct record *second)
{
return first->number - second->number;
}

static void bubble_sort(void **ar_p_struct, int count, int (*comparator)(void *, void *))
{
int i;
int j;

for (i = 0; i < count - 1; ++i)
{
for (j = 0; j < count - 1 - i; ++j)
{
if (comparator(ar_p_struct[j+1], ar_p_struct[j]) < 0) {
void *p_tmp;

p_tmp = ar_p_struct[j];
ar_p_struct[j] = ar_p_struct[j + 1];
ar_p_struct[j + 1] = p_tmp;
}
}
}
}

static void print_record(struct record *my_record)
{
int i;

for (i = 0; i < my_record->num_fields; ++i)
{
printf("%s ", my_record->fields[i]);
}

printf("\n");
}

static void parse_records(FILE *fp, struct record ***p_ar_p_records, int *p_num_records)
{
char buffer[256];

*p_num_records = 0;
*p_ar_p_records = NULL;

while (fgets(buffer, 256, fp) != NULL)
{
struct record *current_record = malloc(sizeof(struct record));

current_record->num_fields = 0;
current_record->fields = malloc(10 * sizeof(char *));
current_record->buffer = strdup(buffer);

char *start = current_record->buffer;
char *pointer = start;

while(1)
{
if (*pointer == '|' || *pointer == '\0')
{
current_record->fields[current_record->num_fields] = start;

start = pointer + 1;

if (current_record->num_fields == 0)
current_record->number = atoi(current_record->fields[0]);

++current_record->num_fields;

if (*pointer == '\0')
{
if (*(pointer - 1) == '\n')
*(pointer - 1) = '\0';
break;
}

*pointer = '\0';
}

++pointer;
}

++(*p_num_records);
*p_ar_p_records = (struct record **) realloc(*p_ar_p_records, sizeof(struct record *) * (*p_num_records));
(*p_ar_p_records)[(*p_num_records) - 1] = current_record;
}
}

int main(int argc, char **argv)
{
int i;
int num_records;
struct record **ar_p_records;

FILE *fp;

if (argc != 2)
{
printf("Usage: %s <input file>\n", argv[0]);
return 1;
}

fp = fopen(argv[1], "r");

if (fp == NULL)
{
printf("error reading file: %s\n", argv[1]);
return 2;
}

parse_records(fp, &ar_p_records, &num_records);

printf("===== parse result\n");

for (i = 0; i < num_records; ++i)
{
print_record(ar_p_records[i]);
}

bubble_sort((void **) ar_p_records, num_records, (int(*)(void *, void *)) record_number_comparator);

printf("\n===== sort result\n");

for (i = 0; i < num_records; ++i)
{
bubble_sort(
(void **) (ar_p_records[i]->fields + 1),
ar_p_records[i]->num_fields - 1,
(int(*)(void *, void *)) strcmp);
print_record(ar_p_records[i]);
}


fclose(fp);
}


C# with LINQ by bleak
using System;
using System.Linq;

class FunCode
{
static void Main(string[] args)
{
string[] records = System.IO.File.ReadAllLines("input.txt");
var rs = from r in records
let fields = r.Split('|')
orderby int.Parse(fields[0])
select string.Concat(fields[0], " ", string.Join(" ", fields.Skip(1).OrderBy(x => x).ToArray()));
foreach (var r in rs)
Console.WriteLine(r);
}
}


Clojure by pphetra
(defn outer-sort [xs]
(sort-by #(first %) xs))

(defn inner-sort [xs]
(let [fst (first xs)
rst (sort (rest xs))]
(cons fst rst)))

(defn solve [xs]
(outer-sort (map inner-sort xs)))

(defn read-file [file-name]
(let [raw-lines (seq (.split (slurp file-name) "\n"))
lines (map #(seq (.split % "\\|")) raw-lines)]
lines))

(defn pretty-print [xs]
(doseq [line xs]
(do (doseq [word line]
(print word ""))
(print "\n"))))

;; ตัวอย่างการใช้
(pretty-print (solve (read-file "input.txt")))


Coldfusion by iporsut
<cfset recordset = QueryNew("id, list" , "Integer, VarChar")>
<cfset count = 1 />
<cfloop file="#Expandpath('/')#input.txt" index="line">
<cfscript>
line = listChangeDelims(line,",","|");
id = listfirst(line);
list = listsort(listrest(line) , "text", "asc");
list = listChangeDelims(list," ",",");

QueryAddRow(recordset,1);
QuerySetCell(recordset,"id",id,count);
QuerySetCell(recordset,"list",list,count);

count++;
</cfscript>
</cfloop>

<cfquery dbtype="query" name="qSortedRecord">
select id,list
from recordset
order by id asc
</cfquery>

<cfoutput query="qSortedRecord">
#qSortedRecord.id# #qSortedRecord.list# <br />
</cfoutput>


Erlang by pphetra
-module(s1).
-export([solve/1]).

-import(lists, [sort/2,sort/1]).
-import(string, [tokens/2, to_integer/1]).

% เขาว่า read แบบ binary จะ performance ดีกว่า
readfile(FileName) ->
{ok, Binary} = file:read_file(FileName),
tokens(erlang:binary_to_list(Binary), "\n").

% แต่ละ line ที่ได้มาให้ split tokens ด้วย
parse(FileName) ->
[ tokens(Line, "|") || Line <- readfile(FileName)].

% sort โดยใช้ Head Element และแปลงเป็น integer ก่อน sort
outer_sort(List) ->
sort(fun ([H1|_], [H2|_]) -> to_integer(H1) =< to_integer(H2) end, List).

solve(FileName) ->
outer_sort([ [H | sort(T)] || [H|T] <- parse(FileName)]).

1> c(s1).
{ok,s1}
2> s1:solve('input.txt').


Groovy #1 by cblue
def result=[:]
new File('input.txt').eachLine {
def line = it.tokenize('|')
result[line[0] as int] = line[1..-1].sort()
}
result.sort{ e1, e2 -> e1.key - e2.key }.each { k, v ->
println "$k ${v.join(' ')}"
}


Groovy #2 by cblue
def result = new File('input.txt').readLines().collect {
def line = it.tokenize('|')
[line.head() as int] + line.tail().sort()
}
result.sort { o1, o2 -> o1[0] <=> o2[0] }.each {
println it.join(' ')
}


Groovy #3 by xcaleber

new File("input.txt").readLines()*.tokenize("|").sort{one, another -> one[0] <=> another[0]}.each{ println "${it[0]} ${it[1..-1].sort().join(' ')}" }


Haskell #1 by iporsut
import List

compareFirst::[[Char]]->[[Char]]->Ordering
compareFirst a b | (read (head a)::Int) >= (read (head b)::Int) = GT
| otherwise = LT

transformPipeToSpace::[Char]->[Char]
transformPipeToSpace a = map transform a
where
transform b | b == '|' = ' '
| otherwise = b
inputToWords::[Char]->[[[Char]]]
inputToWords input = map words (map transformPipeToSpace (lines input))


wordsToOutput::[[[Char]]]->[Char]
wordsToOutput word = foldl (++) "" (map unwordsNewline word)
where
unwordsNewline w = (unwords w)++['\n']

sortRecords::[Char]->[[[Char]]]
sortRecords input = map sortWords (sortBy compareFirst (inputToWords input))
where
sortWords (x:xs) = x:(sort xs)
main = do
input <- readFile "input.txt"
putStrLn (wordsToOutput (sortRecords input))


Haskell #2 by pphetra
import List

split delim s
| [] <- rest = [token]
| otherwise = token : split (delim) (tail rest)
where (token,rest) = span (/= delim) s

compareFirst a b | (read (head a)::Int) >= (read (head b)::Int) = GT
| otherwise = LT

splitAndSort s = head xs : sort (tail xs)
where xs = split '|' s

main = do
input <- readFile "s1.txt"
putStrLn $ (unlines . map unwords . sortBy compareFirst . map splitAndSort . lines) input


Java #1 by comx
package javaapplication1;

import java.io.*;
import java.util.*;

public class Main {
public static void main(String[] args) throws Exception {
Reader r = new FileReader("c:/input.txt");
BufferedReader br = new BufferedReader(r);
SortedSet lineItemHolder = new TreeSet();
while (true) {
String line = br.readLine();
if (line == null) {
break;
}
lineItemHolder.add(new LineItem(line));
}
print(lineItemHolder);
}

public static void print(Collection items) {
for (Object object : items) {
System.out.println(object);
}
}
}


Java #2 by panther
import java.io.*;
import java.util.*;

class FooSort
{
public static void main(String[] args)throws Exception
{
BufferedReader br = new BufferedReader( new FileReader("c:/input.txt") );
Map<Integer, List<String>> tm = new TreeMap<Integer, List<String>>();

while( true ){

String line = br.readLine();

if( line == null ){
break;
}

String[] record = line.split("[|]");
List<String> list = Arrays.asList( record );

Collections.sort( list );

tm.put( Integer.parseInt( list.get( 0 ) ) , list );
}

Iterator<Map.Entry<Integer, List<String>>> it =
((Set<Map.Entry<Integer, List<String>>>)tm.entrySet()).iterator();

while( it.hasNext() ){
renderlist( it.next().getValue() );
}
}

public static void renderlist( List<String> list ){

Iterator<String> it = list.iterator();

while( it.hasNext() ){
System.out.print( it.next() + " " );
}

System.out.println();
}
}


PHP by Rux
<?php
$lines = file('input.txt');

foreach($lines as $line){
$key = substr($line, 0, strpos($line, '|'));
$ds[$key] = explode('|', $line);
}

ksort($ds);

foreach($ds as $rw){
sort($rw);
echo implode("&nbsp;", $rw);
echo "<br />";
}
?>


Ruby by your truely
lines = IO.readlines('input.txt').map do |line|
items = line.chomp.split('|')
[items.shift] + items.sort
end
lines.sort { |one, another| one[0].to_i <=> another[0].to_i }.each do |line|
puts line.join(' ')
end


Ruby with UNIX commands by pphetra
ruby -F'\|' -nlae 'puts $F[1..-1].sort().unshift($F[0]).join(" ")' input.txt | sort -nk 1

Scala by your truly
import scala.io.Source
import scala.util.Sorting

def show_sort(fileName: String) = {
def sort_record(line: String) = {
val fields = List.fromString(line.replace("\n", ""), '|')
fields.head :: fields.tail.sort((a,b) => (a compareTo b) < 0)
}
val tuples = Source.fromFile(fileName).getLines.map(sort_record).collect
val sort_tuples = Sorting.stableSort(tuples, (a:List[String], b:List[String]) => a.head.toInt < b.head.toInt)
sort_tuples.map(tuple => println(tuple.mkString("", " ", "")))
}
show_sort("input.txt")


VisualBasic.NET with LINQ by tuckclub
Module Module1

Sub Main()
Dim records = System.IO.File.ReadAllLines("input.txt")
Dim rs = From r In records _
Let ia = r.Split("|") _
Let ca = (From c In ia.Skip(1) Order By c) _
Order By ia(0) _
Select ia.Take(1).Union(ca).ToArray
For Each r In rs
Console.WriteLine(String.Join(" ", r))
Next
Console.ReadKey()
End Sub

End Module

Friday, February 20, 2009

irb with autocompletion support

Interactive ruby shell (irb) is a handy tool for exploratory programming in ruby. One of lesser known features in irb is tab autocompletion support. Say, you want to know instance methods of a particular object, you can type the object followed by a dot and then hit a tab. irb will display all instance methods on that object.

Below is an example of methods for a number 1. After you type 1., hit a tab key. (Oh yeah, 1 is an object in ruby)

$ irb -r irb/completion
irb(main):001:0> 1.
1.eql? 1.instance_variable_get 1.prec_f 1.taint
1.__id__ 1.equal? 1.instance_variable_set 1.prec_i 1.tainted?
1.__send__ 1.even? 1.instance_variables 1.pred 1.tap
1.abs 1.extend 1.integer? 1.private_methods 1.times
1.between? 1.fdiv 1.is_a? 1.protected_methods 1.to_a
1.ceil 1.floor 1.kind_of? 1.public_methods 1.to_enum
1.chr 1.freeze 1.method 1.quo 1.to_f
1.class 1.frozen? 1.methods 1.remainder 1.to_i
1.clone 1.hash 1.modulo 1.respond_to? 1.to_int
1.coerce 1.id 1.next 1.round 1.to_s
1.display 1.id2name 1.nil? 1.send 1.to_sym
1.div 1.inspect 1.nonzero? 1.singleton_method_added 1.truncate
1.divmod 1.instance_eval 1.object_id 1.singleton_methods 1.type
1.downto 1.instance_exec 1.odd? 1.size 1.untaint
1.dup 1.instance_of? 1.ord 1.step 1.upto
1.enum_for 1.instance_variable_defined? 1.prec 1.succ 1.zero?


You may scope down results by typing more, say, show all instance methods that start with 'i'.

irb(main):001:0> 1.i
1.id 1.instance_eval 1.instance_variable_defined? 1.instance_variables
1.id2name 1.instance_exec 1.instance_variable_get 1.integer?
1.inspect 1.instance_of? 1.instance_variable_set 1.is_a?


To enable tab completion in irb, you need to start irb with -r irb/completion option. Or if you are already in irb, do 'require irb/completion'.

As a bonus, require irb/completion works in rails script/console too.

Saturday, January 24, 2009

How to make Ruby and JRuby live peacefully

Similar to each Ruby versions, Ruby and JRuby maintain their own gems separately. While we can use jruby and jirb to specifically tell we want to JRuby version, JRuby does not have jrake or jgem. How do we specify we want to execute rake/gem in Ruby and JRuby?

A solution is to prefix the program with ruby -S and jruby -S like

ruby -S rake
jruby -S rake


or

ruby -S gem list
jruby -S gem list


This way, we explicitly tell which interpreter we want to use and it will automatically find the right program in the right path. Problem solved.

If you want to install multiple Ruby vesions in the same machine, use MultiRuby as described in Dr Nic's Future proofing your Ruby code. Ruby 1.9.1 is coming.