November 5, 2009

Groovy Goodness: Splitting Strings

In Java we can use the split() method of the String class or the StringTokenizer class to split strings. Groovy adds the methods split() and tokenize() to the String class, so we can invoke them directly on a string. The split() method return a String[] instance and the tokenize() method return a List. There is also a difference in the argument we can pass to the methods. The split() method takes a regular expression string and the tokenize() method will use all characters as delimiter.

def s = '''\

assert s.split() instanceof String[]
assert ['username;language,like', 'mrhaki,Groovy;yes'] == s.split()  // Default split on whitespace. ( \t\n\r\f)
assert ['username', 'language', 'like', 'mrhaki', 'Groovy', 'yes'] == s.split(/(;|,|\n)/)  // Split argument is a regular expression.

def result = []
s.splitEachLine(",") {
    result << it  // it is list with result of split on ,
assert ['username;language', 'like'] == result[0]
assert ['mrhaki', 'Groovy;yes'] == result[1]

assert s.tokenize() instanceof List
assert ['username;language,like', 'mrhaki,Groovy;yes'] == s.tokenize()  // Default tokenize on whitespace. ( \t\n\r\f)
assert ['username', 'language', 'like', 'mrhaki', 'Groovy', 'yes'] == s.tokenize("\n;,")  // Argument is a String with all tokens we want to tokenize on.

Run script on Groovy web console.