September 22, 2011

Groovy Goodness: Use Connection Parameters to Get Text From URL

For a long time we can simply get the text from an URL in Groovy. Since Groovy 1.8.1 we can set parameters to the underlying URLConnection that is used to get the content. The parameters are passed as a Map to the getText() method or to the newReader() or newInputStream() methods for an URL.

We can set the following parameters:

  • connectTimeout in milliseconds
  • readTimeout in milliseconds
  • useCaches
  • allowUserInteraction
  • requestProperties is a Map with general request properties

// Contents of http://www.mrhaki.com/url.html:
// Simple test document
// for testing URL extensions
// in Groovy.

def url = "http://www.mrhaki.com/url.html".toURL()

// Simple Integer enhancement to make
// 10.seconds be 10 * 1000 ms.
Integer.metaClass.getSeconds = { ->
    delegate * 1000

// Get content of URL with parameters.
def content = url.getText(connectTimeout: 10.seconds, readTimeout: 10.seconds,
                          useCaches: true, allowUserInteraction: false,
                          requestProperties: ['User-Agent': 'Groovy Sample Script'])

assert content == '''\
Simple test document
for testing URL extensions
in Groovy.

url.newReader(connectTimeout: 10.seconds, useCaches: true).withReader { reader ->
    assert reader.readLine() == 'Simple test document'