April 28, 2020

Clojure Goodness: Partition Collection Into Sequences

Clojure has the partition, partition-all and partition-by functions to transform a collection into a list of sequences with a (fixed) number of items. We can set the number of items in each sequence by providing a number as the first argument of the partition and partition-all functions. Any remainder elements are not in the resulting list of sequences when we use partition, but are when we use partition-all. We can also specify another collection to use values from to fill up the remainder as the third argument of the partition function.
Optionally we can specify an offset step value as a second argument using both functions. This mean a new partition sequence will start based on stepping through the original collection with the given step value.
Finally we can use a function to define when a new partition must start with the partition-by function. Every time the function returns a new value a new partition will begin.

In the following example Clojure code we use all three functions with all possible arguments:

(ns mrhaki.core.partition
  (:require [clojure.test :refer [is]]))

;; Sample string (a sequence of characters).
(def letters "aBCdeFg")

;; First argument defines how many items are in each partition.
;; Any remainder is ignored. 
(is (= [[\a \B] [\C \d] [\e \F]] (partition 2 letters)))

;; With partition-all the remainder is part of the result.
(is (= [[\a \B] [\C \d] [\e \F] [\g]] (partition-all 2 letters)))

;; The second argument is a step offset.
(is (= [[\a \B] [\d \e]] (partition 2 3 letters)))

(is (= [[\a \B] [\d \e] [\g]] (partition-all 2 3 letters)))

(is (= [[\a \B \C] [\C \d \e] [\e \F \g]] (partition 3 2 letters)))

(is (= [[\a \B \C] [\C \d \e] [\e \F \g] [\g]] (partition-all 3 2 letters)))

;; The third argument is used to fill the last remainder partition if needed.
(is (= [[\a \B \C] [\d \e \F] [\g \! \?]] (partition 3 3 [\! \? \@] letters)))

(is (= [[\a \B \C] [\d \e \F] [\g \! \!]] (partition 3 3 (repeat \!) letters)))

;; When padding collection has not enough items, only what is available
;; is used to fill the remainder part.
(is (= [[\a \B \C] [\d \e \F] [\g \!]] (partition 3 3 [\!] letters)))

;; Using partition-by we can use a function that perfoms the split
;; when the function returns a new value.
(is (= [[\a] [\B \C] [\d \e] [\F] [\g]]
       (partition-by #(Character/isUpperCase %) letters)))

(is (= [[ 1 2 3 4] [5] [6 7 8 9] [10] [11 12 13 14]]
       (partition-by #(= 0 (mod % 5)) (range 1 15))))

Written with Clojure 1.10.1.