Wednesday, September 23, 2015

How It Works: Union

Here is a recursive version of the set-theoretic function, Union, which returns the common elements of two or more sets (represented as Lists in Mathematica).

There are two base cases: 1) The union of any set with the empty set is the set itself, and 2) The union of any set with a single element set is the single element set appended to the other set. The recursive case is: 3) The union of any two Lists is the union of the first List with the First element of the second List (i.e. the single element base case), which, calling itself, recurses down the Rest of the second List until it hits the empty set base case (an empty List), at which point it unwinds back out to produce the overall union.

Union produces a set of unique elements  — no duplicates — so we need to de-dupe the List. An easy way to do that is Sort it and use a replacement Rule (we could also use the built-in DeleteDuplicates or How It Works: DeleteDuplicates).

I show two ways to control the Precedence of the abbreviate operators for ReplaceRepeated vs Postfix. Since with Precedence of 110 ReplaceRepeated is a little 'stickier' than Postfix at 70 Sort will stick to the ReplaceRepeated function unless we do something. So we can either enclose the compound function in parentheses before applying ReplaceRepeated as in the 2nd base case or we can use Postfix with Slot (#) and pure Function (&) as typical in my extended Postfix syntax as in the recursive case.

Clear@myUnion;

myUnion[list1_List,{}]:=list1; (* 1st base case *)

myUnion[list1_List,element_]:=(If[MemberQ[list1,element],list1,list1/.{x___}->{x,element}] //Sort)//.{a___,b_,b_,c___}->{a,b,c}(* 2nd base case *)

myUnion[list1_List,list2_List]:=Module[{union},union=myUnion[myUnion[list1,First@list2],Rest@list2]//Sort//#//.{a___,b_,b_,c___}->{a,b,c}&
] (* recursive case *)


myUnion[{1,2,3,4},{}]

{1,2,3,4}

myUnion[{1,2,3,4},5]

{1,2,3,4,5}

myUnion[{1,2,3,4},2]

{1,2,3,4}

This shows the need to add a delete duplicates function:

myUnion[{1,2,2,3},2]

{1,2,2,3}

A simple replacement Rule does the job, but use ReplaceRepeated or only the first instance of a duplicate will be replaced:

{1,2,2,3,4,4}/.{a___,b_,b_,c___}->{a,b,c}

{1,2,3,4,4}

{4,4,1,3,3,2,2}//.{a___,b_,b_,c___}->{a,b,c}

This works without the delete duplicates function.

myUnion[{1,2,3,4},{3,4,5,6}]

{1,2,3,4,5,6}

But this would fail without it. First without delete duplicates.

myUnion[{1,2,2,3,4},{3,4,5}]

{1,2,2,3,4,5}

And now with delete duplicates.

myUnion[{1,2,2,3,4},{3,4,5}]

{1,2,3,4,5}

And the base case.

myUnion[{3,2,1,2,2,3},3]

{1,2,3}

And the recursive case.

myUnion[{1,2},{2,3,4}]

{1,2,3,4}