Tuesday, May 19, 2015

Does Pi Occur in E - Find a Digit Sequence in any Number

Recently Stephen Wolfram and my friend and lab head, neurosurgeon and computer scientist Jeff Arle, were talking about Carl Sagan's idea (see his novel Contact) that a message could be embedded in a transcendental number, and Stephen said, "Let's write a program to find a number sequence in the sequence of any given number." Jeff said it took him just a minute or two (mine below took over an hour :-( but I bet he had written this function before :-).

You recall a transcendental number is a number that is irrational in any base and therefore has an infinite sequence. Since I see nothing that can be parameterized in a transcendental number, I don't see how anyone, even God (in this case the deistic creator or source of the universe), could embed a message in a transcendental number. On the other hand, you could specify a start location in any transcendental number for any message of any length, since any transcendental number contains all possible number sequences.

Thus the sequence of any approximation to Pi is contained somewhere in E, and vice versa. The following is my program to find any number sequence in any other sequence.

Clear@findSequence;
findSequence[sequence_?NumericQ,sourceNumber_,maxDigits_Integer,base_Integer:10]:=

Module[
{stringSequence=StringReplace[ToString@sequence,"."->""],
numberDigitSequence,partitionLength,partitionedNumber,
sourceSequence={}(*initialize sourceSequence*),
startPositionOfSequence=1,
targetSequence,targetSequenceLength},

targetSequenceLength=StringLength@stringSequence;

targetSequence=RealDigits@ToExpression@stringSequence//First;
Print@targetSequence;

numberDigitSequence=N[sourceNumber,maxDigits]//RealDigits//First;
partitionedNumber=Partition[numberDigitSequence,targetSequenceLength,1];

(* The length of the partitioned sequence is the number of partitions against which to try to match the target sequence. startOfSequence is the Position in the source sequence of the first digit of the subsequence being matched. Note using While means only the first match is found if there are more than one. *)

While[startPositionOfSequence<=Length@partitionedNumber&&sourceSequence=!=targetSequence,sourceSequence=partitionedNumber[[startPositionOfSequence]];startPositionOfSequence++];

If[sourceSequence==targetSequence,StringForm["Sequence starts at position ``.",startPositionOfSequence-1 (*subtract 1 since first comparison is between empty List {} and targetSequence*)],"Sequence not found."]
]

In[151]:= Timing@findSequence[3.14159, E, 10000000]

During evaluation of In[151]:= {3,1,4,1,5,9}

Out[151]= {9.406860, StringForm[
 "Sequence starts at position ``.", 1436936]}

I thought Contact was a fun read and even better movie, but IMHO The Demon-Haunted World should be required reading for every scientist.

Tuesday, December 30, 2014

String Replacement Methods: StringTemplate

A String Replacement Overview is here.

StringTemplate

StringTemplate saves you the trouble of searching for a String subset within a String to replace or setting up your own marker to flag the StringPosition in the String at which to perform a replacement.

Further, good programming practice dictates that we use selectors and constructors – specialized, dedicated functions to extract a subset of a file or to change a subset of a file – and to always use those rather than ad hoc one liners scattered in our functions and programs.1,2 StringTemplate conveniently formalizes and enforces the use of selectors and constructors.

StringForm is simpler to understand and use than StringTemplate, so I use StringForm when you need to output a message from your function. I don't end the command with a semi-colon so you can see the InputForm of a TemplateObject including its default Options.

stringTemplate1=StringTemplate@"The quick brown `` jumped over the lazy white ``."

TemplateObject[{The quick brown ,TemplateSlot[1], jumped over the lazy white ,TemplateSlot[2],.},InsertionFunction->TextString,CombinerFunction->StringJoin]

You can directly Apply any StringTemplate as a function to a List of its arguments that fits its requirements, or use TemplateApply to do the same thing.

stringTemplate1@@{"mink","peccadillo"}

The quick brown mink jumped over the lazy white peccadillo.

Equivalently, here StringTemplate is used as a function as you would any other function – use it as the Head of an Expression with its arguments.

stringTemplate1["mink","peccadillo"]

The quick brown mink jumped over the lazy white peccadillo.

Equivalently, using TemplateApply:

TemplateApply[stringTemplate1,{"mink","peccadillo"}]

The quick brown mink jumped over the lazy white peccadillo.

1. Maeder, Roman, Computer Science with Mathematica. Cambridge: Cambridge University Press, 2000. Chapter 5.3. Design of Abstract Data Types.

2. Maeder, Roman, M220: Programming in Mathematica  (course given by Wolfram Education Group, which I have taken twice and recommend).

String Replacement Methods: Overview

Here are String replacement methods that I have used in code from one-liners up to programs producing hundreds of thousands of text and html files. In general, use the simplest method or one that you understand clearly. Use StringForm to output messages from your functions and programs. For longer functions or programs, StringTemplate is the new best practice.

There is a function I don't discuss, StringInsert, which inserts a substring at a given StringPosition in a control String. I don't advocate its use since it's very brittle in that if you add or delete even one character before the StringPosition then the insertion point will be wrong.

StringForm

Literal Replacement, Markers, and Delimiters

String Replacement Methods: Literal Replacement, Markers, and Delimiters

A String Replacement Overview is here.

Note that the next three methods all use StringReplace. This is in keeping with my principle that the fastest way to learn Mathematica is to become a power user of its 70 or so core functions. In String processing, for instance, StringInsert is not a function you need to know. Instead learn to use the more powerful and robust function, StringReplace.

Literal Replacement

Literal replacement works by using StringReplace to find a literal substring within a String and substitute another substring for it. Literal replacement is very simple and easy to use.

string1="The quick brown fox jumped over the lazy white dog.";

StringReplace[string1,{"fox"->"mink","dog"->"pecadillo"}]

The quick brown mink jumped over the lazy white pecadillo.

Markers

Using markers to indicate the replacement position can improve code legibility. Use StringReplace to replace just the marked text.

string2="The quick brown <animal1> jumped over the lazy white <animal2>.";

StringReplace[string2,{"<animal1>"->"mink","<animal2>"->"pecadillo"}]

The quick brown mink jumped over the lazy white pecadillo.

Delimiters

Use StringReplace to replace text between the delimiters. This is very useful when you want to replace a lot of text in a document, especially in a long document. However, the new function StringTemplate is a superior method overall.

sitemapTemplate="<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\">
<!-- put list of urls here with a line feed after each one -->
</urlset>";

urls="<url><loc>http://www.blah.net/page1.html</loc></url>
<url><loc>http://www.blah.net/page2.html</loc></url>";

Note that you use StringExpression (shorthand "~~") to concatenate quoted Strings with Blanks in the String to be found by StringReplace, but you must use StringJoin (shorthand "<>") if you concatenate different Strings in the replacement String.

sitemapTemplateWithURLs=StringReplace[sitemapTemplate,"<!-- put list"~~urlsList__~~"each one -->"->urls]

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url><loc>http://www.blah.net/page1.html</loc></url>
<url><loc>http://www.blah.net/page2.html</loc></url>
</urlset>

String Replacement Methods: StringForm

A String Replacement Overview is here.

StringForm

StringForm is a simple, elegant String template function. Use it in your functions where Print isn't enough since you need to fill in some variables, such as calculations on the fly. In a function use the following form with Print to output it since lines end with an output-suppressing semi-colon.

Print@StringForm["control string", variables];

Here the double backtick marks tell Mathematica where to fill in the blanks with arguments you give in the order in which they are inserted into the String. An argument can be a String or an Expression of unlimited complexity, which will be evaluated before insertion. If you don't want the inserted Expression to be evaluated, though, use HoldForm (example below).

StringForm["Use `` for relatively short and simple String templates, such as output messages in your functions. For example, the cube root of `` is ``.","StringForm",27,27^(1/3)]

Use StringForm for relatively short and simple String templates, such as output messages in your functions. For example, the cube root of 27 is 3.

If you're going to use an argument twice, switch the order, or use a number of arguments and you want to prevent mistakes, use numbered, rather than ordered, backticks. You often want a line break, for which the \n escape character is used within the quotation marks that are Mathematica's String delimiters.

StringForm["Flying or gliding mammals include `1`, `2`,\n`3`, `4`, `5`, `6`, and `7`.\nThe most common species are in the `3` family.","flying possums","greater glider","bats","flying squirrels","flying lemurs","flying monkeys","cats"]

Flying or gliding mammals include flying possums, greater glider,
bats, flying squirrels, flying lemurs, flying monkeys, and cats.
The most common species are in the bats family.

To prevent the inserted Expression from being evaluated, use HoldForm:

StringForm["For example, the sixth term of the Fibonacci series is the sum of the preceding two terms: ``.",HoldForm[1+1+2+3+5=8]]


For example, the sixth term of the Fibonacci series is the sum of the preceding two terms: 1+1+2+3+5=8.

Tuesday, December 23, 2014

The Easiest Way to Clear Memory

While Mathematica is designed to manage memory for you, under certain circumstances it can get bogged down, mainly because it keeps a record of all your inputs and outputs with In and Out. So if you're using functions that output a lot of computation, or working with large files, you may notice Mathematica slowing down.

There are a number of ways that you can manage memory in Mathematica. I have tried them. You can selectively clear Out cells. You can constrain the memory used by functions with MemoryConstrained. But by far the easiest method to clear memory is to use Quit[] in your Notebook or Quit Kernel → Local under the Evaluation menu.

Beginners hesitate to Quit the kernel, but there are few adverse consequences. Even if you haven't saved your Notebooks, they will not be affected and you can save them. There will be no blue screen and no one will show up at your door and work you over with a pair of pliers and a blowtorch. You will just completely flush the memory and be able to start fresh.

To facilitate re-starting your work after quitting the kernel, or in general when you open a Notebook to resume, use Initialization Cells. You can set Initialization in the menu under Cell → Cell Properties or by right-clicking on the cell and selecting Initialization Cell. A little downward tick mark appears in the upper right corner of the cell.

Then when you re-start the kernel by selecting any cell, selecting Evaluation → Evaluate Initialization Cells, or re-open the Notebook, all the Initialization cells are automatically re-Evaluated. In this way you lose very little time by quitting the kernel and re-starting.

Memory-Management Commands to Use Frequently


Clear[] should be used for good housekeeping to make sure there is no pre-existing value for a symbol during the current session, but it can also clear the memory consumed by a symbol, such as one storing a large calculation or file.

Clear@spikeFile;

To remove the symbol entirely including memory it used, use Remove, which is also good housekeeping when you are done with a symbol. That said, to clear out all symbols (you guessed it) just Quit[].

Remove@spikeFile.

Memory-Management Commands to Use Occasionally


Memory currently used by the kernel:

In[157]:= MemoryInUse[]

Out[157]= 135450976

Memory currently used by the front end (all of your open Notebooks):

In[158]:= MemoryInUse@$FrontEnd

Out[158]= 543264768

The maximum memory used by the kernel during your current Mathematica session:

In[159]:= MaxMemoryUsed[]

Out[159]= 137155304

Clear a cell that consumed lots of memory in your session:

Unprotect[Out]; Out[537] =.;
Protect@Out;

Easy Ways to Create Variations of Function

Easy Ways to Create Variations of Function

You will often need to create more than one version of a function. Here are two different methods, a piecewise function and using an Option. Either are fine, but in general simply for conciseness I use a piecewise function when the function is short (like less than a dozen lines) and an Option when the function is longer.

Here is the dataset for the examples. To see the 6 functions used to create arrays in Mathematica, see Ways to Create Arrays.

dataset=Array[List,{5,3}]

{{{1,1},{1,2},{1,3}},{{2,1},{2,2},{2,3}},{{3,1},{3,2},{3,3}},{{4,1},{4,2},{4,3}},{{5,1},{5,2},{5,3}}}

Here's the first version. It takes a List of data as a first argument and an Integer as its second argument. This simple example uses the index to pick out a Part from the data.

function1[data_List,index_Integer]:=data[[index]]

function1[dataset,4]

{{4,1},{4,2},{4,3}}

The Piecewise Approach

Now we need a second version to take a List of indices in the second argument. We can simply use datatyping by specifying the Head of the second argument. This is one way to do the "piecewise" method. The domain of possible inputs is split into pieces (sub-domains) and each variation of a function is designed to pick out the correct piece (sub-domain) for which it is designed.

function2[data_List,index_List]:=data[[index]]

function2[dataset,{2,4}]

{{{2,1},{2,2},{2,3}},{{4,1},{4,2},{4,3}}}

The Options Approach

You can see that if there were a minor change in a long function, copying the function is not as concise as the following approach using Options. There is more overhead in this example to create the Option and handle it in the function, but in a long function that overhead is less than copying as in the Piecewise approach.

Note that for concise creation of Piecewise mathematical functions, the built-in function Piecewise should be used. For an explanation of how I use Options, see A Template for Optional Arguments [to be published].

ClearAll@function3;
Options@function3="indexOption"->"Integer";

function3[data_List,index_,options___?OptionQ]:=
Module[{indexOption="indexOption"/.{options}/.Options@function3},

If[indexOption=="Integer",dataset[[index]]  ];
If[indexOption=="List",dataset[[{index}]]  ];

(*endModule*)]

Here is the function that prompted me to write this primer. It takes the membrane voltage time series from simulated spiking neuron cells and counts how many spikes occur, with cell range and time series range as arguments. I needed to expand it so I could specify counting spikes in several time series ranges. I didn't want to copy and vary it with the piecewise approach, so I used the options approach. The changes to the original function to implement the new ones are italicized.

ClearAll@batchMeanFiringRateTable;
Options@batchMeanFiringRateTable={"Export"->Off,"MultipleBatch"->Off,"MultipleTimeSeries"->False};

batchMeanFiringRateTable[spikeFileDirectory_String,cellRange_List,timeSeriesRange_List,leftColumnHeading_String:"Neural Group",rightColumnHeading_String:"Average Firing Rate",options___?OptionQ]:=
Module[{exportFileName,firingData,
spikeFiles=getSpikeFiles@spikeFileDirectory,

exportOption="Export"/.{options}/.Options@batchMeanFiringRateTable,
multipleBatchOption="MultipleBatch"/.{options}/.Options@batchMeanFiringRateTable,
multipleTimeSeriesOption="MultipleTimeSeries"/.{options}/.Options@batchMeanFiringRateTable},

(*With "Batch"\[Rule]On, meanFiringRateFromSpikeFile will just return the averageSpikeRate each time it's called*)

(*Need FileBaseName to identify cell group in left column if TableForm; need First@timeSeriesRange to remove the outer List for a single time series range*)

If[multipleTimeSeriesOption==False,firingData=Table[{FileBaseName@batchFile,meanFiringRateFromSpikeFile[batchFile,cellRange,First@timeSeriesRange,"Batch"->On]},{batchFile,spikeFiles}]  (*endIf*)];

If [multipleTimeSeriesOption==True,firingData=Table[{FileBaseName@batchFile,meanFiringRateFromSpikeFile[batchFile,cellRange,timeSeries,"Batch"->On]},{batchFile,spikeFiles},{timeSeries,timeSeriesRange}]/.{{x_,y_},{x_,z_}}->{x,y,z};(*Table will take each spike file and iterate over timeSeriesRange. Don't need First@timeSeriesRange here. *) (*endIf*)];

If[multipleBatchOption==Off,Print@"Multiple batch is off.";Print[firingData//TableForm[#,TableAlignments->Left,TableHeadings->{None,{leftColumnHeading,"Ave AP Rate: "<>ToString@timeSeriesRange}}]&],Return@firingData (*endIf*)];

If[exportOption==On&&multipleBatchOption==Off,exportFileName=FileNameJoin@{DirectoryName@spikeFiles[[1]],ToString@FileNameTake[DirectoryName@spikeFiles[[1]],-1]<>"-BatchFileTable.xlsx"};
Export[exportFileName,firingData];Print@StringForm["File exported to ``.",exportFileName]];
]


batchMeanFiringRateTable["C:\\Users\\Public\\Documents\\UNCuS16_09_2013\\DataFiles\\WDR-Abeta-14-12only\\WDR-Abeta-14-12only-CS8-9-20-RS0p28\\",{1,80},{{1,45},{46,85}},"Neural Group","Method"->"OverSpikingCells","MultipleBatch"->Off,"Export"->Off,"MultipleTimeSeries"->True]