Jekyll2018-06-19T14:59:42+00:00http://blog.element84.com/Element 84 BlogRaspberry Pi Office Art2017-10-11T07:00:00+00:002017-10-11T07:00:00+00:00http://blog.element84.com/Raspberry-Pi-Office-Art<p>As a fun project and a way to get familiar with the popular Raspberry Pi platform, our team decided to build a real life display that can hang on the wall and be used to display various team metrics such as pageload speeds, burndown progress, achievement counts, etc. Below you will find a tutorial on how to make your own display as well as a link to all the code used to run the project.</p>
<p><img src="img/arrow_front.jpg" alt="Art" /></p>
<p>We chose this project because microcontrollers can be a fun way to break out of the day to day programming patterns in web or systems projects. The real strength of working with microcontrollers is that the output can be in physical objects instead of on screen. This opens up quite a different world of possibilities and allows you to make very large displays without the cost of a large LCD Screen. Our hope is that this project may get people interested enough in microcontrollers to make their own supersized display.</p>
<h3 id="parts-list">Parts List:</h3>
<p>We used a Raspberry Pi 3 for this project.
We also used a high torque servo that can be found here:</p>
<p><a href="https://www.adafruit.com/product/1142">https://www.adafruit.com/product/1142</a></p>
<p>You will need 2 power supplies. One for the Pi and one for the servo. I would recommend going above 1.5 amps on a 5v supply for the servo</p>
<p><a href="https://www.adafruit.com/product/1995">https://www.adafruit.com/product/1995</a></p>
<p>Additional male/female header wires can be found on Adafruit.</p>
<h3 id="overview">Overview:</h3>
<p>The general architecture of the display is a Raspberry Pi receiving values via web requests and then translating those values into signals for a motor which controls the display. The Pi runs 2 programs in order to accomplish this task. One is a Node.js program to receive the values and store them in local memory. The other is a Python program which calibrates the motor, grabs the value from the Node.js server, then turns the arrow on the display.</p>
<h3 id="software-challenges-faced">Software Challenges Faced:</h3>
<p>The main software challenges faced in the project were learning how to run a Node.js Server off of a Raspberry Pi, and communicating with the servo motor without a dedicated microcontroller shield.</p>
<h3 id="setting-up-nodejs-server">Setting up Node.js Server:</h3>
<p>Luckily installing and starting a Node.js server on a Raspberry Pi is a pretty straightforward process. Our Raspberry Pi 3 came with Node.js pre installed. If Node.js is not found on your system, you can install with the following command.</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="err">$</span> <span class="n">apt</span><span class="o">-</span><span class="n">get</span> <span class="n">install</span> <span class="o">-</span><span class="n">y</span> <span class="n">nodejs</span></code></pre></figure>
<p>In our file ‘set_value_server.js’ you can see the code we use to store and retrieve the display value via web requests. After setting up a basic Node.js server, we also include the set_value function to parse through web requests. This function parses the incoming web requests to identify the ‘position’ parameter. If this parameter is found, then our global value is set and returned to all future requests.</p>
<h3 id="controlling-the-motor-with-python">Controlling the Motor with Python:</h3>
<p>I should start by saying our understanding of Pi communications with the motor were based heavily off of this tutorial from Adafruit. <a href="https://learn.adafruit.com/adafruits-raspberry-pi-lesson-8-using-a-servo-motor/software">https://learn.adafruit.com/adafruits-raspberry-pi-lesson-8-using-a-servo-motor/software</a></p>
<p>In short, the open source WiringPi library offers access to the Pi’s pins to control electrical read and writes. By using the Pi’s internal pulse width modulation capabilities we can output a repeating electrical signal at various speeds which the motor then translates to specific positions. Installing the WiringPi library is a bit tricky on the Pi. The following commands must be run in this order:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="err">$</span> <span class="n">apt</span><span class="o">-</span><span class="n">get</span> <span class="n">install</span> <span class="o">-</span><span class="n">y</span> <span class="n">python</span><span class="o">-</span><span class="n">dev</span>
<span class="err">$</span> <span class="n">apt</span><span class="o">-</span><span class="n">get</span> <span class="n">install</span> <span class="o">-</span><span class="n">y</span> <span class="n">python</span><span class="o">-</span><span class="n">setuptools</span>
<span class="err">$</span> <span class="no">Pip</span> <span class="n">install</span> <span class="n">wiringpi</span></code></pre></figure>
<p>The wiring table found here is also very helpful for using WiringPi. <a href="https://projects.drogon.net/raspberry-pi/wiringpi/pins/">https://projects.drogon.net/raspberry-pi/wiringpi/pins/</a></p>
<p>In our ‘motor_control.py’ file you can see we start off with some boilerplate code to configure the Pi to communicate with our servo motor.</p>
<p>In these lines of code, we are directly setting the range for our motor’s electrical signals.</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="c1">#configs for the servo range</span>
<span class="n">motorTop</span> <span class="o">=</span> <span class="mf">242.0</span>
<span class="n">motorBottom</span> <span class="o">=</span> <span class="mf">53.0</span>
<span class="n">motorDif</span> <span class="o">=</span> <span class="n">motorTop</span> <span class="o">-</span> <span class="n">motorBottom</span></code></pre></figure>
<p>Every servo motor is slightly unique and the range of pulses that move the motor all the way from its maximum right and left turning angles will be unique. If you reset these values and then run the python program, you can see if this is the correct range for your motor. The first thing the program does is tells the motor to try the minimum and maximum values so that the motor can be calibrated. That code is below.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="c"># to calibrate top amount on start</span>
<span class="n">wiringpi</span><span class="o">.</span><span class="n">pwmWrite</span><span class="p">(</span><span class="mi">18</span><span class="p">,</span> <span class="nb">int</span><span class="p">(</span><span class="n">motorTop</span><span class="p">))</span>
<span class="n">time</span><span class="o">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
<span class="c"># to calibrate bottom amount on start</span>
<span class="n">wiringpi</span><span class="o">.</span><span class="n">pwmWrite</span><span class="p">(</span><span class="mi">18</span><span class="p">,</span> <span class="nb">int</span><span class="p">(</span><span class="n">motorBottom</span><span class="p">))</span>
<span class="n">time</span><span class="o">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span></code></pre></figure>
<p>Finally at the bottom of the program code, we can see that an endless loop is set up to check our set_value server and then move the motor with the <code class="highlighter-rouge">wiringpi.pwmWrite(18, int(pulse_amount))</code> command.</p>
<h3 id="setting-up-hardware">Setting up Hardware:</h3>
<p>The circuit needed for this hardware is actually very simple since there is only one outside component.
The diagram below was made with the excellent open source program Fritzing.</p>
<p><img src="img/robot_bb.jpg" alt="Circuit Diagram" /></p>
<p>The power and ground from the servo are connected to the 5v plug’s power and ground. For the power supply to the motor, you will need to strip off the mini usb header and connect the wires directly to the motor. Be very careful that the ground and neutral wires do not ever touch, as this will cause a short and sparking.</p>
<p>The servo’s control wire is connected to the Raspberry Pi’s pin 12. Additionally the common ground wire needs to be connected from the Pi’s pin 6 to the plug’s ground. The circuit will not work without the common ground connection.</p>
<h3 id="construction">Construction:</h3>
<p><img src="img/arrow_back.jpg" alt="Art" />
<img src="img/pi_front.jpg" alt="Art" /></p>
<h3 id="using-the-display">Using the Display:</h3>
<p>You can get all of our code onto the raspberry pi by cloning our github repository from your Pi.
All the code is also available at the bottom of this blog post.
Starting the programs once installed onto the Pi is very simple. Open a terminal and navigate to the folder containing the programs. Then run the command:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="err">$</span> <span class="n">sudo</span> <span class="n">bash</span> <span class="n">startup_script</span><span class="p">.</span><span class="nf">sh</span></code></pre></figure>
<p>This is a very short script which starts the Node.js and Python programs together.</p>
<p>After both programs have been started, you should see your motor attempt to calibrate all the way left and right. Once this has finished the display is ready to receive inputs.</p>
<p>To communicate with the Pi you will need to identify its local IP address. Here is a short tutorial for that, <a href="https://learn.adafruit.com/adafruits-raspberry-pi-lesson-3-network-setup/finding-your-pis-ip-address">https://learn.adafruit.com/adafruits-raspberry-pi-lesson-3-network-setup/finding-your-pis-ip-address</a>.</p>
<p>This also works well if you know the Pi is online. You can run from any computer on your network without ssh’ing into the Pi.</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="err">$</span> <span class="n">arp</span> <span class="o">-</span><span class="n">na</span> <span class="o">|</span> <span class="n">grep</span> <span class="o">-</span><span class="n">i</span> <span class="n">b8</span><span class="p">:</span><span class="mi">27</span><span class="ss">:eb</span></code></pre></figure>
<p>With IP address in hand, you can now start moving the arrow on your display. On any computer connected to the same network as the Pi, open a web browser and navigate the the url</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">http</span><span class="ss">:/</span><span class="o">/</span><span class="p">[</span><span class="no">IP_ADDRESS</span><span class="p">]:</span><span class="mi">3030</span><span class="p">?</span><span class="n">position</span><span class="o">=</span><span class="p">[</span><span class="no">VALUE</span><span class="p">]</span></code></pre></figure>
<p>where IP_ADDRESS is the ip address of your Pi and VALUE is a value from 0-100.</p>
<h3 id="conclusion">Conclusion</h3>
<p>This project’s scope changed several times during the course of its development, much like all side projects. The biggest change was to configure the ‘set_value_server.js’ program to accept agnostic values between 1 - 100. We added this feature so that any program can be hooked into the display as long as it outputs a number from 1 - 100. It is our hope that the team here at Element 84 can get creative with what metrics are displayed on a day to day basis. (We’ve already had some highly technical suggestions like “counting cups of coffee” and “days until the weekend”)</p>
<p>Overall this was a very fun project. When you are immersed in any type of programming it can be refreshing to take a break and see what another style is like. Along the way we learned a lot about the Raspberry Pi ecosystem, and set ourselves up for the next Raspberry Pi project in the future. Hope you enjoyed the tutorial and feel free to reach out if you get stuck while building your own Pi project.</p>
<h3 id="code-motor_controlpy">Code motor_control.py</h3>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="c"># This program controls an attached servo motor through the raspberry pi</span>
<span class="c"># By using the wiringpi library, we can communicate via the GPIO pins of the pi with connected motors</span>
<span class="c"># Much of the code below adapted from:</span>
<span class="c"># https://learn.adafruit.com/adafruits-raspberry-pi-lesson-8-using-a-servo-motor/overview</span>
<span class="c"># Servo Control For Raspberry Pi 3</span>
<span class="kn">import</span> <span class="nn">time</span>
<span class="kn">import</span> <span class="nn">urllib2</span>
<span class="kn">import</span> <span class="nn">wiringpi</span>
<span class="n">set_value_port</span> <span class="o">=</span> <span class="s">'3030'</span>
<span class="c"># use 'GPIO naming'</span>
<span class="c"># setting the wiringpi library to work with Pi's general purpose IO pins.</span>
<span class="n">wiringpi</span><span class="o">.</span><span class="n">wiringPiSetupGpio</span><span class="p">()</span>
<span class="c"># setting GPIO pin #18 to be a PWM output</span>
<span class="c"># GPIO pin #18 translates to pin #12 in regular Pi pin counting convention</span>
<span class="c"># GPIO pin #18 is the only pin that can do PWM</span>
<span class="c"># PWM is pulse width modulation, meaning it can turn on/off in timed increments very accurately</span>
<span class="c"># These on/off time increments are how we communicate with the motor, almost like morse code </span>
<span class="n">wiringpi</span><span class="o">.</span><span class="n">pinMode</span><span class="p">(</span><span class="mi">18</span><span class="p">,</span> <span class="n">wiringpi</span><span class="o">.</span><span class="n">GPIO</span><span class="o">.</span><span class="n">PWM_OUTPUT</span><span class="p">)</span>
<span class="c"># set the PWM mode to milliseconds stype</span>
<span class="n">wiringpi</span><span class="o">.</span><span class="n">pwmSetMode</span><span class="p">(</span><span class="n">wiringpi</span><span class="o">.</span><span class="n">GPIO</span><span class="o">.</span><span class="n">PWM_MODE_MS</span><span class="p">)</span>
<span class="c"># divide down clock</span>
<span class="c"># These are more wiringpi settings to align the Pi's timer with the motor's timer</span>
<span class="n">wiringpi</span><span class="o">.</span><span class="n">pwmSetClock</span><span class="p">(</span><span class="mi">192</span><span class="p">)</span>
<span class="n">wiringpi</span><span class="o">.</span><span class="n">pwmSetRange</span><span class="p">(</span><span class="mi">2000</span><span class="p">)</span>
<span class="n">delay_period</span> <span class="o">=</span> <span class="mf">0.01</span>
<span class="c">#configs for the servo range</span>
<span class="n">motorTop</span> <span class="o">=</span> <span class="mf">242.0</span>
<span class="n">motorBottom</span> <span class="o">=</span> <span class="mf">53.0</span>
<span class="n">motorDif</span> <span class="o">=</span> <span class="n">motorTop</span> <span class="o">-</span> <span class="n">motorBottom</span>
<span class="c"># to calibrate top amount on start</span>
<span class="n">wiringpi</span><span class="o">.</span><span class="n">pwmWrite</span><span class="p">(</span><span class="mi">18</span><span class="p">,</span> <span class="nb">int</span><span class="p">(</span><span class="n">motorTop</span><span class="p">))</span>
<span class="n">time</span><span class="o">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
<span class="c"># to calibrate bottom amount on start</span>
<span class="n">wiringpi</span><span class="o">.</span><span class="n">pwmWrite</span><span class="p">(</span><span class="mi">18</span><span class="p">,</span> <span class="nb">int</span><span class="p">(</span><span class="n">motorBottom</span><span class="p">))</span>
<span class="n">time</span><span class="o">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
<span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
<span class="n">set_value</span> <span class="o">=</span> <span class="nb">float</span><span class="p">(</span><span class="n">urllib2</span><span class="o">.</span><span class="n">urlopen</span><span class="p">(</span><span class="s">"http://localhost:"</span> <span class="o">+</span> <span class="n">set_value_port</span><span class="p">)</span><span class="o">.</span><span class="n">read</span><span class="p">())</span>
<span class="k">if</span> <span class="n">set_value</span> <span class="o">>=</span> <span class="mf">0.0</span><span class="p">:</span>
<span class="c">#getting the value from server and translating to motor range.</span>
<span class="c">#so this takes any number 1- 100 and translates it to a pulse_amount of 242 - 53</span>
<span class="n">pulse_amount</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(((</span><span class="mi">100</span> <span class="o">-</span> <span class="n">set_value</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="n">motorTop</span> <span class="o">-</span> <span class="n">motorBottom</span><span class="p">)</span> <span class="o">/</span> <span class="mf">100.0</span><span class="p">)</span> <span class="o">+</span> <span class="n">motorBottom</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="c">#setting default to be lowest value if no set_value found or set_value is negative</span>
<span class="n">pulse_amount</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">motorBottom</span><span class="p">)</span>
<span class="c">#limiting the range of values to the motor's range.</span>
<span class="c">#this makes sure we don't pass the motor a range that it can not reach if set_value > 100</span>
<span class="k">if</span> <span class="n">pulse_amount</span> <span class="o">></span> <span class="n">motorTop</span><span class="p">:</span>
<span class="n">pulse_amount</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">motorTop</span><span class="p">)</span>
<span class="k">elif</span> <span class="n">pulse_amount</span> <span class="o"><</span> <span class="n">motorBottom</span><span class="p">:</span>
<span class="n">pulse_amount</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">motorBottom</span><span class="p">)</span>
<span class="c">#writing value to motor</span>
<span class="n">wiringpi</span><span class="o">.</span><span class="n">pwmWrite</span><span class="p">(</span><span class="mi">18</span><span class="p">,</span> <span class="nb">int</span><span class="p">(</span><span class="n">pulse_amount</span><span class="p">))</span>
<span class="c">#waiting 5 seconds to check the server again and update the arrow position</span>
<span class="n">time</span><span class="o">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span></code></pre></figure>
<h3 id="code-set_value_serverjs">Code set_value_server.js</h3>
<figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><span class="c1">//This program sets up the server for the Pi to listen to incoming values for the speedometer</span>
<span class="c1">//The global variable METER_VALUE is updated whenever a user issues a GET request to</span>
<span class="c1">//the pi's ip with a parameter of "position"</span>
<span class="c1">//Then the server returns the value of METER_VALUE for all GET requests to the pi's ip address</span>
<span class="c1">//command to set meter value between 1 - 100</span>
<span class="c1">//http://localhost:3030/?position=100</span>
<span class="kd">var</span> <span class="nx">METER_VALUE</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span>
<span class="c1">//SERVER SET UP</span>
<span class="kd">const</span> <span class="nx">http</span> <span class="o">=</span> <span class="nx">require</span><span class="p">(</span><span class="s1">'http'</span><span class="p">);</span>
<span class="kd">var</span> <span class="nx">url</span> <span class="o">=</span> <span class="nx">require</span><span class="p">(</span><span class="s1">'url'</span><span class="p">);</span>
<span class="kd">const</span> <span class="nx">port</span> <span class="o">=</span> <span class="mi">3030</span><span class="p">;</span>
<span class="kd">const</span> <span class="nx">requestHandler</span> <span class="o">=</span> <span class="p">(</span><span class="nx">req</span><span class="p">,</span> <span class="nx">res</span><span class="p">)</span> <span class="o">=></span> <span class="p">{</span>
<span class="nx">set_value</span><span class="p">(</span><span class="nx">req</span><span class="p">.</span><span class="nx">url</span><span class="p">)</span>
<span class="c1">//Send the html message</span>
<span class="nx">res</span><span class="p">.</span><span class="nx">end</span><span class="p">(</span><span class="nx">METER_VALUE</span><span class="p">.</span><span class="nx">toString</span><span class="p">());</span>
<span class="k">return</span>
<span class="p">};</span>
<span class="kd">const</span> <span class="nx">server</span> <span class="o">=</span> <span class="nx">http</span><span class="p">.</span><span class="nx">createServer</span><span class="p">(</span><span class="nx">requestHandler</span><span class="p">);</span>
<span class="nx">server</span><span class="p">.</span><span class="nx">listen</span><span class="p">(</span><span class="nx">port</span><span class="p">,</span> <span class="p">(</span><span class="nx">err</span><span class="p">)</span> <span class="o">=></span> <span class="p">{</span>
<span class="k">if</span><span class="p">(</span><span class="nx">err</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="s1">'something bad happened'</span><span class="p">,</span> <span class="nx">err</span><span class="p">);</span>
<span class="p">}</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="s2">`server is listening on </span><span class="p">${</span><span class="nx">port</span><span class="p">}</span><span class="s2">`</span><span class="p">);</span>
<span class="p">});</span>
<span class="c1">//parsing the url requests and obtaining/setting the position value</span>
<span class="kd">function</span> <span class="nx">set_value</span><span class="p">(</span><span class="nx">path</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">var</span> <span class="nx">queryData</span> <span class="o">=</span> <span class="nx">url</span><span class="p">.</span><span class="nx">parse</span><span class="p">(</span><span class="nx">path</span><span class="p">,</span> <span class="kc">true</span><span class="p">).</span><span class="nx">query</span><span class="p">;</span>
<span class="nx">value</span> <span class="o">=</span> <span class="nx">queryData</span><span class="p">.</span><span class="nx">position</span><span class="p">;</span>
<span class="k">if</span> <span class="p">((</span><span class="nx">value</span> <span class="o">==</span> <span class="kc">undefined</span><span class="p">)</span> <span class="o">||</span> <span class="p">(</span><span class="nx">value</span> <span class="o">==</span> <span class="s2">""</span><span class="p">))</span> <span class="p">{</span>
<span class="k">return</span> <span class="kc">false</span><span class="p">;</span>
<span class="p">}</span>
<span class="nx">value</span> <span class="o">=</span> <span class="nb">parseFloat</span><span class="p">(</span><span class="nx">value</span><span class="p">);</span>
<span class="k">if</span> <span class="p">((</span><span class="nx">value</span> <span class="o"><</span> <span class="mf">0.0</span><span class="p">)</span> <span class="o">||</span> <span class="p">(</span><span class="nx">value</span> <span class="o">></span> <span class="mf">100.0</span><span class="p">))</span> <span class="p">{</span>
<span class="k">return</span> <span class="kc">false</span><span class="p">;</span>
<span class="p">}</span>
<span class="nx">METER_VALUE</span> <span class="o">=</span> <span class="nx">value</span>
<span class="k">return</span> <span class="kc">true</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>
<h3 id="code-startup_scriptsh">Code startup_script.sh</h3>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="c1">#!/bin/bash</span>
<span class="n">sudo</span> <span class="n">node</span> <span class="n">lib</span><span class="o">/</span><span class="n">set_value_server</span><span class="p">.</span><span class="nf">js</span> <span class="o">&</span>
<span class="n">sudo</span> <span class="n">python</span> <span class="n">lib</span><span class="o">/</span><span class="n">motor_control</span><span class="p">.</span><span class="nf">py</span> <span class="o">&</span></code></pre></figure>abakerAs a fun project and a way to get familiar with the popular Raspberry Pi platform, our team decided to build a real life display that can hang on the wall and be used to display various team metrics such as pageload speeds, burndown progress, achievement counts, etc. Below you will find a tutorial on how to make your own display as well as a link to all the code used to run the project.JDI Mind Tricks2016-03-31T07:00:00+00:002016-03-31T07:00:00+00:00http://blog.element84.com/debugging-clojure-with-jdi<p>I have always used a REPL driven approach to Clojure development and this has been very
productive, but at times I have really missed the old school approach of setting
break points and stepping through code, examining variables along the way. While there
are some very capable solutions that get me part of the way there
(<a href="https://atom.io/packages/proto-repl">proto-repl</a>, etc.), I was curious to see if it
was possible to debug Clojure in a more traditional way. I have used
<a href="https://github.com/GeorgeJahad/debug-repl">debug-repl</a>, but I wanted more control.
I learned about <a href="https://github.com/clojure-emacs/cider">CIDER</a>,
but was unwilling to make the switch to EMACS (let’s just leave it at that) so I was
unaware of its debugging capabilities.</p>
<p>In the meantime I had been teaching myself Elixir (see my previous posts) and looking to
improve upon my Elixir REPL package, <a href="https://atom.io/packages/iex">iex</a>, for the Atom
editor. At the same time some minor annoyances with Atom caused me to resume my never-ending
quest for the perfect editor, which eventually led me to
<a href="https://code.visualstudio.com/">Visual Studio Code</a>. I quickly realized that this is a great
platform on which to build a debugger. Unlike Atom or Sublime Text, Visual Studio Code
is designed from the ground up to be an IDE, not just an editor. It has the light weight
feel of an editor, but the debugging UI is built in with API hooks to make extension to
various languages relatively straightforward.</p>
<p>When <a href="https://cursive-ide.com/">Cursive</a> (the Clojure environmnet for IntelliJ)
was introduced I realized it <em>was</em> possible
to do traditional style debugging of Clojure code, and, armed
with that knowledge, I was determined to learn how to do this myself.
So I have recently been experimenting with various approaches to debugging Clojure code in
an attempt to build a Clojure debugger for VS Code (more on this in an upcoming post).
I learned a few things along the way that might be useful for
anyone headed down this path, so I’m writing this post to help them out.</p>
<p>I’ll talk about some of the basic concepts and APIs involved first and then I’ll present a simple
project with code to demonstrate what I have learned. The <a href="https://github.com/indiejames/clojure-debug-demo">project is available on github</a>;
feel free to use the code as you see fit.</p>
<h3 id="desired-functionality">Desired Functionality</h3>
<p>Debugging is a complicated topic and there are many approaches that differ from
platform to platform, so we need to define what it is we hope to be able to do.
For my purposes, the minimum capabilities I need are</p>
<ul>
<li>Setting break points to stop a running JVM on a given line of code.</li>
<li>Examining local variables / function arguments (the stack frame) at that point.</li>
<li>Stepping over a line of code after a break point.</li>
<li>Stepping into function calls after a break point.</li>
<li>Resuming code execution after a break point.</li>
</ul>
<h3 id="java-debugging---the-java-debug-architecture-and-the-java-debug-interface">Java Debugging - the Java Debug Architecture and the Java Debug Interface</h3>
<p>It should come as no surprise that Java offers a huge and comprehensive architecture for
debugging. The <a href="http://docs.oracle.com/javase/8/docs/technotes/guides/jpda/">Java Platform Debug Architecture</a>
consists of two interfaces - the Java Virtual Machine Tools Interface (JVM TI) and
the Java Debug Interface (JDI) - as well as one communication protocol - the
Java Debug Wire Protocol (JDWP).
The JVM TI defines services that a VM implementation must provide to support debugging.
The JDI defines an interface for building debuggers. Finally, JDWP defines
the protocol for communication between debuggers and processes being debugged.
We will use the Clojure Java interoperability to call the JDI from Clojure code to
perform our debugging.</p>
<p>The JDI defines a set of Java interfaces and classes for accessing and controlling another
virtual machine. The fundamental interface is <a href="https://docs.oracle.com/javase/7/docs/jdk/api/jpda/jdi/com/sun/jdi/Mirror.html"><code class="highlighter-rouge">Mirror</code></a>.
Mirrors are proxies used by a debugger to examine and manipulate the entities in
another virtual machine. Arguably the most important descendant of <code class="highlighter-rouge">Mirror</code> is the
<a href="https://docs.oracle.com/javase/7/docs/jdk/api/jpda/jdi/com/sun/jdi/VirtualMachine.html"><code class="highlighter-rouge">VirtualMachine</code></a>
interface. It provides access to the internal state of a
VM being debugged as well as methods to control that state.</p>
<p>You don’t instantiate a <code class="highlighter-rouge">VirtualMachine</code> mirror directly - one is returned for you when
you connect to another VM. You use the <a href="https://docs.oracle.com/javase/7/docs/jdk/api/jpda/jdi/com/sun/jdi/VirtualMachineManager.html#attachingConnectors()"><code class="highlighter-rouge">VirutalMachineManager</code></a>
interface to manage connections to
one or more VMs. The <code class="highlighter-rouge">VirtualMachineManager</code> gives you a list of connectors called
<a href="https://docs.oracle.com/javase/7/docs/jdk/api/jpda/jdi/com/sun/jdi/connect/AttachingConnector.html"><code class="highlighter-rouge">AttachingConnector</code>s</a>
that you can use to create an actual connection.</p>
<p>There are a lot
of options regarding the connection; the debugger can launch the target VM directly,
the debugger can connect to an existing VM, the target VM can attach to an existing
debugger, or the target VM can launch the debugger on its own. The first two options
are the most common.</p>
<p>In this example we will be debugging code running in one REPL by attaching
to it from a second REPL, as shown in diagram 1.</p>
<p><img src="img/clojure-debug.png" alt="Debugging using two nREPLs" /></p>
<p><strong>Diagram 1 - Debugging code from one nREPL using another nREPL.</strong></p>
<p>We will launch one instance of nREPL in debug mode (running in JVM 1). This is
the REPL in which we will run the demo code in the <code class="highlighter-rouge">debug-demo.core</code>
namespace.
We will launch
another instance of nREPL in normal mode (running in JVM 2) and use the
<code class="highlighter-rouge">debug-demo.debug</code> namespace functions to access and control JVM 1 via the
JDI.</p>
<p>The demo code is modified slightly from the sample code generated
by leiningen when creating a project with the default (library) template.
It consists of a namespace with two simple functions shown below:</p>
<figure class="highlight"><pre><code class="language-clojure" data-lang="clojure"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
</pre></td><td class="code"><pre><span class="p">(</span><span class="nf">ns</span><span class="w"> </span><span class="n">debug-demo.core</span><span class="w">
</span><span class="s">"Functions to use for demoing debugging."</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="k">defn</span><span class="w"> </span><span class="n">bar</span><span class="w">
</span><span class="s">"Returns the square of a number."</span><span class="w">
</span><span class="p">[</span><span class="o">^</span><span class="nb">long</span><span class="w"> </span><span class="n">num</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="nb">*</span><span class="w"> </span><span class="n">num</span><span class="w"> </span><span class="n">num</span><span class="p">))</span><span class="w">
</span><span class="p">(</span><span class="k">defn</span><span class="w"> </span><span class="n">foo</span><span class="w">
</span><span class="s">"I don't do a whole lot."</span><span class="w">
</span><span class="p">[</span><span class="o">^</span><span class="nb">long</span><span class="w"> </span><span class="n">x</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="s">"Hello, World!"</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="k">let</span><span class="w"> </span><span class="p">[</span><span class="n">y</span><span class="w"> </span><span class="mi">4</span><span class="w">
</span><span class="n">z</span><span class="w"> </span><span class="mi">10</span><span class="w">
</span><span class="n">w</span><span class="w"> </span><span class="p">(</span><span class="nf">bar</span><span class="w"> </span><span class="n">x</span><span class="p">)]</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"y = "</span><span class="w"> </span><span class="n">y</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"z = "</span><span class="w"> </span><span class="n">z</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"w = "</span><span class="w"> </span><span class="n">w</span><span class="p">)))</span></pre></td></tr></tbody></table></code></pre></figure>
<p>I set type hints on the arguments to both functions to get around a limitation
in my current implementation of printing local variables. I’ll go into more
detail when we look at that code.</p>
<h3 id="accessing-the-jdi-from-clojure">Accessing the JDI from Clojure</h3>
<p>Thanks to the Java interoperablitiy provided by Clojure, we can access the JDI
as we would any other library.
The boilerplate setup code of connecting to a VM to create a
<code class="highlighter-rouge">VirtualMachine</code> can be captured with the following Clojure function:</p>
<figure class="highlight"><pre><code class="language-clojure" data-lang="clojure"><span class="p">(</span><span class="k">defn</span><span class="w"> </span><span class="n">setup-debugger</span><span class="w">
</span><span class="s">"Intialize the debugger."</span><span class="w">
</span><span class="p">[</span><span class="n">port</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="k">let</span><span class="w"> </span><span class="p">[</span><span class="n">vm-manager</span><span class="w"> </span><span class="p">(</span><span class="nf">com.sun.jdi.Bootstrap/virtualMachineManager</span><span class="p">)</span><span class="w">
</span><span class="n">attachingConnectors</span><span class="w"> </span><span class="p">(</span><span class="nf">.attachingConnectors</span><span class="w"> </span><span class="n">vm-manager</span><span class="p">)</span><span class="w">
</span><span class="n">connector</span><span class="w"> </span><span class="p">(</span><span class="nb">some</span><span class="w"> </span><span class="p">(</span><span class="k">fn</span><span class="w"> </span><span class="p">[</span><span class="n">ac</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="nb">when</span><span class="w"> </span><span class="p">(</span><span class="nb">=</span><span class="w"> </span><span class="s">"dt_socket"</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="nb">-></span><span class="w"> </span><span class="n">ac</span><span class="w"> </span><span class="n">.transport</span><span class="w"> </span><span class="n">.name</span><span class="p">)</span><span class="w">
</span><span class="n">ac</span><span class="p">))</span><span class="w">
</span><span class="n">attachingConnectors</span><span class="p">)</span><span class="w">
</span><span class="n">params-map</span><span class="w"> </span><span class="p">(</span><span class="nb">when</span><span class="w"> </span><span class="n">connector</span><span class="w"> </span><span class="p">(</span><span class="nf">.defaultArguments</span><span class="w"> </span><span class="n">connector</span><span class="p">))</span><span class="w">
</span><span class="n">port-arg</span><span class="w"> </span><span class="p">(</span><span class="nb">when</span><span class="w"> </span><span class="n">params-map</span><span class="w"> </span><span class="p">(</span><span class="nb">get</span><span class="w"> </span><span class="n">params-map</span><span class="w"> </span><span class="s">"port"</span><span class="p">))</span><span class="w">
</span><span class="n">_</span><span class="w"> </span><span class="p">(</span><span class="nb">when</span><span class="w"> </span><span class="n">port-arg</span><span class="w"> </span><span class="p">(</span><span class="nf">.setValue</span><span class="w"> </span><span class="n">port-arg</span><span class="w"> </span><span class="n">port</span><span class="p">))]</span><span class="w">
</span><span class="p">(</span><span class="nb">when-let</span><span class="w"> </span><span class="p">[</span><span class="n">vm</span><span class="w"> </span><span class="p">(</span><span class="nb">when</span><span class="w"> </span><span class="n">port-arg</span><span class="w"> </span><span class="p">(</span><span class="nf">.attach</span><span class="w"> </span><span class="n">connector</span><span class="w"> </span><span class="n">params-map</span><span class="p">))]</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"Attached to process "</span><span class="w"> </span><span class="p">(</span><span class="nf">.name</span><span class="w"> </span><span class="n">vm</span><span class="p">))</span><span class="w">
</span><span class="n">vm</span><span class="p">)))</span></code></pre></figure>
<p>This function attaches to an existing VM on the given port (more about this
later) by asking the default <code class="highlighter-rouge">VirtualMachineManager</code> for list of
<code class="highlighter-rouge">AttachingConnector</code>s and then finding the one that provides a transport named
“dt_socket”. This is the second connection option mentioned above.
It then uses this connector to connect to the target VM on
the port provided. On success it prints a diagnostic message to identify
the connected target VM and returns the <code class="highlighter-rouge">VirtualMachine</code> so we can use
it to make debug requests.</p>
<p>We need to add one more thing to this function to make it really useful,
however. The JDI relies on events to control and monitor a VM.
<a href="https://docs.oracle.com/javase/7/docs/jdk/api/jpda/jdi/com/sun/jdi/request/EventRequest.html"><code class="highlighter-rouge">EventRequest</code>s</a>
are made to initiate an action on the VM and <a href="https://docs.oracle.com/javase/7/docs/jdk/api/jpda/jdi/com/sun/jdi/event/Event.html"><code class="highlighter-rouge">Event</code>s</a>
are returned to indicate some action has taken place. So we need to listen for
<code class="highlighter-rouge">Event</code>s so we can be notified when something (like hitting a break point) has
happened.</p>
<p>We can create a new <code class="highlighter-rouge">core.async</code> thread in our setup function to listen to the event queue
of the <code class="highlighter-rouge">VirtualMachine</code>. Now our startup function looks like this</p>
<figure class="highlight"><pre><code class="language-clojure" data-lang="clojure"><span class="p">(</span><span class="k">defn</span><span class="w"> </span><span class="n">setup-debugger</span><span class="w">
</span><span class="s">"Intialize the debugger."</span><span class="w">
</span><span class="p">[</span><span class="n">port</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="k">let</span><span class="w"> </span><span class="p">[</span><span class="n">vm-manager</span><span class="w"> </span><span class="p">(</span><span class="nf">com.sun.jdi.Bootstrap/virtualMachineManager</span><span class="p">)</span><span class="w">
</span><span class="n">attachingConnectors</span><span class="w"> </span><span class="p">(</span><span class="nf">.attachingConnectors</span><span class="w"> </span><span class="n">vm-manager</span><span class="p">)</span><span class="w">
</span><span class="n">connector</span><span class="w"> </span><span class="p">(</span><span class="nb">some</span><span class="w"> </span><span class="p">(</span><span class="k">fn</span><span class="w"> </span><span class="p">[</span><span class="n">ac</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="nb">when</span><span class="w"> </span><span class="p">(</span><span class="nb">=</span><span class="w"> </span><span class="s">"dt_socket"</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="nb">-></span><span class="w"> </span><span class="n">ac</span><span class="w"> </span><span class="n">.transport</span><span class="w"> </span><span class="n">.name</span><span class="p">)</span><span class="w">
</span><span class="n">ac</span><span class="p">))</span><span class="w">
</span><span class="n">attachingConnectors</span><span class="p">)</span><span class="w">
</span><span class="n">params-map</span><span class="w"> </span><span class="p">(</span><span class="nb">when</span><span class="w"> </span><span class="n">connector</span><span class="w"> </span><span class="p">(</span><span class="nf">.defaultArguments</span><span class="w"> </span><span class="n">connector</span><span class="p">))</span><span class="w">
</span><span class="n">port-arg</span><span class="w"> </span><span class="p">(</span><span class="nb">when</span><span class="w"> </span><span class="n">params-map</span><span class="w"> </span><span class="p">(</span><span class="nb">get</span><span class="w"> </span><span class="n">params-map</span><span class="w"> </span><span class="s">"port"</span><span class="p">))</span><span class="w">
</span><span class="n">_</span><span class="w"> </span><span class="p">(</span><span class="nb">when</span><span class="w"> </span><span class="n">port-arg</span><span class="w"> </span><span class="p">(</span><span class="nf">.setValue</span><span class="w"> </span><span class="n">port-arg</span><span class="w"> </span><span class="n">port</span><span class="p">))]</span><span class="w">
</span><span class="p">(</span><span class="nb">when-let</span><span class="w"> </span><span class="p">[</span><span class="n">vm</span><span class="w"> </span><span class="p">(</span><span class="nb">when</span><span class="w"> </span><span class="n">port-arg</span><span class="w"> </span><span class="p">(</span><span class="nf">.attach</span><span class="w"> </span><span class="n">connector</span><span class="w"> </span><span class="n">params-map</span><span class="p">))]</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"Attached to process "</span><span class="w"> </span><span class="p">(</span><span class="nf">.name</span><span class="w"> </span><span class="n">vm</span><span class="p">))</span><span class="w">
</span><span class="p">(</span><span class="k">let</span><span class="w"> </span><span class="p">[</span><span class="n">evt-req-mgr</span><span class="w"> </span><span class="p">(</span><span class="nf">.eventRequestManager</span><span class="w"> </span><span class="n">vm</span><span class="p">)</span><span class="w">
</span><span class="n">evt-queue</span><span class="w"> </span><span class="p">(</span><span class="nf">.eventQueue</span><span class="w"> </span><span class="n">vm</span><span class="p">)]</span><span class="w">
</span><span class="p">(</span><span class="nf">thread</span><span class="w"> </span><span class="p">(</span><span class="nf">listen-for-events</span><span class="w"> </span><span class="n">evt-queue</span><span class="w"> </span><span class="n">evt-req-mgr</span><span class="p">)))</span><span class="w">
</span><span class="n">vm</span><span class="p">)))</span></code></pre></figure>
<p>The <code class="highlighter-rouge">listen-for-events</code> function just logs the received event for now.</p>
<figure class="highlight"><pre><code class="language-clojure" data-lang="clojure"><span class="p">(</span><span class="k">defn</span><span class="w"> </span><span class="n">listen-for-events</span><span class="w">
</span><span class="s">"List for events on the event queue and handle them."</span><span class="w">
</span><span class="p">[</span><span class="n">evt-queue</span><span class="w"> </span><span class="n">evt-req-mgr</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"Listening for events...."</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="nb">loop</span><span class="w"> </span><span class="p">[</span><span class="n">evt-set</span><span class="w"> </span><span class="p">(</span><span class="nf">.remove</span><span class="w"> </span><span class="n">evt-queue</span><span class="p">)]</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"Got an event............"</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="nf">recur</span><span class="w"> </span><span class="p">(</span><span class="nf">.remove</span><span class="w"> </span><span class="n">evt-queue</span><span class="p">))))</span></code></pre></figure>
<p>To test our code we start a REPL in our project directory and tell the VM
to listen for debugger connections. To do this we must set the environment
variable <code class="highlighter-rouge">JVM_OPTS</code> as follows:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>export JVM_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=8030
</code></pre></div></div>
<p>Then when we launch the REPL we see the following:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>=> lein repl
Listening for transport dt_socket at address: 8030
nREPL server started on port 64012 on host 127.0.0.1 - nrepl://127.0.0.1:64012
REPL-y 0.3.7, nREPL 0.2.12
Clojure 1.8.0
Java HotSpot(TM) 64-Bit Server VM 1.8.0_74-b02
Docs: (doc function-name-here)
(find-doc "part-of-name-here")
Source: (source function-name-here)
Javadoc: (javadoc java-object-or-class-here)
Exit: Control+D or (exit) or (quit)
Results: Stored in vars *1, *2, *3, an exception in *e
user=>
</code></pre></div></div>
<p>The first line about “Listening for transport” is printed by the JVM itself, not
the REPL.</p>
<p>Now we can start a different REPL (without setting <code class="highlighter-rouge">JVM_OPTS</code>) and call our
setup function.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-> lein repl
lojure 1.8.0
Java HotSpot(TM) 64-Bit Server VM 1.8.0_74-b02
Docs: (doc function-name-here)
(find-doc "part-of-name-here")
Source: (source function-name-here)
Javadoc: (javadoc java-object-or-class-here)
Exit: Control+D or (exit) or (quit)
Results: Stored in vars *1, *2, *3, an exception in *e
user=>
</code></pre></div></div>
<p>Notice the missing “Listening for transport” message.</p>
<p>We can then connect to JVM 1 by calling our setup function.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>user=> (use 'debug-demo.debug)
nil
user=> (def vm (setup-debugger 8030))
Attached to process Java HotSpot(TM) 64-Bit Server VM
#'user/vm
Listening for events....
</code></pre></div></div>
<p>We capture the <code class="highlighter-rouge">VirtualMachine</code> returned by <code class="highlighter-rouge">setup-debugger</code> in the <code class="highlighter-rouge">vm</code> var. We
see the “Listening for events…” message indicating that our even handler is
running. You may see the “Listening for events…” message comingled with the
other output since the listener is running on a separate thread.</p>
<p>Now that we can connect to our target VM, it’s time to tackle the first of
the capabilities on our requirements list, setting break points. This is
accomplished by issuing a <a href="https://docs.oracle.com/javase/7/docs/jdk/api/jpda/jdi/com/sun/jdi/request/BreakpointRequest.html"><code class="highlighter-rouge">BreakpointRequest</code></a>
to the <code class="highlighter-rouge">VirtualMachine</code>. The primary attribute of a <code class="highlighter-rouge">BreakPoint</code> request is
a <a href="https://docs.oracle.com/javase/7/docs/jdk/api/jpda/jdi/com/sun/jdi/Location.html"><code class="highlighter-rouge">Location</code></a>.
Instances of <code class="highlighter-rouge">Location</code> encapsulate information about a position in the code: the
source file, the line, etc. So if we want to set a break point on a certain
line in a given file, we need to get its <code class="highlighter-rouge">Location</code>.</p>
<p>To do this
we first need to understand something about the relationship between Clojure
and Java. Clojure compiles to Java bytecode, but a line of Clojure may not
correspond directly to a line of Java. In fact, a line of Clojure may compile to <em>several</em>
lines of Java. This is to be expected as Clojure
is the more expressive of the two languages. So this raises the question,
“how can we tell the VM (which runs bytecode) that we
want to set a break point on a particular line of Clojure when that line may
correspond to several lines of Java?”</p>
<p>Fortunately, the Java designers realized some time ago (probably with the
advent of Groovy) that people were implementing other languages on the JVM, so
they came up with a way to support them called <em>strata</em>.</p>
<p>The JDI documentation has this to say about strata:</p>
<blockquote>
<p>The source information for a <code class="highlighter-rouge">Location</code> is dependent on the stratum which is used.
A stratum is a source code level within a sequence of translations. For example,
say the baz program is written in the programming language “Foo” then translated
to the language “Bar” and finally translated into the Java programming language.
The Java programming language stratum is named “Java”, let’s say the other strata
are named “Foo” and “Bar”. A given location (as viewed by the <code class="highlighter-rouge">sourceName()</code> and
<code class="highlighter-rouge">lineNumber()</code> methods) might be at line 14 of “baz.foo” in the “Foo” stratum, line
23 of “baz.bar” in the “Bar” stratum and line 71 of the “Java” stratum. Note that
while the Java programming language may have only one source file for a reference
type, this restriction does not apply to other strata - thus each <code class="highlighter-rouge">Location</code> should
be consulted to determine its source path.</p>
</blockquote>
<p>Which is a long-winded way of saying that the compilation process can preserve
information form the original source language like source file and line
number. Even better, when searching for a particular <code class="highlighter-rouge">Location</code>, we can specify
a particular stratum to use - in our case “Clojure”.</p>
<p>There is no method we can call to get the <code class="highlighter-rouge">Location</code> for a given source file and
line directly - we need to go through all the <code class="highlighter-rouge">Locations</code> for all the reference types
in our target VM and find the one that matches our source file and line number.
We can narrow the scope a bit by only finding the locations for the “Clojure” strata.</p>
<p>Our high level <code class="highlighter-rouge">set-breakpoint</code> function looks like this</p>
<figure class="highlight"><pre><code class="language-clojure" data-lang="clojure"><span class="p">(</span><span class="k">defn</span><span class="w"> </span><span class="n">set-breakpoint</span><span class="w">
</span><span class="s">"Set a breakpoint"</span><span class="w">
</span><span class="p">[</span><span class="n">vm</span><span class="w"> </span><span class="n">src-path</span><span class="w"> </span><span class="n">line</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="nb">when-let</span><span class="w"> </span><span class="p">[</span><span class="n">loc</span><span class="w"> </span><span class="p">(</span><span class="nf">find-loc-for-src-line</span><span class="w"> </span><span class="n">vm</span><span class="w"> </span><span class="n">src-path</span><span class="w"> </span><span class="n">line</span><span class="p">)]</span><span class="w">
</span><span class="p">(</span><span class="k">let</span><span class="w"> </span><span class="p">[</span><span class="n">evt-req-mgr</span><span class="w"> </span><span class="p">(</span><span class="nf">.eventRequestManager</span><span class="w"> </span><span class="n">vm</span><span class="p">)</span><span class="w">
</span><span class="n">breq</span><span class="w"> </span><span class="p">(</span><span class="nf">.createBreakpointRequest</span><span class="w"> </span><span class="n">evt-req-mgr</span><span class="w"> </span><span class="n">loc</span><span class="p">)]</span><span class="w">
</span><span class="p">(</span><span class="nf">.setSuspendPolicy</span><span class="w"> </span><span class="n">breq</span><span class="w"> </span><span class="n">com.sun.jdi.request.BreakpointRequest/SUSPEND_ALL</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="nf">.enable</span><span class="w"> </span><span class="n">breq</span><span class="p">))</span><span class="w">
</span><span class="n">loc</span><span class="p">))</span></code></pre></figure>
<p>The first thing it does is call the <code class="highlighter-rouge">find-loc-for-src-line vm src-path</code> function
to try to get the <code class="highlighter-rouge">Location</code> assocated with the given source file and line number.
Then it uses the <a href="http://docs.oracle.com/javase/1.5.0/docs/guide/jpda/jdi/com/sun/jdi/request/EventRequestManager.html"><code class="highlighter-rouge">EventRequestManager</code></a>
for the <code class="highlighter-rouge">VirtualMachine</code> to create a
disabled <code class="highlighter-rouge">BreakPointRequest</code>. It sets the thread supsend policy on the request
to <code class="highlighter-rouge">SUPSEND_ALL</code>, which means stop all the threads in the VM when we hit
the break point. Alternatively we could use <code class="highlighter-rouge">SUSPEND_EVENT_THREAD</code> to just suspend
the thread that hit the break point. Finally, we enable the <code class="highlighter-rouge">BreakPointRequest</code>.</p>
<p>The <code class="highlighter-rouge">find-loc-for-src-line</code> function searches through all the reference types
on the VM to find the matching location as described above. This code
is rather long so I’m not going to cover it here. See the github
project for the source code if you want to know the details.</p>
<p>To see our break point event we need to check for it in our event handler
function:</p>
<figure class="highlight"><pre><code class="language-clojure" data-lang="clojure"><span class="p">(</span><span class="k">defn</span><span class="w"> </span><span class="n">listen-for-events</span><span class="w">
</span><span class="s">"List for events on the event queue and handle them."</span><span class="w">
</span><span class="p">[</span><span class="n">evt-queue</span><span class="w"> </span><span class="n">evt-req-mgr</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"Listening for events...."</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="nb">loop</span><span class="w"> </span><span class="p">[</span><span class="n">evt-set</span><span class="w"> </span><span class="p">(</span><span class="nf">.remove</span><span class="w"> </span><span class="n">evt-queue</span><span class="p">)]</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"Got an event............"</span><span class="p">)</span><span class="w">
</span><span class="c1">;;</span><span class="w">
</span><span class="c1">;; New code to handle break point events</span><span class="w">
</span><span class="c1">;;</span><span class="w">
</span><span class="p">(</span><span class="k">let</span><span class="w"> </span><span class="p">[</span><span class="n">events</span><span class="w"> </span><span class="p">(</span><span class="nf">iterator-seq</span><span class="w"> </span><span class="p">(</span><span class="nf">.eventIterator</span><span class="w"> </span><span class="n">evt-set</span><span class="p">))]</span><span class="w">
</span><span class="p">(</span><span class="nb">doseq</span><span class="w"> </span><span class="p">[</span><span class="n">evt</span><span class="w"> </span><span class="n">events</span><span class="w">
</span><span class="no">:let</span><span class="w"> </span><span class="p">[</span><span class="n">evt-req</span><span class="w"> </span><span class="p">(</span><span class="nf">.request</span><span class="w"> </span><span class="n">evt</span><span class="p">)]]</span><span class="w">
</span><span class="p">(</span><span class="k">cond</span><span class="w">
</span><span class="p">(</span><span class="nb">instance?</span><span class="w"> </span><span class="n">BreakpointRequest</span><span class="w"> </span><span class="n">evt-req</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="k">let</span><span class="w"> </span><span class="p">[</span><span class="n">tr</span><span class="w"> </span><span class="p">(</span><span class="nf">.thread</span><span class="w"> </span><span class="n">evt</span><span class="p">)</span><span class="w">
</span><span class="n">line</span><span class="w"> </span><span class="p">(</span><span class="nb">-></span><span class="w"> </span><span class="n">evt-req</span><span class="w"> </span><span class="n">.location</span><span class="w"> </span><span class="n">.lineNumber</span><span class="p">)]</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"Thread: "</span><span class="w"> </span><span class="p">(</span><span class="nf">.name</span><span class="w"> </span><span class="n">tr</span><span class="p">))</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"Breakpoint hit at line "</span><span class="w"> </span><span class="n">line</span><span class="p">))</span><span class="w">
</span><span class="no">:default</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"Unknown event"</span><span class="p">))))</span><span class="w">
</span><span class="c1">;;</span><span class="w">
</span><span class="c1">;; End break point code</span><span class="w">
</span><span class="c1">;;</span><span class="w">
</span><span class="p">(</span><span class="nf">recur</span><span class="w"> </span><span class="p">(</span><span class="nf">.remove</span><span class="w"> </span><span class="n">evt-queue</span><span class="p">))))</span></code></pre></figure>
<p>This will print the name of the thread where the break point event occurred as well
as the line number in the source file. We will use the name of the thread later
when we look up local variables.</p>
<p>We can now set a break point in the <code class="highlighter-rouge">foo</code> function in our target REPL.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>user=> (set-breakpoint vm "/User/jnorton/Clojure/debug-demo/src/debug_demo/core.clj" 12)
Found location...............
#object[com.sun.tools.jdi.LocationImpl 0x67b220cf "debug_demo.core$foo:12"]
</code></pre></div></div>
<p>Our code found the location and made the break point request. Now if we run the
<code class="highlighter-rouge">foo</code> function we can see it pause and the event listener receives the
break point event (refer to listing 1 above for the code being debugged).</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>REPL1 (TARGET)
user=> (foo 4)
REPL 2
user=> Got an event............
Thread: nREPL-worker-2
Breakpoint hit at line 12
</code></pre></div></div>
<p>Now that we have our break points working, let’s move on to the next capability,
examing local variables. In order to do this we need to retrieve the stack
frame for the paused thread, but first we need to get the thread reference.
For this we create the following utility function:</p>
<figure class="highlight"><pre><code class="language-clojure" data-lang="clojure"><span class="p">(</span><span class="k">defn</span><span class="w"> </span><span class="n">get-thread-with-name</span><span class="w">
</span><span class="s">"Returns the ThreadReference with the given name"</span><span class="w">
</span><span class="p">[</span><span class="n">vm</span><span class="w"> </span><span class="nb">name</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="nb">some</span><span class="w"> </span><span class="p">(</span><span class="k">fn</span><span class="w"> </span><span class="p">[</span><span class="n">thread-ref</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="nb">when</span><span class="w"> </span><span class="p">(</span><span class="nb">=</span><span class="w"> </span><span class="nb">name</span><span class="w"> </span><span class="p">(</span><span class="nf">.name</span><span class="w"> </span><span class="n">thread-ref</span><span class="p">))</span><span class="w"> </span><span class="n">thread-ref</span><span class="p">))</span><span class="w">
</span><span class="p">(</span><span class="nf">.allThreads</span><span class="w"> </span><span class="n">vm</span><span class="p">)))</span></code></pre></figure>
<p>Here <code class="highlighter-rouge">name</code> is the name printed in our event handler.</p>
<p>We can get the <a href="https://docs.oracle.com/javase/7/docs/jdk/api/jpda/jdi/com/sun/jdi/StackFrame.html"><code class="highlighter-rouge">StackFrame</code></a>
object from the <code class="highlighter-rouge">ThreadReference</code> by calling
its <code class="highlighter-rouge">frame</code> method. We encapsulate this in the follwing Clojure function:</p>
<figure class="highlight"><pre><code class="language-clojure" data-lang="clojure"><span class="p">(</span><span class="k">defn</span><span class="w"> </span><span class="n">get-frame</span><span class="w">
</span><span class="s">"Get the frame at the given stack position for the named thread"</span><span class="w">
</span><span class="p">[</span><span class="n">vm</span><span class="w"> </span><span class="n">thread-name</span><span class="w"> </span><span class="n">stack-pos</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="k">let</span><span class="w"> </span><span class="p">[</span><span class="n">thread-ref</span><span class="w"> </span><span class="p">(</span><span class="nf">get-thread-with-name</span><span class="w"> </span><span class="n">vm</span><span class="w"> </span><span class="n">thread-name</span><span class="p">)]</span><span class="w">
</span><span class="p">(</span><span class="nf">.frame</span><span class="w"> </span><span class="n">thread-ref</span><span class="w"> </span><span class="n">stack-pos</span><span class="p">)))</span></code></pre></figure>
<p>Stack position refers to the postion of the frame on the call stack. To get
the locals in scope at the break point we use stack position 0. Once we
get the <code class="highlighter-rouge">StackFrame</code>, we can list the local variables by calling its
<code class="highlighter-rouge">visibleVariables</code> method, or we can get a specific variable by calling
<code class="highlighter-rouge">visibleVariableByName(String name)</code>. We will use the first method to
get all the local variables.</p>
<figure class="highlight"><pre><code class="language-clojure" data-lang="clojure"><span class="p">(</span><span class="k">defn</span><span class="w"> </span><span class="n">print-locals</span><span class="w">
</span><span class="s">"Print the local variables and their values for the given stack frame.
This function is not robust and converts all locals to strings to print them out.
A real API should interrogate the local to determine its type and handle it
accordingly."</span><span class="w">
</span><span class="p">[</span><span class="n">frame</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="nb">doseq</span><span class="w"> </span><span class="p">[</span><span class="n">local</span><span class="w"> </span><span class="p">(</span><span class="nf">.visibleVariables</span><span class="w"> </span><span class="n">frame</span><span class="p">)]</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="p">(</span><span class="nf">.name</span><span class="w"> </span><span class="n">local</span><span class="p">)</span><span class="w"> </span><span class="s">" = "</span><span class="w"> </span><span class="p">(</span><span class="nb">str</span><span class="w"> </span><span class="n">local</span><span class="p">))</span></code></pre></figure>
<p>As the docstring states, this function does not handle variables all that
well. In particular, reference objects (class instances, for example) do not
print well. They are mirrored by objects of type <a href="http://www.docjar.com/docs/api/com/sun/tools/jdi/ObjectReferenceImpl.html"><code class="highlighter-rouge">ObjectReferencImpl</code></a>,
which
has a default <code class="highlighter-rouge">toString</code> method that just prints “object reference of type Long”
for <code class="highlighter-rouge">Long</code> types, etc. This does not give us the actual value, so it’s not
much use. To make things worse, since Clojure wraps function arguments without type
hints in objects, we see a lot of <code class="highlighter-rouge">ObjectReferenceImpl</code>. This is the reason
for the type hints in the <code class="highlighter-rouge">foo</code> and <code class="highlighter-rouge">bar</code> demo functions.</p>
<p>If we run the code, we get the following:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>user=> (def frame (get-frame vm "nREPL-worker-2" 0))
#'user/frame
user=> (print-locals frame)
TYPE: com.sun.tools.jdi.LongValueImpl
x = 4
nil
</code></pre></div></div>
<p>We see the type for <code class="highlighter-rouge">x</code> ls a long (mirrored by <code class="highlighter-rouge">LongValueImpl</code>) and the value is 4.
Because we set our breakpoint before the subsequent <code class="highlighter-rouge">let</code> block we don’t see
<code class="highlighter-rouge">y</code>, <code class="highlighter-rouge">z</code>, or <code class="highlighter-rouge">w</code> yet.</p>
<p>Now if we want to step into or over code we need to create a
<a href="https://docs.oracle.com/javase/7/docs/jdk/api/jpda/jdi/com/sun/jdi/request/StepRequest.html"><code class="highlighter-rouge">StepRequest</code></a>.
This is done in the same manner as a <code class="highlighter-rouge">BreakPointRequest</code>. We use the event request
manager to create it, then configure and finally activate it.</p>
<figure class="highlight"><pre><code class="language-clojure" data-lang="clojure"><span class="p">(</span><span class="k">defn</span><span class="w"> </span><span class="n">step</span><span class="w">
</span><span class="s">"Step into or over called functions. Depth must be either StepRequest.STEP_INTO or
StepRequest.STEP_OVER"</span><span class="w">
</span><span class="p">[</span><span class="n">vm</span><span class="w"> </span><span class="n">thread-name</span><span class="w"> </span><span class="n">depth</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="k">let</span><span class="w"> </span><span class="p">[</span><span class="n">evt-req-mgr</span><span class="w"> </span><span class="p">(</span><span class="nf">.eventRequestManager</span><span class="w"> </span><span class="n">vm</span><span class="p">)</span><span class="w">
</span><span class="n">thread-ref</span><span class="w"> </span><span class="p">(</span><span class="nf">get-thread-with-name</span><span class="w"> </span><span class="n">vm</span><span class="w"> </span><span class="n">thread-name</span><span class="p">)</span><span class="w">
</span><span class="n">step-req</span><span class="w"> </span><span class="p">(</span><span class="nf">.createStepRequest</span><span class="w"> </span><span class="n">evt-req-mgr</span><span class="w"> </span><span class="n">thread-ref</span><span class="w"> </span><span class="n">StepRequest/STEP_LINE</span><span class="w"> </span><span class="n">depth</span><span class="p">)]</span><span class="w">
</span><span class="p">(</span><span class="nf">.addCountFilter</span><span class="w"> </span><span class="n">step-req</span><span class="w"> </span><span class="mi">1</span><span class="p">)</span><span class="w"> </span><span class="c1">;; one step only</span><span class="w">
</span><span class="p">(</span><span class="nf">.setSuspendPolicy</span><span class="w"> </span><span class="n">step-req</span><span class="w"> </span><span class="n">com.sun.jdi.request.EventRequest/SUSPEND_EVENT_THREAD</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="nf">.enable</span><span class="w"> </span><span class="n">step-req</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="nf">.resume</span><span class="w"> </span><span class="n">vm</span><span class="p">)))</span></code></pre></figure>
<p><code class="highlighter-rouge">depth</code> should either be <code class="highlighter-rouge">StepRequest/STEP_OVER</code> (to step over code)
or <code class="highlighter-rouge">StepRequest/STEP_INTO</code> (to step into code). After we create and configure
our <code class="highlighter-rouge">StepRequest</code> we enable it and then call <code class="highlighter-rouge">resume</code> on the <code class="highlighter-rouge">VirtualMachine</code>.
This moves us by one step.</p>
<p>We can create a couple of utility functions to make stepping over and into
code easier.</p>
<figure class="highlight"><pre><code class="language-clojure" data-lang="clojure"><span class="p">(</span><span class="k">defn</span><span class="w"> </span><span class="n">step-into</span><span class="w">
</span><span class="s">"Step into called functions"</span><span class="w">
</span><span class="p">[</span><span class="n">vm</span><span class="w"> </span><span class="n">thread-name</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="nf">step</span><span class="w"> </span><span class="n">vm</span><span class="w"> </span><span class="n">thread-name</span><span class="w"> </span><span class="n">StepRequest/STEP_INTO</span><span class="p">))</span><span class="w">
</span><span class="p">(</span><span class="k">defn</span><span class="w"> </span><span class="n">step-over</span><span class="w">
</span><span class="s">"Step over called functions"</span><span class="w">
</span><span class="p">[</span><span class="n">vm</span><span class="w"> </span><span class="n">thread-name</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="nf">step</span><span class="w"> </span><span class="n">vm</span><span class="w"> </span><span class="n">thread-name</span><span class="w"> </span><span class="n">StepRequest/STEP_OVER</span><span class="p">))</span></code></pre></figure>
<p>We also need to add code to our event handler to let us know when a step event
has occurred. This is necessary because we cannot issue a second step event
until the pending event has been resumed.</p>
<figure class="highlight"><pre><code class="language-clojure" data-lang="clojure"><span class="p">(</span><span class="k">defn</span><span class="w"> </span><span class="n">listen-for-events</span><span class="w">
</span><span class="s">"List for events on the event queue and handle them."</span><span class="w">
</span><span class="p">[</span><span class="n">evt-queue</span><span class="w"> </span><span class="n">evt-req-mgr</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"Listening for events...."</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="nb">loop</span><span class="w"> </span><span class="p">[</span><span class="n">evt-set</span><span class="w"> </span><span class="p">(</span><span class="nf">.remove</span><span class="w"> </span><span class="n">evt-queue</span><span class="p">)]</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"Got an event............"</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="k">let</span><span class="w"> </span><span class="p">[</span><span class="n">events</span><span class="w"> </span><span class="p">(</span><span class="nf">iterator-seq</span><span class="w"> </span><span class="p">(</span><span class="nf">.eventIterator</span><span class="w"> </span><span class="n">evt-set</span><span class="p">))]</span><span class="w">
</span><span class="p">(</span><span class="nb">doseq</span><span class="w"> </span><span class="p">[</span><span class="n">evt</span><span class="w"> </span><span class="n">events</span><span class="w">
</span><span class="no">:let</span><span class="w"> </span><span class="p">[</span><span class="n">evt-req</span><span class="w"> </span><span class="p">(</span><span class="nf">.request</span><span class="w"> </span><span class="n">evt</span><span class="p">)]]</span><span class="w">
</span><span class="p">(</span><span class="k">cond</span><span class="w">
</span><span class="p">(</span><span class="nb">instance?</span><span class="w"> </span><span class="n">BreakpointRequest</span><span class="w"> </span><span class="n">evt-req</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="k">let</span><span class="w"> </span><span class="p">[</span><span class="n">tr</span><span class="w"> </span><span class="p">(</span><span class="nf">.thread</span><span class="w"> </span><span class="n">evt</span><span class="p">)</span><span class="w">
</span><span class="n">line</span><span class="w"> </span><span class="p">(</span><span class="nb">-></span><span class="w"> </span><span class="n">evt-req</span><span class="w"> </span><span class="n">.location</span><span class="w"> </span><span class="n">.lineNumber</span><span class="p">)]</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"Thread: "</span><span class="w"> </span><span class="p">(</span><span class="nf">.name</span><span class="w"> </span><span class="n">tr</span><span class="p">))</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"Breakpoint hit at line "</span><span class="w"> </span><span class="n">line</span><span class="p">))</span><span class="w">
</span><span class="c1">;;</span><span class="w">
</span><span class="c1">;; New code for step events</span><span class="w">
</span><span class="c1">;;</span><span class="w">
</span><span class="p">(</span><span class="nb">instance?</span><span class="w"> </span><span class="n">StepRequest</span><span class="w"> </span><span class="n">evt-req</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="k">let</span><span class="w"> </span><span class="p">[</span><span class="n">tr</span><span class="w"> </span><span class="p">(</span><span class="nf">.thread</span><span class="w"> </span><span class="n">evt</span><span class="p">)</span><span class="w">
</span><span class="n">frame</span><span class="w"> </span><span class="p">(</span><span class="nf">.frame</span><span class="w"> </span><span class="n">tr</span><span class="w"> </span><span class="mi">0</span><span class="p">)</span><span class="w">
</span><span class="n">loc</span><span class="w"> </span><span class="p">(</span><span class="nf">.location</span><span class="w"> </span><span class="n">frame</span><span class="p">)</span><span class="w">
</span><span class="n">src</span><span class="w"> </span><span class="p">(</span><span class="nf">.sourceName</span><span class="w"> </span><span class="n">loc</span><span class="p">)]</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"At location "</span><span class="w"> </span><span class="p">(</span><span class="nf">.lineNumber</span><span class="w"> </span><span class="n">loc</span><span class="p">))</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"File: "</span><span class="w"> </span><span class="n">src</span><span class="p">)</span><span class="w">
</span><span class="c1">;; Need to remove a step request or we won't be able to make another one.</span><span class="w">
</span><span class="p">(</span><span class="nf">.deleteEventRequest</span><span class="w"> </span><span class="n">evt-req-mgr</span><span class="w"> </span><span class="n">evt-req</span><span class="p">))</span><span class="w">
</span><span class="c1">;;</span><span class="w">
</span><span class="c1">;; End step event code</span><span class="w">
</span><span class="c1">;;</span><span class="w">
</span><span class="no">:default</span><span class="w">
</span><span class="p">(</span><span class="nb">println</span><span class="w"> </span><span class="s">"Unknown event"</span><span class="p">))))</span><span class="w">
</span><span class="p">(</span><span class="nf">recur</span><span class="w"> </span><span class="p">(</span><span class="nf">.remove</span><span class="w"> </span><span class="n">evt-queue</span><span class="p">))))</span></code></pre></figure>
<p>Now when we issue a step over request in our debugger REPL we see the event
captured by our event listener and we see the target REPL output the result of
the <code class="highlighter-rouge">println</code> on line 12.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>REPL 2
user=> (step-over vm "nREPL-worker-2")
nil
Got an event............
At location 15
File: core.clj
REPL 1 (TARGET)
4 Hello, World!
</code></pre></div></div>
<p>This places us on line 15, the beginning of the call to <code class="highlighter-rouge">bar</code> in the assignment
to <code class="highlighter-rouge">z</code>, the last line of the <code class="highlighter-rouge">let</code> block. I’m not completely sure of the
behavior of step over when it comes to things like assigment blocks. I don’t know
if it treats them as one contiguous line, or if it just stepped until the
next function call (to <code class="highlighter-rouge">bar</code>). I need to get a better understanding of Clojure
internals.</p>
<p>We can then step into the function call.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>REPL 2
user=> (step-into vm "nREPL-worker-2")
nil
Got an event............
At location 4
File: core.clj
</code></pre></div></div>
<p>This places us at line 4, the beginnig of the definition for the the <code class="highlighter-rouge">bar</code>
function. I’m not quite sure why it stops there and not on line 7, but we
can proceed to the body of the function by executing another step over.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>user=> (step-over vm "nREPL-worker-2")
nil
Got an event............
At location 7
File: core.clj
</code></pre></div></div>
<p>Now we can retrieve the local variables for the current break point in the <code class="highlighter-rouge">bar</code>
function.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>user=> (def frame (get-frame vm "nREPL-worker-2" 0))
#'user/frame
user=> (print-locals frame)
TYPE: com.sun.tools.jdi.LongValueImpl
num = 4
nil
</code></pre></div></div>
<p>The last capability I needed was to be able to resume code execution
after a break point. This is actually the easiest feature to imlement since
we are pausing all threads in the VM. The <code class="highlighter-rouge">VirtualMache</code> interface specifies
a <code class="highlighter-rouge">resume</code> method that will resume any suspended threads. We wrap this in
a Clojure function like so</p>
<figure class="highlight"><pre><code class="language-clojure" data-lang="clojure"><span class="p">(</span><span class="k">defn</span><span class="w"> </span><span class="n">continue</span><span class="w">
</span><span class="s">"Resume execution of a paused VM."</span><span class="w">
</span><span class="p">[</span><span class="n">vm</span><span class="p">]</span><span class="w">
</span><span class="p">(</span><span class="nf">.resume</span><span class="w"> </span><span class="n">vm</span><span class="p">))</span></code></pre></figure>
<p>Now we can resume our paused code.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>REPL 2
user=> (continue vm)
nil
REPL 1 (TARGET)
y = 4
z = 10
w = 16
nil
</code></pre></div></div>
<h3 id="conclusion">Conclusion</h3>
<p>I still have some things to add to do proper Java style debugging,
but hopefully this is enough
to get you started. One thing to bare in mind is that Clojure is a functional
language, while step dubbing is inherently imperative. So expect some oddities
when stepping through Clojure code. This continues to be a learning process for me, so please leave comments
as you learn more yourselves. The following is a list of things I hope to implement
next.</p>
<h4 id="to-do">To Do</h4>
<ul>
<li><strong>Better printing of reference types</strong>. Right now attempting to print a local
variable that is a reference type (class, interface) simply calls <code class="highlighter-rouge">.toString</code> on
the object, which defaults to the message “instance of Long”, etc. This is a
problem because without type hints, Clojure function arguments are passed as
reference types. So I need to figure out how to access the underlying value
of the reference type.</li>
<li><strong>Conditional break points</strong> (break points that include code to determine if the
running code should stop). There is no built in functionality for this in the
JDI as far as I know. The Eclipse debugger appears to store conditions as
<code class="highlighter-rouge">String</code>s that presumably get compiled and run after a breakpoint is hit to see
if the code should resume. A similar approach could possibly work with
Clojure code.</li>
<li><strong>Break on exception</strong>. Break points that trigger when an exception occurs.</li>
<li><strong>Moving up and down the call stack</strong>. Would allow examining local
variables in each stack frame. This is possible directly through the JDI, I just
need to implement it.</li>
<li><strong>Setting watch points on variables</strong>. Also possible directly with the JDI.</li>
<li><strong>Rebinding local variables</strong> (changing values) before resuming execution after
a break point. I have investigated this and the JDI does allow this, but it is
poorly documented and examples are difficult to find.</li>
</ul>
<h3 id="acknowledgements">Acknowledgements</h3>
<p>Thanks go to Colin Fleming (author of Cursive) for pointers that helped get me started,
and to Jason Gilman (author of proto-repl) for advice along the way. Also thanks to
Wayne Adams for his informative <a href="http://wayne-adams.blogspot.com/2011/12/examining-variables-in-jdi.html">blog post</a>
on using JDI to debug Java programs. I also learned a bit from reading about
<a href="http://www.lichteblau.com/cloak/cl-jdi/README.html">CL-JDI</a> and through many
JDI examples <a href="http://www.programcreek.com/java-api-examples/index.php?api=com.sun.jdi.StackFrame">here</a>.</p>jnortonI have always used a REPL driven approach to Clojure development and this has been very productive, but at times I have really missed the old school approach of setting break points and stepping through code, examining variables along the way. While there are some very capable solutions that get me part of the way there (proto-repl, etc.), I was curious to see if it was possible to debug Clojure in a more traditional way. I have used debug-repl, but I wanted more control. I learned about CIDER, but was unwilling to make the switch to EMACS (let’s just leave it at that) so I was unaware of its debugging capabilities.Proto REPL - Updates to the Clojure REPL for Atom2016-02-04T00:00:00+00:002016-02-04T00:00:00+00:00http://blog.element84.com/proto-repl-update<p><a href="https://atom.io/packages/proto-repl">Proto REPL</a> is a Clojure REPL for the <a href="https://atom.io">Atom Editor</a> that I introduced in <a href="http://blog.element84.com/introducing-proto-repl.html">a blog post last October</a>. When I introduced Proto REPL, I wrote “The future of interactive development is going to be visual.” and that “ATOM is at its heart a web browser that means you can use the combination of HTML/CSS/JavaScript right in your editor for visualizations”. Proto REPL was only barely scratching the surface of what you could do to enhance interactivity or provide visualizations in your editor. A lot has changed since then to give Proto REPL a better Clojure development experience.</p>
<h2 id="fundamentals">Fundamentals</h2>
<p>Proto REPL now uses <a href="https://github.com/clojure/tools.nrepl">nREPL</a> for communication with the Clojure process. Originally, communication was entirely via standard input and output. It would send code to the Clojure process the same way you would type Clojure code into a terminal based Clojure REPL. It was very basic but it worked surprisingly well for the initial version. The change to use nREPL brings with it a few new features like the ability to connect to a remote Clojure process, interrupt long running commands, and capture the value of executed code and redirect it for other purposes. That last feature is especially important in fulfilling one of the original goals of integrating visualizations.</p>
<p>Another improvement to Proto REPL was making its REPL display area act more like a real REPL. Proto REPL took the initial shortcut of directing all output to an Atom text editor window. You could modify anything and there was no place to enter new code for evaluation. The REPL display portion of Proto REPL has been updated to behave more like a traditional REPL. You enter new forms for evaluation at the bottom area. The results are displayed above that. It supports a history of your past evaluations as well.</p>
<h2 id="visualizations">Visualizations</h2>
<p>Proto REPL has had some internal changes that make it easier to extend. Another package can now register itself with Proto REPL via <a href="https://github.com/jasongilman/proto-repl#code-execution-extensions">Code Execution Extensions</a>. This allows the creation of packages that extend Proto REPL and add visualizations or other new capabilities.</p>
<p><a href="https://github.com/jasongilman/proto-repl-charts">Proto REPL Charts</a> is a new Atom package that uses that feature to display tables and graphs of results from executed Clojure Code. Proto REPL Charts supports scatter graphs, line graphs, bar charts, other custom graphs, and tables of data. Once Proto REPL Charts is installed you can execute code like this …</p>
<figure class="highlight"><pre><code class="language-clojure" data-lang="clojure"><span class="p">(</span><span class="nf">prc/bar-chart</span><span class="w">
</span><span class="s">"GDP_By_Year"</span><span class="w">
</span><span class="p">{</span><span class="s">"2013"</span><span class="w"> </span><span class="p">[</span><span class="mi">16768</span><span class="w"> </span><span class="mi">9469</span><span class="w"> </span><span class="mi">4919</span><span class="w"> </span><span class="mi">3731</span><span class="p">]</span><span class="w">
</span><span class="s">"2014"</span><span class="w"> </span><span class="p">[</span><span class="mi">17418</span><span class="w"> </span><span class="mi">10380</span><span class="w"> </span><span class="mi">4616</span><span class="w"> </span><span class="mi">3859</span><span class="p">]}</span><span class="w">
</span><span class="p">{</span><span class="no">:labels</span><span class="w"> </span><span class="p">[</span><span class="s">"US"</span><span class="w"> </span><span class="s">"China"</span><span class="w"> </span><span class="s">"Japan"</span><span class="w"> </span><span class="s">"Germany"</span><span class="p">]})</span></code></pre></figure>
<p>… to display a bar chart like this inside Atom:</p>
<p><img src="https://raw.githubusercontent.com/jasongilman/proto-repl-charts/master/examples/bar_chart.png" alt="Bar Chart" /></p>
<h2 id="enhanced-interactivity">Enhanced Interactivity</h2>
<p>Proto REPL added some improvements to interactivity through integration with <a href="https://github.com/JunoLab/atom-ink">Atom Ink</a>, an Atom package for building interactive IDE components in Atom. Proto REPL currently uses it for displaying results of executed operations inline (next to the code that was executed). The pretty printed version of the result is available by expanding the inline display.</p>
<p><img src="https://raw.githubusercontent.com/jasongilman/proto-repl/master/images/inline_results.gif" alt="Inline Display" /></p>
<p>The inline display becomes very useful when combined with Automatic Evaluation Mode. Automatic Evaluation Mode evaluates every top level form in a specified file <em>as you type</em>. When you work in Clojure you typically move around a file making changes and sending various forms to be re-evaluated at the REPL. Automatic evaluation removes the need for the extra step of manually re-evaluating blocks of code. You just write the code and you immediately see the results. Most Clojure code can be stateless with zero side effects so it’s safe to run over and over. Code with side effects can be run in the traditional way.</p>
<p>See the following gif for a demo. You can even combine Proto REPL Charts with this so your graphs are immediately updated as you type.</p>
<p><img src="https://github.com/jasongilman/proto-repl/raw/master/images/autoeval.gif" alt="Automatic Evaluation" /></p>
<p>Inline display of results, automatic evaluation, and Proto REPL Charts are all about removing the impediments when coding. They’re attempts to make it easier to see what you’re building.</p>
<h2 id="community-feedback">Community Feedback</h2>
<p>I’ve been really pleased with how much positive feedback and support Proto REPL has received. The <a href="https://atom.io/packages/proto-repl">Proto REPL page at Atom</a> reports more than 2,400 downloads so far. There have been many suggestions for improvement, issues filed, and some quality improvements submitted as pull requests including the ability to remotely connect to another nREPL server. Here are some recent quotes from Proto REPL users.</p>
<blockquote>
<p><a href="https://github.com/jasongilman/proto-repl/issues/22#issuecomment-174398496">This is really great. Your repl plus parinfer just made Atom a serious Clojure/Clojurescript editor.</a></p>
<p><a href="https://twitter.com/wwwphilcom/status/695323237606096899">HUGE work has been done since I last tried out the proto-repl Clojure package for Atom. Amazing work!</a></p>
</blockquote>
<blockquote>
<p><a href="https://github.com/jasongilman/proto-repl/issues/22#issuecomment-179295249">This is really exciting! I already find it easier to use than light table, in terms of starting a repl and getting evaluation going.</a></p>
</blockquote>
<h2 id="whats-next">What’s next?</h2>
<p>This is still just the beginning. The foundation is there for being able to use Proto REPL to do real development work but there are still many areas in which the experience could be improved. These are some of the future possibilities.</p>
<ul>
<li>Increased integration with Atom Ink. Atom Ink is in the early stages of development. As Atom Ink improves so will Proto REPL.</li>
<li>Adding the ability to save and view values captured across the execution of an algorithm.</li>
<li>More visualizations and ways to visualize algorithms or different kinds of data.</li>
<li>New features like <a href="https://github.com/jasongilman/proto-repl/issues/30">refactoring support</a> or <a href="https://github.com/jasongilman/proto-repl/issues/28">improved code completion</a></li>
</ul>
<p>If you’re interested in helping to improve Proto REPL please take a look at the <a href="https://github.com/jasongilman/proto-repl/issues">issues</a>. You can volunteer or ask questions there about the best way to contribute.</p>jgilmanProto REPL is a Clojure REPL for the Atom Editor that I introduced in a blog post last October. When I introduced Proto REPL, I wrote “The future of interactive development is going to be visual.” and that “ATOM is at its heart a web browser that means you can use the combination of HTML/CSS/JavaScript right in your editor for visualizations”. Proto REPL was only barely scratching the surface of what you could do to enhance interactivity or provide visualizations in your editor. A lot has changed since then to give Proto REPL a better Clojure development experience.Focus2016-01-31T12:00:00+00:002016-01-31T12:00:00+00:00http://blog.element84.com/focus<p>I’ve struggled with what I should write about for my first e84 blog post. You see, I’m not your typical high-tech employee. I don’t have a computer science degree. I’m not a math nut. I’ve never solved a <a href="http://gizmodo.com/this-robot-can-solve-a-rubiks-cube-in-one-second-1754882972" title="Rubik's Cube in One Second">Rubik’s Cube</a>. That being said, I’ve got experience ranging a wide gamut of disciplines that I’ve managed to combine into a skill set that fits fairly well into an extremely high tech workplace. If I could create my own title, it would be “Wildcard” or something else equally ambiguous. I believe there’s value in variety. One of my core approaches to almost every aspect of my life is to incorporate a principle learned in one area into other areas.</p>
<p>This typically results in some random sports analogy in the middle of a sprint planning meeting. Or possibly one of the <a href="http://www.agilemanifesto.org/principles.html" title="Agile Principles">twelve principles</a> of agile development spilling into how I train for my next triathlon. Or even worse, attempting to resolve a “disagreement” with my wife by calling upon the latest nugget I picked up from whatever documentary I last watched. That never works out well.</p>
<p>In that vein, I’m always trying to think of ways to ensure that my teams are functioning at the highest levels. I believe subtle changes can result in major shifts. It’s not productive to reorganize your company every 6 months, but incorporating a subtle shift in philosophy? Or introducing a principle from another area that you believe to be useful in yours? I’m game.</p>
<h3 id="a-championship-approach">A Championship Approach</h3>
<p>I recently listened to an interview with the play-by-play announcer for the Alabama Crimson Tide football team, Phil Savage. He is a former college football coach, NFL general manager (among other positions), and currently serves as the executive director of the Senior Bowl. To say that he has seen and understands what it takes to succeed in football is a pretty gross understatement. Savage was asked what makes Alabama’s football program so great? His answer was really surprising to me. He didn’t highlight Alabama’s <a href="http://sports.yahoo.com/footballrecruiting/football/recruiting/teamrank/2011/all/all" title="Team Recruiting Ranks">dominance in recruiting</a> (ranked #1 every year by <a href="https://rivals.n.rivals.com/" title="Rivals Network">rivals.com</a> with the exception of last year where they struggled to a #2 ranking) or wax on about their rich tradition. The truth is, there are many schools that have the talent that Alabama has and the tradition that Alabama has.</p>
<p>So what is it? What makes them great year in and year out under Nick Saban?</p>
<blockquote>
<p>He (Nick Saban) mentioned this on the show that we did a week ago. He emphasized the importance of every person in what he calls “the organization” knowing what their role is. I asked him, “You’re so driven and so disciplined yourself, how do you get that communicated to 120 players, 30 staff members, and the whole structure of your program?” He said, “Everyone has to know their role and that has to be their focus. And I picked that up from Bill Belichick.”</p>
</blockquote>
<p>There you have it. Two of the greatest coaches in the history of football believe that the key to success is for every person to know their role and make that their focus.</p>
<h3 id="focus--trust-applied">Focus & Trust Applied</h3>
<p>I was struck by the simplicity of that approach. What is your job? Focus on that. As I think back on previous experiences, that model is threatened when there is a lack of trust either up or down the chain. If the CEO doesn’t trust their team, that CEO is going to drive themselves insane by feeling like they have to be involved in everything. If the employees don’t trust their supervisor they’ll either constantly be pressing into areas that they are unqualified for or the tasks they are supposed to be executing will fall by the wayside, which then causes their supervisors to not trust them. And the cycle continues.</p>
<p>Know your role. Focus on that. Trust your team.</p>
<p>That’s one of the reasons I love Element 84. Our hiring process is fairly intense, so when someone has jumped through all of the hoops we know that we are getting the best of the best and there’s an awful lot of trust that comes with that. We are all afforded the opportunity not only to work on some really great projects, but <em>influence</em> our projects on a day to day basis. That’s fun.</p>
<p>So as you move forward with your tasks today, ask yourself… Are you focusing on <em>your job</em> to the best of your abilities? Do you trust your team? If yes, well done. That’s a great place to be.</p>mreeseI’ve struggled with what I should write about for my first e84 blog post. You see, I’m not your typical high-tech employee. I don’t have a computer science degree. I’m not a math nut. I’ve never solved a Rubik’s Cube. That being said, I’ve got experience ranging a wide gamut of disciplines that I’ve managed to combine into a skill set that fits fairly well into an extremely high tech workplace. If I could create my own title, it would be “Wildcard” or something else equally ambiguous. I believe there’s value in variety. One of my core approaches to almost every aspect of my life is to incorporate a principle learned in one area into other areas.Functional Programming for the Functionally Challenged (Like Me)2016-01-30T07:00:00+00:002016-01-30T07:00:00+00:00http://blog.element84.com/elixir-update-in<p>In the <a href="/elixir-get-in.html">previous installment</a> of our
introduction to functional programming we looked
at reading values from nested data structures.
In this final post we look at the flip side of working with nested data structures,
updating them. If you have not read the previous post yet and are not familiar
Elixir, you might want to read it now, as this post builds on that one.</p>
<h3 id="the-challenge">The Challenge</h3>
<p>If navigating through a nested data structure to retrieve a value seems
challenging, updating a nested structure may appear hopeless, especially since
Elixir and FP languages in general have <em>immutable</em> data. In Java and other
languages that support mutable data structures we can change nested values <em>in-place</em>.
This usually means changing a nested value is no harder than reading it.
As long as we can navigate to the value we want to change we can do whatever we
want with it.</p>
<p>With immutable data we can’t simply change one value in a data structure. If
we want to update a value, we have to return a whole new data structure with
the value changed. This is not as bad as it sounds. First of all, since we know
the original data structure can’t be changed (immutable, remember?) our new
structure can be composed of the original structure plus the part that has changed.
Also, we don’t have to construct this ourselves; Elixir provides an API that makes
this, well, not easy, but easier than it would be without it.</p>
<p>Let’s start with a fairly simple nested map. Java has no map literals,
so we have to construct an empty map and use mutation right out of the gate.</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">Map</span><span class="o"><</span><span class="n">String</span><span class="o">,</span> <span class="n">Map</span><span class="o"><</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">>></span> <span class="n">myMap</span> <span class="o">=</span> <span class="k">new</span> <span class="n">HashMap</span><span class="o"><</span><span class="n">String</span><span class="o">,</span> <span class="n">Map</span><span class="o"><</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">>>();</span>
<span class="n">myMap</span><span class="o">.</span><span class="na">put</span><span class="o">(</span><span class="s">"Joe"</span><span class="o">,</span> <span class="k">new</span> <span class="n">HashMap</span><span class="o"><</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">>());</span>
<span class="n">myMap</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">"Joe"</span><span class="o">).</span><span class="na">put</span><span class="o">(</span><span class="s">"age"</span><span class="o">,</span> <span class="mi">30</span><span class="o">);</span>
<span class="n">myMap</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">"Joe"</span><span class="o">).</span><span class="na">put</span><span class="o">(</span><span class="s">"weight"</span><span class="o">,</span> <span class="mi">170</span><span class="o">);</span>
<span class="n">myMap</span><span class="o">.</span><span class="na">put</span><span class="o">(</span><span class="s">"Bob"</span><span class="o">,</span> <span class="k">new</span> <span class="n">HashMap</span><span class="o"><</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">>());</span>
<span class="n">myMap</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">"Bob"</span><span class="o">).</span><span class="na">put</span><span class="o">(</span><span class="s">"age"</span><span class="o">,</span> <span class="mi">32</span><span class="o">);</span>
<span class="n">myMap</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">"Bob"</span><span class="o">).</span><span class="na">put</span><span class="o">(</span><span class="s">"weight"</span><span class="o">,</span> <span class="mi">172</span><span class="o">);</span>
<span class="o">=></span> <span class="o">{</span><span class="n">Joe</span><span class="o">={</span><span class="n">weight</span><span class="o">=</span><span class="mi">170</span><span class="o">,</span> <span class="n">age</span><span class="o">=</span><span class="mi">30</span><span class="o">},</span> <span class="n">Bob</span><span class="o">={</span><span class="n">weight</span><span class="o">=</span><span class="mi">172</span><span class="o">,</span> <span class="n">age</span><span class="o">=</span><span class="mi">32</span><span class="o">}}</span></code></pre></figure>
<p>Elixir supports map literals, so initializing a map is simpler.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">my_map</span> <span class="o">=</span> <span class="p">%{</span><span class="sd">"</span><span class="s2">Joe"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">30</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">170</span><span class="p">},</span> <span class="sd">"</span><span class="s2">Bob"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">32</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">172</span><span class="p">}}</span>
<span class="o">=></span> <span class="p">%{</span><span class="sd">"</span><span class="s2">Bob"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">32</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">172</span><span class="p">},</span> <span class="sd">"</span><span class="s2">Joe"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">30</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">170</span><span class="p">}}</span></code></pre></figure>
<p>As an aside, notice that we use <em>atoms</em> (like symbols in Ruby) for the keys
that don’t vary (age, weight) and strings for the keys that do (name).</p>
<p>Let’s say Joe has gained a little weight and we want update our map to reflect this.
In Java this is straight-forward. In fact, it’s exactly what we did to
initialize our map.</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">myMap</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">"Joe"</span><span class="o">).</span><span class="na">put</span><span class="o">(</span><span class="s">"weight"</span><span class="o">,</span> <span class="mi">175</span><span class="o">);</span>
<span class="o">=></span> <span class="o">{</span><span class="n">Joe</span><span class="o">={</span><span class="n">weight</span><span class="o">=</span><span class="mi">175</span><span class="o">,</span> <span class="n">age</span><span class="o">=</span><span class="mi">30</span><span class="o">},</span> <span class="n">Bob</span><span class="o">={</span><span class="n">weight</span><span class="o">=</span><span class="mi">172</span><span class="o">,</span> <span class="n">age</span><span class="o">=</span><span class="mi">32</span><span class="o">}}</span></code></pre></figure>
<p>For maps, Elixir is almost as simple as Java. For this we have the
<a href="http://elixir-lang.org/docs/master/elixir/Kernel.html#put_in/3"><code class="highlighter-rouge">Kernel.put_in</code></a>
function. It takes a list of keys and uses them to navigate a data structure, in the
same way as <a href="http://elixir-lang.org/docs/master/elixir/Kernel.html#get_in/2"><code class="highlighter-rouge">get_in</code></a>
which we used in the <a href="/elixir-get-in.html">previous post</a>.
<code class="highlighter-rouge">get_in</code> and <code class="highlighter-rouge">put_in</code> are part of a family of functions that share the same
semantics for working with nested data structures.</p>
<p>For our nested map we have the following:</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">new_map</span> <span class="o">=</span> <span class="n">put_in</span><span class="p">(</span><span class="n">my_map</span><span class="p">,</span> <span class="p">[</span><span class="sd">"</span><span class="s2">Joe"</span><span class="p">,</span> <span class="ss">:weight</span><span class="p">],</span> <span class="m">175</span><span class="p">)</span>
<span class="o">=></span> <span class="p">%{</span><span class="sd">"</span><span class="s2">Bob"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">32</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">172</span><span class="p">},</span> <span class="sd">"</span><span class="s2">Joe"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">30</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">175</span><span class="p">}}</span></code></pre></figure>
<p>We can use a short-cut syntax to get even closer to the Java style.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">new_map</span> <span class="o">=</span> <span class="n">put_in</span> <span class="n">my_map</span><span class="p">[</span><span class="sd">"</span><span class="s2">Joe"</span><span class="p">][</span><span class="ss">:weight</span><span class="p">],</span> <span class="m">175</span>
<span class="o">=></span> <span class="p">%{</span><span class="sd">"</span><span class="s2">Bob"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">32</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">172</span><span class="p">},</span> <span class="sd">"</span><span class="s2">Joe"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">30</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">175</span><span class="p">}}</span></code></pre></figure>
<p>Now suppose we know that Joe’s weight increased by ten percent, but we don’t know
the actual value. In Java we have</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="kt">int</span> <span class="n">joeWeight</span> <span class="o">=</span> <span class="o">(</span><span class="kt">int</span><span class="o">)</span> <span class="o">(</span><span class="n">myMap</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">"Joe"</span><span class="o">).</span><span class="na">get</span><span class="o">(</span><span class="s">"weight"</span><span class="o">)</span> <span class="o">*</span> <span class="mf">1.1</span><span class="o">);</span>
<span class="n">myMap</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">"Joe"</span><span class="o">).</span><span class="na">put</span><span class="o">(</span><span class="s">"weight"</span><span class="o">,</span> <span class="n">joeWeight</span><span class="o">);</span>
<span class="o">=></span> <span class="o">{</span><span class="n">Joe</span><span class="o">={</span><span class="n">weight</span><span class="o">=</span><span class="mi">187</span><span class="o">,</span> <span class="n">age</span><span class="o">=</span><span class="mi">30</span><span class="o">},</span> <span class="n">Bob</span><span class="o">={</span><span class="n">weight</span><span class="o">=</span><span class="mi">172</span><span class="o">,</span> <span class="n">age</span><span class="o">=</span><span class="mi">32</span><span class="o">}}</span></code></pre></figure>
<p>In Elixir when we want to update to some function of the
current value we use <a href="http://elixir-lang.org/docs/master/elixir/Kernel.html#update_in/3"><code class="highlighter-rouge">Kernel.update_in</code></a>.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">new_map</span> <span class="o">=</span> <span class="n">update_in</span> <span class="n">my_map</span><span class="p">[</span><span class="sd">"</span><span class="s2">Joe"</span><span class="p">][</span><span class="ss">:weight</span><span class="p">],</span> <span class="k">fn</span> <span class="n">w</span> <span class="o">-></span> <span class="n">w</span> <span class="o">*</span> <span class="m">1.1</span> <span class="k">end</span>
<span class="o">=></span> <span class="p">%{</span><span class="sd">"</span><span class="s2">Bob"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">32</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">172</span><span class="p">},</span> <span class="sd">"</span><span class="s2">Joe"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">30</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">187</span><span class="p">}}</span></code></pre></figure>
<p>The biggest difference in the two approaches is that with <code class="highlighter-rouge">update_in</code> we pass a
function in (remember <em>higher order functions</em> from the
<a href="/elixir-feelin-loopy.html">first post</a>?) instead
of a value. The function takes the current value and returns the new one.
The function definition is the <code class="highlighter-rouge">fn w -> w * 1.1 end</code> bit.</p>
<p>Because <code class="highlighter-rouge">put_in</code> and <code class="highlighter-rouge">update_in</code> follow the same semantics as <code class="highlighter-rouge">get_in</code> we have
a lot of flexibility for updating nested structures. Suppose a year has passed and
we need to update everyone’s age. In an imperative language we don’t have a lot of
options. Using a non-functional approach in Java we would have to update every
record separately, like this:</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="k">for</span> <span class="o">(</span><span class="n">Map</span><span class="o">.</span><span class="na">Entry</span><span class="o"><</span><span class="n">String</span><span class="o">,</span> <span class="n">Map</span><span class="o"><</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">>></span> <span class="n">entry</span> <span class="o">:</span> <span class="n">myMap</span><span class="o">.</span><span class="na">entrySet</span><span class="o">())</span> <span class="o">{</span>
<span class="n">entry</span><span class="o">.</span><span class="na">getValue</span><span class="o">().</span><span class="na">put</span><span class="o">(</span><span class="s">"age"</span><span class="o">)</span> <span class="o">=</span> <span class="n">entry</span><span class="o">.</span><span class="na">getValue</span><span class="o">().</span><span class="na">get</span><span class="o">(</span><span class="s">"age"</span><span class="o">)</span> <span class="o">+</span> <span class="mi">1</span><span class="o">;</span>
<span class="o">}</span></code></pre></figure>
<p>In Elixir we can pass a function saying “choose everything”
as the first key in our list of keys passed
to <code class="highlighter-rouge">update_in</code>. We’ll call this function <code class="highlighter-rouge">all</code>, meaning we want to work on all
the keys at the top level. This is just a style choice, we could call it <code class="highlighter-rouge">foo</code>
and <code class="highlighter-rouge">update_in</code> would not care.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">new_map</span> <span class="o">=</span> <span class="n">update_in</span> <span class="n">my_map</span><span class="p">,</span> <span class="p">[</span><span class="n">all</span><span class="p">,</span> <span class="ss">:age</span><span class="p">],</span> <span class="k">fn</span> <span class="n">age</span> <span class="o">-></span> <span class="n">age</span> <span class="o">+</span> <span class="m">1</span> <span class="k">end</span>
<span class="o">=></span> <span class="p">%{</span><span class="sd">"</span><span class="s2">Bob"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">33</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">175</span><span class="p">},</span> <span class="sd">"</span><span class="s2">Joe"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">31</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">170</span><span class="p">}}</span></code></pre></figure>
<p>Before defining the <code class="highlighter-rouge">all</code> function it’s helpful to understand how
keys work in <code class="highlighter-rouge">update_in</code> (or <code class="highlighter-rouge">get_in</code>, <code class="highlighter-rouge">put_in</code>, <code class="highlighter-rouge">get_and_update_in</code>).
As we traverse our nested data structure we can
think of each level in our traversal as a substructure. Each key
gets applied to the substructure for the current point in the traversal.
The key determines which portion of the substructure is used for the next key.</p>
<p>Given our original map, if we apply the key list <code class="highlighter-rouge">["Joe", :age]</code> then we start
with the map</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="p">%{</span><span class="sd">"</span><span class="s2">Bob"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">32</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">172</span><span class="p">},</span> <span class="sd">"</span><span class="s2">Joe"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">30</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">187</span><span class="p">}}</span></code></pre></figure>
<p>apply the key <code class="highlighter-rouge">"Joe"</code> to get the next substructure</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="p">%{</span><span class="ss">age:</span> <span class="m">30</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">187</span><span class="p">}</span></code></pre></figure>
<p>then apply the key <code class="highlighter-rouge">:age</code> to get the final substructure, in this case a single
value</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="m">30</span></code></pre></figure>
<p>With normal keys like strings or atoms this happens
automatically. With function keys, it’s a bit more complicated as our function
must handle some of the work of the traversal.</p>
<p>Every function used as a key to <code class="highlighter-rouge">update_in</code> must have the same signature,</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">fun</span><span class="p">(</span><span class="ss">:get_and_update</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">next</span><span class="p">)</span></code></pre></figure>
<p>The first argument is an atom, <code class="highlighter-rouge">:get_and_update</code>, meaning our function
only matches if someone calls it with this as the first argument.
This is the action that will
be passed by <code class="highlighter-rouge">update_in</code> to our function. The reason <code class="highlighter-rouge">update_in</code> passes the
action is because we sometimes want to write multiple function clauses to handle
different actions, such as <code class="highlighter-rouge">:get</code> for the <code class="highlighter-rouge">get_in</code> function.
You may be wondering why the action is <code class="highlighter-rouge">:get_and_update</code> instead of just
<code class="highlighter-rouge">:update</code>. The reason is that <code class="highlighter-rouge">update_in</code> actually defers to
<code class="highlighter-rouge">get_and_update_in</code> which expects this action.</p>
<p>The second argument, <code class="highlighter-rouge">data</code>, is the substructure for the current point of the
traversal. This is the only data our function has to care about - all the data
higher up in the traversal has already been handled. The final argument, <code class="highlighter-rouge">next</code>,
is a function that we must call on the elements of <code class="highlighter-rouge">data</code> that we want to
traverse.</p>
<p>So all key functions for <code class="highlighter-rouge">update_in</code> need to do the same thing,
<span data-pullquote="All key functions take a substructure and transform it by identifying the elements of interest and calling `next` on them.">
take a substructure and transform it by identifying the elements of interest
and calling <code class="highlighter-rouge">next</code> on them. Normally the key function would leave the
other elements alone, but this doesn’t <em>have</em> to be the case – we will use this
to our advantage later on.</span></p>
<p>The <code class="highlighter-rouge">all</code> function is given below:</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="code"><pre><span class="n">all</span> <span class="o">=</span> <span class="k">fn</span> <span class="ss">:get_and_update</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">next</span> <span class="o">-></span>
<span class="n">data_list</span> <span class="o">=</span> <span class="no">Map</span><span class="o">.</span><span class="n">to_list</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
<span class="n">new_list</span> <span class="o">=</span> <span class="no">Enum</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">data_list</span><span class="p">,</span> <span class="k">fn</span> <span class="p">{</span><span class="n">key</span><span class="p">,</span> <span class="n">value</span><span class="p">}</span> <span class="o">-></span>
<span class="p">{</span><span class="n">_</span><span class="p">,</span> <span class="n">updated</span><span class="p">}</span> <span class="o">=</span> <span class="n">next</span><span class="o">.</span><span class="p">(</span><span class="n">value</span><span class="p">)</span>
<span class="p">{</span><span class="n">key</span><span class="p">,</span> <span class="n">updated</span><span class="p">}</span>
<span class="k">end</span><span class="p">)</span>
<span class="p">{</span><span class="no">nil</span><span class="p">,</span> <span class="no">Enum</span><span class="o">.</span><span class="n">into</span><span class="p">(</span><span class="n">new_list</span><span class="p">,</span> <span class="p">%{})}</span>
<span class="k">end</span></pre></td></tr></tbody></table></code></pre></figure>
<p>This function is a bit complicated at first glance, so let’s break it down.</p>
<p>On line 1 starting with <code class="highlighter-rouge">fn</code> we define the signature for an anonymous function
that matches on three arguments. The first argument must be <code class="highlighter-rouge">:get-and_update</code>,
which is the action that <code class="highlighter-rouge">update_in</code> will pass when this function is called.
The other two arguments, <code class="highlighter-rouge">data</code> and <code class="highlighter-rouge">next</code>, are the current substructure and
the function we need to call on the selected elements.
We bind (assign) this to the variable <code class="highlighter-rouge">all</code> so we can pass it as one of the keys
to <code class="highlighter-rouge">update_in</code>.</p>
<p>The substructure passed to this function will actually be our whole map,
since <code class="highlighter-rouge">all</code> is the first key in our list of keys passed to <code class="highlighter-rouge">update_in</code>.
On line 2 we convert the map to a list. We do this because we want to process
every value in the map. Converting it to a list turns this</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="p">%{</span><span class="sd">"</span><span class="s2">Bob"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">32</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">172</span><span class="p">},</span> <span class="sd">"</span><span class="s2">Joe"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">30</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">170</span><span class="p">}}</span></code></pre></figure>
<p>into this</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="p">[{</span><span class="sd">"</span><span class="s2">Bob"</span><span class="p">,</span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">32</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">175</span><span class="p">}},</span> <span class="p">{</span><span class="sd">"</span><span class="s2">Joe"</span><span class="p">,</span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">30</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">170</span><span class="p">}}]</span></code></pre></figure>
<p>which is a list of two-element (key/value) tuples.</p>
<p>On lines 3-6 we transform this list into a new list by using <code class="highlighter-rouge">map</code> to call
the <code class="highlighter-rouge">next</code> function on every value. This gives us a new list that has
our original keys and transformed values.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="p">[{</span><span class="sd">"</span><span class="s2">Bob"</span><span class="p">,</span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">33</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">175</span><span class="p">}},</span> <span class="p">{</span><span class="sd">"</span><span class="s2">Joe"</span><span class="p">,</span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">31</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">170</span><span class="p">}}]</span></code></pre></figure>
<p>On line 7 we construct a return value for our <code class="highlighter-rouge">all</code> function that consists
of a two element tuple. The first element is <code class="highlighter-rouge">nil</code>. It could be anything
because <code class="highlighter-rouge">update_in</code> is going to ignore it. If we passed our <code class="highlighter-rouge">all</code> function
to <code class="highlighter-rouge">get_and_update_in</code> then we would want to return the original value,
<code class="highlighter-rouge">data</code>, instead of <code class="highlighter-rouge">nil</code>. The second element of the tuple is a new map
that we construct by calling <code class="highlighter-rouge">Enum.into(new_list, %{})</code>. Moving back and forth
between maps and lists is routine in Elixir.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="p">{</span><span class="no">nil</span><span class="p">,</span> <span class="p">%{</span><span class="sd">"</span><span class="s2">Bob"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">33</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">172</span><span class="p">},</span> <span class="sd">"</span><span class="s2">Joe"</span> <span class="o">=></span> <span class="p">%{</span><span class="ss">age:</span> <span class="m">31</span><span class="p">,</span> <span class="ss">weight:</span> <span class="m">170</span><span class="p">}}}</span></code></pre></figure>
<p>Again, our function key has one job, transform the current substructure
by calling <code class="highlighter-rouge">next</code> on the elements we care about and leaving the other elements
alone.</p>
<p>Let’s put everything we have learned in one final example. Suppose we have a
company database represented by the following data structure:</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">company</span> <span class="o">=</span> <span class="p">%{</span><span class="ss">total_revenue:</span> <span class="m">10_000_000</span><span class="p">,</span>
<span class="ss">sales:</span> <span class="p">[%{</span><span class="ss">salesperson:</span> <span class="sd">"</span><span class="s2">Joe"</span><span class="p">,</span> <span class="ss">bonus:</span> <span class="m">0</span><span class="p">,</span> <span class="ss">accounts:</span> <span class="p">[</span><span class="sd">"</span><span class="s2">XYZ Inc."</span><span class="p">,</span> <span class="sd">"</span><span class="s2">ABC Co."</span><span class="p">,</span> <span class="sd">"</span><span class="s2">McDoogles"</span><span class="p">]},</span>
<span class="p">%{</span><span class="ss">salesperson:</span> <span class="sd">"</span><span class="s2">Bob"</span><span class="p">,</span> <span class="ss">bonus:</span> <span class="m">0</span><span class="p">,</span> <span class="ss">accounts:</span> <span class="p">[</span><span class="sd">"</span><span class="s2">ACME"</span><span class="p">,</span> <span class="sd">"</span><span class="s2">Tarjet"</span><span class="p">]},</span>
<span class="p">%{</span><span class="ss">salesperson:</span> <span class="sd">"</span><span class="s2">Jane"</span><span class="p">,</span> <span class="ss">bonus:</span> <span class="m">0</span><span class="p">,</span> <span class="ss">accounts:</span> <span class="p">[</span><span class="sd">"</span><span class="s2">Homely Depot"</span><span class="p">,</span> <span class="sd">"</span><span class="s2">Element 84"</span><span class="p">,</span> <span class="sd">"</span><span class="s2">Pear Computers"</span><span class="p">]},</span>
<span class="p">%{</span><span class="ss">salesperson:</span> <span class="sd">"</span><span class="s2">Jill"</span><span class="p">,</span> <span class="ss">bonus:</span> <span class="m">0</span><span class="p">,</span> <span class="ss">accounts:</span> <span class="p">[</span><span class="sd">"</span><span class="s2">Four Guys"</span><span class="p">,</span> <span class="sd">"</span><span class="s2">Gas 'N Sip'"</span><span class="p">]}]}</span></code></pre></figure>
<p>We want to update our the records for our sales people to reflect their quarterly
bonuses. The formula for the bonus is given by</p>
<p>\begin{equation}
bonus = $1000 \times number\ of\ accounts
\end{equation}</p>
<p>We could calculate everyone’s bonus by calling <code class="highlighter-rouge">get_in</code> for each employee to
get their accounts and using the count of accounts to compute their bonus. Then
we could call <code class="highlighter-rouge">update_in</code> to set their bonus using the values we calculated.
But that sounds like a lot of work. Besides, it means we would have to <em>know</em>
the names of all our sales people. Who has time for <em>that</em>?</p>
<p>Instead, we can update all the bonuses in one call to <code class="highlighter-rouge">update_in</code> by exploiting
what we know about how function keys work. Normally, a function key is just used
as a selector. It chooses which path(s) to follow and eventually when we reach the
final element selected by the final key, <code class="highlighter-rouge">update_in</code> applies the transformation
function we gave it and things bubble back up. So normally we end up modifying
the element selected by our last key and all the other element in our nested
structure are left alone.</p>
<p>But remember the primary responsibility of the function key is to <em>transform</em>
the substructure that is passed to it. Normally it does this by returning
what ever bubbled up from subsequent keys, but it doesn’t <em>have</em> to do this.
In fact, we are free to modify the data structure however we want. In this
case, we can update the value for the <code class="highlighter-rouge">:bonus</code> key with whatever bubbled up
from the subsequent keys.</p>
<p>Our call to <code class="highlighter-rouge">update_in</code> will look like this</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">update_in</span> <span class="n">company</span><span class="p">,</span> <span class="p">[</span><span class="ss">:sales</span><span class="p">,</span> <span class="n">update_bonus</span><span class="p">,</span> <span class="ss">:accounts</span><span class="p">],</span> <span class="k">fn</span> <span class="n">accounts</span> <span class="o">-></span>
<span class="m">1000</span> <span class="o">*</span> <span class="no">Enum</span><span class="o">.</span><span class="n">count</span><span class="p">(</span><span class="n">accounts</span><span class="p">)</span>
<span class="k">end</span> </code></pre></figure>
<p>The first key, <code class="highlighter-rouge">:sales</code>, selects the sales substructure. The next key,
<code class="highlighter-rouge">update_bonus</code>, is a function key that will be responsible for returning
a new substructure will all the bonuses filled in. The last key, <code class="highlighter-rouge">:accounts</code>,
selects the list of accounts for each salesperson.</p>
<p>The last argument to <code class="highlighter-rouge">update_in</code> is the transformation function, which operates
on the last element in the traversal, in this case the list of accounts
for each salesperson. It simply computes a bonus by multiplying the number
of accounts by 1000.</p>
<p>The atom keys and the transformation function should be easy enough to
understand. The only tricky part is the <code class="highlighter-rouge">update_bonus</code> function key, which
is given here</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="code"><pre><span class="n">update_bonus</span> <span class="o">=</span> <span class="k">fn</span> <span class="ss">:get_and_update</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">next</span> <span class="o">-></span>
<span class="n">new_data</span> <span class="o">=</span> <span class="no">Enum</span><span class="o">.</span><span class="n">map</span> <span class="n">data</span><span class="p">,</span> <span class="k">fn</span> <span class="n">salesperson</span> <span class="o">-></span>
<span class="p">{</span><span class="n">_</span><span class="p">,</span> <span class="n">updated_salesperson</span><span class="p">}</span> <span class="o">=</span> <span class="n">next</span><span class="o">.</span><span class="p">(</span><span class="n">salesperson</span><span class="p">)</span>
<span class="n">bonus</span> <span class="o">=</span> <span class="n">updated_salesperson</span><span class="p">[</span><span class="ss">:accounts</span><span class="p">]</span>
<span class="n">put_in</span><span class="p">(</span><span class="n">salesperson</span><span class="p">,</span> <span class="p">[</span><span class="ss">:bonus</span><span class="p">],</span> <span class="n">bonus</span><span class="p">)</span>
<span class="k">end</span>
<span class="p">{</span><span class="no">nil</span><span class="p">,</span> <span class="n">new_data</span><span class="p">}</span>
<span class="k">end</span></pre></td></tr></tbody></table></code></pre></figure>
<p>Hopefully this is starting to make sense to you, but let’s break it down to
be sure. Line 1 is the function signature that we have examined earlier and
the assignment to a variable, <code class="highlighter-rouge">update_bonus</code>, so we can pass it to
<code class="highlighter-rouge">update_in</code>. The <code class="highlighter-rouge">data</code> passed to this function will be list of maps of salespeople
as selected by the <code class="highlighter-rouge">:sales</code> key. Line 2 is going to transform all the elements in
the list by mapping over them using <code class="highlighter-rouge">Enum.map</code>.</p>
<p>The function applied to each of the elements takes a single argument,
<code class="highlighter-rouge">salesperson</code>, which will be a map. The body of the function is given in lines
3-5. It calls the <code class="highlighter-rouge">next</code> function on the <code class="highlighter-rouge">salesperson</code> map and gets back
an updated salesperson map, with the
bonus computed by the transformation function. If we were to just
bubble this value back up we would have a mess since we would be replacing
our list of accounts with the bonus. Instead, on line 4 we extarct the
computed bonus from the updated_map (it’s under <code class="highlighter-rouge">:accounts</code>) and on line
5 we set the current map’s
<code class="highlighter-rouge">:bonus</code> key to the calculated bonus. This new map replaces our <code class="highlighter-rouge">salesperson</code> map.</p>
<p>Finally, on line 7, we return our updated list of maps in a two element
tuple, using <code class="highlighter-rouge">nil</code> as the first value (remember, <code class="highlighter-rouge">update_in</code> will ignore this
value).</p>
<p>Putting it all together, we start with the first key, <code class="highlighter-rouge">:sales</code>, which points
this substructure, a list of maps:</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="p">[%{</span><span class="ss">salesperson:</span> <span class="sd">"</span><span class="s2">Joe"</span><span class="p">,</span> <span class="ss">bonus:</span> <span class="m">0</span><span class="p">,</span> <span class="ss">accounts:</span> <span class="p">[</span><span class="sd">"</span><span class="s2">XYZ Inc."</span><span class="p">,</span> <span class="sd">"</span><span class="s2">ABC Co."</span><span class="p">,</span> <span class="sd">"</span><span class="s2">McDoogles"</span><span class="p">]},</span>
<span class="p">%{</span><span class="ss">salesperson:</span> <span class="sd">"</span><span class="s2">Bob"</span><span class="p">,</span> <span class="ss">bonus:</span> <span class="m">0</span><span class="p">,</span> <span class="ss">accounts:</span> <span class="p">[</span><span class="sd">"</span><span class="s2">ACME"</span><span class="p">,</span> <span class="sd">"</span><span class="s2">Tarjet"</span><span class="p">]},</span>
<span class="p">%{</span><span class="ss">salesperson:</span> <span class="sd">"</span><span class="s2">Jane"</span><span class="p">,</span> <span class="ss">bonus:</span> <span class="m">0</span><span class="p">,</span> <span class="ss">accounts:</span> <span class="p">[</span><span class="sd">"</span><span class="s2">Homely Depot"</span><span class="p">,</span> <span class="sd">"</span><span class="s2">Element 84"</span><span class="p">,</span> <span class="sd">"</span><span class="s2">Pear Computers"</span><span class="p">]},</span>
<span class="p">%{</span><span class="ss">salesperson:</span> <span class="sd">"</span><span class="s2">Jill"</span><span class="p">,</span> <span class="ss">bonus:</span> <span class="m">0</span><span class="p">,</span> <span class="ss">accounts:</span> <span class="p">[</span><span class="sd">"</span><span class="s2">Four Guys"</span><span class="p">,</span> <span class="sd">"</span><span class="s2">Gas 'N Sip'"</span><span class="p">]}]</span></code></pre></figure>
<p>This list is the structure that is passed to our <code class="highlighter-rouge">update_bonus</code> function as the
<code class="highlighter-rouge">data</code> parameter.</p>
<p><code class="highlighter-rouge">update_bonus</code> then calls <code class="highlighter-rouge">Enum.map</code> on this list. The function passed to <code class="highlighter-rouge">Enum.map</code>
gets a map (<code class="highlighter-rouge">salesperson</code>) as its argument. It calls <code class="highlighter-rouge">next</code> on this map, which
moves on to the next key in our key list, <code class="highlighter-rouge">:accounts</code>. The substructure pointed
to by <code class="highlighter-rouge">:accounts</code> is the list of accounts. So for our first salesperson map, it
would be this</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="p">[</span><span class="sd">"</span><span class="s2">XYZ Inc."</span><span class="p">,</span> <span class="sd">"</span><span class="s2">ABC Co."</span><span class="p">,</span> <span class="sd">"</span><span class="s2">McDoogles"</span><span class="p">]</span></code></pre></figure>
<p>Since <code class="highlighter-rouge">:accounts</code> is the last key in the list passed to <code class="highlighter-rouge">update_in</code>, <code class="highlighter-rouge">update_in</code>
calls its tranformation function on this account list, which looks like this</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="m">1000</span> <span class="o">*</span> <span class="no">Enum</span><span class="o">.</span><span class="n">count</span><span class="p">([</span><span class="sd">"</span><span class="s2">XYZ Inc."</span><span class="p">,</span> <span class="sd">"</span><span class="s2">ABC Co."</span><span class="p">,</span> <span class="sd">"</span><span class="s2">McDoogles"</span><span class="p">])</span>
<span class="o">=></span> <span class="m">3000</span></code></pre></figure>
<p>Now the computed value begins to bubble back up. So at this point we have</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="p">{</span><span class="n">_</span><span class="p">,</span> <span class="n">updated_salesperson</span><span class="p">}</span> <span class="o">=</span> <span class="p">{</span><span class="no">nil</span><span class="p">,</span> <span class="p">%{</span><span class="ss">salesperson:</span> <span class="sd">"</span><span class="s2">Joe"</span><span class="p">,</span> <span class="ss">bonus:</span> <span class="m">0</span><span class="p">,</span> <span class="ss">accounts:</span> <span class="m">3000</span><span class="p">}}</span></code></pre></figure>
<p>on line 3 of our <code class="highlighter-rouge">update_bonus</code> function.</p>
<p>The computed bonus, 3000, is under the <code class="highlighter-rouge">:accounts</code> key, since this is the last
key in our key list. We simply extract it on line 4 with</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">bonus</span> <span class="o">=</span> <span class="n">updated_salesperson</span><span class="p">[</span><span class="ss">:accounts</span><span class="p">]</span>
<span class="o">=></span> <span class="m">3000</span></code></pre></figure>
<p>Now that we have the bonus, we just update the oringal <code class="highlighter-rouge">salesperson</code> map the
was passed to us on line 5</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">put_in</span><span class="p">(</span><span class="n">salesperson</span><span class="p">,</span> <span class="p">[</span><span class="ss">:bonus</span><span class="p">],</span> <span class="n">bonus</span><span class="p">)</span>
<span class="o">=></span> <span class="p">%{</span><span class="ss">salesperson:</span> <span class="sd">"</span><span class="s2">Joe"</span><span class="p">,</span> <span class="ss">bonus:</span> <span class="m">3000</span><span class="p">,</span> <span class="ss">accounts:</span> <span class="p">[</span><span class="sd">"</span><span class="s2">XYZ Inc."</span><span class="p">,</span> <span class="sd">"</span><span class="s2">ABC Co."</span><span class="p">,</span> <span class="sd">"</span><span class="s2">McDoogles"</span><span class="p">]}</span></code></pre></figure>
<p><code class="highlighter-rouge">update_bonus</code> does this for every salesperson map in the list it received, so
finally, we have</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">update_in</span> <span class="n">company</span><span class="p">,</span> <span class="p">[</span><span class="ss">:sales</span><span class="p">,</span> <span class="n">update_bonus</span><span class="p">,</span> <span class="ss">:accounts</span><span class="p">],</span> <span class="k">fn</span> <span class="n">accounts</span> <span class="o">-></span>
<span class="m">1000</span> <span class="o">*</span> <span class="no">Enum</span><span class="o">.</span><span class="n">count</span><span class="p">(</span><span class="n">accounts</span><span class="p">)</span>
<span class="k">end</span>
<span class="o">=></span> <span class="p">{</span><span class="ss">sales:</span> <span class="p">[%{</span><span class="ss">accounts:</span> <span class="p">[</span><span class="sd">"</span><span class="s2">XYZ Inc."</span><span class="p">,</span> <span class="sd">"</span><span class="s2">ABC Co."</span><span class="p">,</span> <span class="sd">"</span><span class="s2">McDoogles"</span><span class="p">],</span> <span class="ss">bonus:</span> <span class="m">3000</span><span class="p">,</span> <span class="ss">salesperson:</span> <span class="sd">"</span><span class="s2">Joe"</span><span class="p">},</span>
<span class="p">%{</span><span class="ss">accounts:</span> <span class="p">[</span><span class="sd">"</span><span class="s2">ACME"</span><span class="p">,</span> <span class="sd">"</span><span class="s2">Tarjet"</span><span class="p">],</span> <span class="ss">bonus:</span> <span class="m">2000</span><span class="p">,</span> <span class="ss">salesperson:</span> <span class="sd">"</span><span class="s2">Bob"</span><span class="p">},</span>
<span class="p">%{</span><span class="ss">accounts:</span> <span class="p">[</span><span class="sd">"</span><span class="s2">Homely Depot"</span><span class="p">,</span> <span class="sd">"</span><span class="s2">Element 84"</span><span class="p">,</span> <span class="sd">"</span><span class="s2">Pear Computers"</span><span class="p">],</span> <span class="ss">bonus:</span> <span class="m">3000</span><span class="p">,</span> <span class="ss">salesperson:</span> <span class="sd">"</span><span class="s2">Jane"</span><span class="p">},</span>
<span class="p">%{</span><span class="ss">accounts:</span> <span class="p">[</span><span class="sd">"</span><span class="s2">Four Guys"</span><span class="p">,</span> <span class="sd">"</span><span class="s2">Gas 'N Sip'"</span><span class="p">],</span> <span class="ss">bonus:</span> <span class="m">2000</span><span class="p">,</span> <span class="ss">salesperson:</span> <span class="sd">"</span><span class="s2">Jill"</span><span class="p">}],</span>
<span class="ss">total_revenue:</span> <span class="m">10000000</span><span class="p">}</span></code></pre></figure>
<h3 id="conclusion">Conclusion</h3>
<p>In this series we have seen that familiar tasks such as looping over an array
or working with nested data structures can be challenging for programmers moving
from an imperative language to a functional one like Elixir; challenging,
but not impossible. Once you start to think in terms of transforming
data this opens up a lot of possibilities. Hopefully the
examples we have worked through here have begun to help you understand
how these kinds of problems can be solved with functional approaches.
And hopefully I have convinced you that functional
approaches are not reserved for academic problems, but are applicable to the
kinds of problems you encounter every day – and that you don’t have to be
a genius to use them.</p>
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
TeX: { equationNumbers: { autoNumber: "AMS" } }
});
</script>
<script type="text/javascript" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
</script>jnortonIn the previous installment of our introduction to functional programming we looked at reading values from nested data structures. In this final post we look at the flip side of working with nested data structures, updating them. If you have not read the previous post yet and are not familiar Elixir, you might want to read it now, as this post builds on that one.Testing with VCR and Token Authentication2015-11-29T10:26:24+00:002015-11-29T10:26:24+00:00http://blog.element84.com/rails-vcr-tokens<p>If you are using <a href="https://github.com/vcr/vcr">VCR</a> to record/playback HTTP requests for your Rails tests, you may run into problems if your cassettes use tokens to authenticate with those services.</p>
<h3 id="the-error">The error</h3>
<p>After recording a cassette, you may see this familiar error upon running your tests:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="c"># Error</span>
An HTTP request has been made that VCR does not know how to handle</code></pre></figure>
<h3 id="the-cause">The cause</h3>
<p>VCR uses a RequestMatcher to determine if an outgoing request (and response) already exist in a cassette. Upon finding a match, the test will use the response pre-recorded in the cassette.</p>
<p>Two common ways to Authenticate on an external request are:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="c"># Token Param</span>
https://example.com/service/data?user<span class="o">=</span>123&account<span class="o">=</span>johnsmith&token<span class="o">=</span>qW413vsj4Sv4
<span class="c"># HTTP Basic Header Auth</span>
https://johnsmith:qW413vsj4Sv4@example.com/service/data?user<span class="o">=</span>123</code></pre></figure>
<p>Because tokens typically expire, cassettes recorded at different times may contain different authentication tokens. This becomes a problem when running the whole test suite. Because the request/response to get a token will also be pre-recorded, VCR will return the same response (and auth token) to all of the tests. The first test will pass, but those using cassettes recorded at a different time will experience the error mentioned above. Which test(s) pass may vary on each run because VCR uses a Seed to (randomly) order & execute the tests.</p>
<h3 id="breaking-down-a-request">Breaking down a request</h3>
<p>The next step is to understand which part of the request is failing to match, and what can be done about it.</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="c"># Sample Request</span>
https://example.com/service/data?user<span class="o">=</span>123&account<span class="o">=</span>johnsmith#time<span class="o">=</span>789
<span class="c"># Components</span>
<span class="k">*</span> Scheme <span class="o">(</span>http://, https://, etc<span class="o">)</span>
<span class="k">*</span> Host <span class="o">(</span>example.com<span class="o">)</span>
<span class="k">*</span> Path <span class="o">(</span>/service/data<span class="o">)</span>
<span class="k">*</span> Query <span class="o">(</span>?user<span class="o">=</span>123&account<span class="o">=</span>johnsmith<span class="o">)</span>
<span class="k">*</span> Fragment <span class="o">(</span><span class="c">#time=789)</span></code></pre></figure>
<h3 id="the-culprit">The culprit</h3>
<p>Depending on which authentication method is used, the culprit is in one of two places:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="c"># Token is in the Query</span>
https://example.com/service/data?user<span class="o">=</span>123&account<span class="o">=</span>johnsmith&token<span class="o">=</span>qW413vsj4Sv4
<span class="c"># Token is in the Header</span>
https://johnsmith:qW413vsj4Sv4@example.com/service/data?user<span class="o">=</span>123</code></pre></figure>
<h3 id="the-solution">The Solution</h3>
<p>VCR allows you to create a Custom RequestMatcher. With this, we can create a matcher that ignores the auth token (or other problematic/mutable param).</p>
<p><b>If you supply a Token Param…</b></p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="no">VCR</span><span class="p">.</span><span class="nf">configure</span> <span class="k">do</span> <span class="o">|</span><span class="n">config</span><span class="o">|</span>
<span class="o">...</span>
<span class="n">config</span><span class="p">.</span><span class="nf">default_cassette_options</span> <span class="o">=</span> <span class="p">{</span>
<span class="o">...</span>
<span class="ss">:match_requests_on</span> <span class="o">=></span> <span class="p">[</span><span class="ss">:method</span><span class="p">,</span> <span class="no">VCR</span><span class="p">.</span><span class="nf">request_matchers</span><span class="p">.</span><span class="nf">uri_without_param</span><span class="p">(</span><span class="ss">:token</span><span class="p">)]</span>
<span class="p">}</span>
<span class="k">end</span></code></pre></figure>
<p><b>If you use a Basic Auth Header…</b></p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="no">VCR</span><span class="p">.</span><span class="nf">configure</span> <span class="k">do</span> <span class="o">|</span><span class="n">config</span><span class="o">|</span>
<span class="o">...</span>
<span class="n">config</span><span class="p">.</span><span class="nf">default_cassette_options</span> <span class="o">=</span> <span class="p">{</span>
<span class="o">...</span>
<span class="ss">:match_requests_on</span> <span class="o">=></span> <span class="p">[</span><span class="ss">:method</span><span class="p">,</span> <span class="ss">:no_basic_auth</span><span class="p">]</span>
<span class="p">}</span>
<span class="n">config</span><span class="p">.</span><span class="nf">register_request_matcher</span> <span class="ss">:no_basic_auth</span> <span class="k">do</span> <span class="o">|</span><span class="n">req1</span><span class="p">,</span> <span class="n">req2</span><span class="o">|</span>
<span class="p">(</span><span class="no">URI</span><span class="p">(</span><span class="n">req1</span><span class="p">.</span><span class="nf">uri</span><span class="p">).</span><span class="nf">scheme</span> <span class="o">==</span> <span class="no">URI</span><span class="p">(</span><span class="n">req2</span><span class="p">.</span><span class="nf">uri</span><span class="p">).</span><span class="nf">scheme</span><span class="p">)</span> <span class="o">&&</span>
<span class="p">(</span><span class="no">URI</span><span class="p">(</span><span class="n">req1</span><span class="p">.</span><span class="nf">uri</span><span class="p">).</span><span class="nf">host</span> <span class="o">==</span> <span class="no">URI</span><span class="p">(</span><span class="n">req2</span><span class="p">.</span><span class="nf">uri</span><span class="p">).</span><span class="nf">host</span><span class="p">)</span> <span class="o">&&</span>
<span class="p">(</span><span class="no">URI</span><span class="p">(</span><span class="n">req1</span><span class="p">.</span><span class="nf">uri</span><span class="p">).</span><span class="nf">path</span> <span class="o">==</span> <span class="no">URI</span><span class="p">(</span><span class="n">req2</span><span class="p">.</span><span class="nf">uri</span><span class="p">).</span><span class="nf">path</span><span class="p">)</span> <span class="o">&&</span>
<span class="p">(</span><span class="no">URI</span><span class="p">(</span><span class="n">req1</span><span class="p">.</span><span class="nf">uri</span><span class="p">).</span><span class="nf">query</span> <span class="o">==</span> <span class="no">URI</span><span class="p">(</span><span class="n">req2</span><span class="p">.</span><span class="nf">uri</span><span class="p">).</span><span class="nf">query</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span></code></pre></figure>
<h3 id="wrap-up">Wrap Up</h3>
<p>While the ability to use <code class="highlighter-rouge">uri_without_param</code> is easy enough, creating a custom matcher is definitely a workaround at best. Unfortunately, requests for a <a href="https://github.com/vcr/vcr/issues/484">cleaner solution</a> to this problem seem to have been closed. Until that changes, we’ll have to keep using the custom matcher.</p>mbialasIf you are using VCR to record/playback HTTP requests for your Rails tests, you may run into problems if your cassettes use tokens to authenticate with those services.Functional Programming for the Functionally Challenged (Like Me)2015-11-25T07:00:00+00:002015-11-25T07:00:00+00:00http://blog.element84.com/elixir-get-in<p>In the <a href="/elixir-feelin-loopy.html">last post</a> we looked at
functional approaches to solving problems typically solved using loops
in imperative languages. These problems centered around list-like data structures
such as arrays or vectors. In this post we will look at more complicated
nested data structures.</p>
<h3 id="the-challenge">The Challenge</h3>
<p>Functional programming languages generally focus on transforming data rather than
working with object hierarchies. They tend to make extensive use of maps, lists,
and other fundamental data structures in lieu of
custom types like those used in object oriented languages. In FP it is
common to use highly nested data composed of many types of structures
(maps, lists, sets, etc.).</p>
<p>Working with these nested structures can be a challenge for programmers getting
started with FP. Modifying nested structures can seem particularly difficult
when working with immutable data. In this post, I will cover several common
chores encountered when working with nested data structures and explore
solutions using <a href="http://elixir-lang.org/">Elixir</a>.</p>
<h3 id="background---elixir-collection-types">Background - Elixir Collection Types</h3>
<p>Before diving into nested data structures, it might be helpful to introduce
the collection types Elixir provides, as these will be used to build more
complicated nested structures.</p>
<p>Perhaps the simplest collection type is the list. In Elixir we construct a list
using square brackets.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">list</span> <span class="o">=</span> <span class="p">[</span><span class="ss">:ok</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="sd">"</span><span class="s2">a"</span><span class="p">]</span>
<span class="o">=></span> <span class="p">[</span><span class="ss">:ok</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="sd">"</span><span class="s2">a"</span><span class="p">]</span></code></pre></figure>
<p>Elixir, like most functional languages, is highly tailored for working with lists.
The <a href="http://elixir-lang.org/docs/v1.1/elixir/List.html"><code class="highlighter-rouge">List</code></a>
module contains many functions that operate on or work with lists. The <code class="highlighter-rouge">first</code> function
will (obviously) return the first element of a list, or <code class="highlighter-rouge">nil</code> if the list is
empty</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">value</span> <span class="o">=</span> <span class="no">List</span><span class="o">.</span><span class="n">first</span><span class="p">(</span><span class="n">list</span><span class="p">)</span>
<span class="o">=></span> <span class="ss">:ok</span></code></pre></figure>
<p>One important thing to note is that in Elixir
<span data-pullquote="collections can store any type, including mixed types">
collections can store any type,
including mixed types</span>, whereas Java collections can only store one type
of object per collection. This gives us a lot of flexibility, as we can
create highly nested structures in Elixir composed of different types at each level.</p>
<p>Because lists are commonly processed using functions that recursively split
a list into head and tail, Elixir provides a convenient shortcut that allows
us to bind variables to the head (first element) and tail (all the rest)
of a list using a pattern match.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="p">[</span><span class="n">head</span> <span class="o">|</span> <span class="n">tail</span><span class="p">]</span> <span class="o">=</span> <span class="n">list</span>
<span class="o">=></span> <span class="p">[</span><span class="ss">:ok</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="sd">"</span><span class="s2">a"</span><span class="p">]</span>
<span class="n">head</span>
<span class="o">=></span> <span class="ss">:ok</span>
<span class="n">tail</span>
<span class="o">=></span> <span class="p">[</span><span class="m">1</span><span class="p">,</span> <span class="sd">"</span><span class="s2">a"</span><span class="p">]</span></code></pre></figure>
<p>Suppose we want to access a random element in a list.
In Java this is no different than accessing the first element.
In Elixir things get a bit more complicated. Random access is not something
one usually does with a list, so the <code class="highlighter-rouge">List</code> module has no function for this.
However, because lists implement the <code class="highlighter-rouge">Enumerable</code> protocol, we can use the
<code class="highlighter-rouge">Enum</code> modules <code class="highlighter-rouge">at</code> function to access a value at a random index.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">value</span> <span class="o">=</span> <span class="no">Enum</span><span class="o">.</span><span class="n">at</span><span class="p">(</span><span class="n">list</span><span class="p">,</span> <span class="m">1</span><span class="p">)</span>
<span class="o">=></span> <span class="m">1</span></code></pre></figure>
<p><em>Warning: Elixir lists are not</em> arrays. Random access to values is an order
N operation as the list values must be read from the start up to the index.
If you need array-like functionality, us a tuple. Tuples are constructed with
brackets like this</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">tuple</span> <span class="o">=</span> <span class="p">{</span><span class="ss">:ok</span><span class="p">,</span> <span class="m">200</span><span class="p">}</span>
<span class="o">=></span> <span class="p">{</span><span class="ss">:ok</span><span class="p">,</span> <span class="m">200</span><span class="p">}</span></code></pre></figure>
<p>Tuples in Elixir provide
constant time random access. With a tuple we can access a random index using
the <code class="highlighter-rouge">Kernel</code> module’s <code class="highlighter-rouge">elem</code> function. <code class="highlighter-rouge">Kernel</code> functions are loaded by
default, so we don’t need to prefix them with the module name.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">value</span> <span class="o">=</span> <span class="n">elem</span><span class="p">(</span><span class="n">tuple</span><span class="p">,</span> <span class="m">1</span><span class="p">)</span>
<span class="o">=></span> <span class="m">200</span></code></pre></figure>
<p>Tuples and lists are useful, but often we need to make use of map like structures
to store and retrieve key value pairs. We can use a map for this. Maps are
constructed using <code class="highlighter-rouge">%{}</code> notation like this.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">map</span> <span class="o">=</span> <span class="p">%{</span><span class="sd">"</span><span class="s2">foo"</span> <span class="o">=></span> <span class="sd">"</span><span class="s2">bar"</span><span class="p">,</span> <span class="sd">"</span><span class="s2">fizz"</span> <span class="o">=></span> <span class="sd">"</span><span class="s2">buzz"</span><span class="p">}</span>
<span class="o">=></span> <span class="p">%{</span><span class="sd">"</span><span class="s2">fizz"</span> <span class="o">=></span> <span class="sd">"</span><span class="s2">buzz"</span><span class="p">,</span> <span class="sd">"</span><span class="s2">foo"</span> <span class="o">=></span> <span class="sd">"</span><span class="s2">bar"</span><span class="p">}</span></code></pre></figure>
<p>Keys and values can be anything, not just strings. If the keys are
<a href="http://elixir-lang.org/getting-started/basic-types.html#atoms">atoms</a> then
we can use a shortcut notation like so:</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">map</span> <span class="o">=</span> <span class="p">%{</span><span class="ss">foo:</span> <span class="sd">"</span><span class="s2">bar"</span><span class="p">,</span> <span class="ss">fizz:</span> <span class="sd">"</span><span class="s2">buzz"</span><span class="p">}</span>
<span class="o">=></span> <span class="p">%{</span><span class="ss">fizz:</span> <span class="sd">"</span><span class="s2">buzz"</span><span class="p">,</span> <span class="ss">foo:</span> <span class="sd">"</span><span class="s2">bar"</span><span class="p">}</span></code></pre></figure>
<p>We can use the <code class="highlighter-rouge">Map</code>
module’s <code class="highlighter-rouge">get</code> function to access a map value:</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">value</span> <span class="o">=</span> <span class="no">Map</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">map</span><span class="p">,</span> <span class="ss">:foo</span><span class="p">)</span>
<span class="o">=></span> <span class="sd">"</span><span class="s2">bar"</span></code></pre></figure>
<p><code class="highlighter-rouge">get</code> returns <code class="highlighter-rouge">nil</code> if the map does not contain the key.
<code class="highlighter-rouge">get</code> can accept a third parameter that specifies
a default value if the map does not contain the key:</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"> <span class="n">value</span> <span class="o">=</span> <span class="no">Map</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">map</span><span class="p">,</span> <span class="ss">:baz</span><span class="p">,</span> <span class="sd">"</span><span class="s2">default"</span><span class="p">)</span>
<span class="o">=></span> <span class="sd">"</span><span class="s2">default"</span></code></pre></figure>
<p>Because accessing map values is such a common operation, Elixir provides a
shortcut syntax:</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"> <span class="n">value</span> <span class="o">=</span> <span class="n">map</span><span class="p">[</span><span class="ss">:fizz</span><span class="p">]</span>
<span class="o">=></span> <span class="sd">"</span><span class="s2">buzz"</span></code></pre></figure>
<p>Finally, if the keys to our map are atoms
(like symbols in ruby or keywords in Clojure) then we can use a dot notation
to retrieve the value for a given key.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"> <span class="n">value</span> <span class="o">=</span> <span class="n">map</span><span class="o">.</span><span class="n">fizz</span>
<span class="o">=></span> <span class="sd">"</span><span class="s2">buzz"</span></code></pre></figure>
<h3 id="retrieving-a-value-from-a-nested-structure">Retrieving a Value from a Nested Structure</h3>
<p>So far so good, but we wanted to use nested data structures. Let’s say we
have a list of maps:</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">data</span> <span class="o">=</span> <span class="p">[%{</span><span class="ss">foo:</span> <span class="sd">"</span><span class="s2">bar"</span><span class="p">,</span> <span class="ss">fizz:</span> <span class="sd">"</span><span class="s2">buzz"</span><span class="p">},</span> <span class="p">%{</span><span class="ss">foo:</span> <span class="sd">"</span><span class="s2">baz"</span><span class="p">,</span> <span class="ss">fizz:</span> <span class="sd">"</span><span class="s2">fez"</span><span class="p">}]</span></code></pre></figure>
<p>Now suppose we want to retrieve the value from first map for the key
<code class="highlighter-rouge">:foo</code>. In Java we can compose our method calls like so to avoid using an
intermediate variable:</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">String</span> <span class="n">value</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="mi">0</span><span class="o">).</span><span class="na">get</span><span class="o">(</span><span class="s">"foo"</span><span class="o">);</span></code></pre></figure>
<p>The one danger here is that if one of the <code class="highlighter-rouge">get</code>s in our chained method calls
returns <code class="highlighter-rouge">null</code> then the subsequent <code class="highlighter-rouge">get</code> will throw a <code class="highlighter-rouge">NullPointerException</code>.</p>
<p>In Elixir we can get close to this.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"> <span class="n">value</span> <span class="o">=</span> <span class="no">List</span><span class="o">.</span><span class="n">first</span><span class="p">(</span><span class="n">data</span><span class="p">)[</span><span class="ss">:foo</span><span class="p">]</span>
<span class="o">=></span> <span class="sd">"</span><span class="s2">bar"</span></code></pre></figure>
<p>Although this is concise and somewhat familiar looking, it doesn’t work well
with more complicated nesting. It’s better to use pipe
composition (see the <a href="/elixir-feelin-loopy.html">previous blog post</a>).</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"> <span class="n">value</span> <span class="o">=</span> <span class="n">data</span> <span class="o">|></span> <span class="no">List</span><span class="o">.</span><span class="n">first</span> <span class="o">|></span> <span class="no">Map</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="ss">:foo</span><span class="p">)</span>
<span class="o">=></span> <span class="sd">"</span><span class="s2">bar"</span></code></pre></figure>
<p>Here we pipe our data structure as the first argument to the <code class="highlighter-rouge">List.first</code>
function and then pipe the result of that as the first argument to the
<code class="highlighter-rouge">Map.get</code> function. In this way we can string together arbitrary paths
into a nested structure.</p>
<p>This approach gets a bit messy as our data gets more complicated. For these
cases we can rely on the <code class="highlighter-rouge">get_in</code> function.</p>
<p><code class="highlighter-rouge">get_in</code> takes a nested data structure and a path of keys (as a list) and returns the value
at that path.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">nested_data</span> <span class="o">=</span> <span class="p">%{</span><span class="ss">foo:</span> <span class="p">%{</span><span class="ss">bar:</span> <span class="sd">"</span><span class="s2">baz"</span><span class="p">}}</span>
<span class="o">=></span> <span class="p">%{</span><span class="ss">foo:</span> <span class="p">%{</span><span class="ss">bar:</span> <span class="sd">"</span><span class="s2">baz"</span><span class="p">}}</span>
<span class="n">get_in</span><span class="p">(</span><span class="n">nested_data</span><span class="p">,</span> <span class="p">[</span><span class="ss">:foo</span><span class="p">,</span> <span class="ss">:bar</span><span class="p">])</span>
<span class="o">=></span> <span class="sd">"</span><span class="s2">baz"</span></code></pre></figure>
<p>Unfortunately <code class="highlighter-rouge">get_in</code> does not support numerical indices as keys, so the
following is not supported:</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">nested_data</span> <span class="o">=</span> <span class="p">[%{</span><span class="ss">foo:</span> <span class="sd">"</span><span class="s2">bar"</span><span class="p">},</span> <span class="p">%{</span><span class="ss">foo:</span> <span class="sd">"</span><span class="s2">baz"</span><span class="p">}]</span>
<span class="o">=></span> <span class="p">[%{</span><span class="ss">foo:</span> <span class="sd">"</span><span class="s2">bar"</span><span class="p">},</span> <span class="p">%{</span><span class="ss">foo:</span> <span class="sd">"</span><span class="s2">baz"</span><span class="p">}]</span>
<span class="n">val</span> <span class="o">=</span> <span class="n">get_in</span><span class="p">(</span><span class="n">nested_data</span><span class="p">,</span> <span class="p">[</span><span class="m">0</span><span class="p">,</span> <span class="ss">foo:</span><span class="p">])</span>
<span class="o">=></span> <span class="o">**</span> <span class="p">(</span><span class="no">SyntaxError</span><span class="p">)</span> <span class="ss">iex:</span><span class="m">19</span><span class="p">:</span> <span class="n">keyword</span> <span class="n">argument</span> <span class="n">must</span> <span class="n">be</span> <span class="n">followed</span> <span class="n">by</span> <span class="n">space</span> <span class="k">after</span><span class="p">:</span> <span class="ss">foo:</span></code></pre></figure>
<p><code class="highlighter-rouge">get_in</code> <em>does</em> support passing functions as keys, which gives us a great deal
of flexibility (at a cost of some complexity). These ‘selector’ functions are
passed three arguments, the operation (always :get), the data to be accessed,
and a function to be invoked next.</p>
<p>If we want to effectively use a numerical index for a key, we can implement it
as a function like this</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">index1_fun</span> <span class="o">=</span> <span class="k">fn</span> <span class="ss">:get</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">next</span> <span class="o">-></span>
<span class="n">next</span><span class="o">.</span><span class="p">(</span><span class="no">Enum</span><span class="o">.</span><span class="n">at</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="m">1</span><span class="p">))</span>
<span class="k">end</span>
<span class="o">=></span> <span class="c1">#Function<18.54118792/3 in :erl_eval.expr/5></span></code></pre></figure>
<p>Our selector function selects the first element (using <code class="highlighter-rouge">Enum.at</code>) and then
passes this to the <code class="highlighter-rouge">next</code> function, effectively passing it to the next selector
in the chain.</p>
<p>We pass in this function as one of our keys like this</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">val</span> <span class="o">=</span> <span class="n">get_in</span><span class="p">(</span><span class="n">nested_data</span><span class="p">,</span> <span class="p">[</span><span class="n">index1_fun</span><span class="p">,</span> <span class="ss">:foo</span><span class="p">])</span>
<span class="o">=></span> <span class="sd">"</span><span class="s2">baz"</span></code></pre></figure>
<p>Because Elixir supports higher order functions, we can generalize our
selector function to take an index as an argument and return the function for
getting the item at that index.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">index_fun</span> <span class="o">=</span> <span class="k">fn</span> <span class="n">index</span> <span class="o">-></span>
<span class="k">fn</span> <span class="ss">:get</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">next</span> <span class="o">-></span>
<span class="n">next</span><span class="o">.</span><span class="p">(</span><span class="no">Enum</span><span class="o">.</span><span class="n">at</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">index</span><span class="p">))</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="o">=></span> <span class="c1">#Function<6.54118792/1 in :erl_eval.expr/5></span>
<span class="n">get_in</span><span class="p">(</span><span class="n">nested_data</span><span class="p">,</span> <span class="p">[</span><span class="n">index_fun</span><span class="o">.</span><span class="p">(</span><span class="m">0</span><span class="p">),</span> <span class="ss">:foo</span><span class="p">])</span>
<span class="o">=></span> <span class="sd">"</span><span class="s2">bar"</span>
<span class="n">get_in</span><span class="p">(</span><span class="n">nested_data</span><span class="p">,</span> <span class="p">[</span><span class="n">index_fun</span><span class="o">.</span><span class="p">(</span><span class="m">1</span><span class="p">),</span> <span class="ss">:foo</span><span class="p">])</span>
<span class="o">=></span> <span class="sd">"</span><span class="s2">baz"</span></code></pre></figure>
<p>Our select function can do a lot more than just pick a particular index or
key; since we have access to all the data at the current selection level,
we can apply whatever logic we want to it.</p>
<p>Say we have the following data structure to describe our company:</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">company</span> <span class="o">=</span> <span class="p">%{</span>
<span class="ss">earnings:</span> <span class="m">10_000_000</span><span class="p">,</span>
<span class="ss">employees:</span> <span class="p">[%{</span><span class="ss">name:</span> <span class="sd">"</span><span class="s2">Jack"</span><span class="p">,</span> <span class="ss">age:</span> <span class="m">25</span><span class="p">},</span>
<span class="p">%{</span><span class="ss">name:</span> <span class="sd">"</span><span class="s2">Jill"</span><span class="p">,</span> <span class="ss">age:</span> <span class="m">22</span><span class="p">},</span>
<span class="p">%{</span><span class="ss">name:</span> <span class="sd">"</span><span class="s2">John"</span><span class="p">,</span> <span class="ss">age:</span> <span class="m">20</span><span class="p">},</span>
<span class="p">%{</span><span class="ss">name:</span> <span class="sd">"</span><span class="s2">Joan"</span><span class="p">,</span> <span class="ss">age:</span> <span class="m">27</span><span class="p">}]</span>
<span class="p">}</span></code></pre></figure>
<p>Now suppose we want a list of names of all the employees that are over a
certain age. We can accomplish this as follows:</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="n">age_filter</span> <span class="o">=</span> <span class="k">fn</span> <span class="n">age</span> <span class="o">-></span>
<span class="k">fn</span> <span class="ss">:get</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">next</span> <span class="o">-></span>
<span class="no">Enum</span><span class="o">.</span><span class="n">filter_map</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="k">fn</span> <span class="n">employee</span> <span class="o">-></span> <span class="n">employee</span><span class="o">.</span><span class="n">age</span> <span class="o">></span> <span class="n">age</span> <span class="k">end</span><span class="p">,</span> <span class="o">&</span><span class="p">(</span><span class="n">next</span><span class="o">.</span><span class="p">(</span><span class="nv">&1</span><span class="p">)))</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="o">=></span> <span class="c1">#Function<6.54118792/1 in :erl_eval.expr/5></span>
<span class="n">get_in</span><span class="p">(</span><span class="n">company</span><span class="p">,</span> <span class="p">[</span><span class="ss">:employees</span><span class="p">,</span> <span class="n">age_filter</span><span class="o">.</span><span class="p">(</span><span class="m">24</span><span class="p">),</span> <span class="ss">:name</span><span class="p">])</span>
<span class="o">=></span> <span class="p">[</span><span class="sd">"</span><span class="s2">Jack"</span><span class="p">,</span> <span class="sd">"</span><span class="s2">Joan"</span><span class="p">]</span></code></pre></figure>
<p>This function takes an age and creates a selector function that will
filter the current selection level to those employees that are over the
given age. It uses the <code class="highlighter-rouge">Enum.filter_map</code> function that first filters a
collection and then maps a function over the filtered results, in this case
the <code class="highlighter-rouge">next</code> function passed to our selector.</p>
<h3 id="conclusion">Conclusion</h3>
<p>We have seen how we can navigate through nested data structures using a variety of
approaches, culminating in writing custom selector functions for <code class="highlighter-rouge">get_in</code>.
Hopefully this has given you a better understanding of how we can use
these techniques to work with complex data.
In the next post we look at the other side of the coin, updating nested
structures and dealing with immutability.</p>jnortonIn the last post we looked at functional approaches to solving problems typically solved using loops in imperative languages. These problems centered around list-like data structures such as arrays or vectors. In this post we will look at more complicated nested data structures.Proto REPL: A new Clojure REPL for the Atom Editor2015-10-26T00:00:00+00:002015-10-26T00:00:00+00:00http://blog.element84.com/introducing-proto-repl<p>I’d like to introduce a new Clojure REPL, <a href="https://atom.io/packages/proto-repl">Proto REPL</a>, that I created as a plugin to the <a href="https://atom.io">Atom</a> editor. Proto REPL lets you develop Clojure applications in Atom using an interactive REPL driven development experience.</p>
<p><img src="https://raw.githubusercontent.com/jasongilman/proto-repl/master/front_image.png" alt="Proto REPL" title="Proto REPL" /></p>
<h2 id="proto-repl-features">Proto REPL Features</h2>
<ul>
<li>Send blocks of code or selections from an editor tab to the REPL for execution.</li>
<li>Display documentation or code of selected functions and namespaces.</li>
<li>Easy, fast reloading of all the Clojure code in a project.</li>
<li>Jump to the definition of any Clojure Var with a single key press. This even works with libraries and clojure.core functions.</li>
<li>Run a single test, all tests in a namespace, or all tests in a project.</li>
<li>A tool bar provides button click access to common REPL capabilities.</li>
</ul>
<p>Atom makes it super easy to install new packages. Just run <code class="highlighter-rouge">apm install proto-repl</code> after installing the Atom editor to try it out. You’ll need Java (a Clojure dependency) and <a href="http://leiningen.org">Leiningen</a>(a Clojure build tool) installed to use it.</p>
<p>Try it out with the <a href="https://github.com/jasongilman/proto-repl-demo">Proto REPL Demo project</a>.</p>
<h2 id="why-a-new-repl">Why a new REPL?</h2>
<p>The future of software development going to be interactive. You’ll be connected to your running code in a way that makes it easy to understand what’s going on. Clojure gives me an experience like that today. Using a REPL let’s you easily experiment with small bits of code giving you immediate feedback. It makes you incredibly productive.</p>
<p>For various reasons I’ve decided to look for alternatives to my current Clojure development approach. I currently use <a href="https://github.com/jasongilman/SublimeClojureSetup">Sublime Text with Sublime REPL</a> to do Clojure development. Building Proto REPL in the Atom editor allowed me to replicate that experience but offered new capabilities I didn’t have with Sublime Text.</p>
<h3 id="visualizations-in-the-mist">Visualizations in the Mist</h3>
<p>The future of interactive development is going to be visual. The REPL of the future will look more like <a href="http://gorilla-repl.org">Gorilla REPL</a>, a notebook style REPL, than the terminal style REPLs we use today. Gorilla REPL is an interesting tool in the same vein as Mathematica Notebooks or IPython Notebooks. It gives you a futuristic REPL experience in which the output of a command might produce a number, a table, a bar chart, or even <a href="https://github.com/wiseman/leaflet-gorilla">a map of a spatial area</a>.</p>
<p>Gorilla REPL is great for creating documents but as a REPL for normal development it has limited interaction between your editor and the REPL. Gorilla REPL and a REPL in your editor can run on the same JVM and access code in your project but you can’t send code from your editor directly for display in Gorilla REPL. (Unless I missed this capability somehow.) Your editor and the browser window in which Gorilla REPL runs are two separate applications. This limits how close you can get to a single unified experience. Gorilla REPL is a very useful tool but would benefit from tighter editor integration. However, I’m convinced that Gorilla REPL’s approach of visual output in the web browser is the right one.</p>
<h3 id="the-browser-is-your-editor">The Browser is Your Editor</h3>
<p>ATOM is a new-ish editor from Github built on top of Chromium, the open source web browser on which Google Chrome is based. To over-simplify things it’s a text editor built on top of a web browser. You can write packages to extend it in JavaScript. You have access to the Chrome Developer Tools so you can set break points or profile your Atom plugin code. Because ATOM is at its heart a web browser that means you can use the combination of HTML/CSS/JavaScript right in your editor for visualizations or displaying whatever you want.</p>
<p>My goal for the initial version of Proto REPL was to provide feature parity with my REPL setup in Sublime Text. It doesn’t take full advantage of the visual capabilities of Atom yet. Proto REPL barely dips its toes in the visualization and interactive capabilities. When you execute a block of Clojure code in Proto REPL the block is highlighted in yellow for a fraction of a second to give you visual feedback of what was executed. This is done by temporarily applying a background color change with CSS. This simple change was something I couldn’t easily do in Sublime Text but was trivial to do in Atom.</p>
<p>I’m planning to start experimenting with capturing output from the REPL and displaying the results visually in ATOM. I’m hoping that a future version of Proto REPL can offer something a bit more visual.</p>
<p>Links</p>
<ul>
<li><a href="https://github.com/jasongilman/proto-repl">Proto REPL on Github</a></li>
<li><a href="https://atom.io/packages/proto-repl">Atom Proto REPL Page</a></li>
<li><a href="https://github.com/jasongilman/proto-repl-demo">Proto REPL Demo Project</a></li>
</ul>jgilmanI’d like to introduce a new Clojure REPL, Proto REPL, that I created as a plugin to the Atom editor. Proto REPL lets you develop Clojure applications in Atom using an interactive REPL driven development experience.Functional Programming for the Functionally Challenged (Like Me)2015-10-02T07:00:00+00:002015-10-02T07:00:00+00:00http://blog.element84.com/elixir-feelin-loopy<p><em>This is the first post in a series dedicated to presenting solutions to
common challenges that developers encounter when moving from an imperative
programming approach to functional programming (FP). I will present a series
of problems and provide solutions in both Java and <a href="http://elixir-lang.org/">Elixir</a>,
a functional language running on the Erlang VM.</em></p>
<p><em>This is not intended as a Java vs. Elixir comparison. Nor is it intended
to be an Elixir primer. It is intended
to demonstrate how to solve certain types of problems when moving from an
imperative approach to a functional one.</em></p>
<p><em>Java was chosen for the imperative
solutions because it is familiar to most programmers and has traditionally
been used in an imperative style. It is worth noting that Java has begun to
adopt some functional aspects, for instance the introduction of the <a href="http://docs.oracle.com/javase/1.5.0/docs/guide/language/foreach.html">for-each loop</a>
way back in Java 5 and more recently
<a href="https://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html">lambda expressions</a>
in Java 8.</em></p>
<h3>The Challenge</h3>
<p>One of the first challenges I encountered when starting with functional programming
was how to replace the looping constructs (for loop, while loops, etc.) that I
knew and understood from my imperative background. Loops are used all the time
in imperative programming for a variety of reasons:</p>
<ol>
<li>We want to execute some side effect(s) related to a collection (or data source),
like printing it out.</li>
<li>We want to find some element in a collection (and maybe do something
with it).</li>
<li>We want to transform a collection into a different collection (or collections).</li>
</ol>
<p>In a (mostly) imperative language like Java, there are only a few ways to do
these things, and they all involve loops of some sort. Let’s look at an example
of the first task by printing out a list.</p>
<h3>Printing a List</h3>
<p>Say we have a list of records and that those records have a “name” field.
Now suppose that we want to print out all the names from the records in our list.
Strictly speaking, printing a list relies on “side effects” and is not
purely functional. How we go about it, however, can follow the imperative
style or the FP style.</p>
<p>In Java we might
store our records as <code class="highlighter-rouge">Person</code> objects in an <code class="highlighter-rouge">ArrayList</code> and iterate over the list
to print out the names like this</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="code"><pre><span class="k">for</span><span class="o">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="o">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">list</span><span class="o">.</span><span class="na">size</span><span class="o">();</span> <span class="n">i</span><span class="o">++)</span> <span class="o">{</span>
<span class="n">Person</span> <span class="n">val</span> <span class="o">=</span> <span class="n">list</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="n">i</span><span class="o">);</span>
<span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="n">val</span><span class="o">.</span><span class="na">getName</span><span class="o">()));</span>
<span class="o">}</span></pre></td></tr></tbody></table></code></pre></figure>
<p>This code is fairly straightforward, but it’s worth
mentioning that we must set the bound of our loop’s initial
and terminating conditions. A mistake here can lead to
a missed name or an <code class="highlighter-rouge">IndexOutOfBoundsException</code>.</p>
<p>Alternatively we could use an iterator instead of a <code class="highlighter-rouge">for</code> loop, which
would mitigate some of the bound related concerns, but the code inside the
iteration would look similar to the code inside our <code class="highlighter-rouge">for</code> loop. Even better
would be to use a for-each loop, but then we would be moving into FP
territory and I don’t want to blur the line too much between the two approaches.</p>
<p>So how might we do something like this in Elixir given that we don’t have
looping control flow?<sup id="fnref:noloop"><a href="#fn:noloop" class="footnote">1</a></sup></p>
<p>Elixir makes this easy by providing functions for operating on
enumerable collections in the <code class="highlighter-rouge">Enum</code> module.
Several of these let us apply a function to all (or some) of the items in a collection.</p>
<p>In our case we can use the <code class="highlighter-rouge">each</code> function to print out our names. We store
our records as maps in a list and call <code class="highlighter-rouge">each</code> on the list.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
</pre></td><td class="code"><pre><span class="no">Enum</span><span class="o">.</span><span class="n">each</span><span class="p">(</span><span class="n">list</span><span class="p">,</span> <span class="k">fn</span> <span class="n">item</span> <span class="o">-></span> <span class="no">IO</span><span class="o">.</span><span class="n">puts</span> <span class="n">item</span><span class="p">[</span><span class="sd">"</span><span class="s2">name"</span><span class="p">]</span> <span class="k">end</span><span class="p">)</span></pre></td></tr></tbody></table></code></pre></figure>
<p>The <code class="highlighter-rouge">each</code> function takes two arguments, a collection and a function<sup id="fnref:higher-order"><a href="#fn:higher-order" class="footnote">2</a></sup>
to execute on each element of the collection.
In Elixir we define an <em>anonymous function</em><sup id="fnref:anonymous"><a href="#fn:anonymous" class="footnote">3</a></sup> using the
<code class="highlighter-rouge">fn arg(s) -> ... end</code> syntax. The code between the <code class="highlighter-rouge">-></code> and <code class="highlighter-rouge">end</code> defines
the function body. So our call to <code class="highlighter-rouge">fn</code> defines a function
that takes a list element as its <code class="highlighter-rouge">item</code> argument and prints the value for the
<code class="highlighter-rouge">item</code>’s “name”
key using the <code class="highlighter-rouge">puts</code> function from the <code class="highlighter-rouge">IO</code> module.</p>
<p>If you have explored FP at all you have probably heard of the <code class="highlighter-rouge">map</code> function
that is similar to <code class="highlighter-rouge">each</code>. <code class="highlighter-rouge">Enum</code> also provides a <code class="highlighter-rouge">map</code> function, and we
could use it here as well. There is a difference in the two, however.
<code class="highlighter-rouge">each</code> is specifically designed
to execute side-effects and merely returns the symbol <code class="highlighter-rouge">:ok</code> when it completes,
whereas <code class="highlighter-rouge">map</code> returns a new collection composed of the results of running
the given function on each item in the input collection.</p>
<p>Notice that unlike the Java implementation, with Elixir we don’t
specify <em>how</em> to do what we want so much as declare <em>what</em> we
want do (print out the “name” of each
item in our list). This <em>declarative</em> style is at the
heart of functional programming and is one of the fundamental differences
between imperative programming and FP.</p>
<p>Let’s try something slightly more complicated. Suppose that some of our
records don’t have “name” entries. We would like to avoid printing out
<code class="highlighter-rouge">null</code>s for those that don’t. We can modify our Java code slightly to
accomplish this.</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
</pre></td><td class="code"><pre><span class="k">for</span><span class="o">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="o">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">list</span><span class="o">.</span><span class="na">size</span><span class="o">();</span> <span class="n">i</span><span class="o">++)</span> <span class="o">{</span>
<span class="n">Person</span> <span class="n">val</span> <span class="o">=</span> <span class="n">list</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="n">i</span><span class="o">);</span>
<span class="n">String</span> <span class="n">name</span> <span class="o">=</span> <span class="n">val</span><span class="o">.</span><span class="na">getName</span><span class="o">();</span>
<span class="k">if</span><span class="o">(</span><span class="n">name</span> <span class="o">!=</span> <span class="kc">null</span><span class="o">)</span> <span class="o">{</span>
<span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="n">name</span><span class="o">);</span>
<span class="o">}</span>
<span class="o">}</span></pre></td></tr></tbody></table></code></pre></figure>
<p>This code is the same as the previous example except that we must explicitly
check our values to avoid printing <code class="highlighter-rouge">null</code>s.</p>
<p>We could take a similar approach with Elixir and add a check in our
anonymous function to only print names that are not <code class="highlighter-rouge">nil</code> (Elixir’s version of <code class="highlighter-rouge">null</code>).
We have other options, however. <code class="highlighter-rouge">Enum</code> provides the <code class="highlighter-rouge">filter</code> function, which
iterates over a collection and applies a given function to each item. Any items
for which the given function returns a “truthy” (not <code class="highlighter-rouge">false</code> or <code class="highlighter-rouge">nil</code>) value
are returned in a list.</p>
<p>We can combine this with the <code class="highlighter-rouge">each</code> function (or <code class="highlighter-rouge">map</code>)
in a couple of different ways. One way is to capture the output of the <code class="highlighter-rouge">filter</code>
function in a variable and pass that to <code class="highlighter-rouge">each</code>:</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
</pre></td><td class="code"><pre><span class="n">filtered</span> <span class="o">=</span> <span class="no">Enum</span><span class="o">.</span><span class="n">filter</span><span class="p">(</span><span class="n">list</span><span class="p">,</span> <span class="k">fn</span> <span class="n">item</span> <span class="o">-></span> <span class="n">item</span><span class="p">[</span><span class="sd">"</span><span class="s2">name"</span><span class="p">]</span> <span class="k">end</span><span class="p">)</span>
<span class="no">Enum</span><span class="o">.</span><span class="n">each</span><span class="p">(</span><span class="n">filtered</span><span class="p">,</span> <span class="k">fn</span> <span class="n">item</span> <span class="o">-></span> <span class="no">IO</span><span class="o">.</span><span class="n">puts</span> <span class="n">item</span><span class="p">[</span><span class="sd">"</span><span class="s2">name"</span><span class="p">]</span> <span class="k">end</span><span class="p">)</span></pre></td></tr></tbody></table></code></pre></figure>
<p>We can write this more succinctly as</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
</pre></td><td class="code"><pre><span class="no">Enum</span><span class="o">.</span><span class="n">each</span><span class="p">(</span><span class="no">Enum</span><span class="o">.</span><span class="n">filter</span><span class="p">(</span><span class="n">list</span><span class="p">,</span> <span class="k">fn</span> <span class="n">item</span> <span class="o">-></span> <span class="n">item</span><span class="p">[</span><span class="sd">"</span><span class="s2">name"</span><span class="p">]</span> <span class="k">end</span><span class="p">),</span>
<span class="k">fn</span> <span class="n">item</span> <span class="o">-></span> <span class="no">IO</span><span class="o">.</span><span class="n">puts</span> <span class="n">item</span><span class="p">[</span><span class="sd">"</span><span class="s2">name"</span><span class="p">]</span> <span class="k">end</span><span class="p">)</span></pre></td></tr></tbody></table></code></pre></figure>
<p>but this is harder to follow (for me at least). Fortunately, Elixir supports
a different form of
<a href="https://en.wikipedia.org/wiki/Function_composition">function composition</a>
using the pipe operator, <code class="highlighter-rouge">|></code>, which takes the value of the
expression on its left and applies it as the first argument to the function on
its right. Thus we can write</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
</pre></td><td class="code"><pre><span class="n">list</span> <span class="o">|></span> <span class="no">Enum</span><span class="o">.</span><span class="n">filter</span><span class="p">(</span><span class="k">fn</span> <span class="n">item</span> <span class="o">-></span> <span class="n">item</span><span class="p">[</span><span class="sd">"</span><span class="s2">name"</span><span class="p">]</span> <span class="k">end</span><span class="p">)</span> <span class="o">|></span> <span class="no">Enum</span><span class="o">.</span><span class="n">each</span><span class="p">(</span><span class="k">fn</span> <span class="n">item</span> <span class="o">-></span> <span class="no">IO</span><span class="o">.</span><span class="n">puts</span> <span class="n">item</span><span class="p">[</span><span class="sd">"</span><span class="s2">name"</span><span class="p">]</span> <span class="k">end</span><span class="p">)</span></pre></td></tr></tbody></table></code></pre></figure>
<p>You read this as “take the value of <code class="highlighter-rouge">list</code> and use it as the first argument
to <code class="highlighter-rouge">filter</code> with the given function, then take the output from <em>that</em> and
use it as the first argument to <code class="highlighter-rouge">each</code> with its given funciton.”
Notice that we don’t specify the first argument to the <code class="highlighter-rouge">filter</code> or <code class="highlighter-rouge">each</code>
functions as it
is assigned implicitly by the pipe operator. The pipe operator can be used
repeatedly to create arbitrarily long function compositions, piping the output
of each segment of the chain into the first argument of the next.</p>
<p>This style is reminiscent of the unix pipe operator and allows us to think
in terms of data flowing through a pipeline of operations. Contrast this with
the imperative approach in the Java example.</p>
<p>We have yet another option with the <code class="highlighter-rouge">filer_map</code> function. It
provides even tighter composition of the <code class="highlighter-rouge">filter</code> and <code class="highlighter-rouge">map</code> functionality.
<code class="highlighter-rouge">filter_map</code> takes as arguments a collection and two functions. Each item in
the collection is passed to the first function, and each item for which that
function returns a truthy value is passed to the second
function. The values of the second function for these items are composed into
a list.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
</pre></td><td class="code"><pre><span class="no">Enum</span><span class="o">.</span><span class="n">filter_map</span><span class="p">(</span><span class="n">list</span><span class="p">,</span> <span class="o">&</span><span class="p">(</span><span class="nv">&1</span><span class="p">[</span><span class="sd">"</span><span class="s2">name"</span><span class="p">],</span> <span class="o">&</span><span class="p">(</span><span class="no">IO</span><span class="o">.</span><span class="n">puts</span><span class="p">(</span><span class="nv">&1</span><span class="p">[</span><span class="sd">"</span><span class="s2">name"</span><span class="p">])))</span></pre></td></tr></tbody></table></code></pre></figure>
<p>Here we take advantage of the second way of expressing anonymous functions
in Elixir. The <code class="highlighter-rouge">&(...)</code> syntax declares an anonymous function, with arguments
passed to it bound to <code class="highlighter-rouge">&1</code>, <code class="highlighter-rouge">&2</code>, etc. This short form is convenient when
writing concise functions like the ones we have here. I find the
longer form slightly more readable with it’s named arguments, so for the sake of
clarity I will stick to the longer form for the rest of this series.</p>
<h3>Finding an Element in a List</h3>
<p>Moving onto our next challenge, how might we find elements in a list? Let’s
continue with our previous example of a collection containing records,
and say that we want to find the record with
the value “Bob” for the key “name”. Our Java implementation might look like
this</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="code"><pre><span class="n">Person</span> <span class="n">bob</span> <span class="o">=</span> <span class="kc">null</span><span class="o">;</span>
<span class="k">for</span><span class="o">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="o">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">list</span><span class="o">.</span><span class="na">size</span><span class="o">();</span> <span class="n">i</span><span class="o">++)</span> <span class="o">{</span>
<span class="n">Person</span> <span class="n">val</span> <span class="o">=</span> <span class="n">list</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="n">i</span><span class="o">);</span>
<span class="n">String</span> <span class="n">name</span> <span class="o">=</span> <span class="n">val</span><span class="o">.</span><span class="na">getName</span><span class="o">();</span>
<span class="k">if</span><span class="o">(</span><span class="n">name</span> <span class="o">!=</span> <span class="kc">null</span> <span class="o">&&</span> <span class="n">name</span><span class="o">.</span><span class="na">equals</span><span class="o">(</span><span class="s">"Bob"</span><span class="o">))</span> <span class="o">{</span>
<span class="n">bob</span> <span class="o">=</span> <span class="n">val</span><span class="o">;</span>
<span class="k">break</span><span class="o">;</span>
<span class="o">}</span>
<span class="o">}</span></pre></td></tr></tbody></table></code></pre></figure>
<p>Again, there are points worth mentioning about this code. The first point is that we must
declare a variable outside of our loop to keep track of our match and
explicitly assign to this variable when we find a match inside our loop.
When checking for a match inside the loop, we must test to make sure the
name is not <code class="highlighter-rouge">null</code> to avoid a <code class="highlighter-rouge">NullPointerException</code> on
line 8. The final
point is that we must explicitly break out of our loop when we find a match to
avoid the overhead of processing unnecessary records.</p>
<p>How might we do something like this in Elixir given that we don’t have
mutable data<sup id="fnref:mutable"><a href="#fn:mutable" class="footnote">4</a></sup> to keep track of our match? Elixir makes
this easy by providing the <code class="highlighter-rouge">find</code> function in the <code class="highlighter-rouge">Enum</code> module.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
</pre></td><td class="code"><pre><span class="n">bob</span> <span class="o">=</span> <span class="no">Enum</span><span class="o">.</span><span class="n">find</span><span class="p">(</span><span class="n">list</span><span class="p">,</span> <span class="k">fn</span> <span class="n">item</span> <span class="o">-></span> <span class="n">item</span><span class="p">[</span><span class="sd">"</span><span class="s2">name"</span><span class="p">]</span> <span class="o">==</span> <span class="sd">"</span><span class="s2">Bob"</span> <span class="k">end</span><span class="p">)</span></pre></td></tr></tbody></table></code></pre></figure>
<p>The <code class="highlighter-rouge">find</code> function takes two arguments, the collection we want to search and
a function that determines if a collection item matches our query.
The body of the anonymous function
, <code class="highlighter-rouge">item["name"] == "Bob"</code>, simply evaluates to <code class="highlighter-rouge">true</code> if the value for the
“name” key of its <code class="highlighter-rouge">item</code> argument equals “Bob” and <code class="highlighter-rouge">false</code> if not.</p>
<p><code class="highlighter-rouge">find</code> will iterate over all the items in the collection,
passing each item to the anonymous function until the that function returns
<code class="highlighter-rouge">true</code>. <code class="highlighter-rouge">find</code> returns the
first item in the collection that matches or <code class="highlighter-rouge">nil</code> if there was no match.</p>
<p>Note that we don’t have to deal with any of the issues from the Java
implementation. There is no bounds checks, no checking for nulls
(nils in Elixir) to avoid exceptions, and no need to terminate our search
when we find the item we want (<code class="highlighter-rouge">find</code> handles that).</p>
<h3>Transforming a Collection</h3>
<p>Suppose we want to take the list from the previous example and create a new
list that contains the names for all the records for which the value stored for
the “age” key is greater than 21.</p>
<p>In Java we might use code like the following:</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="code"><pre><span class="n">ArrayList</span><span class="o"><</span><span class="n">String</span><span class="o">></span> <span class="n">newList</span> <span class="o">=</span> <span class="k">new</span> <span class="n">ArrayList</span><span class="o"><</span><span class="n">String</span><span class="o">>();</span>
<span class="k">for</span><span class="o">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="o">;</span> <span class="n">i</span><span class="o"><</span><span class="n">list</span><span class="o">.</span><span class="na">size</span><span class="o">();</span> <span class="n">i</span><span class="o">++)</span> <span class="o">{</span>
<span class="n">Person</span> <span class="n">person</span> <span class="o">=</span> <span class="n">list</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="n">i</span><span class="o">);</span>
<span class="k">if</span> <span class="o">(</span><span class="n">person</span><span class="o">.</span><span class="na">age</span> <span class="o">></span> <span class="mi">21</span><span class="o">)</span> <span class="o">{</span>
<span class="n">newList</span><span class="o">.</span><span class="na">add</span><span class="o">(</span><span class="n">person</span><span class="o">.</span><span class="na">getName</span><span class="o">());</span>
<span class="o">}</span>
<span class="o">}</span></pre></td></tr></tbody></table></code></pre></figure>
<p>The <code class="highlighter-rouge">Enum</code> module in Elixir provides many functions for transforming collections,
but in this case we probably want to use a <a href="http://elixir-lang.org/getting-started/comprehensions.html">list comprehension</a>.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
</pre></td><td class="code"><pre><span class="n">new_list</span> <span class="o">=</span> <span class="n">for</span> <span class="n">item</span> <span class="o"><-</span> <span class="n">list</span><span class="p">,</span> <span class="k">fn</span> <span class="n">item</span> <span class="o">-></span> <span class="n">item</span><span class="p">[</span><span class="sd">"</span><span class="s2">age"</span><span class="p">]</span> <span class="o">></span> <span class="m">21</span> <span class="k">end</span> <span class="k">do</span> <span class="n">item</span><span class="p">[</span><span class="sd">"</span><span class="s2">name"</span><span class="p">]</span> <span class="k">end</span></pre></td></tr></tbody></table></code></pre></figure>
<p><code class="highlighter-rouge">for</code> takes a collection, an optional predicate function that selects a subset
of the items in the collection (the anonymous function that tests the item’s age),
and an expression that returns some function (in the general sense) of the items in the subset</p>
<ul>
<li>in this case it simply selects the value for the “name” key.</li>
</ul>
<p>Read this as “construct a new collection consisting of the names of each item
in the list that has a value for “age” greater than 21.”</p>
<p>It’s important to distinguish the <code class="highlighter-rouge">for</code> in Elixir from a <code class="highlighter-rouge">for</code> loop in Java.
<code class="highlighter-rouge">for</code> in Elixir is <em>not</em> a control flow construct. Although you <em>can</em> have
side effects in a <code class="highlighter-rouge">for</code>, its primary purpose is to construct a new collection,
and this collection is the return value from the <code class="highlighter-rouge">for</code>. This contrasts with
the Java implementation in which we have to declare our new collection
outside our loop and then append values to it piece by piece in the body of
the loop.</p>
<p>Suppose we want to sort the names as well. In Java we either sort the
new collection as we construct it (ugh), or we can call the <code class="highlighter-rouge">sort</code> method on the
<code class="highlighter-rouge">Collections</code> class.</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
</pre></td><td class="code"><pre><span class="n">Collections</span><span class="o">.</span><span class="na">sort</span><span class="o">(</span><span class="n">newList</span><span class="o">);</span></pre></td></tr></tbody></table></code></pre></figure>
<p>This almost has an FP feel to it, but it’s not, because this sorts our new list
<em>in place</em>, i.e., mutates it.</p>
<p>With Elixir we can similarly call <code class="highlighter-rouge">Enum.sort</code> on
<code class="highlighter-rouge">new_list</code>, but we want to start thinking about data transformations in FP,
so instead we pipe the output of our list comprehension directly to <code class="highlighter-rouge">sort</code>.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
</pre></td><td class="code"><pre><span class="n">new_list</span> <span class="o">=</span> <span class="n">for</span> <span class="n">item</span> <span class="o"><-</span> <span class="n">list</span><span class="p">,</span> <span class="n">item</span><span class="p">[</span><span class="sd">"</span><span class="s2">age"</span><span class="p">]</span> <span class="o">></span> <span class="m">21</span> <span class="k">do</span> <span class="n">item</span><span class="p">[</span><span class="sd">"</span><span class="s2">name"</span><span class="p">]</span> <span class="k">end</span> <span class="o">|></span> <span class="no">Enum</span><span class="o">.</span><span class="n">sort</span><span class="p">()</span></pre></td></tr></tbody></table></code></pre></figure>
<p>You might be objecting at this point by asking “But these pipelines are
transforming data - how is that different from changing our list in place
with Java’s <code class="highlighter-rouge">Collections.sort()</code>?” The difference is that collections transformed
by Elixer functions are copied - the original collection never changes. And
right about now you are probably thinking how inefficient that must be in
terms of memory and CPU cycles. I’m going to dodge this as outside the
scope of this series, but go read about
<a href="https://en.wikipedia.org/wiki/Persistent_data_structure">persistent data structures</a>
and have your mind blown.</p>
<p>Let’s look at another example. Suppose we want to partition our collection into
two different collections, one collection containing all the items where
the value for the “age” key is less than 21 and one consisting of all the rest.</p>
<p>In Java this would not be much different from our previous example</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="code"><pre><span class="n">ArrayList</span><span class="o"><</span><span class="n">Person</span><span class="o">></span> <span class="n">underAge</span> <span class="o">=</span> <span class="k">new</span> <span class="n">ArrayList</span><span class="o"><</span><span class="n">Person</span><span class="o">>();</span>
<span class="n">ArrayList</span><span class="o"><</span><span class="n">Person</span><span class="o">></span> <span class="n">legal</span> <span class="o">=</span> <span class="k">new</span> <span class="n">ArrayList</span><span class="o"><</span><span class="n">Person</span><span class="o">>();</span>
<span class="k">for</span><span class="o">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="o">;</span> <span class="n">i</span><span class="o"><</span><span class="n">list</span><span class="o">.</span><span class="na">size</span><span class="o">();</span> <span class="n">i</span><span class="o">++)</span> <span class="o">{</span>
<span class="n">Person</span> <span class="n">person</span> <span class="o">=</span> <span class="n">list</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="n">i</span><span class="o">);</span>
<span class="k">if</span> <span class="o">(</span><span class="n">person</span><span class="o">.</span><span class="na">age</span> <span class="o"><</span> <span class="mi">21</span><span class="o">)</span> <span class="o">{</span>
<span class="n">underAge</span><span class="o">.</span><span class="na">add</span><span class="o">(</span><span class="n">person</span><span class="o">);</span>
<span class="o">}</span> <span class="k">else</span> <span class="o">{</span>
<span class="n">legal</span><span class="o">.</span><span class="na">add</span><span class="o">(</span><span class="n">person</span><span class="o">);</span>
<span class="o">}</span>
<span class="o">}</span></pre></td></tr></tbody></table></code></pre></figure>
<p>In Elixir, we have yet another higher order function from <code class="highlighter-rouge">Enum</code>, the
<code class="highlighter-rouge">partition</code> function. This function is similar to the <code class="highlighter-rouge">filter</code> function, but
instead of discarding items it splits them into two lists, one for the items
for which the given function returns a truthy value and one for everything
else.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
</pre></td><td class="code"><pre><span class="p">{</span><span class="n">under_age</span><span class="p">,</span> <span class="n">legal</span><span class="p">}</span> <span class="o">=</span> <span class="no">Enum</span><span class="o">.</span><span class="n">partition</span><span class="p">(</span><span class="n">list</span><span class="p">,</span> <span class="k">fn</span> <span class="n">item</span> <span class="o">-></span> <span class="n">item</span><span class="p">[</span><span class="sd">"</span><span class="s2">age"</span><span class="p">]</span> <span class="o"><</span> <span class="m">21</span> <span class="k">end</span><span class="p">)</span></pre></td></tr></tbody></table></code></pre></figure>
<p>Again, notice the declarative style here. We want to <em>partition</em> our list into
one list with items with “age” under 21 and one with the rest. Also note that
<code class="highlighter-rouge">partition</code> returns a tuple that we can match against (see <a href="http://elixir-lang.org/getting-started/pattern-matching.html">Elixir pattern matching</a>) to
bind our two lists directly to the variables <code class="highlighter-rouge">under_age</code> and <code class="highlighter-rouge">legal</code>.
Compare this to the Java implementation where we had to declare our
lists outside our loop and append items to each list explicitly using
conditional logic inside our loop.</p>
<h3>To Infinity and Beyond</h3>
<p>All the previous
examples have one thing in common; they all work with collections that fit
into memory. Often we need to read and process a data source that
won’t fit into memory (like a <em>really</em> big file) or is effectively infinite
(like reading from a data stream). This is a classic use for loops in the
imperative approach.</p>
<p>So let’s compare how we handle this using a functional approach.
Let’s implement functionality similar to the Unix <a href="http://www.gnu.org/software/grep/manual/grep.html">grep</a>
command in which we will read in a file line by line and print out any lines that
contain a given string. We’ll make things more interesting by printing out the
line number with each matching line. To keep things simple we will
assume that there are no lines too big to fit into memory.</p>
<p>A straightforward
imperative<sup id="fnref:java-streams"><a href="#fn:java-streams" class="footnote">5</a></sup> Java implementation looks like this
(ignoring error handling)</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
</pre></td><td class="code"><pre><span class="n">FileInputStream</span> <span class="n">fs</span> <span class="o">=</span> <span class="k">new</span> <span class="n">FileInputStream</span><span class="o">(</span><span class="n">filename</span><span class="o">);</span>
<span class="n">BufferedReader</span> <span class="n">br</span> <span class="o">=</span> <span class="k">new</span> <span class="n">BufferedReader</span><span class="o">(</span><span class="k">new</span> <span class="n">InputStreamReader</span><span class="o">(</span><span class="n">fs</span><span class="o">));</span>
<span class="n">String</span> <span class="n">line</span><span class="o">;</span>
<span class="kt">int</span> <span class="n">lineNum</span> <span class="o">=</span> <span class="mi">1</span><span class="o">;</span>
<span class="k">while</span> <span class="o">((</span><span class="n">line</span> <span class="o">=</span> <span class="n">br</span><span class="o">.</span><span class="na">readLine</span><span class="o">())</span> <span class="o">!=</span> <span class="kc">null</span><span class="o">)</span> <span class="o">{</span>
<span class="k">if</span> <span class="o">(</span><span class="n">line</span><span class="o">.</span><span class="na">indexOf</span><span class="o">(</span><span class="n">query</span><span class="o">)</span> <span class="o">!=</span> <span class="o">-</span><span class="mi">1</span><span class="o">)</span> <span class="o">{</span>
<span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"%d - %s%n"</span><span class="o">,</span> <span class="n">lineNum</span><span class="o">,</span> <span class="n">line</span><span class="o">);</span>
<span class="o">}</span>
<span class="n">lineNum</span><span class="o">++;</span>
<span class="o">}</span>
<span class="n">br</span><span class="o">.</span><span class="na">close</span><span class="o">();</span></pre></td></tr></tbody></table></code></pre></figure>
<p>We use a <code class="highlighter-rouge">while</code> loop to read from the open file until there are no lines
left. Inside the loop we check the current line to see if it contains our
query term. If it does, we print out the line with its line number.
Fairly simple stuff.</p>
<p>So how do we handle data sources of indefinite size in Elixir? Not surprisingly,
Elixir provides a module for this. The <code class="highlighter-rouge">Stream</code> module provides the same
functionality as the <code class="highlighter-rouge">Enum</code> module, but instead of operating on enumerable
collections, it operates on <em>streams</em>. In Elixir, streams are enumerables that
generate items one by one during enumeration. We will examine a solution based
on the <code class="highlighter-rouge">Stream</code> module in a later post, but first lets take a look at a more generic solution
based on a <a href="https://en.wikipedia.org/wiki/Recursion_(computer_science)">recursive function</a>.</p>
<p>With a recursive function call we can produce the same effect of
an iterative loop in an imperative language. There are two primary differences in the
implementation. The first is that instead of specifying a termination
condition in a loop predicate we must implement the logic for this in our
function. The second difference is that with an imperative approach we can
mutate external state with each loop iteration (such as incrementing the line counter),
whereas with the recursive approach we must pass any state into our function as
normal arguments.</p>
<p>Our Elixir implementation consists of two functions (in Elixir, functions
with the same name but different arity are considered different functions).
The first function is the actual recursive function that does most of the
work.</p>
<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
</pre></td><td class="code"><pre><span class="k">def</span> <span class="n">grep</span><span class="p">(</span><span class="n">file</span><span class="p">,</span> <span class="n">query</span><span class="p">,</span> <span class="n">line_num</span><span class="p">)</span> <span class="k">do</span>
<span class="k">case</span> <span class="no">IO</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="n">file</span><span class="p">,</span> <span class="ss">:line</span><span class="p">)</span> <span class="k">do</span>
<span class="p">{</span><span class="ss">:error</span><span class="p">,</span> <span class="n">reason</span><span class="p">}</span> <span class="o">-></span>
<span class="no">IO</span><span class="o">.</span><span class="n">puts</span> <span class="n">reason</span>
<span class="no">File</span><span class="o">.</span><span class="n">close</span><span class="p">(</span><span class="n">file</span><span class="p">)</span>
<span class="ss">:eof</span> <span class="o">-></span>
<span class="no">File</span><span class="o">.</span><span class="n">close</span><span class="p">(</span><span class="n">file</span><span class="p">)</span>
<span class="n">line</span> <span class="o">-></span>
<span class="k">if</span> <span class="no">String</span><span class="o">.</span><span class="n">contains?</span><span class="p">(</span><span class="n">line</span><span class="p">,</span> <span class="n">query</span><span class="p">)</span> <span class="k">do</span>
<span class="no">IO</span><span class="o">.</span><span class="n">puts</span> <span class="sd">"</span><span class="si">#{</span><span class="n">line_num</span><span class="si">}</span><span class="s2">: </span><span class="si">#{</span><span class="n">line</span><span class="si">}</span><span class="s2">"</span>
<span class="k">end</span>
<span class="n">grep</span><span class="p">(</span><span class="n">file</span><span class="p">,</span> <span class="n">query</span><span class="p">,</span> <span class="n">line_num</span> <span class="o">+</span> <span class="m">1</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">def</span> <span class="n">grep</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="n">query</span><span class="p">)</span> <span class="k">do</span>
<span class="p">{</span><span class="ss">:ok</span><span class="p">,</span> <span class="n">file</span><span class="p">}</span> <span class="o">=</span> <span class="no">File</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="p">[</span><span class="ss">:read</span><span class="p">,</span> <span class="ss">:utf8</span><span class="p">])</span>
<span class="n">grep</span><span class="p">(</span><span class="n">file</span><span class="p">,</span> <span class="n">query</span><span class="p">,</span> <span class="m">1</span><span class="p">)</span>
<span class="k">end</span></pre></td></tr></tbody></table></code></pre></figure>
<p>The second function accepts a filename and a query string.
It opens the file and passes it to the first function, initializing the
<code class="highlighter-rouge">line_num</code> argument to 1.</p>
<p>The first function simply reads a line from the file and use a <code class="highlighter-rouge">case</code> statement
to decide what to do. If the line read results in an error then we simply
print the error and close the file. Similarly, if the output of the line read
is and end-of-file then we just close the file.</p>
<p>The interesting bit is the last case, where we successfully read a line.
If the line contains our query string, we print it out along with the line
number. In any case, since we did not read an end-of-file, we call ourselves
again<sup id="fnref:tce"><a href="#fn:tce" class="footnote">6</a></sup> on line 15. We pass the same arguments we received, with the exception
of the line number, which we increase by one. In this way we keep reading
and processing lines until the file is exhausted.</p>
<p>Notice how we don’t keep track of the line number using an external variable.
That bit of state is maintained implicitly in the argument to the function call.
There is no mutated state since we are never assigning anything. <em>Every time
we want to change state we must pass the new state in as arguments to our
function</em>. This can be a tricky bit to understand at first, but its
fundamental so take time to let it sink in.</p>
<h3>Conclusion</h3>
<p>We have seen how we can accomplish tasks in a functional language that typically
are solved using looping control flow in imperative languages. Hopefully you
are starting to get a feel for the declarative style that is pervasive in FP,
and hopefully you are starting to appreciate how it can help you write more
expressive, concise code. In the next post we will take a look into
our next challenge, working with nested data structures.</p>
<h3>Notes</h3>
<div class="footnotes">
<ol>
<li id="fn:noloop">
<p>Actually, Elixir does have a <code class="highlighter-rouge">for</code> but it is not technically a loop construct, rather it is what is known as a <em>list comprehension</em>. <a href="#fnref:noloop" class="reversefootnote">↩</a></p>
</li>
<li id="fn:higher-order">
<p>Yes <em>function</em> - see <a href="https://en.wikipedia.org/wiki/Higher-order_function">higher order functions</a>. <a href="#fnref:higher-order" class="reversefootnote">↩</a></p>
</li>
<li id="fn:anonymous">
<p>Anonymous functions are simply functions that are not named. Technically they are also <a href="https://simple.wikipedia.org/wiki/Closure_(computer_science)">closures</a>. <a href="#fnref:anonymous" class="reversefootnote">↩</a></p>
</li>
<li id="fn:mutable">
<p>Although <em>data</em> is immutable, variables <em>can</em> be rebound to new data, so we could do something similar to the variable assignment happening in the Java loop. Higher order functions like <code class="highlighter-rouge">Enum.find</code> prevent a need for this, however. <a href="#fnref:mutable" class="reversefootnote">↩</a></p>
</li>
<li id="fn:java-streams">
<p>Java 8 provides <a href="http://www.oracle.com/technetwork/articles/java/ma14-java-se-8-streams-2177646.html">streams</a> that provide a more declarative approach to processing I/O. <a href="#fnref:java-streams" class="reversefootnote">↩</a></p>
</li>
<li id="fn:tce">
<p>At this point you may be concerned about a large number of recursive calls exceeding our stack limit. Fortunately Elixir, like many functional languages, provides <a href="https://en.wikipedia.org/wiki/Tail_call">tail call elimination</a>, so our recursive calls are effectively treated as jumps back to the top of the function. <a href="#fnref:tce" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>jnortonThis is the first post in a series dedicated to presenting solutions to common challenges that developers encounter when moving from an imperative programming approach to functional programming (FP). I will present a series of problems and provide solutions in both Java and Elixir, a functional language running on the Erlang VM.Functional Programming for the Functionally Challenged (Like Me)2015-09-09T07:00:00+00:002015-09-09T07:00:00+00:00http://blog.element84.com/elixir-fp-prologue<p>By now you have probably heard all the hype about functional programming (FP) and
may have even dipped your toe in the water by trying out <a href="http://clojure.org/">Clojure</a>
or one of the other Lisp dialects.</p>
<p>Maybe you have experimented with some of the functional
elements of <a href="http://www.scala-lang.org/">Scala</a>, <a href="https://www.ruby-lang.org/en/">Ruby</a>,
<a href="https://www.python.org/">Python</a>, or one of the newer languages
like <a href="https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/">Swift</a>. If you are a long time imperative programmer like me,
your first exposure to functional programming may have left you feeling a bit
bewildered and not quite sure if you are <em>smart enough</em> for FP.</p>
<p>Maybe you have convinced yourself that FP is
only for academics or guys that win Turing Awards. Maybe you have even convinced
yourself that you don’t <em>really</em> need to learn it at all, that
it’s just the latest craze and that if you wait long enough something else
will come along. After all, FP
languages don’t even make the top ten on the latest <a href="https://redmonk.com/sogrady/category/programming-languages/">RedMonk listings</a>.</p>
<p>First the bad news: functional programming isn’t going to disappear any time
soon. FP tenets like immutable data, <a href="https://en.wikipedia.org/wiki/Referential_transparency_(computer_science)">referential transparency</a>,
<a href="https://en.wikipedia.org/wiki/Higher-order_function">higher order functions</a>,
<a href="https://en.wikipedia.org/wiki/Function_composition">function composition</a>,
etc., lend themselves to building
highly scalable software that can take advantage of this distributed,
multi-core world in which we find ourselves. I’m not just repeating the
FP mantra here; this is based on personal observations made while spending the last two years rebuilding a legacy <a href="http://jruby.org/">JRuby</a> system in Clojure. Now the good news:
<span data-pullquote="you don't have to be a genius to be a functional programmer">
you don’t have to be a genius like <a href="https://twitter.com/richhickey?lang=en">Rich Hickey</a>
to be a functional programmer.</span></p>
<p>This is the prologue to a series of posts for those of us who aren’t
quite sure we are clever enough to be functional programmers. In this series
I’ll cover a set of example programming tasks that often seem challenging when
moving from an imperative style to FP. For each example I will first present
a solution in an imperative language, <a href="http://java.com/en/">Java</a>, and then present a solution using
a functional language, <a href="http://elixir-lang.org/">Elixir</a>. I’ll highlight the
differences in each approach and explain the rationale and benefits of the FP
solutions.</p>
<p>Many of you have probably never seen Elixir code (or even heard of Elixir), but
I think it makes
a great choice for learning FP for several reasons. First, it borrows a lot of
syntax from Ruby, so it may be more accessible to Rubyists than the Lisp
dialects. Second, it runs on the <a href="https://erlangcentral.org/tag/beam/">Erlang VM (BEAM)</a>,
which has quick startup and
generally makes for a snappier development experience than say JVM based
languages. Finally, although it’s only a few years old, the documentation and
tooling around Elixir are excellent, including a powerful REPL (Read-Eval-Print-Loop)
with good editor integration.</p>
<p>Hopefully by the end of this series you will realize that “thinking functionally” is not
strictly the domain of the intelligentsia, and that FP is not only within your
reach, but worth reaching for. In the <a href="/elixir-feelin-loopy.html">next article</a> we will look at a
common task, iterating over a collection, which should be familiar using an
imperative approach, but can be confusing at first using FP.</p>jnortonBy now you have probably heard all the hype about functional programming (FP) and may have even dipped your toe in the water by trying out Clojure or one of the other Lisp dialects.