<?xml version="1.0" encoding="UTF-8" ?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Richard Palethorpe's software engineering articles</title>
  <id>https://richiejp.com</id>
  <updated>2025-05-02T08:28:51+00:00</updated>
  <subtitle>Systems engineering (C, Zig, Kernel, Databases), Full-Stack (TypeScript, Svelte, Go), Indiehacking</subtitle>
  <author>
    <name>Richard Palethorpe</name>
    <email>io@richiejp.com</email>
    <uri>https://richiejp.com</uri>
  </author>
  <link href="https://richiejp.com/atom.feed.xml" rel="self" />
  <icon>/favicon.svg</icon>
  <logo>/logo.svg</logo>
  <rights> © 2025 Richard Palethorpe</rights>
<entry>
  <title>Bitbanging 1D Reversible Automata</title>
  <id>https://richiejp.com/1d-reversible-automata</id>
  <published>2021-03-28T15:49:03+01:00</published>
  <updated>2025-01-07T21:33:31Z</updated>
  <link rel="alternate" href="https://richiejp.com/1d-reversible-automata" />
  <summary>Nearest-neighbor, one-dimensional, reversible, bitbanging
binary cell automata implementation.</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <h1 id="one-dimensional-reversible-automata">One Dimensional
            Reversible Automata</h1>
            <p>I created a <a
            href="https://github.com/gfxprim/automata">demo</a> for the
            <a href="http://gfxprim.ucw.cz/index.html">GFXPrim</a>
            library. It implements and displays a nearest-neighbor,
            one-dimensional, binary cell automata. Additionally it
            implements a reversible automata, which is almost identical
            except for a small change to make it reversible. The
            automata is displayed over time in two dimensions, time
            travels from top to bottom. Although in the reversible case
            time could be played backwards.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>There is now a <a
            href="1d-reversible-automata-webgpu">WebGPU version you can
            try out in your browser!</a></p>
            </div>
            </div>
            <div class="float">
            <img src="/https/richiejp.com/73r.png"
            alt="A reversible rendition of rule 73" />
            <div class="figcaption">A reversible rendition of rule
            73</div>
            </div>
            <p>The automata works as follows:</p>
            <ul>
            <li>Each cell has a state, which is on or off, black or
            white, boolean etc.</li>
            <li>At each time step, the state of a cell in the next step
            is chosen by a rule.</li>
            <li>The rule looks at a cell’s current value and the values
            of its left and right neighbors.</li>
            <li>There are <span
            class="math inline">2<sup>3</sup> = 8</span> possible state
            combinations (patterns) for 3 binary cells.</li>
            <li>A rule states which patterns result in a black cell in
            the next time step.</li>
            <li>There are <span
            class="math inline">2<sup>8</sup> = 256</span> possible
            rules. That is, 256 unique combinations of patterns.</li>
            </ul>
            <div class="float">
            <img src="/https/richiejp.com/105.png" alt="Rule 105" />
            <div class="figcaption">Rule 105</div>
            </div>
            <p>So a pattern is a 3 digit binary number, where each digit
            corresponds to a cell. The middle digit is the center cell,
            the high order bit the left cell, the low order bit the
            right cell. A rule can be display by showing a row of
            patterns and a row of next states.</p>
            <table>
            <thead>
            <tr class="header">
            <th align="center">111</th>
            <th align="center">110</th>
            <th align="center">101</th>
            <th align="center">100</th>
            <th align="center">011</th>
            <th align="center">010</th>
            <th align="center">001</th>
            <th align="center">000</th>
            </tr>
            </thead>
            <tbody>
            <tr class="odd">
            <td align="center">0</td>
            <td align="center">1</td>
            <td align="center">1</td>
            <td align="center">0</td>
            <td align="center">1</td>
            <td align="center">1</td>
            <td align="center">1</td>
            <td align="center">0</td>
            </tr>
            </tbody>
            </table>
            <p>Above is rule <code>110</code>, <code>0x6e</code> or
            <code>01101110</code>. It essentially says to match patterns
            <code>110</code>, <code>101</code>, <code>011</code>,
            <code>010</code>, <code>001</code>. Where a pattern match
            results in the cell being set to 1 at the next time step. If
            no pattern is matched or equivalently, an inactive pattern
            is matched, then the cell will be set to 0.</p>
            <p>Again note that each pattern resembles a 3bit binary
            number. Also the values of the active patterns resemble an
            8bit binary number. We can use this to perform efficient
            matching of the patterns using binary operations.</p>
            <p>Let’s assume our CPU natively operates on 64bit integers
            (called <em>words</em>). We can pack a 64 cell automata into
            a single 64bit integer. Each bit corresponds to a cell. If a
            bit is <code>1</code> then it is a black cell and
            <code>0</code> for white. In this case we are using integers
            as bit fields. We don’t care about the integer number the
            bits can represent.</p>
            <div class="float">
            <img src="/https/richiejp.com/94r.png" alt="Rule 94 Reversible" />
            <div class="figcaption">Rule 94 Reversible</div>
            </div>
            <p>The CPU can perform bitwise operations on all 64bits in
            parallel and without branching. This means we can perform a
            single operation 64 times in parallel.<a href="#fn1"
            class="footnote-ref" id="fnref1"><sup>1</sup></a></p>
            <p>If we <em>rotate</em> (wrapped <code>&gt;&gt;</code><a
            href="#fn2" class="footnote-ref"
            id="fnref2"><sup>2</sup></a>) all bits to the right by one,
            then we get a new integer where the left neighbor of a bit
            is now in its position. Likewise if we shift all bits to the
            left, then we get an integer representing the right
            neighbors. This gives us 3 integers where the left, center
            and right bits are in the same position. For example, using
            only 8bits:</p>
            <table>
            <tbody>
            <tr class="odd">
            <td align="left">left:</td>
            <td>0100 1011</td>
            <td><code>&gt;&gt;</code></td>
            </tr>
            <tr class="even">
            <td align="left">center:</td>
            <td>1001 0110</td>
            <td></td>
            </tr>
            <tr class="odd">
            <td align="left">right:</td>
            <td>0010 1101</td>
            <td><code>&lt;&lt;</code></td>
            </tr>
            </tbody>
            </table>
            <p>Each pattern can be represented as a 3bit number, plus a
            4th bit to say whether it is active in a given rule. As we
            want to operate on all 64bits at once in the left, right and
            center bit fields. We can generate 64bit long <em>masks</em>
            from the value of each bit in a given pattern.</p>
            <p>So if we have a pattern where the left cell should be
            one, then we can create a 64bit mask of <em>all</em> ones.
            If it should be zero, then all zeroes. Likewise for the
            center and right cells. The masks can be <em>xor’ed</em><a
            href="#fn3" class="footnote-ref"
            id="fnref3"><sup>3</sup></a> (<code>^</code>) with the
            corresponding cell fields to show if no match occurred. That
            is, if the pattern is one and the cell is zero or the cell
            is one and the pattern is zero. We can invert this
            (<code>~</code>) to give one when a match occurs.</p>
            <p>To see whether all components (left, right, center) of a
            pattern matches we can bitwise <em>and</em>
            (<code>&amp;</code>) them together. We can then bitwise
            <em>or</em><a href="#fn4" class="footnote-ref"
            id="fnref4"><sup>4</sup></a> (<code>|</code>) the result of
            the pattern matches together to produce the final
            values.</p>
            <div class="float">
            <img src="/https/richiejp.com/193.jpg" alt="Rule 193, inverted rule 110" />
            <div class="figcaption">Rule 193, inverted rule 110</div>
            </div>
            <p>If we wish to operate on an automata larger than 64
            cells, then we can combine multiple integers into an array.
            After performing the left and right shifts, we get the high
            or low bit from the next or previous integers in the array.
            Then set the low and high bits of the right and left bit
            fields. In other words we chain them together using the end
            bits of the left and right bit fields.</p>
            <p>For illustration purposes, below is the <em>kernel</em>
            of the the automata algorithm.</p>
            <div class="sourceCode" id="cb1"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="co">/* If bit n is 1 then make all bits 1 otherwise 0 */</span></span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a><span class="pp">#define BIT_TO_MAX</span><span class="op">(</span><span class="pp">b</span><span class="op">,</span><span class="pp"> n</span><span class="op">)</span><span class="pp"> </span><span class="op">(((</span><span class="pp">b </span><span class="op">&gt;&gt;</span><span class="pp"> n</span><span class="op">)</span><span class="pp"> </span><span class="op">&amp;</span><span class="pp"> </span><span class="dv">1</span><span class="op">)</span><span class="pp"> </span><span class="op">*</span><span class="pp"> </span><span class="op">~</span><span class="dv">0</span><span class="bu">UL</span><span class="op">)</span></span>
<span id="cb1-3"><a href="#cb1-3" tabindex="-1"></a></span>
<span id="cb1-4"><a href="#cb1-4" tabindex="-1"></a><span class="co">/* Numeric representation of the current update rule */</span></span>
<span id="cb1-5"><a href="#cb1-5" tabindex="-1"></a><span class="dt">static</span> <span class="dt">uint8_t</span> rule <span class="op">=</span> <span class="dv">110</span><span class="op">;</span></span>
<span id="cb1-6"><a href="#cb1-6" tabindex="-1"></a></span>
<span id="cb1-7"><a href="#cb1-7" tabindex="-1"></a><span class="co">/* Apply the current rule to a 64bit segment of a row */</span></span>
<span id="cb1-8"><a href="#cb1-8" tabindex="-1"></a><span class="dt">static</span> <span class="kw">inline</span> <span class="dt">uint64_t</span> ca1d_rule_apply<span class="op">(</span><span class="dt">uint64_t</span> c_prev<span class="op">,</span></span>
<span id="cb1-9"><a href="#cb1-9" tabindex="-1"></a>                                       <span class="dt">uint64_t</span> c<span class="op">,</span></span>
<span id="cb1-10"><a href="#cb1-10" tabindex="-1"></a>                                       <span class="dt">uint64_t</span> c_next<span class="op">,</span></span>
<span id="cb1-11"><a href="#cb1-11" tabindex="-1"></a>                                       <span class="dt">uint64_t</span> c_prev_step<span class="op">)</span></span>
<span id="cb1-12"><a href="#cb1-12" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb1-13"><a href="#cb1-13" tabindex="-1"></a>    <span class="dt">int</span> i<span class="op">;</span></span>
<span id="cb1-14"><a href="#cb1-14" tabindex="-1"></a>    <span class="co">/* These are wrapping shifts when c_prev == c or c_next == c */</span></span>
<span id="cb1-15"><a href="#cb1-15" tabindex="-1"></a>    <span class="dt">uint64_t</span> l <span class="op">=</span> <span class="op">(</span>c <span class="op">&gt;&gt;</span> <span class="dv">1</span><span class="op">)</span> <span class="op">^</span> <span class="op">(</span>c_prev <span class="op">&lt;&lt;</span> <span class="dv">63</span><span class="op">);</span></span>
<span id="cb1-16"><a href="#cb1-16" tabindex="-1"></a>    <span class="dt">uint64_t</span> r <span class="op">=</span> <span class="op">(</span>c <span class="op">&lt;&lt;</span> <span class="dv">1</span><span class="op">)</span> <span class="op">^</span> <span class="op">(</span>c_next <span class="op">&gt;&gt;</span> <span class="dv">63</span><span class="op">);</span></span>
<span id="cb1-17"><a href="#cb1-17" tabindex="-1"></a>    <span class="dt">uint64_t</span> c_next_step <span class="op">=</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb1-18"><a href="#cb1-18" tabindex="-1"></a></span>
<span id="cb1-19"><a href="#cb1-19" tabindex="-1"></a>    <span class="cf">for</span> <span class="op">(</span>i <span class="op">=</span> <span class="dv">0</span><span class="op">;</span> i <span class="op">&lt;</span> <span class="dv">8</span><span class="op">;</span> i<span class="op">++)</span> <span class="op">{</span></span>
<span id="cb1-20"><a href="#cb1-20" tabindex="-1"></a>        <span class="dt">uint64_t</span> active <span class="op">=</span> BIT_TO_MAX<span class="op">(</span>rule<span class="op">,</span> i<span class="op">);</span></span>
<span id="cb1-21"><a href="#cb1-21" tabindex="-1"></a>        <span class="dt">uint64_t</span> left   <span class="op">=</span> BIT_TO_MAX<span class="op">(</span>i<span class="op">,</span> <span class="dv">2</span><span class="op">);</span></span>
<span id="cb1-22"><a href="#cb1-22" tabindex="-1"></a>        <span class="dt">uint64_t</span> center <span class="op">=</span> BIT_TO_MAX<span class="op">(</span>i<span class="op">,</span> <span class="dv">1</span><span class="op">);</span></span>
<span id="cb1-23"><a href="#cb1-23" tabindex="-1"></a>        <span class="dt">uint64_t</span> right  <span class="op">=</span> BIT_TO_MAX<span class="op">(</span>i<span class="op">,</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb1-24"><a href="#cb1-24" tabindex="-1"></a></span>
<span id="cb1-25"><a href="#cb1-25" tabindex="-1"></a>        c_next_step <span class="op">|=</span></span>
<span id="cb1-26"><a href="#cb1-26" tabindex="-1"></a>            active <span class="op">&amp;</span> <span class="op">~(</span>left <span class="op">^</span> l<span class="op">)</span> <span class="op">&amp;</span> <span class="op">~(</span>center <span class="op">^</span> c<span class="op">)</span> <span class="op">&amp;</span> <span class="op">~(</span>right <span class="op">^</span> r<span class="op">);</span></span>
<span id="cb1-27"><a href="#cb1-27" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb1-28"><a href="#cb1-28" tabindex="-1"></a></span>
<span id="cb1-29"><a href="#cb1-29" tabindex="-1"></a>    <span class="co">/* The automata becomes reversible when we use c_prev_state */</span></span>
<span id="cb1-30"><a href="#cb1-30" tabindex="-1"></a>    <span class="cf">return</span> c_next_step <span class="op">^</span> c_prev_step<span class="op">;</span></span>
<span id="cb1-31"><a href="#cb1-31" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>To make the automata “reversible” an extra step can be
            added. We look at a cell’s previous (in addition to the
            current, left and right) and if it was one then
            <em>invert</em> the next value. This is equivalent to
            xor’ring the previous value with the next.</p>
            <div class="float">
            <img src="/https/richiejp.com/193r.jpg" alt="Rule 193 again, but reversible" />
            <div class="figcaption">Rule 193 again, but reversible</div>
            </div>
            <p>It is not entirely clear to me what the mathematical
            implications are of being reversible. However it is
            important to physics and makes some really cool patterns
            which mimic nature. Also entropy and the second rule of
            themodynamics, yada, yada…</p>
            <p>The automata definition is taken from Stephen Wolfram’s
            “A new kind of science”. He proposes at least one
            <em>obvious</em><a href="#fn5" class="footnote-ref"
            id="fnref5"><sup>5</sup></a> C implementation using arrays
            of cells. He also provides a table of binary expressions for
            each rule. E.g. rule 90 reduces to just the <code>l^r</code>
            binary expression. It may be possible for the compiler to
            automatically reduce my implementation to these minimal
            expressions.</p>
            <p>To see why, let’s consider rule 90 for each pattern.</p>
            <div class="float">
            <img src="/https/richiejp.com/90r.png" alt="Rule 90 reversible" />
            <div class="figcaption">Rule 90 reversible</div>
            </div>
            <table>
            <thead>
            <tr class="header">
            <th align="center">111</th>
            <th align="center">110</th>
            <th align="center">101</th>
            <th align="center">100</th>
            <th align="center">011</th>
            <th align="center">010</th>
            <th align="center">001</th>
            <th align="center">000</th>
            </tr>
            </thead>
            <tbody>
            <tr class="odd">
            <td align="center">0</td>
            <td align="center">1</td>
            <td align="center">0</td>
            <td align="center">1</td>
            <td align="center">1</td>
            <td align="center">0</td>
            <td align="center">1</td>
            <td align="center">0</td>
            </tr>
            </tbody>
            </table>
            <p><span class="math inline">01011010 = 90</span></p>
            <p>First for pattern <code>000</code>.</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a>  active <span class="op">&amp;</span> <span class="op">~(</span>left <span class="op">^</span> l<span class="op">)</span> <span class="op">&amp;</span> <span class="op">~(</span>center <span class="op">^</span> c<span class="op">)</span> <span class="op">&amp;</span> <span class="op">~(</span>right <span class="op">^</span> r<span class="op">);</span></span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a><span class="op">=</span> <span class="dv">0</span> <span class="op">&amp;</span> <span class="op">~(</span><span class="dv">0</span> <span class="op">^</span> l<span class="op">)</span> <span class="op">&amp;</span> <span class="op">~(</span><span class="dv">0</span> <span class="op">^</span> c<span class="op">)</span> <span class="op">&amp;</span> <span class="op">~(</span><span class="dv">0</span> <span class="op">^</span> r<span class="op">);</span></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a><span class="op">=</span> <span class="fl">0.</span>`</span></code></pre></div>
            <p>Active is zero so the whole line reduces to zero. Now for
            <code>001</code>. Note that <code>1</code> here actually
            means <code>~0UL</code>, that is 64bit integer max.</p>
            <div class="sourceCode" id="cb3"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a>   <span class="dv">1</span> <span class="op">&amp;</span> <span class="op">~(</span><span class="dv">0</span> <span class="op">^</span> l<span class="op">)</span> <span class="op">&amp;</span> <span class="op">~(</span><span class="dv">0</span> <span class="op">^</span> c<span class="op">)</span> <span class="op">&amp;</span> <span class="op">~(</span><span class="dv">1</span> <span class="op">^</span> r<span class="op">);</span></span>
<span id="cb3-2"><a href="#cb3-2" tabindex="-1"></a><span class="op">=</span> <span class="op">~</span>l <span class="op">&amp;</span> <span class="op">~</span>c <span class="op">&amp;</span> r<span class="op">.</span></span></code></pre></div>
            <p>As expected pattern <code>001</code> matches
            <code>l=0, c=0, r=1</code>. Let’s just list the remaining
            patterns or’ed together in their reduced state. Then reduce
            that further. Note that the <code>for</code> loop in
            <code>ca1d_rule_apply</code> will be <em>unrolled</em> by
            the compiler when optimising for performance. It’s also
            quite clear that <code>c_next_step</code> is dependant on an
            expression from the previous iteration or zero. So all the
            pattern match results will get or’ed together.</p>
            <div class="sourceCode" id="cb4"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb4-1"><a href="#cb4-1" tabindex="-1"></a>  l <span class="op">&amp;</span> c <span class="op">&amp;</span> <span class="op">~</span>r <span class="op">|</span> l <span class="op">&amp;</span> <span class="op">~</span>c <span class="op">&amp;</span> <span class="op">~</span>r <span class="op">|</span> <span class="op">~</span>l <span class="op">&amp;</span> c <span class="op">&amp;</span> r <span class="op">|</span> <span class="op">~</span>l <span class="op">&amp;</span> <span class="op">~</span>c <span class="op">&amp;</span> r<span class="op">;</span></span>
<span id="cb4-2"><a href="#cb4-2" tabindex="-1"></a><span class="op">=</span> l <span class="op">&amp;</span> <span class="op">~</span>r <span class="op">|</span> <span class="op">~</span>l <span class="op">&amp;</span> r<span class="op">;</span></span>
<span id="cb4-3"><a href="#cb4-3" tabindex="-1"></a><span class="op">=</span> l <span class="op">^</span> r<span class="op">.</span></span></code></pre></div>
            <p>See on the top row that
            <code>(l &amp; c &amp; ~r | l &amp; ~c &amp; ~r)</code> or’s
            together <code>c</code> and not <code>c</code>. So we can
            remove it. Then we get an expression equivalent to xor’ring
            <code>l</code> and <code>r</code>.</p>
            <p>In theory at least, the compiler can see that
            <code>rule</code> only has 256 values and create a reduced
            version of <code>ca1d_rule_apply</code> for each value.
            Whether it actually does is not of much practical concern
            when the rendering code is the bottle neck. However it’s
            interesting to see if the compiler can deduce the best
            solution or whether anything trips it up.</p>
            <p>Judging from the disassembly from
            <code>gcc -O3 -mcpu=native -mtune=native</code>, it may
            actually do this. Additionally it <em>vectorizes</em> the
            code packing four 64bit ints at a time into 256bit registers
            and operating on those. I don’t know which part of the code
            it is vectorising or how. It’s possible that what I think is
            the rule being reduced is something related to
            vectorisation.</p>
            <div class="float">
            <img src="/https/richiejp.com/210.png" alt="Rule 210" />
            <div class="figcaption">Rule 210</div>
            </div>
            <p>To render the automata we take the approach of iterating
            over each pixel in the image. We calculate which cell the
            pixel falls inside and set the color of the pixel to that of
            the cell. That’s it.</p>
            <div class="sourceCode" id="cb5"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb5-1"><a href="#cb5-1" tabindex="-1"></a><span class="co">/* Draws a pixel */</span></span>
<span id="cb5-2"><a href="#cb5-2" tabindex="-1"></a><span class="dt">static</span> <span class="kw">inline</span> <span class="dt">void</span> shade_pixel<span class="op">(</span>gp_pixmap <span class="op">*</span>p<span class="op">,</span> gp_coord x<span class="op">,</span> gp_coord y<span class="op">,</span></span>
<span id="cb5-3"><a href="#cb5-3" tabindex="-1"></a>                               gp_pixel bg<span class="op">,</span> gp_pixel fg<span class="op">)</span></span>
<span id="cb5-4"><a href="#cb5-4" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-5"><a href="#cb5-5" tabindex="-1"></a>    gp_pixel px<span class="op">;</span></span>
<span id="cb5-6"><a href="#cb5-6" tabindex="-1"></a>    <span class="dt">size_t</span> i <span class="op">=</span> <span class="op">(</span>x <span class="op">*</span> <span class="op">(</span><span class="dv">64</span> <span class="op">*</span> width<span class="op">))</span> <span class="op">/</span> p<span class="op">-&gt;</span>w<span class="op">;</span></span>
<span id="cb5-7"><a href="#cb5-7" tabindex="-1"></a>    <span class="dt">size_t</span> j <span class="op">=</span> <span class="op">(</span>y <span class="op">*</span> height<span class="op">)</span> <span class="op">/</span> p<span class="op">-&gt;</span>h<span class="op">;</span></span>
<span id="cb5-8"><a href="#cb5-8" tabindex="-1"></a>    <span class="dt">size_t</span> k <span class="op">=</span> <span class="dv">63</span> <span class="op">-</span> <span class="op">(</span>i <span class="op">&amp;</span> <span class="dv">63</span><span class="op">);</span></span>
<span id="cb5-9"><a href="#cb5-9" tabindex="-1"></a>    <span class="dt">uint64_t</span> c <span class="op">=</span> steps<span class="op">[</span>gp_matrix_idx<span class="op">(</span>width<span class="op">,</span> j<span class="op">,</span> i <span class="op">&gt;&gt;</span> <span class="dv">6</span><span class="op">)];</span></span>
<span id="cb5-10"><a href="#cb5-10" tabindex="-1"></a></span>
<span id="cb5-11"><a href="#cb5-11" tabindex="-1"></a>    c <span class="op">=</span> BIT_TO_MAX<span class="op">(</span>c<span class="op">,</span> k<span class="op">);</span></span>
<span id="cb5-12"><a href="#cb5-12" tabindex="-1"></a>    px <span class="op">=</span> <span class="op">(</span>fg <span class="op">&amp;</span> c<span class="op">)</span> <span class="op">|</span> <span class="op">(</span>bg <span class="op">&amp;</span> <span class="op">~</span>c<span class="op">);</span></span>
<span id="cb5-13"><a href="#cb5-13" tabindex="-1"></a></span>
<span id="cb5-14"><a href="#cb5-14" tabindex="-1"></a>    gp_putpixel_raw<span class="op">(</span>p<span class="op">,</span> x<span class="op">,</span> y<span class="op">,</span> px<span class="op">);</span></span>
<span id="cb5-15"><a href="#cb5-15" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>GFXPrim makes drawing very simple. The above code is fast
            enough for my purpsoses, but a significant improvement can
            be had. Integer division is much slower than floating point
            multiplication on most newer CPUs. It’s actually much faster
            (2x at least) on my CPU to calculate a pair of ratios in
            floating point, then convert them back to integers.</p>
            <p>However, you may ask why we are even drawing on the CPU
            in the first place? This is because GFXPrim targets embedded
            systems with no graphics processor. Additionally the CPU may
            not even support floating point natively. So integer
            division may actually be faster in this case. Still better
            would be to limit the size of the pixmap to be <span
            class="math inline">2<sup><em>x</em></sup></span> larger
            than the dimensions of the automata, where <span
            class="math inline"><em>x</em> ∈ ℕ</span> then we can use
            shifts instead of division.</p>
            <div class="float">
            <img src="/https/richiejp.com/210r.png" alt="Rule 210 Reversible" />
            <div class="figcaption">Rule 210 Reversible</div>
            </div>
            <div class="float">
            <img src="/https/richiejp.com/105r.png" alt="Rule 105 Reversible" />
            <div class="figcaption">Rule 105 Reversible</div>
            </div>
            <div class="footnotes footnotes-end-of-document">
            <hr />
            <ol>
            <li id="fn1"><p>In fact on my CPU it is 256 cells at a time.
            As AVX can be used to operate on 4 64bit words at a time.<a
            href="#fnref1" class="footnote-back">↩︎</a></p></li>
            <li id="fn2"><p>We perform a rotating shift which moves the
            end bit to the start. This causes the automata to wrap
            around.<a href="#fnref2"
            class="footnote-back">↩︎</a></p></li>
            <li id="fn3"><p>Combined with exclusive <em>or</em>.<a
            href="#fnref3" class="footnote-back">↩︎</a></p></li>
            <li id="fn4"><p>Or we can use xor as the patterns are
            mutually exclusive, so only one may match at a time for each
            bit.<a href="#fnref4" class="footnote-back">↩︎</a></p></li>
            <li id="fn5"><p>That’s me being an arse not Wolfram. Of
            course whenever “obvious” is used in these contexts it’s
            never really correct.<a href="#fnref5"
            class="footnote-back">↩︎</a></p></li>
            </ol>
            </div>
    </div>
  </content>
</entry>
<entry>
  <title>WebGPU Bitbanging 1D Reversible Automata</title>
  <id>https://richiejp.com/1d-reversible-automata-webgpu</id>
  <published>2024-12-31T11:46:55Z</published>
  <updated>2025-01-07T21:33:31Z</updated>
  <link rel="alternate" href="https://richiejp.com/1d-reversible-automata-webgpu" />
  <summary>Running in the browser using a WebGPU WGSL compuation
shader</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>This is a reimplementation of my <a
            href="/https/richiejp.com/1d-reversible-automata">Bitbanging 1D Reversible
            Automata</a> in WGSL so that it runs on GPUs inside a
            computation shader. If you are not familiar with 1D automata
            or bitbanging techniques then please see the original
            article.</p>
            <p>WebGPU is still not available on some browsers at the
            time of writing. On my Linux machine I had to set
            <code>--enable-unsafe-webgpu --enable-features=Vulkan</code>
            on Chromium to enable it. However it worked out-of-the-box
            with Chrome on my phone and ancient Mac Book Air.</p>
            <p>I’ve included some screen-shots throughout the article if
            you can’t get it work.</p>
            <h1 id="automata-viewer">Automata viewer</h1>
            <p><label for="device-type">Screen size:</label>
            <select id="device-type">
            <option value="mobile">Mobile</option>
            <option value="desktop">Desktop</option> </select></p>
            <figure>
            <canvas id="gpu-canvas" width="412" height="512">
            </canvas>
            </figure>
            <p><label for="zoom">Pan X: <span
            id="pan-x-value">0.0</span></label> <input
            type="range"
            id="pan-x"
            min="0.0"
            max="1.0"
            step="0.001"
            value="0.0"
            ></p>
            <p><label for="zoom">Pan Y: <span
            id="pan-y-value">0.5</span></label> <input
            type="range"
            id="pan-y"
            min="0.0"
            max="1.0"
            step="0.001"
            value="0.5"
            ></p>
            <p><label for="zoom">Zoom: <span
            id="zoom-value">2</span></label> <input
            type="range"
            id="zoom"
            min="1"
            max="8"
            value="2"
            ></p>
            <p><label for="rule">Rule: <span
            id="rule-value">105</span></label> <input
            type="range"
            id="rule"
            min="1"
            max="255"
            value="110"
            ></p>
            <p><label for="seed">Seed: <span
            id="seed-value">114322622</span></label> <input
            type="range"
            id="seed"
            min="-1073741824"
            max="1073741824"
            value="114322622"
            > <span>Value: </span></p>
            <p><label> <input
                class="mr-2"
                type="checkbox"
                id="reversible"
                checked
            > <span class="text-sm text-gray-600">Reversible</span>
            </label></p>
            <button id="export-img">
            Download Image
            </button>
            <p>If you find a particular rule doesn’t produce interesting
            output then try dragging the seed value to a negative value.
            This will populate all of the starting bit fields instead of
            just one.</p>
            <p>You can find the <a
            href="https://gitlab.com/Palethorpe/portfolio/-/blob/master/static/automata-webgpu.js">JavaScript
            source here</a> and the <a
            href="https://gitlab.com/Palethorpe/portfolio/-/blob/master/static/automata.wgsl">WGSL
            source here</a>. In addition you can just open your
            browser’s developer console and get the files there. I
            didn’t use an build tools, it’s just vanilla JS.</p>
            <h1 id="webgpu-background">WebGPU background</h1>
            <p>For some time I have wanted to embed the original
            automata viewer in my website. The original is written in C,
            so I thought I would try compiling it to WASM, but this
            wasn’t very fun.</p>
            <p>At the same time I have been interested in learning to do
            something on the GPU and even if the GPU is not particularly
            well suited to the calculations involved it should be plenty
            fast enough.</p>
            <p>Originally I tried using WebGL, but after some research I
            couldn’t see a way to perform the calculation in a vertex or
            fragment shader without multiple render calls. It may very
            well be possible, but it’s not obvious how to use the result
            of a previous calculation on a vertex or fragment without
            multiple stages to a pipeline.</p>
            <div class="float">
            <img src="178r-mobile.webp" alt="Rule 178 reversible" />
            <div class="figcaption">Rule 178 reversible</div>
            </div>
            <p>WebGPU on the other hand has computation shaders which
            are perfectly suited to this type of calculation. It’s very
            new so I’d preferred to have steered clear of it, but it
            seemed to be the path of least resistance and by far the
            most interesting.</p>
            <p>Probably I could get JavaScript to go fast enough to do
            the calculation while dealing with 1D automata. However
            using WebGPU paves the way for multi-dimensional automata or
            just displaying 1D automata in 3D.</p>
            <p>WebGPU has its own Rust flavoured shader language that is
            called WGSL due to some drama with Apple and perhaps other
            reasons. It’s tempting to talk about this, but it’s a waste
            of time for both me and you. The language itself though
            seems pretty good although I don’t have much to base this
            upon.</p>
            <p>I learnt enough about WebGPU to make this from <a
            href="https://webgpufundamentals.org/">WebGPU
            Fundamentals</a>, <a
            href="https://www.w3.org/TR/webgpu/">the WebGPU
            specification</a> and <a
            href="https://www.w3.org/TR/WGSL">the WGSL spec</a>. I also
            found <a href="https://compute.toys/">Compute Toys</a>
            useful and noticed that someone had already uploaded <a
            href="https://compute.toys/view/1354">a 1D automata</a>
            albeit not a reversible one.</p>
            <h1 id="wgsl-automata-implementation">WGSL automata
            implementation</h1>
            <p>I didn’t think I would be able to achieve quite the same
            result as my original implementation. Indeed only 32-bit
            ints are supported by WebGPU and there is no vectorisation.
            However none of that is needed on GPU and we can still do
            bit banging.</p>
            <p>I don’t know if GPUs are limited in terms of integer
            arithmetic compared to floating point, but it makes no
            difference here. I’d have to line up more automata than
            could be displayed for performance to be an issue.</p>
            <div class="float">
            <img src="233r-mobile.webp" alt="Rule 233 reversible" />
            <div class="figcaption">Rule 233 reversible</div>
            </div>
            <p>Below is the compute shader which calculates the next
            value for each automaton 16 x 32 automata at a time. There
            are 32 bits in an integer and the <em>work group</em> size
            is 16 by default. Meaning that the GPU will do 16 u32
            computations in parallel.</p>
            <p>Most GPUs can do at least 64 computations in parallel,
            but it is a bit pointless because we run out of pixels. We
            could write the automata activations to a much larger
            texture and sample multiple texels for each pixel, but I
            suspect it wouldn’t be all that interesting.</p>
            <div class="sourceCode" id="cb1"><pre
            class="sourceCode rust"><code class="sourceCode rust"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="kw">override</span> stride<span class="op">:</span> <span class="dt">u32</span> <span class="op">=</span> <span class="dv">16</span><span class="op">;</span></span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a></span>
<span id="cb1-3"><a href="#cb1-3" tabindex="-1"></a><span class="kw">struct</span> Params <span class="op">{</span></span>
<span id="cb1-4"><a href="#cb1-4" tabindex="-1"></a>  width<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb1-5"><a href="#cb1-5" tabindex="-1"></a>  height<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb1-6"><a href="#cb1-6" tabindex="-1"></a>  rule<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb1-7"><a href="#cb1-7" tabindex="-1"></a>  reversible<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb1-8"><a href="#cb1-8" tabindex="-1"></a>  zoom<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb1-9"><a href="#cb1-9" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb1-10"><a href="#cb1-10" tabindex="-1"></a></span>
<span id="cb1-11"><a href="#cb1-11" tabindex="-1"></a><span class="op">@</span>group(<span class="dv">0</span>) <span class="op">@</span>binding(<span class="dv">0</span>) var<span class="op">&lt;</span>storage<span class="op">,</span> read_write<span class="op">&gt;</span> cells<span class="op">:</span> array<span class="op">&lt;</span><span class="dt">u32</span><span class="op">&gt;;</span></span>
<span id="cb1-12"><a href="#cb1-12" tabindex="-1"></a><span class="op">@</span>group(<span class="dv">0</span>) <span class="op">@</span>binding(<span class="dv">1</span>) var<span class="op">&lt;</span>uniform<span class="op">&gt;</span> params<span class="op">:</span> Params<span class="op">;</span></span>
<span id="cb1-13"><a href="#cb1-13" tabindex="-1"></a></span>
<span id="cb1-14"><a href="#cb1-14" tabindex="-1"></a><span class="co">// Take bit n of m and if it is 1 return all 1s, if 0 return all 0s</span></span>
<span id="cb1-15"><a href="#cb1-15" tabindex="-1"></a><span class="op">@</span>must_use <span class="kw">fn</span> bit_to_max(m<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span> n<span class="op">:</span> <span class="dt">u32</span>) <span class="op">-&gt;</span> <span class="dt">u32</span> <span class="op">{</span></span>
<span id="cb1-16"><a href="#cb1-16" tabindex="-1"></a>  <span class="kw">let</span> z<span class="op">:</span> <span class="dt">u32</span> <span class="op">=</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb1-17"><a href="#cb1-17" tabindex="-1"></a></span>
<span id="cb1-18"><a href="#cb1-18" tabindex="-1"></a>  <span class="cf">return</span> ((m <span class="op">&gt;&gt;</span> n) <span class="op">&amp;</span> <span class="dv">1</span>) <span class="op">*</span> <span class="op">~</span>z<span class="op">;</span></span>
<span id="cb1-19"><a href="#cb1-19" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb1-20"><a href="#cb1-20" tabindex="-1"></a></span>
<span id="cb1-21"><a href="#cb1-21" tabindex="-1"></a><span class="op">@</span>compute <span class="op">@</span>workgroup_size(stride) <span class="kw">fn</span> cmp(</span>
<span id="cb1-22"><a href="#cb1-22" tabindex="-1"></a>  <span class="op">@</span>builtin(global_invocation_id) id<span class="op">:</span> vec3u</span>
<span id="cb1-23"><a href="#cb1-23" tabindex="-1"></a>) <span class="op">{</span></span>
<span id="cb1-24"><a href="#cb1-24" tabindex="-1"></a>  <span class="kw">let</span> rule <span class="op">=</span> params<span class="op">.</span>rule<span class="op">;</span></span>
<span id="cb1-25"><a href="#cb1-25" tabindex="-1"></a>  <span class="kw">let</span> i <span class="op">=</span> id<span class="op">.</span>x<span class="op">;</span></span>
<span id="cb1-26"><a href="#cb1-26" tabindex="-1"></a>  <span class="kw">let</span> cols <span class="op">=</span> stride<span class="op">;</span></span>
<span id="cb1-27"><a href="#cb1-27" tabindex="-1"></a>  <span class="kw">let</span> rows <span class="op">=</span> arrayLength(<span class="op">&amp;</span>cells) <span class="op">/</span> cols<span class="op">;</span></span>
<span id="cb1-28"><a href="#cb1-28" tabindex="-1"></a></span>
<span id="cb1-29"><a href="#cb1-29" tabindex="-1"></a>  <span class="co">// Wrap the u32 bitfields; if we are on the first or last index</span></span>
<span id="cb1-30"><a href="#cb1-30" tabindex="-1"></a>  <span class="kw">let</span> left_i <span class="op">=</span> select(i <span class="op">-</span> <span class="dv">1</span><span class="op">,</span> cols <span class="op">-</span> <span class="dv">1</span><span class="op">,</span> i <span class="op">==</span> <span class="dv">0</span>)<span class="op">;</span></span>
<span id="cb1-31"><a href="#cb1-31" tabindex="-1"></a>  <span class="kw">let</span> right_i <span class="op">=</span> select(i <span class="op">+</span> <span class="dv">1</span><span class="op">,</span> <span class="dv">0</span><span class="op">,</span> i <span class="op">==</span> cols <span class="op">-</span> <span class="dv">1</span>)<span class="op">;</span></span>
<span id="cb1-32"><a href="#cb1-32" tabindex="-1"></a></span>
<span id="cb1-33"><a href="#cb1-33" tabindex="-1"></a>  <span class="co">// The cells before the current ones, used for reversible automata</span></span>
<span id="cb1-34"><a href="#cb1-34" tabindex="-1"></a>  var prev<span class="op">:</span> <span class="dt">u32</span> <span class="op">=</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb1-35"><a href="#cb1-35" tabindex="-1"></a></span>
<span id="cb1-36"><a href="#cb1-36" tabindex="-1"></a>  <span class="cf">for</span> (var j<span class="op">:</span> <span class="dt">u32</span> <span class="op">=</span> <span class="dv">1</span><span class="op">;</span> j <span class="op">&lt;</span> rows<span class="op">;</span> j<span class="op">++</span>) <span class="op">{</span></span>
<span id="cb1-37"><a href="#cb1-37" tabindex="-1"></a>    <span class="kw">let</span> row_off <span class="op">=</span> (j <span class="op">-</span> <span class="dv">1</span>) <span class="op">*</span> cols<span class="op">;</span></span>
<span id="cb1-38"><a href="#cb1-38" tabindex="-1"></a>    <span class="kw">let</span> center <span class="op">=</span> cells[row_off <span class="op">+</span> i]<span class="op">;</span></span>
<span id="cb1-39"><a href="#cb1-39" tabindex="-1"></a>    <span class="co">// move the cell-bits on the left and right into the center</span></span>
<span id="cb1-40"><a href="#cb1-40" tabindex="-1"></a>    <span class="co">// the bits at the edge are taken from the neighboring bitfields</span></span>
<span id="cb1-41"><a href="#cb1-41" tabindex="-1"></a>    <span class="kw">let</span> left  <span class="op">=</span> (center <span class="op">&gt;&gt;</span> <span class="dv">1</span>) <span class="op">|</span> (cells[row_off <span class="op">+</span> left_i ] <span class="op">&lt;&lt;</span> <span class="dv">31</span>)<span class="op">;</span></span>
<span id="cb1-42"><a href="#cb1-42" tabindex="-1"></a>    <span class="kw">let</span> right <span class="op">=</span> (center <span class="op">&lt;&lt;</span> <span class="dv">1</span>) <span class="op">|</span> (cells[row_off <span class="op">+</span> right_i] <span class="op">&gt;&gt;</span> <span class="dv">31</span>)<span class="op">;</span></span>
<span id="cb1-43"><a href="#cb1-43" tabindex="-1"></a></span>
<span id="cb1-44"><a href="#cb1-44" tabindex="-1"></a>    var result<span class="op">:</span> <span class="dt">u32</span> <span class="op">=</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb1-45"><a href="#cb1-45" tabindex="-1"></a></span>
<span id="cb1-46"><a href="#cb1-46" tabindex="-1"></a>    <span class="co">// for each of the 8 3-bit patterns...</span></span>
<span id="cb1-47"><a href="#cb1-47" tabindex="-1"></a>    <span class="cf">for</span> (var k<span class="op">:</span> <span class="dt">u32</span> <span class="op">=</span> <span class="dv">0</span><span class="op">;</span> k <span class="op">&lt;</span> <span class="dv">8</span><span class="op">;</span> k<span class="op">++</span>) <span class="op">{</span></span>
<span id="cb1-48"><a href="#cb1-48" tabindex="-1"></a>      <span class="co">// is the cell (de)activated for this pattern?</span></span>
<span id="cb1-49"><a href="#cb1-49" tabindex="-1"></a>      <span class="kw">let</span> on <span class="op">=</span> bit_to_max(rule<span class="op">,</span> k)<span class="op">;</span></span>
<span id="cb1-50"><a href="#cb1-50" tabindex="-1"></a>      <span class="co">// check if the left, center and right cell-bits match the pattern</span></span>
<span id="cb1-51"><a href="#cb1-51" tabindex="-1"></a>      <span class="co">// it&#39;s useful to remember that we are working on 32 cells at once</span></span>
<span id="cb1-52"><a href="#cb1-52" tabindex="-1"></a>      <span class="kw">let</span> l <span class="op">=</span> <span class="op">~</span>(bit_to_max(k<span class="op">,</span> <span class="dv">2</span>) <span class="op">^</span> left)<span class="op">;</span></span>
<span id="cb1-53"><a href="#cb1-53" tabindex="-1"></a>      <span class="kw">let</span> c <span class="op">=</span> <span class="op">~</span>(bit_to_max(k<span class="op">,</span> <span class="dv">1</span>) <span class="op">^</span> center)<span class="op">;</span></span>
<span id="cb1-54"><a href="#cb1-54" tabindex="-1"></a>      <span class="kw">let</span> r <span class="op">=</span> <span class="op">~</span>(bit_to_max(k<span class="op">,</span> <span class="dv">0</span>) <span class="op">^</span> right)<span class="op">;</span></span>
<span id="cb1-55"><a href="#cb1-55" tabindex="-1"></a></span>
<span id="cb1-56"><a href="#cb1-56" tabindex="-1"></a>      <span class="co">// set cell-bits to active if pattern is active and left,</span></span>
<span id="cb1-57"><a href="#cb1-57" tabindex="-1"></a>      <span class="co">// right and center cell-bits all matched</span></span>
<span id="cb1-58"><a href="#cb1-58" tabindex="-1"></a>      result <span class="op">|=</span> l <span class="op">&amp;</span> c <span class="op">&amp;</span> r <span class="op">&amp;</span> on<span class="op">;</span></span>
<span id="cb1-59"><a href="#cb1-59" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb1-60"><a href="#cb1-60" tabindex="-1"></a>    <span class="co">// for reversible automata...</span></span>
<span id="cb1-61"><a href="#cb1-61" tabindex="-1"></a>    result <span class="op">^=</span> prev<span class="op">;</span></span>
<span id="cb1-62"><a href="#cb1-62" tabindex="-1"></a></span>
<span id="cb1-63"><a href="#cb1-63" tabindex="-1"></a>    cells[j <span class="op">*</span> cols <span class="op">+</span> i] <span class="op">=</span> result<span class="op">;</span></span>
<span id="cb1-64"><a href="#cb1-64" tabindex="-1"></a>    workgroupBarrier()<span class="op">;</span></span>
<span id="cb1-65"><a href="#cb1-65" tabindex="-1"></a></span>
<span id="cb1-66"><a href="#cb1-66" tabindex="-1"></a>    prev <span class="op">=</span> select(center<span class="op">,</span> <span class="dv">0</span><span class="op">,</span> params<span class="op">.</span>reversible <span class="op">==</span> <span class="dv">0</span>)<span class="op">;</span></span>
<span id="cb1-67"><a href="#cb1-67" tabindex="-1"></a>  <span class="op">}</span></span>
<span id="cb1-68"><a href="#cb1-68" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <div class="float">
            <img src="82r-mobile.webp" alt="Rule 82 reversible" />
            <div class="figcaption">Rule 82 reversible</div>
            </div>
            <p>The automata are drawn to the screen with a vertex and
            pixel shader combination. The colour gradient in the
            background is the result of the vertex colours being blended
            together. I didn’t start out with the intention of there
            being a gradient, but putting colours on the vertices was in
            the tutorial and I kept it.</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode rust"><code class="sourceCode rust"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="kw">struct</span> Verts <span class="op">{</span></span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a>  <span class="op">@</span>builtin(position) pos<span class="op">:</span> vec4f<span class="op">,</span></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a>  <span class="op">@</span>location(<span class="dv">0</span>) color<span class="op">:</span> vec4f<span class="op">,</span></span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb2-5"><a href="#cb2-5" tabindex="-1"></a></span>
<span id="cb2-6"><a href="#cb2-6" tabindex="-1"></a><span class="op">@</span>vertex <span class="kw">fn</span> vs(<span class="op">@</span>builtin(vertex_index) i <span class="op">:</span> <span class="dt">u32</span>) <span class="op">-&gt;</span> Verts <span class="op">{</span></span>
<span id="cb2-7"><a href="#cb2-7" tabindex="-1"></a>  <span class="kw">let</span> pos <span class="op">=</span> array(</span>
<span id="cb2-8"><a href="#cb2-8" tabindex="-1"></a>    vec2f(<span class="op">-</span><span class="dv">1</span><span class="op">,</span> <span class="dv">1</span>)<span class="op">,</span></span>
<span id="cb2-9"><a href="#cb2-9" tabindex="-1"></a>    vec2f(<span class="op">-</span><span class="dv">1</span><span class="op">,-</span><span class="dv">1</span>)<span class="op">,</span></span>
<span id="cb2-10"><a href="#cb2-10" tabindex="-1"></a>    vec2f( <span class="dv">1</span><span class="op">,-</span><span class="dv">1</span>)<span class="op">,</span></span>
<span id="cb2-11"><a href="#cb2-11" tabindex="-1"></a>    vec2f(<span class="op">-</span><span class="dv">1</span><span class="op">,</span> <span class="dv">1</span>)<span class="op">,</span></span>
<span id="cb2-12"><a href="#cb2-12" tabindex="-1"></a>    vec2f( <span class="dv">1</span><span class="op">,</span> <span class="dv">1</span>)<span class="op">,</span></span>
<span id="cb2-13"><a href="#cb2-13" tabindex="-1"></a>    vec2f( <span class="dv">1</span><span class="op">,-</span><span class="dv">1</span>)<span class="op">,</span></span>
<span id="cb2-14"><a href="#cb2-14" tabindex="-1"></a>  )<span class="op">;</span></span>
<span id="cb2-15"><a href="#cb2-15" tabindex="-1"></a>  <span class="kw">let</span> col <span class="op">=</span> array(</span>
<span id="cb2-16"><a href="#cb2-16" tabindex="-1"></a>    vec4f(<span class="dv">0.1</span><span class="op">,</span> <span class="dv">0.5</span><span class="op">,</span> <span class="dv">1</span><span class="op">,</span> <span class="dv">1</span>)<span class="op">,</span></span>
<span id="cb2-17"><a href="#cb2-17" tabindex="-1"></a>    vec4f(<span class="dv">0.1</span><span class="op">,</span> <span class="dv">1</span><span class="op">,</span> <span class="dv">0.5</span><span class="op">,</span> <span class="dv">1</span>)<span class="op">,</span></span>
<span id="cb2-18"><a href="#cb2-18" tabindex="-1"></a>    vec4f(<span class="dv">0.1</span><span class="op">,</span> <span class="dv">0.5</span><span class="op">,</span> <span class="dv">1</span><span class="op">,</span> <span class="dv">1</span>)<span class="op">,</span></span>
<span id="cb2-19"><a href="#cb2-19" tabindex="-1"></a>  )<span class="op">;</span></span>
<span id="cb2-20"><a href="#cb2-20" tabindex="-1"></a></span>
<span id="cb2-21"><a href="#cb2-21" tabindex="-1"></a>  <span class="cf">return</span> Verts(vec4f(pos[i]<span class="op">,</span> <span class="dv">0.0</span><span class="op">,</span> <span class="dv">1.0</span>)<span class="op">,</span> col[i <span class="op">%</span> <span class="dv">3</span>])<span class="op">;</span></span>
<span id="cb2-22"><a href="#cb2-22" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb2-23"><a href="#cb2-23" tabindex="-1"></a></span>
<span id="cb2-24"><a href="#cb2-24" tabindex="-1"></a><span class="op">@</span>fragment <span class="kw">fn</span> fs(verts<span class="op">:</span> Verts) <span class="op">-&gt;</span> <span class="op">@</span>location(<span class="dv">0</span>) vec4f <span class="op">{</span></span>
<span id="cb2-25"><a href="#cb2-25" tabindex="-1"></a>  <span class="kw">let</span> fields <span class="op">=</span> stride<span class="op">;</span></span>
<span id="cb2-26"><a href="#cb2-26" tabindex="-1"></a>  <span class="kw">let</span> cols <span class="op">=</span> (<span class="dv">32</span> <span class="op">*</span> fields) <span class="op">/</span> params<span class="op">.</span>zoom<span class="op">;</span></span>
<span id="cb2-27"><a href="#cb2-27" tabindex="-1"></a>  <span class="kw">let</span> col_width <span class="op">=</span> <span class="dt">f32</span>(params<span class="op">.</span>height) <span class="op">/</span> <span class="dt">f32</span>(cols)<span class="op">;</span></span>
<span id="cb2-28"><a href="#cb2-28" tabindex="-1"></a>  <span class="kw">let</span> rows <span class="op">=</span> (arrayLength(<span class="op">&amp;</span>cells) <span class="op">/</span> fields) <span class="op">/</span> params<span class="op">.</span>zoom<span class="op">;</span></span>
<span id="cb2-29"><a href="#cb2-29" tabindex="-1"></a>  <span class="kw">let</span> row_height <span class="op">=</span> <span class="dt">f32</span>(params<span class="op">.</span>width) <span class="op">/</span> <span class="dt">f32</span>(rows)<span class="op">;</span></span>
<span id="cb2-30"><a href="#cb2-30" tabindex="-1"></a>  <span class="kw">let</span> i_off <span class="op">=</span> ((<span class="dv">32</span> <span class="op">*</span> fields) <span class="op">-</span> cols) <span class="op">/</span> <span class="dv">2</span><span class="op">;</span></span>
<span id="cb2-31"><a href="#cb2-31" tabindex="-1"></a></span>
<span id="cb2-32"><a href="#cb2-32" tabindex="-1"></a>  <span class="co">// get the cell&#39;s column index</span></span>
<span id="cb2-33"><a href="#cb2-33" tabindex="-1"></a>  <span class="kw">let</span> i <span class="op">=</span> <span class="dt">u32</span>(floor(verts<span class="op">.</span>pos<span class="op">.</span>y <span class="op">/</span> col_width)) <span class="op">+</span> i_off<span class="op">;</span></span>
<span id="cb2-34"><a href="#cb2-34" tabindex="-1"></a>  <span class="co">// get the field index; i / 32</span></span>
<span id="cb2-35"><a href="#cb2-35" tabindex="-1"></a>  <span class="kw">let</span> f <span class="op">=</span> i <span class="op">&gt;&gt;</span> <span class="dv">5</span><span class="op">;</span></span>
<span id="cb2-36"><a href="#cb2-36" tabindex="-1"></a>  <span class="co">// get the bit index; i % 32</span></span>
<span id="cb2-37"><a href="#cb2-37" tabindex="-1"></a>  <span class="kw">let</span> b <span class="op">=</span> <span class="dv">31</span> <span class="op">-</span> (i <span class="op">&amp;</span> <span class="dv">31</span>)<span class="op">;</span></span>
<span id="cb2-38"><a href="#cb2-38" tabindex="-1"></a>  <span class="kw">let</span> j <span class="op">=</span> <span class="dt">u32</span>(floor(verts<span class="op">.</span>pos<span class="op">.</span>x <span class="op">/</span> row_height))<span class="op">;</span></span>
<span id="cb2-39"><a href="#cb2-39" tabindex="-1"></a>  <span class="kw">let</span> a <span class="op">=</span> <span class="dv">0.2</span> <span class="op">+</span> <span class="dv">0.8</span> <span class="op">*</span> <span class="dt">f32</span>(((cells[j <span class="op">*</span> fields <span class="op">+</span> f]) <span class="op">&gt;&gt;</span> b) <span class="op">&amp;</span> <span class="dv">1</span>)<span class="op">;</span></span>
<span id="cb2-40"><a href="#cb2-40" tabindex="-1"></a></span>
<span id="cb2-41"><a href="#cb2-41" tabindex="-1"></a>  <span class="cf">return</span> a <span class="op">*</span> verts<span class="op">.</span>color<span class="op">;</span></span>
<span id="cb2-42"><a href="#cb2-42" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>The vertex shader positions the vertices into a rectangle
            made out of two triangles. Each fragment then gets a color
            value that is the result of interpolating between the
            nearest vertices to where the fragment was located on a
            triangle.</p>
            <div class="float">
            <img src="82r-alt.webp"
            alt="Rule 82r again, but with different params" />
            <div class="figcaption">Rule 82r again, but with different
            params</div>
            </div>
            <p>The automata is drawn by changing the alpha value for
            each fragment depending whether it falls inside an active
            cell. When the number of fragments equal the number of
            cells, then it is one cell per fragment and therefor it is
            one bit per cell.</p>
            <h1 id="conclusion">Conclusion</h1>
            <p>I am excited to get something working with WebGPU and
            while it is complicated to get computation shaders running,
            I think there is a lot of potential there. When I get chance
            I’d like to look into higher dimensional automata and how
            these could be computed and displayed with WebGPU. I seem to
            remember that adding dimensions makes finding interesting
            rules far more difficult, but it would be cool to see
            something in 3D.</p>
            <div class="float">
            <img src="250r-desktop.png" alt="Rule 250 reversible" />
            <div class="figcaption">Rule 250 reversible</div>
            </div>
            <div class="float">
            <img src="58r-desktop.png" alt="Rule 58 reversible" />
            <div class="figcaption">Rule 58 reversible</div>
            </div>
            <div class="float">
            <img src="18r-desktop.png" alt="Rule 18 reversible" />
            <div class="figcaption">Rule 18 reversible</div>
            </div>
    </div>
  </content>
</entry>
<entry>
  <title>List of software recruiters</title>
  <id>https://richiejp.com/agent-list</id>
  <published>2023-11-09T10:14:23Z</published>
  <updated>2023-11-09T10:14:23Z</updated>
  <link rel="alternate" href="https://richiejp.com/agent-list" />
  <summary>Some agents for finding jobs and contracts</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <h1 id="avantirc.com">Avantirc.com</h1>
            <ul>
            <li>Has embedded systems roles</li>
            <li>Some contract roles?</li>
            <li>Some remote roles</li>
            </ul>
            <h1 id="ioassociates.co.uk">ioassociates.co.uk</h1>
            <ul>
            <li>Has contract roles</li>
            <li>Mainly hybrid</li>
            <li>Seem confused</li>
            </ul>
    </div>
  </content>
</entry>
<entry>
  <title>Winning a rare data race</title>
  <id>https://richiejp.com/a-rare-data-race</id>
  <published>2020-06-29T19:30:22+01:00</published>
  <updated>2021-11-24T08:50:57Z</updated>
  <link rel="alternate" href="https://richiejp.com/a-rare-data-race" />
  <summary>How to reproduce a rare data race with the Fuzzy Sync race
exposition library.</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p><em>Don’t like reading? <a
            href="https://youtu.be/P1lstl-NwWQ">Video
            accompaniment</a>.</em></p>
            <h1 id="fuzzy-sync">Fuzzy Sync</h1>
            <p>Fuzzy Sync is a library we have developed for the Linux
            Test Project (LTP). It allows us to synchronise events in
            time, thus triggering bugs which are dependant on the
            outcome of one or more data races. Additionally it can
            introduce randomised delays (the fuzzy part) to achieve
            outcomes that would otherwise be extremely rare.</p>
            <p>Cyril Hrubis <a
            href="https://people.kernel.org/metan/how-to-trigger-races-reliably">previously
            introduced the library</a> on the kernel.org blog along with
            one of the tests we created it for. In this article I will
            present a contrived data race which demonstrates why a delay
            is sometimes needed in order to synchronise the critical
            sections.</p>
            <p><strong>Update</strong>: I extracted Fuzzy Sync from the
            LTP into a <a
            href="https://gitlab.com/Palethorpe/fuzzy-sync">standalone
            library</a>.</p>
            <p>For reference the Fuzzy Sync library is just a single
            header file which you can see <a
            href="https://github.com/linux-test-project/ltp/blob/master/include/tst_fuzzy_sync.h">here</a>.
            There is more documentation than code. It is entirely a
            user-land library, not tied to Linux in any way, although it
            does use a few other parts of the LTP library so it is not
            currently stand-alone. It certainly does not inject errors
            or delays into the kernel or instrument it.</p>
            <p>Despite the fact I think this is an obvious thing to do,
            if you are writing an exploit or just writing regression
            tests involving a data race, I couldn’t find a library which
            already did it. Although I have since found various tools
            and whatnot with something similar embedded; for example the
            Syzkaller executor has <em>system call collision</em>.</p>
            <p>Note that when I say ‘rare’, I mean rare in my local
            context. In the wild someone will accidentally (or
            deliberately for that matter) discover a way to trigger it
            on a regular basis. There are a lot of systems in the wild
            running a lot of different software.</p>
            <h1 id="the-race">The Race</h1>
            <p>We have a simple race between two threads
            <strong>A</strong> and <strong>B</strong>, trying to read
            and write to the same variable.</p>
            <p>Let’s look at the race code, most of the boiler plate is
            omitted and just the main loops for threads
            <strong>A</strong> and <strong>B</strong> are shown along
            with some variable definitions. You can see <a
            href="https://github.com/richiejp/ltp/blob/fzsync-demo/lib/newlib_tests/tst_fuzzy_sync_demo.c">the
            full source here</a>.</p>
            <div class="sourceCode" id="cb1"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="dt">static</span> <span class="kw">struct</span> tst_fzsync_pair pair<span class="op">;</span></span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a><span class="dt">static</span> <span class="dt">volatile</span> <span class="dt">char</span> winner<span class="op">;</span></span></code></pre></div>
            <p>The data race revolves around the <code>winner</code>
            global variable which we access in both threads. <em>Roughly
            speaking</em>; setting it as volatile tells the compiler
            both that the value can change at any time and also setting
            the value may have side effects. In this case, it prevents
            the compiler from removing the <code>if</code> statement in
            the code below. Static analysis will show that
            <code>winner</code> is never equal to <code>'B'</code>
            unless the variable can change at any time.</p>
            <p><code>pair</code> contains the Fuzzy Sync libraries state
            and is passed to most library functions. It is accessed from
            both threads as well, but is protected with memory barriers
            and the <a
            href="https://github.com/richiejp/ltp/blob/fzsync-demo/include/tst_atomic.h">atomic
            API</a>, which is far preferable to marking it as volatile.
            Generally speaking, you shouldn’t just mark stuff as
            volatile and then access it from multiple threads like any
            other variable. However it makes the demo code easier to
            read and I’m only running it on x86 so…</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="cf">while</span> <span class="op">(</span>tst_fzsync_run_a<span class="op">(&amp;</span>pair<span class="op">))</span> <span class="op">{</span></span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a>    winner <span class="op">=</span> <span class="ch">&#39;A&#39;</span><span class="op">;</span></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a></span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a>    tst_fzsync_start_race_a<span class="op">(&amp;</span>pair<span class="op">);</span></span>
<span id="cb2-5"><a href="#cb2-5" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>winner <span class="op">==</span> <span class="ch">&#39;A&#39;</span> <span class="op">&amp;&amp;</span> winner <span class="op">==</span> <span class="ch">&#39;B&#39;</span><span class="op">)</span></span>
<span id="cb2-6"><a href="#cb2-6" tabindex="-1"></a>        winner <span class="op">=</span> <span class="ch">&#39;A&#39;</span><span class="op">;</span></span>
<span id="cb2-7"><a href="#cb2-7" tabindex="-1"></a>    tst_fzsync_end_race_a<span class="op">(&amp;</span>pair<span class="op">);</span></span>
<span id="cb2-8"><a href="#cb2-8" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>This is the loop for thread <strong>A</strong>,
            <code>tst_fzsync_run_a()</code> returns <code>true</code> if
            the test should continue and false otherwise.
            <code>tst_fzsync_start_race_a()</code> and
            <code>tst_fzsync_end_race_a()</code> delineate the block of
            code where we think the race can happen. Usually this is
            just a single statement, such as a function or system call,
            which is external to our reproducer code. Usually we would
            have some more complex setup between
            <code>...run_a/b()</code> and
            <code>...start_race_a/b()</code>, if we didn’t then it might
            be better to dispense with Fuzzy Sync and just use a pair of
            plain loops. Which is actually the case here, but for now
            let’s just pretend something complex and time consuming
            needs to be done before each race.</p>
            <p>Before the race begins we set <code>winner</code> to
            <code>A</code>, then during the race we branch on the
            seemingly impossible condition</p>
            <div class="sourceCode" id="cb3"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a>winner <span class="op">==</span> <span class="ch">&#39;A&#39;</span> <span class="op">&amp;&amp;</span> winner <span class="op">==</span> <span class="ch">&#39;B&#39;</span>`</span></code></pre></div>
            <p>Clearly if <code>winner</code> is <code>'A'</code> then
            it is not <code>'B'</code>, unless the value of
            <code>winner</code> changes half way through the statement.
            Which it can because this is actually two statements in C
            and will be compiled to a number of assembly instructions.
            In other words it is not <em>atomic</em>.</p>
            <div class="sourceCode" id="cb4"><pre
            class="sourceCode nasm"><code class="sourceCode nasm"><span id="cb4-1"><a href="#cb4-1" tabindex="-1"></a>0x00000000004055d2 &lt;+1026&gt;: movzbl <span class="bn">0x1b297</span><span class="op">(%</span><span class="kw">rip</span><span class="op">),%</span><span class="kw">eax</span>        # <span class="bn">0x420870</span> <span class="op">&lt;</span>winner<span class="op">&gt;</span></span>
<span id="cb4-2"><a href="#cb4-2" tabindex="-1"></a>0x00000000004055d9 &lt;+1033&gt;: <span class="kw">cmp</span>    <span class="op">$</span><span class="bn">0</span>x41<span class="op">,%</span><span class="kw">al</span></span>
<span id="cb4-3"><a href="#cb4-3" tabindex="-1"></a>0x00000000004055db &lt;+1035&gt;: <span class="cf">je</span>     <span class="bn">0x4057e8</span> <span class="op">&lt;</span>run<span class="op">+</span><span class="dv">1560</span><span class="op">&gt;</span></span>
<span id="cb4-4"><a href="#cb4-4" tabindex="-1"></a></span>
<span id="cb4-5"><a href="#cb4-5" tabindex="-1"></a>... &lt;500+ lines of fuzzy sync assembly redacted<span class="op">&gt;</span> <span class="fu">...</span></span>
<span id="cb4-6"><a href="#cb4-6" tabindex="-1"></a></span>
<span id="cb4-7"><a href="#cb4-7" tabindex="-1"></a>0x00000000004057e8 &lt;+1560&gt;: movzbl <span class="bn">0x1b081</span><span class="op">(%</span><span class="kw">rip</span><span class="op">),%</span><span class="kw">eax</span>        # <span class="bn">0x420870</span> <span class="op">&lt;</span>winner<span class="op">&gt;</span></span>
<span id="cb4-8"><a href="#cb4-8" tabindex="-1"></a>0x00000000004057ef &lt;+1567&gt;: <span class="kw">cmp</span>    <span class="op">$</span><span class="bn">0</span>x42<span class="op">,%</span><span class="kw">al</span></span>
<span id="cb4-9"><a href="#cb4-9" tabindex="-1"></a>0x00000000004057f1 &lt;+1569&gt;: <span class="cf">jne</span>    <span class="bn">0x4055e1</span> <span class="op">&lt;</span>run<span class="op">+</span><span class="dv">1041</span><span class="op">&gt;</span></span>
<span id="cb4-10"><a href="#cb4-10" tabindex="-1"></a>0x00000000004057f7 &lt;+1575&gt;: movb   <span class="op">$</span><span class="bn">0</span>x41<span class="op">,</span><span class="bn">0x1b072</span><span class="op">(%</span><span class="kw">rip</span><span class="op">)</span>        # <span class="bn">0x420870</span> <span class="op">&lt;</span>winner<span class="op">&gt;</span></span>
<span id="cb4-11"><a href="#cb4-11" tabindex="-1"></a>0x00000000004057fe &lt;+1582&gt;: jmpq   <span class="bn">0x4055e1</span> <span class="op">&lt;</span>run<span class="op">+</span><span class="dv">1041</span><span class="op">&gt;</span></span></code></pre></div>
            <p>This is the x86_64 assembly which checks and sets
            <code>winner</code>. The first 2 instructions load
            <code>winner</code> and compare it to <code>'A'</code>, the
            third will then jump to the second half of the statement if
            it is equal to <code>'A'</code>. The second half is much
            like the first. Importantly we have two loads on
            <code>winner</code> (the two <code>movzbl</code>s) which
            <em>chronologically</em> happen a few instructions
            apart.</p>
            <p>Interestingly GCC has decided to move the second half of
            the statement towards the end of the program. I guess this
            is related to some CPU instruction cache optimisation, hint
            to branch prediction, register juggling or something else.
            If the behavior of our program is dependant on a data race,
            then this <em>kind</em> of optimisation can have a
            significant effect on the program behavior. A data race
            outcome which is highly improbable if the kernel is compiled
            with one compiler version may be highly probable when
            compiled with another. In this particular case, it won’t
            make a difference because the timings are dominated by
            <code>nanosleep</code>.</p>
            <div class="sourceCode" id="cb5"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb5-1"><a href="#cb5-1" tabindex="-1"></a><span class="kw">struct</span> timespec delay <span class="op">=</span> <span class="op">{</span> <span class="dv">0</span><span class="op">,</span> <span class="dv">1</span> <span class="op">};</span></span>
<span id="cb5-2"><a href="#cb5-2" tabindex="-1"></a></span>
<span id="cb5-3"><a href="#cb5-3" tabindex="-1"></a><span class="cf">while</span> <span class="op">(</span>tst_fzsync_run_b<span class="op">(&amp;</span>pair<span class="op">))</span> <span class="op">{</span></span>
<span id="cb5-4"><a href="#cb5-4" tabindex="-1"></a>    tst_fzsync_start_race_b<span class="op">(&amp;</span>pair<span class="op">);</span></span>
<span id="cb5-5"><a href="#cb5-5" tabindex="-1"></a>    nanosleep<span class="op">(&amp;</span>delay<span class="op">,</span> NULL<span class="op">);</span></span>
<span id="cb5-6"><a href="#cb5-6" tabindex="-1"></a>    winner <span class="op">=</span> <span class="ch">&#39;B&#39;</span><span class="op">;</span></span>
<span id="cb5-7"><a href="#cb5-7" tabindex="-1"></a>    tst_fzsync_end_race_b<span class="op">(&amp;</span>pair<span class="op">);</span></span>
<span id="cb5-8"><a href="#cb5-8" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>We have a similar loop for thread <strong>B</strong>, the
            big difference is we make a system call and just set
            <code>winner</code> to <code>'B'</code>.
            <code>nanosleep</code> here will attempt to sleep for one
            nanosecond, which basically means it won’t sleep at all as
            context switching takes far longer than a nanosecond, so by
            the time it gets into kernel land it will have already spent
            too long.</p>
            <p>In a real test we usually have one or more syscalls in
            <strong>A</strong> and <strong>B</strong>. The race happens
            somewhere deep in the kernel and all we can do is ‘collide’
            the system calls. Possibly one thread’s system calls may
            take much longer to reach the race critical section than the
            other. This is what we are simulating here; we have a very
            large difference in the time it takes <strong>A</strong> to
            reach the race critical section compared to
            <strong>B</strong>.</p>
            <p>To make matters worse the race window is very short
            relative to the time it takes to context switch in and out
            of the kernel. We could move the sleep to before
            <code>tst_fzsync_start_race_b</code>, but often in a real
            reproducer the ideal place to demarcate the race window is
            somewhere inside the kernel.</p>
            <h1 id="visualising-the-race">Visualising the race</h1>
            <p>The race has been run 100,000 times and timestamps taken
            of when <strong>A</strong> and <strong>B</strong> start and
            finish. The differences between <strong>A</strong> and
            <strong>B</strong> are shown below. The red circle indicates
            where <strong>A</strong> won the race, that is, set
            <code>winner = 'A'</code>. In this case <strong>A</strong>
            only won once.</p>
            <p>Each time difference is represented by a small black
            dot.</p>
            <div class="float">
            <img src="/https/richiejp.com/start_difference.png" style="width:100.0%"
            alt="Difference Between Starts" />
            <div class="figcaption">Difference Between Starts</div>
            </div>
            <p>That is the time difference between when
            <strong>A</strong> exited
            <code>tst_fzsync_start_race_a()</code> and when
            <strong>B</strong> exited
            <code>tst_fzsync_start_race_b()</code>. Initially both
            <strong>A</strong> and <strong>B</strong> start at about the
            same time; after about ~13000 loops the random delay of
            <strong>A</strong> is introduced.</p>
            <p>Strictly speaking, the possible delay range includes a
            small delay to <strong>B</strong> as well, so that
            <strong>A</strong> ends as <strong>B</strong> starts, but
            <strong>A</strong> is so short we can’t see it on this
            graph.</p>
            <p>You can see that when the start of <strong>B</strong> is
            delayed enough we may hit the race condition. It is close to
            the upper bound of the delay range.</p>
            <div class="float">
            <img src="/https/richiejp.com/end_difference.png" style="width:100.0%"
            alt="Difference Between Ends" />
            <div class="figcaption">Difference Between Ends</div>
            </div>
            <p>On the left of this graph we can see that
            <strong>B</strong> is consistently ~55000 nanoseconds behind
            <strong>A</strong> by the finish. If you look very closely
            you will find outliers where this is not the case, but we
            would have to rerun the race for a long time for
            <strong>A</strong> to win (on my setup).</p>
            <p>The delay is calculated once a minimum number of loops
            have been completed and the mean deviation of various
            timings settles down. On the left you can see there is some
            <em>natural</em> jitter, but not much relative to the
            overall difference in end times. So the delay is calculated
            after relatively few loops. If the <em>natural</em> jitter
            were relatively large, then we wouldn’t need to artificially
            introduce more.</p>
            <p>Once the delay is introduced we can see that
            <strong>A</strong> and <strong>B</strong> are far more
            likely to finish at the same time. The race window for
            <strong>A</strong> to win is almost precisely at a
            difference of 0.</p>
            <h1 id="visualising-the-race-2">Visualising the race 2</h1>
            <p>Below is a more abstract (and patently not-to-scale)
            visualisation of the two major epochs of Fuzzy Sync
            execution.</p>
            <div class="float">
            <img src="/https/richiejp.com/race-time-diagrams.svg"
            alt="Race Time Diagrams" />
            <div class="figcaption">Race Time Diagrams</div>
            </div>
            <p>In the first case we are just using spin locks which act
            as synchronising barriers for entering and exiting the race
            (see <code>tst_fzsync_pair_wait()</code> in the code).</p>
            <p>When one thread reaches the end of the race before the
            other, it counts the number of times it spins waiting. We
            use this value later to calculate the delay range.</p>
            <p>The second diagram shows the ideal delay for
            <code>'A'</code> to win the race, however we actually
            calculate a delay range and randomly pick values from it. In
            this example we know exactly where the race occurs so we
            could dispense with some of the stochasticity and narrow the
            delay range to just attempt the synchronisation shown.</p>
            <p>However, for most tests, we don’t have this level of
            knowledge about the race condition on all possible kernel
            versions and configurations where the bug exists. In fact we
            sometimes don’t know much at all about the call tree and
            timings within the kernel, we just know that calling syscall
            X and Y at the same time makes bad things happen.</p>
            <p>For an attacker focusing on a particular system though,
            you can assume that they would be able to narrow the delay
            range down to a value which allows them to trigger the race
            in much fewer iterations. Fuzzy Sync allows this in some
            cases using <code>tst_fzsync_pair_add_bias()</code> which I
            may cover in a future article.</p>
            <h1 id="reproducers-using-fuzzy-sync">Reproducers using
            Fuzzy Sync</h1>
            <p>At the time of writing there are several LTP tests using
            Fuzzy Sync:</p>
            <ul>
            <li><a
            href="https://github.com/linux-test-project/ltp/blob/master/testcases/cve/cve-2014-0196.c">cve-2014-0196</a></li>
            <li><a
            href="https://github.com/linux-test-project/ltp/blob/master/testcases/cve/cve-2016-7117.c">cve-2016-7117</a></li>
            <li><a
            href="https://github.com/linux-test-project/ltp/blob/master/testcases/cve/cve-2017-2671.c">cve-2017-2671</a></li>
            <li><a
            href="https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/pty/pty03.c">pty03
            (cve-2020-14416)</a></li>
            <li><a
            href="https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/sound/snd_seq01.c">snd_seq01
            (cve-2018-7566)</a></li>
            <li><a
            href="https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/sound/snd_timer01.c">snd_timer01
            (cve-2017-1000380)</a></li>
            <li><a
            href="https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/bind/bind06.c">bind06
            (cve-2018-18559)</a></li>
            <li><a
            href="https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/inotify/inotify09.c">inotify09</a></li>
            <li><a
            href="https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/ipc/shmctl/shmctl05.c">shmctl05</a></li>
            <li><a
            href="https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/sendmsg/sendmsg03.c">sendmsg03
            (cve-2017-17712)</a></li>
            <li><a
            href="https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/setsockopt/setsockopt06.c">setsockopt06
            (cve-2016-8655)</a></li>
            <li><a
            href="https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/setsockopt/setsockopt07.c">setsockopt07
            (cve-2017-1000111)</a></li>
            <li><a
            href="https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/timerfd/timerfd_settime02.c">timerfd_settime02
            (cve-2017-10661)</a></li>
            <li><a
            href="https://github.com/linux-test-project/ltp/blob/master/testcases/network/packet/fanout01.c">fanout01
            (cve-2017-15649)</a></li>
            </ul>
    </div>
  </content>
</entry>
<entry>
  <title>Ayup! Tightening the remote build and deploy loop</title>
  <id>https://richiejp.com/ayup-announcement</id>
  <published>2024-08-12T14:05:52+01:00</published>
  <updated>2024-08-12T14:39:36+01:00</updated>
  <link rel="alternate" href="https://richiejp.com/ayup-announcement" />
  <summary>Ayup is a new OpenSource project I have been working on that
allows you to deploy from source on a remote or local machine</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <h1
            id="from-minimal-vms-to-aiml-gpu-rigs-with-kubernetes">From
            minimal VMs to AI/ML GPU rigs with Kubernetes</h1>
            <p>A recurring theme throughout my career is frustration at
            not being able to easily utilise a remote machine or a local
            VM.</p>
            <p>In some cases this was because I had a local VM running a
            stripped down <a href="https://github.com/richiejp/m">kernel
            with very little userland</a> in it. Creating such an
            environment was useful for quickly building and redeploying
            the Linux kernel, but left very little in userland to work
            with. It moved a lot of stuff outside the machine where the
            software is being deployed.</p>
            <p>In others I had a full blown Linux distro on a remote
            machine, but quickly getting updated versions of my software
            on to there is still a pain. Often involving SSH’ing into a
            remote machine and doing a Git pull or opening a file in
            Vim/Emacs and tweaking it.</p>
            <p>Even worse are occasions where I had to upload a Docker
            image to a repository just so I can run the software in a
            Kubernetes cluster. In this case there are a bunch of tools
            to help (e.g. Skaffold). However the mind boggles at how
            complicated anything involving Kubernetes can get.</p>
            <p>In pretty much all cases I have a local source repository
            with some code in. That could be a Linux Test Project test
            or an LLM example for the Prem Kubernetes operator. I then
            need to get that software built and into a remote machine
            which has the appropriate hardware or OS level software.</p>
            <p>There are plenty of CI/CD options out there, but they
            lack the kind of tight feedback loop I want in any given
            situation. I certainly don’t want to involve GitHub or
            GitLab when tweaking one line of code in a series of
            experiments.</p>
            <p>There are some tools that do a very good job, but
            typically require a fair amount of setup. To the extent that
            I think it’s more effort than its worth and resort to some
            variation of logging in over serial and doing things
            manually.</p>
            <h1 id="ayup-a-solution-for-some-of-this">Ayup: A solution
            for some of this</h1>
            <p>What I generally want in these cases is a tool I can
            easily install and connect to on these systems that does the
            full CI/CD in a very rapid manner. Meaning it has to be
            statically compiled with just the kernel as its dependency
            and it has to have an easy, secure connection mechanism and
            it needs to cache builds.</p>
            <p>Importantly it needs to do the good stuff by default.
            It’s amazing what Nix, Docker or Kubernetes can do. The
            tools and the features are all there, it’s just they’re not
            put together in the right way for the user experience I
            envision.</p>
            <p>So here is <a
            href="https://github.com/premAI-io/Ayup">Ayup</a>, a build
            and deployment tool based on Buildkit and Containerd. It’s
            initial focus is on AI/ML projects, but in theory there’s
            nothing stopping it from deploying an LTP test.</p>
            <p>Presently it doesn’t bundle Buildkit or Containerd, but
            that’s what a bunch of Kubernetes distros do and it results
            in a 200MB executable. This can be dumped as a single
            executable on a system and off we go.</p>
            <p>There are a bunch of other problems that Ayup is trying
            to tackle that you can see on <a
            href="https://blog.premai.io/open-source-release-ayup-facing-the-deployment-nightmare/">Prem’s
            blog</a>.</p>
    </div>
  </content>
</entry>
<entry>
  <title>A barely HTTP/2 server in Zig</title>
  <id>https://richiejp.com/barely-http2-zig</id>
  <published>2023-05-03T17:26:46+01:00</published>
  <updated>2023-06-16T14:33:24+01:00</updated>
  <link rel="alternate" href="https://richiejp.com/barely-http2-zig" />
  <summary>How does Zig fair against HTTP/2 and where is the simple
binary HTTP replacement?</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <h1 id="intro">Intro</h1>
            <p>Don’t get me wrong HTTP/2 is a clear improvement on
            HTTP1.x, but where is the entry level version? It’s
            certainly not HTTP/3, that’s for sure.</p>
            <p>The general wisdom amongst any new technology group, such
            as Zig, is to stick with text based HTTP and hide behind a
            proxy. This is the logical thing to do because HTTP1.1 has a
            lower barrier to entry. Primarily because of HPACK which
            we’ll get into in a moment.</p>
            <p>To see why HTTP1.1 is an attractive option, take a look
            at my article on <a href="/https/richiejp.com/zig-vs-c-mini-http-server">Zig Vs
            C - Minimal HTTP server</a>. It’s relatively easy to slap
            together a non-compliant HTTP server. My static file server
            is an extreme example, but practically you can aim for
            partial compliance then hide behind a normalising HTTP
            proxy.</p>
            <p>In theory and ostensibly in practice the proxy takes
            whatever awful HTTP is thrown at you from the internet, then
            converts it into some manageable subset. It can also handle
            TLS termination, so you can leave all that bad jazz to a
            third party.</p>
            <p>There are two problems with this. The least important
            being performance and the most being security.</p>
            <p>In my semi-sophisticated opinion, both issues have the
            same root cause. Essentially the length of a HTTP message is
            not known until you parse a variable length list of variable
            length headers.</p>
            <p>In theory you should then know how long the body is.
            However there is confusion over what specifies the body
            length which leads to <a
            href="https://portswigger.net/web-security/request-smuggling">request
            smuggling</a>.</p>
            <p>Request smuggling and desync attacks are enabled by the
            presence of proxies. Allegedly the problem gets worse with
            the <a
            href="https://portswigger.net/research/http2">introduction
            of HTTP/2</a>. However at the end of the linked article it
            states:</p>
            <blockquote>
            <p>If you’re setting up a web application, avoid HTTP/2
            downgrading - it’s the root cause of most of these
            vulnerabilities. Instead, use HTTP/2 end to end.</p>
            </blockquote>
            <p>OK, so how hard can it be to implement HTTP/2 then? This
            is something I was excited to find out about. Not least
            because it is an excuse to try out Zig for implementing
            network protocols. With the eventual goal of doing some
            security research into crusty old tech using exciting new
            tech (a vague plan of mine).</p>
            <p>Zig isn’t just exciting in my opinion; the traffic to my
            blog indicates it is the most interesting thing. I also
            believe it is practical. Surprisingly my original HTTP
            server still compiles and runs on the latest Zig from Git.
            It seems the build API has changed quite a bit, but for an
            unstable language it is quite stable if you stay away from
            new features.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>The source code for the parser and a static file server
            is here: <a
            href="https://github.com/richiejp/barely-http2">github.com/richiejp/barely-http2</a></p>
            </div>
            </div>
            <p>Here is some sample output upon a request by Curl; it’s
            not pretty:</p>
            <div class="sourceCode" id="cb1"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="ex">zig</span> run src/self-serve2.zig <span class="at">--</span> ~/portfolio/public</span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a><span class="ex">info:</span> Listening on 127.0.0.1:9001<span class="kw">;</span> <span class="ex">press</span> Ctrl-C to exit...</span>
<span id="cb1-3"><a href="#cb1-3" tabindex="-1"></a><span class="ex">info:</span> Accepted Connection from: 127.0.0.1:48274</span>
<span id="cb1-4"><a href="#cb1-4" tabindex="-1"></a><span class="ex">info:</span> <span class="op">&lt;&lt;&lt;</span> Got preface!</span>
<span id="cb1-5"><a href="#cb1-5" tabindex="-1"></a><span class="ex">info:</span> <span class="op">&gt;&gt;&gt;</span> Sending server preface</span>
<span id="cb1-6"><a href="#cb1-6" tabindex="-1"></a><span class="ex">info:</span> <span class="op">&lt;&lt;&lt;</span> http2.FrameHdr{ .length = 18, .type = http2.FrameType.settings, .flags = http2.FrameFlags{ .settings = http2.SettingsFlags{ .ack = false, .unused = 0 } }, .r = false, .id = 0 } http2.Payload{ .settings = http2.SettingsPayload{ .settings = { 0, 3, 0, 0, 0, 100, 0, 4, 2, 0, 0, 0, 0, 2, 0, 0, 0, 0 } } }</span>
<span id="cb1-7"><a href="#cb1-7" tabindex="-1"></a><span class="ex">info:</span>     http2.Setting{ .maxConcurrentStreams = 100 }</span>
<span id="cb1-8"><a href="#cb1-8" tabindex="-1"></a><span class="ex">info:</span>     http2.Setting{ .initialWindowSize = 33554432 }</span>
<span id="cb1-9"><a href="#cb1-9" tabindex="-1"></a><span class="ex">info:</span>     http2.Setting{ .enablePush = false }</span>
<span id="cb1-10"><a href="#cb1-10" tabindex="-1"></a><span class="ex">info:</span> <span class="op">&lt;&lt;&lt;</span> http2.FrameHdr{ .length = 4, .type = http2.FrameType.windowUpdate, .flags = http2.FrameFlags{ .unused = 0 }, .r = false, .id = 0 } http2.Payload{ .windowUpdate = http2.WindowUpdatePayload{ .r = false, .windowSizeIncrement = 33488897 } }</span>
<span id="cb1-11"><a href="#cb1-11" tabindex="-1"></a><span class="ex">info:</span> <span class="op">&lt;&lt;&lt;</span> http2.FrameHdr{ .length = 44, .type = http2.FrameType.headers, .flags = http2.FrameFlags{ .headers = http2.HeadersFlags{ .endStream = true, .unused1 = false, .endHeaders = true, .padded = false, .unused2 = false, .priority = false, .unused3 = 0 } }, .r = false, .id = 1 } http2.Payload{ .headers = http2.HeadersPayload{ .headerBlockFragment = { 130, 4, 141, 98, 49, 216, 90, 61, 45, 58, 83, 88, 150, 246, 105, 191, 134, 65, 138, 160, 228, 29, 19, 157, 9, 184, 248, 0, 31, 122, 136, 37, 182, 80, 195, 203, 129, 112, 255, 83, 3, 42, 47, 42 }, .hdec = hpack.Decoder{ .from = { ... }, .to = { ... }, .table = hdrIndx.Table{ ... } } } }</span>
<span id="cb1-12"><a href="#cb1-12" tabindex="-1"></a><span class="ex">info:</span>     :method =<span class="op">&gt;</span> GET</span>
<span id="cb1-13"><a href="#cb1-13" tabindex="-1"></a><span class="ex">info:</span>     :path =<span class="op">&gt;</span> /barely-http2-zig</span>
<span id="cb1-14"><a href="#cb1-14" tabindex="-1"></a><span class="ex">info:</span>     :scheme =<span class="op">&gt;</span> http</span>
<span id="cb1-15"><a href="#cb1-15" tabindex="-1"></a><span class="ex">info:</span>     :authority =<span class="op">&gt;</span> localhost:9001</span>
<span id="cb1-16"><a href="#cb1-16" tabindex="-1"></a><span class="ex">info:</span>     user-agent =<span class="op">&gt;</span> curl/8.0.1</span>
<span id="cb1-17"><a href="#cb1-17" tabindex="-1"></a><span class="ex">info:</span>     accept =<span class="op">&gt;</span> <span class="pp">*</span>/<span class="pp">*</span></span>
<span id="cb1-18"><a href="#cb1-18" tabindex="-1"></a><span class="ex">info:</span> <span class="pp">***</span> Opening barely-http2-zig.html</span>
<span id="cb1-19"><a href="#cb1-19" tabindex="-1"></a><span class="ex">info:</span> <span class="op">&gt;&gt;&gt;</span> Sending OK headers</span>
<span id="cb1-20"><a href="#cb1-20" tabindex="-1"></a><span class="ex">info:</span> <span class="op">&gt;&gt;&gt;</span> Sending DATA http2.FrameHdr{ .length = 16384, .type = http2.FrameType.data, .flags = http2.FrameFlags{ .data = http2.DataFlags{ .endStream = false, .unused = 0, .padded = false } }, .r = false, .id = 1 }</span>
<span id="cb1-21"><a href="#cb1-21" tabindex="-1"></a><span class="ex">info:</span> <span class="op">&gt;&gt;&gt;</span> Sending DATA http2.FrameHdr{ .length = 16384, .type = http2.FrameType.data, .flags = http2.FrameFlags{ .data = http2.DataFlags{ .endStream = false, .unused = 0, .padded = false } }, .r = false, .id = 1 }</span>
<span id="cb1-22"><a href="#cb1-22" tabindex="-1"></a><span class="ex">info:</span> <span class="op">&gt;&gt;&gt;</span> Sending DATA http2.FrameHdr{ .length = 16384, .type = http2.FrameType.data, .flags = http2.FrameFlags{ .data = http2.DataFlags{ .endStream = false, .unused = 0, .padded = false } }, .r = false, .id = 1 }</span>
<span id="cb1-23"><a href="#cb1-23" tabindex="-1"></a><span class="ex">info:</span> <span class="op">&gt;&gt;&gt;</span> Sending DATA http2.FrameHdr{ .length = 16384, .type = http2.FrameType.data, .flags = http2.FrameFlags{ .data = http2.DataFlags{ .endStream = false, .unused = 0, .padded = false } }, .r = false, .id = 1 }</span>
<span id="cb1-24"><a href="#cb1-24" tabindex="-1"></a><span class="ex">info:</span> <span class="op">&gt;&gt;&gt;</span> Sending DATA http2.FrameHdr{ .length = 16384, .type = http2.FrameType.data, .flags = http2.FrameFlags{ .data = http2.DataFlags{ .endStream = false, .unused = 0, .padded = false } }, .r = false, .id = 1 }</span>
<span id="cb1-25"><a href="#cb1-25" tabindex="-1"></a><span class="ex">info:</span> <span class="op">&gt;&gt;&gt;</span> Sending DATA http2.FrameHdr{ .length = 12857, .type = http2.FrameType.data, .flags = http2.FrameFlags{ .data = http2.DataFlags{ .endStream = true, .unused = 0, .padded = false } }, .r = false, .id = 1 }</span>
<span id="cb1-26"><a href="#cb1-26" tabindex="-1"></a><span class="ex">info:</span> <span class="op">&gt;&gt;&gt;</span> Sent 94777 file bytes</span></code></pre></div>
            <h1 id="hpack">HPACK</h1>
            <p>It turned out the main obstacle to slapping together a
            half-arsed HTTP/2 parser and writing a victory blog post
            declaring it took me only <code>N</code> hours, is the
            header compression scheme.</p>
            <p>This, in addition to the stream dependency and encryption
            talk, is probably what frightens people away from
            HTTP/2.</p>
            <p>There is no way that you can skip over HPACK decoding
            unless the other side is equally as lazy. Initially I
            thought I had found a way of doing it by setting the decoder
            table size to zero. However this is scuppered by an
            allowance for the client to start sending frames before it
            has received any settings (because latency).</p>
            <p>In fact even if you could avoid implementing the decoder
            table, you would still have to deal with Huffman encoded
            strings and variable length integers. Although these things
            are standard computer science, so there is plenty of good
            material to fall back on.</p>
            <p>I did however fuss over avoiding memory allocations and
            implement all the data structures myself. It’s worth
            pointing out that the Zig standard library has some nice
            data structures and functions for doing stuff like this.</p>
            <p>Let’s take a look at the thing which I tried hardest to
            avoid, <a
            href="https://github.com/richiejp/barely-http2/blob/main/src/hdrIndx.zig">the
            decoder table</a>. This thing provides some nice compression
            for site specific headers. If you are lost as to what I am
            talking about then do a search on HPACK. There are some
            articles waxing lyrical about the greatness of HPACK.
            Essentially though it remembers headers, in full or part, so
            that they don’t have to be resent for multiple requests.</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="co">/// The content of the table entries; A FIFO buffer.</span></span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a><span class="co">///</span></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a><span class="co">/// It&#39;s a buffer with capacity 3x the size of the minimum table size</span></span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a><span class="co">/// required by HPACK. This allows us to keep adding entries in</span></span>
<span id="cb2-5"><a href="#cb2-5" tabindex="-1"></a><span class="co">/// contiguous chunks with only an occasional copy.</span></span>
<span id="cb2-6"><a href="#cb2-6" tabindex="-1"></a><span class="co">///</span></span>
<span id="cb2-7"><a href="#cb2-7" tabindex="-1"></a><span class="co">/// New entries are added before start and their length subtracted</span></span>
<span id="cb2-8"><a href="#cb2-8" tabindex="-1"></a><span class="co">/// from start.  When start gets below the minimum table size,</span></span>
<span id="cb2-9"><a href="#cb2-9" tabindex="-1"></a><span class="co">/// everything is shifted backwards.</span></span>
<span id="cb2-10"><a href="#cb2-10" tabindex="-1"></a><span class="co">///</span></span>
<span id="cb2-11"><a href="#cb2-11" tabindex="-1"></a><span class="co">/// Only 2X the table size would be needed except that a new entry can</span></span>
<span id="cb2-12"><a href="#cb2-12" tabindex="-1"></a><span class="co">/// reference the index of an entry which is about to be removed.</span></span>
<span id="cb2-13"><a href="#cb2-13" tabindex="-1"></a><span class="at">const</span> HdrData <span class="op">=</span> <span class="kw">struct</span> {</span>
<span id="cb2-14"><a href="#cb2-14" tabindex="-1"></a>    <span class="co">/// The start of the first entry (most recently added)</span></span>
<span id="cb2-15"><a href="#cb2-15" tabindex="-1"></a>    start<span class="op">:</span> <span class="dt">u16</span> <span class="op">=</span> <span class="dv">2</span> <span class="op">*</span> <span class="dv">4096</span><span class="op">,</span></span>
<span id="cb2-16"><a href="#cb2-16" tabindex="-1"></a>    <span class="co">/// Where the start was at the previous copy to shift everything backwards.</span></span>
<span id="cb2-17"><a href="#cb2-17" tabindex="-1"></a>    <span class="co">/// Needed to correct indexes for copied items.</span></span>
<span id="cb2-18"><a href="#cb2-18" tabindex="-1"></a>    prevStart<span class="op">:</span> <span class="dt">u16</span> <span class="op">=</span> <span class="dv">3</span> <span class="op">*</span> <span class="dv">4096</span><span class="op">,</span></span>
<span id="cb2-19"><a href="#cb2-19" tabindex="-1"></a>    <span class="co">/// The length of the current entries</span></span>
<span id="cb2-20"><a href="#cb2-20" tabindex="-1"></a>    len<span class="op">:</span> <span class="dt">u16</span> <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb2-21"><a href="#cb2-21" tabindex="-1"></a>    <span class="co">/// The data</span></span>
<span id="cb2-22"><a href="#cb2-22" tabindex="-1"></a>    vec<span class="op">:</span> [<span class="dv">3</span> <span class="op">*</span> <span class="dv">4096</span>]<span class="dt">u8</span> <span class="op">=</span> <span class="cn">undefined</span><span class="op">,</span></span>
<span id="cb2-23"><a href="#cb2-23" tabindex="-1"></a>};</span>
<span id="cb2-24"><a href="#cb2-24" tabindex="-1"></a></span>
<span id="cb2-25"><a href="#cb2-25" tabindex="-1"></a><span class="co">/// An entry in the table data. Sort of like a slice, but using</span></span>
<span id="cb2-26"><a href="#cb2-26" tabindex="-1"></a><span class="co">/// 16-bit indexes.</span></span>
<span id="cb2-27"><a href="#cb2-27" tabindex="-1"></a><span class="at">const</span> HdrPtr <span class="op">=</span> <span class="kw">struct</span> {</span>
<span id="cb2-28"><a href="#cb2-28" tabindex="-1"></a>    start<span class="op">:</span> <span class="dt">u16</span><span class="op">,</span></span>
<span id="cb2-29"><a href="#cb2-29" tabindex="-1"></a>    nameLen<span class="op">:</span> <span class="dt">u16</span><span class="op">,</span></span>
<span id="cb2-30"><a href="#cb2-30" tabindex="-1"></a>    valueLen<span class="op">:</span> <span class="dt">u16</span><span class="op">,</span></span>
<span id="cb2-31"><a href="#cb2-31" tabindex="-1"></a>};</span>
<span id="cb2-32"><a href="#cb2-32" tabindex="-1"></a></span>
<span id="cb2-33"><a href="#cb2-33" tabindex="-1"></a><span class="co">/// An inner table indexing the table&#39;s entries. Needed because the</span></span>
<span id="cb2-34"><a href="#cb2-34" tabindex="-1"></a><span class="co">/// entries are uneven.</span></span>
<span id="cb2-35"><a href="#cb2-35" tabindex="-1"></a><span class="at">const</span> HdrIndx <span class="op">=</span> <span class="kw">struct</span> {</span>
<span id="cb2-36"><a href="#cb2-36" tabindex="-1"></a>    start<span class="op">:</span> <span class="dt">u8</span> <span class="op">=</span> <span class="dv">127</span><span class="op">,</span></span>
<span id="cb2-37"><a href="#cb2-37" tabindex="-1"></a>    len<span class="op">:</span> <span class="dt">u8</span> <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb2-38"><a href="#cb2-38" tabindex="-1"></a>    vec<span class="op">:</span> [<span class="dv">256</span>]HdrPtr <span class="op">=</span> <span class="cn">undefined</span><span class="op">,</span></span>
<span id="cb2-39"><a href="#cb2-39" tabindex="-1"></a>};</span>
<span id="cb2-40"><a href="#cb2-40" tabindex="-1"></a></span>
<span id="cb2-41"><a href="#cb2-41" tabindex="-1"></a><span class="co">/// The encapsulating Table struct because I forgot that files in Zig</span></span>
<span id="cb2-42"><a href="#cb2-42" tabindex="-1"></a><span class="co">/// are structs.</span></span>
<span id="cb2-43"><a href="#cb2-43" tabindex="-1"></a><span class="kw">pub</span> <span class="at">const</span> Table <span class="op">=</span> <span class="kw">struct</span> {</span>
<span id="cb2-44"><a href="#cb2-44" tabindex="-1"></a>    data<span class="op">:</span> HdrData <span class="op">=</span> HdrData{}<span class="op">,</span></span>
<span id="cb2-45"><a href="#cb2-45" tabindex="-1"></a>    indx<span class="op">:</span> HdrIndx <span class="op">=</span> HdrIndx{}<span class="op">,</span></span>
<span id="cb2-46"><a href="#cb2-46" tabindex="-1"></a>    size<span class="op">:</span> <span class="dt">u16</span> <span class="op">=</span> <span class="dv">4096</span><span class="op">,</span></span>
<span id="cb2-47"><a href="#cb2-47" tabindex="-1"></a></span>
<span id="cb2-48"><a href="#cb2-48" tabindex="-1"></a>    <span class="kw">fn</span> capacity(<span class="va">self</span><span class="op">:</span> <span class="op">*</span>Table) <span class="dt">u16</span> {</span>
<span id="cb2-49"><a href="#cb2-49" tabindex="-1"></a>        <span class="cf">return</span> <span class="va">self</span><span class="op">.</span>data<span class="op">.</span>vec<span class="op">.</span>len <span class="op">/</span> <span class="dv">3</span>;</span>
<span id="cb2-50"><a href="#cb2-50" tabindex="-1"></a>    }</span>
<span id="cb2-51"><a href="#cb2-51" tabindex="-1"></a></span>
<span id="cb2-52"><a href="#cb2-52" tabindex="-1"></a>    <span class="co">/// Get an entry from the table. The returned struct is borrowed</span></span>
<span id="cb2-53"><a href="#cb2-53" tabindex="-1"></a>    <span class="co">/// and needs to be copied if it is to be used after it has been</span></span>
<span id="cb2-54"><a href="#cb2-54" tabindex="-1"></a>    <span class="co">/// evicted from the table.</span></span>
<span id="cb2-55"><a href="#cb2-55" tabindex="-1"></a>    <span class="kw">pub</span> <span class="kw">fn</span> get(<span class="va">self</span><span class="op">:</span> <span class="op">*</span>Table<span class="op">,</span> i<span class="op">:</span> <span class="dt">u8</span>) <span class="op">!</span>HdrConst {</span>
<span id="cb2-56"><a href="#cb2-56" tabindex="-1"></a>        <span class="at">const</span> data <span class="op">=</span> <span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>data;</span>
<span id="cb2-57"><a href="#cb2-57" tabindex="-1"></a>        <span class="at">const</span> indx <span class="op">=</span> <span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>indx;</span>
<span id="cb2-58"><a href="#cb2-58" tabindex="-1"></a>        <span class="at">const</span> slen <span class="op">=</span> STATIC_INDX<span class="op">.</span>len;</span>
<span id="cb2-59"><a href="#cb2-59" tabindex="-1"></a></span>
<span id="cb2-60"><a href="#cb2-60" tabindex="-1"></a>        <span class="cf">if</span> (i <span class="op">==</span> <span class="dv">0</span>)</span>
<span id="cb2-61"><a href="#cb2-61" tabindex="-1"></a>            <span class="cf">return</span> <span class="kw">error</span><span class="op">.</span>InvalidIndexZero;</span>
<span id="cb2-62"><a href="#cb2-62" tabindex="-1"></a></span>
<span id="cb2-63"><a href="#cb2-63" tabindex="-1"></a>        <span class="cf">if</span> (i <span class="op">&lt;</span> slen)</span>
<span id="cb2-64"><a href="#cb2-64" tabindex="-1"></a>            <span class="cf">return</span> STATIC_INDX[i];</span>
<span id="cb2-65"><a href="#cb2-65" tabindex="-1"></a></span>
<span id="cb2-66"><a href="#cb2-66" tabindex="-1"></a>        <span class="cf">if</span> (i <span class="op">-</span> slen <span class="op">&gt;=</span> indx<span class="op">.</span>len) {</span>
<span id="cb2-67"><a href="#cb2-67" tabindex="-1"></a>            <span class="cf">return</span> <span class="kw">error</span><span class="op">.</span>IndexTooBig;</span>
<span id="cb2-68"><a href="#cb2-68" tabindex="-1"></a>        }</span>
<span id="cb2-69"><a href="#cb2-69" tabindex="-1"></a></span>
<span id="cb2-70"><a href="#cb2-70" tabindex="-1"></a>        <span class="at">const</span> hdr <span class="op">=</span> indx<span class="op">.</span>vec[indx<span class="op">.</span>start <span class="op">+</span> (i <span class="op">-</span> slen)];</span>
<span id="cb2-71"><a href="#cb2-71" tabindex="-1"></a>        <span class="at">const</span> start <span class="op">=</span> <span class="cf">if</span> (hdr<span class="op">.</span>start <span class="op">&lt;</span> data<span class="op">.</span>start)</span>
<span id="cb2-72"><a href="#cb2-72" tabindex="-1"></a>            <span class="dv">2</span> <span class="op">*</span> <span class="va">self</span><span class="op">.</span>capacity() <span class="op">+</span> (hdr<span class="op">.</span>start <span class="op">-</span> data<span class="op">.</span>prevStart)</span>
<span id="cb2-73"><a href="#cb2-73" tabindex="-1"></a>        <span class="cf">else</span></span>
<span id="cb2-74"><a href="#cb2-74" tabindex="-1"></a>            hdr<span class="op">.</span>start;</span>
<span id="cb2-75"><a href="#cb2-75" tabindex="-1"></a></span>
<span id="cb2-76"><a href="#cb2-76" tabindex="-1"></a>        <span class="at">const</span> value_start <span class="op">=</span> start <span class="op">+</span> hdr<span class="op">.</span>nameLen;</span>
<span id="cb2-77"><a href="#cb2-77" tabindex="-1"></a></span>
<span id="cb2-78"><a href="#cb2-78" tabindex="-1"></a>        <span class="cf">return</span> <span class="op">.</span>{</span>
<span id="cb2-79"><a href="#cb2-79" tabindex="-1"></a>            <span class="op">.</span>name <span class="op">=</span> data<span class="op">.</span>vec[start<span class="op">..</span>value_start]<span class="op">,</span></span>
<span id="cb2-80"><a href="#cb2-80" tabindex="-1"></a>            <span class="op">.</span>value <span class="op">=</span> data<span class="op">.</span>vec[value_start <span class="op">..</span> value_start <span class="op">+</span> hdr<span class="op">.</span>valueLen]<span class="op">,</span></span>
<span id="cb2-81"><a href="#cb2-81" tabindex="-1"></a>        };</span>
<span id="cb2-82"><a href="#cb2-82" tabindex="-1"></a>    }</span>
<span id="cb2-83"><a href="#cb2-83" tabindex="-1"></a></span>
<span id="cb2-84"><a href="#cb2-84" tabindex="-1"></a>    <span class="co">/// The length of the table according to the HPACK spec</span></span>
<span id="cb2-85"><a href="#cb2-85" tabindex="-1"></a>    <span class="kw">fn</span> nominalLen(<span class="va">self</span><span class="op">:</span> <span class="op">*</span>Table<span class="op">,</span> name<span class="op">:</span> []<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span> value<span class="op">:</span> []<span class="at">const</span> <span class="dt">u8</span>) <span class="dt">usize</span> {</span>
<span id="cb2-86"><a href="#cb2-86" tabindex="-1"></a>        <span class="at">const</span> estimated_overhead <span class="op">=</span> <span class="dv">32</span> <span class="op">*</span> (<span class="dv">1</span> <span class="op">+</span> <span class="bu">@as</span>(<span class="dt">usize</span><span class="op">,</span> <span class="va">self</span><span class="op">.</span>indx<span class="op">.</span>len));</span>
<span id="cb2-87"><a href="#cb2-87" tabindex="-1"></a></span>
<span id="cb2-88"><a href="#cb2-88" tabindex="-1"></a>        <span class="cf">return</span> <span class="va">self</span><span class="op">.</span>data<span class="op">.</span>len <span class="op">+</span> name<span class="op">.</span>len <span class="op">+</span> value<span class="op">.</span>len <span class="op">+</span> estimated_overhead;</span>
<span id="cb2-89"><a href="#cb2-89" tabindex="-1"></a>    }</span>
<span id="cb2-90"><a href="#cb2-90" tabindex="-1"></a></span>
<span id="cb2-91"><a href="#cb2-91" tabindex="-1"></a>    <span class="co">/// Add an entry to the table. The name and value arguments can</span></span>
<span id="cb2-92"><a href="#cb2-92" tabindex="-1"></a>    <span class="co">/// point to an existing entry which will evict itself.</span></span>
<span id="cb2-93"><a href="#cb2-93" tabindex="-1"></a>    <span class="kw">pub</span> <span class="kw">fn</span> add(<span class="va">self</span><span class="op">:</span> <span class="op">*</span>Table<span class="op">,</span> name<span class="op">:</span> []<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span> value<span class="op">:</span> []<span class="at">const</span> <span class="dt">u8</span>) <span class="op">!</span><span class="dt">void</span> {</span>
<span id="cb2-94"><a href="#cb2-94" tabindex="-1"></a>        <span class="at">const</span> data <span class="op">=</span> <span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>data;</span>
<span id="cb2-95"><a href="#cb2-95" tabindex="-1"></a>        <span class="at">const</span> indx <span class="op">=</span> <span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>indx;</span>
<span id="cb2-96"><a href="#cb2-96" tabindex="-1"></a></span>
<span id="cb2-97"><a href="#cb2-97" tabindex="-1"></a>        <span class="cf">while</span> (<span class="va">self</span><span class="op">.</span>nominalLen(name<span class="op">,</span> value) <span class="op">&gt;</span> <span class="va">self</span><span class="op">.</span>size) {</span>
<span id="cb2-98"><a href="#cb2-98" tabindex="-1"></a>            <span class="cf">if</span> (indx<span class="op">.</span>len <span class="op">==</span> <span class="dv">0</span>)</span>
<span id="cb2-99"><a href="#cb2-99" tabindex="-1"></a>                <span class="cf">return</span>;</span>
<span id="cb2-100"><a href="#cb2-100" tabindex="-1"></a></span>
<span id="cb2-101"><a href="#cb2-101" tabindex="-1"></a>            <span class="at">const</span> last <span class="op">=</span> indx<span class="op">.</span>vec[indx<span class="op">.</span>start <span class="op">+</span> indx<span class="op">.</span>len <span class="op">-</span> <span class="dv">1</span>];</span>
<span id="cb2-102"><a href="#cb2-102" tabindex="-1"></a></span>
<span id="cb2-103"><a href="#cb2-103" tabindex="-1"></a>            data<span class="op">.</span>len <span class="op">-=</span> last<span class="op">.</span>nameLen <span class="op">+</span> last<span class="op">.</span>valueLen;</span>
<span id="cb2-104"><a href="#cb2-104" tabindex="-1"></a>            indx<span class="op">.</span>len <span class="op">-=</span> <span class="dv">1</span>;</span>
<span id="cb2-105"><a href="#cb2-105" tabindex="-1"></a>        }</span>
<span id="cb2-106"><a href="#cb2-106" tabindex="-1"></a></span>
<span id="cb2-107"><a href="#cb2-107" tabindex="-1"></a>        <span class="cf">if</span> (indx<span class="op">.</span>start <span class="op">==</span> <span class="dv">0</span>) {</span>
<span id="cb2-108"><a href="#cb2-108" tabindex="-1"></a>            mem<span class="op">.</span>copy(HdrPtr<span class="op">,</span> indx<span class="op">.</span>vec[<span class="dv">128</span><span class="er">.</span><span class="op">.</span>]<span class="op">,</span> indx<span class="op">.</span>vec[<span class="dv">0</span><span class="er">.</span><span class="op">.</span><span class="dv">128</span>]);</span>
<span id="cb2-109"><a href="#cb2-109" tabindex="-1"></a>            indx<span class="op">.</span>start <span class="op">=</span> <span class="dv">128</span>;</span>
<span id="cb2-110"><a href="#cb2-110" tabindex="-1"></a>        }</span>
<span id="cb2-111"><a href="#cb2-111" tabindex="-1"></a></span>
<span id="cb2-112"><a href="#cb2-112" tabindex="-1"></a>        indx<span class="op">.</span>len <span class="op">+=</span> <span class="dv">1</span>;</span>
<span id="cb2-113"><a href="#cb2-113" tabindex="-1"></a>        indx<span class="op">.</span>start <span class="op">-=</span> <span class="dv">1</span>;</span>
<span id="cb2-114"><a href="#cb2-114" tabindex="-1"></a></span>
<span id="cb2-115"><a href="#cb2-115" tabindex="-1"></a>        <span class="at">const</span> hdr <span class="op">=</span> <span class="op">&amp;</span>indx<span class="op">.</span>vec[indx<span class="op">.</span>start];</span>
<span id="cb2-116"><a href="#cb2-116" tabindex="-1"></a>        hdr<span class="op">.</span>nameLen <span class="op">=</span> <span class="bu">@truncate</span>(<span class="dt">u16</span><span class="op">,</span> name<span class="op">.</span>len);</span>
<span id="cb2-117"><a href="#cb2-117" tabindex="-1"></a>        hdr<span class="op">.</span>valueLen <span class="op">=</span> <span class="bu">@truncate</span>(<span class="dt">u16</span><span class="op">,</span> value<span class="op">.</span>len);</span>
<span id="cb2-118"><a href="#cb2-118" tabindex="-1"></a></span>
<span id="cb2-119"><a href="#cb2-119" tabindex="-1"></a>        data<span class="op">.</span>start <span class="op">-=</span> hdr<span class="op">.</span>nameLen;</span>
<span id="cb2-120"><a href="#cb2-120" tabindex="-1"></a>        data<span class="op">.</span>start <span class="op">-=</span> hdr<span class="op">.</span>valueLen;</span>
<span id="cb2-121"><a href="#cb2-121" tabindex="-1"></a>        hdr<span class="op">.</span>start <span class="op">=</span> data<span class="op">.</span>start;</span>
<span id="cb2-122"><a href="#cb2-122" tabindex="-1"></a></span>
<span id="cb2-123"><a href="#cb2-123" tabindex="-1"></a>        data<span class="op">.</span>len <span class="op">+=</span> hdr<span class="op">.</span>nameLen;</span>
<span id="cb2-124"><a href="#cb2-124" tabindex="-1"></a>        data<span class="op">.</span>len <span class="op">+=</span> hdr<span class="op">.</span>valueLen;</span>
<span id="cb2-125"><a href="#cb2-125" tabindex="-1"></a></span>
<span id="cb2-126"><a href="#cb2-126" tabindex="-1"></a>        <span class="at">const</span> value_start <span class="op">=</span> hdr<span class="op">.</span>start <span class="op">+</span> hdr<span class="op">.</span>nameLen;</span>
<span id="cb2-127"><a href="#cb2-127" tabindex="-1"></a></span>
<span id="cb2-128"><a href="#cb2-128" tabindex="-1"></a>        mem<span class="op">.</span>copy(<span class="dt">u8</span><span class="op">,</span> data<span class="op">.</span>vec[hdr<span class="op">.</span>start<span class="op">..</span>value_start]<span class="op">,</span> name);</span>
<span id="cb2-129"><a href="#cb2-129" tabindex="-1"></a>        mem<span class="op">.</span>copy(<span class="dt">u8</span><span class="op">,</span> data<span class="op">.</span>vec[value_start <span class="op">..</span> value_start <span class="op">+</span> hdr<span class="op">.</span>valueLen]<span class="op">,</span> value);</span>
<span id="cb2-130"><a href="#cb2-130" tabindex="-1"></a></span>
<span id="cb2-131"><a href="#cb2-131" tabindex="-1"></a>        <span class="cf">if</span> (data<span class="op">.</span>start <span class="op">&lt;</span> <span class="va">self</span><span class="op">.</span>capacity()) {</span>
<span id="cb2-132"><a href="#cb2-132" tabindex="-1"></a>            mem<span class="op">.</span>copyBackwards(</span>
<span id="cb2-133"><a href="#cb2-133" tabindex="-1"></a>                <span class="dt">u8</span><span class="op">,</span></span>
<span id="cb2-134"><a href="#cb2-134" tabindex="-1"></a>                data<span class="op">.</span>vec[<span class="dv">2</span> <span class="op">*</span> <span class="va">self</span><span class="op">.</span>capacity() <span class="op">..</span>]<span class="op">,</span></span>
<span id="cb2-135"><a href="#cb2-135" tabindex="-1"></a>                data<span class="op">.</span>vec[data<span class="op">.</span>start <span class="op">..</span> data<span class="op">.</span>start <span class="op">+</span> data<span class="op">.</span>len]<span class="op">,</span></span>
<span id="cb2-136"><a href="#cb2-136" tabindex="-1"></a>            );</span>
<span id="cb2-137"><a href="#cb2-137" tabindex="-1"></a>            data<span class="op">.</span>prevStart <span class="op">=</span> data<span class="op">.</span>start;</span>
<span id="cb2-138"><a href="#cb2-138" tabindex="-1"></a>            data<span class="op">.</span>start <span class="op">=</span> <span class="dv">2</span> <span class="op">*</span> <span class="va">self</span><span class="op">.</span>capacity();</span>
<span id="cb2-139"><a href="#cb2-139" tabindex="-1"></a>        }</span>
<span id="cb2-140"><a href="#cb2-140" tabindex="-1"></a>    }</span>
<span id="cb2-141"><a href="#cb2-141" tabindex="-1"></a>};</span></code></pre></div>
            <p>This stores the full name and value of each header in a
            buffer that takes up 3x the space of the nominal table size.
            This is not great memory usage, but it’s relatively simple
            and hopefully cache efficient. Because the memory accesses
            are likely to be close together.</p>
            <p>The table is of a fixed size of 4096, the minimum
            required by HTTP/2. Zig would allow this to be changed
            easily in a variety of ways. I just haven’t bothered to do
            it.</p>
            <p>In the end it’s not a lot of code, although it’s the type
            of code which can be a pain to debug. Zig’s built in tests
            helped with that. You can run them with:</p>
            <div class="sourceCode" id="cb3"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a><span class="ex">$</span> zig test src/hdrIndx.zig</span></code></pre></div>
            <p>Each file in the <code>src</code> directory has its own
            tests. Continuing with HPACK let’s look at integer decoding.
            Each header field name and value is prepended with a
            variable length integer describing the fields length.</p>
            <p>What’s more this integer encoding allows <a
            href="https://www.rfc-editor.org/rfc/rfc7541#section-5.1">some
            bits</a> in the first byte of the encoding to be used for
            flags or whatever. So the integer value actually starts part
            way through the first byte. The first bit on the remaining
            bytes (if any) is then used as a stop bit.</p>
            <p>This allows for infinitely large integers, but I don’t
            allow that because I’m a spoil sport.</p>
            <div class="sourceCode" id="cb4"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb4-1"><a href="#cb4-1" tabindex="-1"></a><span class="co">/// Get the unsigned type big enough to count the bits in T. Needed</span></span>
<span id="cb4-2"><a href="#cb4-2" tabindex="-1"></a><span class="co">/// because Zig constrains the right hand side of a shift to an</span></span>
<span id="cb4-3"><a href="#cb4-3" tabindex="-1"></a><span class="co">/// integer only big enough to perform a full shift. Which is only u3</span></span>
<span id="cb4-4"><a href="#cb4-4" tabindex="-1"></a><span class="co">/// for u8 (for e.g.).</span></span>
<span id="cb4-5"><a href="#cb4-5" tabindex="-1"></a><span class="co">///</span></span>
<span id="cb4-6"><a href="#cb4-6" tabindex="-1"></a><span class="co">/// Meanwhile I don&#39;t know a way to specify this type other than to</span></span>
<span id="cb4-7"><a href="#cb4-7" tabindex="-1"></a><span class="co">/// construct it like this.</span></span>
<span id="cb4-8"><a href="#cb4-8" tabindex="-1"></a><span class="kw">fn</span> ShiftSize(<span class="at">comptime</span> T<span class="op">:</span> <span class="dt">type</span>) <span class="dt">type</span> {</span>
<span id="cb4-9"><a href="#cb4-9" tabindex="-1"></a>    <span class="at">const</span> ShiftInt <span class="op">=</span> Type{</span>
<span id="cb4-10"><a href="#cb4-10" tabindex="-1"></a>        <span class="op">.</span>Int <span class="op">=</span> <span class="op">.</span>{</span>
<span id="cb4-11"><a href="#cb4-11" tabindex="-1"></a>            <span class="op">.</span>signedness <span class="op">=</span> std<span class="op">.</span>builtin<span class="op">.</span>Signedness<span class="op">.</span>unsigned<span class="op">,</span></span>
<span id="cb4-12"><a href="#cb4-12" tabindex="-1"></a>            <span class="op">.</span>bits <span class="op">=</span> <span class="at">comptime</span> std<span class="op">.</span>math<span class="op">.</span>log2_int(<span class="dt">u16</span><span class="op">,</span> <span class="bu">@bitSizeOf</span>(T))<span class="op">,</span></span>
<span id="cb4-13"><a href="#cb4-13" tabindex="-1"></a>        }<span class="op">,</span></span>
<span id="cb4-14"><a href="#cb4-14" tabindex="-1"></a>    };</span>
<span id="cb4-15"><a href="#cb4-15" tabindex="-1"></a></span>
<span id="cb4-16"><a href="#cb4-16" tabindex="-1"></a>    <span class="cf">return</span> <span class="bu">@Type</span>(ShiftInt);</span>
<span id="cb4-17"><a href="#cb4-17" tabindex="-1"></a>}</span>
<span id="cb4-18"><a href="#cb4-18" tabindex="-1"></a></span>
<span id="cb4-19"><a href="#cb4-19" tabindex="-1"></a><span class="kw">fn</span> decodeInt(<span class="at">comptime</span> T<span class="op">:</span> <span class="dt">type</span><span class="op">,</span> <span class="at">comptime</span> n<span class="op">:</span> u3<span class="op">,</span> buf<span class="op">:</span> <span class="op">*</span>[]<span class="at">const</span> <span class="dt">u8</span>) <span class="op">!</span>T {</span>
<span id="cb4-20"><a href="#cb4-20" tabindex="-1"></a>    <span class="at">const</span> prefix <span class="op">=</span> (<span class="dv">1</span> <span class="op">&lt;&lt;</span> n) <span class="op">-</span> <span class="dv">1</span>;</span>
<span id="cb4-21"><a href="#cb4-21" tabindex="-1"></a>    <span class="at">var</span> b <span class="op">=</span> buf<span class="op">.*</span>[<span class="dv">0</span>];</span>
<span id="cb4-22"><a href="#cb4-22" tabindex="-1"></a>    <span class="at">var</span> i<span class="op">:</span> T <span class="op">=</span> b <span class="op">&amp;</span> prefix;</span>
<span id="cb4-23"><a href="#cb4-23" tabindex="-1"></a></span>
<span id="cb4-24"><a href="#cb4-24" tabindex="-1"></a>    <span class="cf">if</span> (i <span class="op">&lt;</span> prefix) {</span>
<span id="cb4-25"><a href="#cb4-25" tabindex="-1"></a>        buf<span class="op">.*</span> <span class="op">=</span> buf<span class="op">.*</span>[<span class="dv">1</span><span class="er">.</span><span class="op">.</span>];</span>
<span id="cb4-26"><a href="#cb4-26" tabindex="-1"></a>        <span class="cf">return</span> i;</span>
<span id="cb4-27"><a href="#cb4-27" tabindex="-1"></a>    }</span>
<span id="cb4-28"><a href="#cb4-28" tabindex="-1"></a></span>
<span id="cb4-29"><a href="#cb4-29" tabindex="-1"></a>    <span class="at">var</span> j<span class="op">:</span> ShiftSize(T) <span class="op">=</span> <span class="dv">1</span>;</span>
<span id="cb4-30"><a href="#cb4-30" tabindex="-1"></a>    <span class="cf">while</span> ((j <span class="op">-</span> <span class="dv">1</span>) <span class="op">*</span> <span class="dv">7</span> <span class="op">&lt;</span> <span class="bu">@bitSizeOf</span>(T)) <span class="op">:</span> (j <span class="op">+=</span> <span class="dv">1</span>) {</span>
<span id="cb4-31"><a href="#cb4-31" tabindex="-1"></a>        b <span class="op">=</span> buf<span class="op">.*</span>[j];</span>
<span id="cb4-32"><a href="#cb4-32" tabindex="-1"></a></span>
<span id="cb4-33"><a href="#cb4-33" tabindex="-1"></a>        i <span class="op">+=</span> <span class="bu">@as</span>(T<span class="op">,</span> (b <span class="op">&amp;</span> <span class="dv">0</span><span class="er">x7f</span>)) <span class="op">&lt;&lt;</span> (<span class="dv">7</span> <span class="op">*</span> (j <span class="op">-</span> <span class="dv">1</span>));</span>
<span id="cb4-34"><a href="#cb4-34" tabindex="-1"></a></span>
<span id="cb4-35"><a href="#cb4-35" tabindex="-1"></a>        <span class="cf">if</span> (b <span class="op">&lt;</span> <span class="dv">0</span><span class="er">x80</span>)</span>
<span id="cb4-36"><a href="#cb4-36" tabindex="-1"></a>            <span class="cf">break</span>;</span>
<span id="cb4-37"><a href="#cb4-37" tabindex="-1"></a>    } <span class="cf">else</span> {</span>
<span id="cb4-38"><a href="#cb4-38" tabindex="-1"></a>        <span class="cf">return</span> UnpackError<span class="op">.</span>IntTooBig;</span>
<span id="cb4-39"><a href="#cb4-39" tabindex="-1"></a>    }</span>
<span id="cb4-40"><a href="#cb4-40" tabindex="-1"></a></span>
<span id="cb4-41"><a href="#cb4-41" tabindex="-1"></a>    buf<span class="op">.*</span> <span class="op">=</span> buf<span class="op">.*</span>[j <span class="op">+</span> <span class="dv">1</span> <span class="op">..</span>];</span>
<span id="cb4-42"><a href="#cb4-42" tabindex="-1"></a>    <span class="cf">return</span> i;</span>
<span id="cb4-43"><a href="#cb4-43" tabindex="-1"></a>}</span></code></pre></div>
            <p>It’s been a while since I wrote this and boy am I glad I
            wrote that stuff about the shift size. This function takes
            <code>comptime</code> parameters which allow different
            functions to be generated depending on the type it returns
            and how many bits are ignored (<code>n</code>).</p>
            <p>Interestingly it seems that Zig forces you to use the
            minimum sized type to perform a shift. Which I think caught
            a few mistakes. The <code>ShiftSize</code> function is
            constructing the necessary sized type to shift
            <code>T</code> that was passed into
            <code>decodeInt</code>.</p>
            <p>So indeed, Zig allows constructing arbitrary types from
            ordinary code at <code>comptime</code>.</p>
            <p>Something else to note about this code is that
            <code>buf</code> is a pointer to a slice. The syntax
            <code>buf.*</code> dereferences the pointer. I update the
            slice to chop off the bits that were used decoding the
            integer. I’m not sure this is an advisable thing to do.</p>
            <p>Now let’s look at string encodings, which display the
            other type of compression employed by HPACK.</p>
            <div class="sourceCode" id="cb5"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb5-1"><a href="#cb5-1" tabindex="-1"></a><span class="co">/// Decode a string which if it is not Huffman encoded is fairly</span></span>
<span id="cb5-2"><a href="#cb5-2" tabindex="-1"></a><span class="co">/// straight forward.</span></span>
<span id="cb5-3"><a href="#cb5-3" tabindex="-1"></a><span class="co">///</span></span>
<span id="cb5-4"><a href="#cb5-4" tabindex="-1"></a><span class="co">/// If it is Huffman encoded then we have to deal with the fact</span></span>
<span id="cb5-5"><a href="#cb5-5" tabindex="-1"></a><span class="co">/// Huffman codes are not byte aligned and are variable length.</span></span>
<span id="cb5-6"><a href="#cb5-6" tabindex="-1"></a><span class="co">///</span></span>
<span id="cb5-7"><a href="#cb5-7" tabindex="-1"></a><span class="co">/// We could put the huffman codes in a binary tree and lookup one bit</span></span>
<span id="cb5-8"><a href="#cb5-8" tabindex="-1"></a><span class="co">/// at a time. However I doubt this is the right place to start on</span></span>
<span id="cb5-9"><a href="#cb5-9" tabindex="-1"></a><span class="co">/// common CPUs.</span></span>
<span id="cb5-10"><a href="#cb5-10" tabindex="-1"></a><span class="co">///</span></span>
<span id="cb5-11"><a href="#cb5-11" tabindex="-1"></a><span class="co">/// So instead we shift (at most) the next four bytes into a</span></span>
<span id="cb5-12"><a href="#cb5-12" tabindex="-1"></a><span class="co">/// buffer. Then compare the first bits of the first byte to the</span></span>
<span id="cb5-13"><a href="#cb5-13" tabindex="-1"></a><span class="co">/// shortest huffman codes. If it doesn&#39;t match any, then move on to</span></span>
<span id="cb5-14"><a href="#cb5-14" tabindex="-1"></a><span class="co">/// longer codes until we are comparing all four bytes.</span></span>
<span id="cb5-15"><a href="#cb5-15" tabindex="-1"></a><span class="co">///</span></span>
<span id="cb5-16"><a href="#cb5-16" tabindex="-1"></a><span class="co">/// I haven&#39;t done any research into the fastest methods of Huffman</span></span>
<span id="cb5-17"><a href="#cb5-17" tabindex="-1"></a><span class="co">/// decoding. This is just a first approximation.</span></span>
<span id="cb5-18"><a href="#cb5-18" tabindex="-1"></a><span class="kw">fn</span> decodeStr(from<span class="op">:</span> <span class="op">*</span>[]<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span> to<span class="op">:</span> <span class="op">*</span>[]<span class="dt">u8</span>) <span class="op">!</span>[]<span class="at">const</span> <span class="dt">u8</span> {</span>
<span id="cb5-19"><a href="#cb5-19" tabindex="-1"></a>    <span class="at">const</span> huffman <span class="op">=</span> from<span class="op">.*</span>[<span class="dv">0</span>] <span class="op">&amp;</span> <span class="dv">0</span><span class="er">x80</span> <span class="op">==</span> <span class="dv">0</span><span class="er">x80</span>;</span>
<span id="cb5-20"><a href="#cb5-20" tabindex="-1"></a>    <span class="at">const</span> len <span class="op">=</span> <span class="cf">try</span> decodeInt(<span class="dt">u16</span><span class="op">,</span> <span class="dv">7</span><span class="op">,</span> from);</span>
<span id="cb5-21"><a href="#cb5-21" tabindex="-1"></a>    <span class="at">const</span> str <span class="op">=</span> from<span class="op">.*</span>[<span class="dv">0</span><span class="er">.</span><span class="op">.</span>len];</span>
<span id="cb5-22"><a href="#cb5-22" tabindex="-1"></a></span>
<span id="cb5-23"><a href="#cb5-23" tabindex="-1"></a>    from<span class="op">.*</span> <span class="op">=</span> from<span class="op">.*</span>[len<span class="op">..</span>];</span>
<span id="cb5-24"><a href="#cb5-24" tabindex="-1"></a></span>
<span id="cb5-25"><a href="#cb5-25" tabindex="-1"></a>    <span class="cf">if</span> (<span class="op">!</span>huffman)</span>
<span id="cb5-26"><a href="#cb5-26" tabindex="-1"></a>        <span class="cf">return</span> str;</span>
<span id="cb5-27"><a href="#cb5-27" tabindex="-1"></a></span>
<span id="cb5-28"><a href="#cb5-28" tabindex="-1"></a>    <span class="at">var</span> i<span class="op">:</span> <span class="dt">u16</span> <span class="op">=</span> <span class="dv">0</span>;</span>
<span id="cb5-29"><a href="#cb5-29" tabindex="-1"></a>    <span class="at">var</span> j<span class="op">:</span> <span class="dt">u32</span> <span class="op">=</span> <span class="dv">0</span>;</span>
<span id="cb5-30"><a href="#cb5-30" tabindex="-1"></a>    <span class="at">var</span> k<span class="op">:</span> <span class="dt">u16</span> <span class="op">=</span> <span class="dv">0</span>;</span>
<span id="cb5-31"><a href="#cb5-31" tabindex="-1"></a>    <span class="at">var</span> c <span class="op">=</span> [_]<span class="dt">u8</span>{<span class="dv">0</span>} <span class="op">**</span> <span class="dv">5</span>;</span>
<span id="cb5-32"><a href="#cb5-32" tabindex="-1"></a></span>
<span id="cb5-33"><a href="#cb5-33" tabindex="-1"></a>    all<span class="op">:</span> <span class="cf">while</span> (i <span class="op">&lt;</span> len) {</span>
<span id="cb5-34"><a href="#cb5-34" tabindex="-1"></a>        mem<span class="op">.</span>copy(<span class="dt">u8</span><span class="op">,</span> <span class="op">&amp;</span>c<span class="op">,</span> str[i<span class="op">..</span>std<span class="op">.</span>math<span class="op">.</span>min(i <span class="op">+</span> <span class="dv">5</span><span class="op">,</span> str<span class="op">.</span>len)]);</span>
<span id="cb5-35"><a href="#cb5-35" tabindex="-1"></a></span>
<span id="cb5-36"><a href="#cb5-36" tabindex="-1"></a>        <span class="at">const</span> j_rem <span class="op">=</span> <span class="bu">@truncate</span>(u3<span class="op">,</span> j);</span>
<span id="cb5-37"><a href="#cb5-37" tabindex="-1"></a></span>
<span id="cb5-38"><a href="#cb5-38" tabindex="-1"></a>        <span class="at">var</span> l<span class="op">:</span> u3 <span class="op">=</span> <span class="dv">0</span>;</span>
<span id="cb5-39"><a href="#cb5-39" tabindex="-1"></a>        <span class="cf">while</span> (j_rem <span class="op">&gt;</span> <span class="dv">0</span> <span class="kw">and</span> l <span class="op">&lt;</span> c<span class="op">.</span>len <span class="op">-</span> <span class="dv">1</span>) <span class="op">:</span> (l <span class="op">+=</span> <span class="dv">1</span>) {</span>
<span id="cb5-40"><a href="#cb5-40" tabindex="-1"></a>            c[l] <span class="op">&lt;&lt;=</span> j_rem;</span>
<span id="cb5-41"><a href="#cb5-41" tabindex="-1"></a>            c[l] <span class="op">|=</span> c[l <span class="op">+</span> <span class="dv">1</span>] <span class="op">&gt;&gt;</span> <span class="bu">@truncate</span>(u3<span class="op">,</span> <span class="dv">8</span> <span class="op">-</span> <span class="bu">@as</span>(u4<span class="op">,</span> j_rem));</span>
<span id="cb5-42"><a href="#cb5-42" tabindex="-1"></a>        }</span>
<span id="cb5-43"><a href="#cb5-43" tabindex="-1"></a></span>
<span id="cb5-44"><a href="#cb5-44" tabindex="-1"></a>        <span class="at">var</span> glen<span class="op">:</span> u5 <span class="op">=</span> <span class="dv">0</span>;</span>
<span id="cb5-45"><a href="#cb5-45" tabindex="-1"></a>        <span class="at">const</span> dehuff <span class="op">=</span> decode<span class="op">:</span> <span class="cf">for</span> (DEHUFF) <span class="op">|</span>group<span class="op">|</span> {</span>
<span id="cb5-46"><a href="#cb5-46" tabindex="-1"></a>            glen <span class="op">=</span> <span class="dv">1</span> <span class="op">+</span> ((group<span class="op">.</span>len <span class="op">-</span> <span class="dv">1</span>) <span class="op">&gt;&gt;</span> <span class="dv">3</span>);</span>
<span id="cb5-47"><a href="#cb5-47" tabindex="-1"></a></span>
<span id="cb5-48"><a href="#cb5-48" tabindex="-1"></a>            <span class="cf">if</span> (glen <span class="op">!=</span> group<span class="op">.</span>codes[<span class="dv">0</span>]<span class="op">.</span>code<span class="op">.</span>len)</span>
<span id="cb5-49"><a href="#cb5-49" tabindex="-1"></a>                <span class="cf">return</span> <span class="kw">error</span><span class="op">.</span>GlenWrongLen;</span>
<span id="cb5-50"><a href="#cb5-50" tabindex="-1"></a></span>
<span id="cb5-51"><a href="#cb5-51" tabindex="-1"></a>            <span class="at">const</span> bits_left <span class="op">=</span> str<span class="op">.</span>len <span class="op">*</span> <span class="dv">8</span> <span class="op">-</span> j;</span>
<span id="cb5-52"><a href="#cb5-52" tabindex="-1"></a>            <span class="cf">if</span> (bits_left <span class="op">&lt;</span> group<span class="op">.</span>len) {</span>
<span id="cb5-53"><a href="#cb5-53" tabindex="-1"></a>                <span class="cf">if</span> (glen <span class="op">&gt;</span> <span class="dv">1</span> <span class="kw">or</span> j_rem <span class="op">&lt;</span> <span class="dv">1</span>)</span>
<span id="cb5-54"><a href="#cb5-54" tabindex="-1"></a>                    <span class="cf">return</span> <span class="kw">error</span><span class="op">.</span>HuffNoMatchInputEndedEarly;</span>
<span id="cb5-55"><a href="#cb5-55" tabindex="-1"></a></span>
<span id="cb5-56"><a href="#cb5-56" tabindex="-1"></a>                <span class="at">const</span> pad_mask <span class="op">=</span> <span class="bu">@truncate</span>(<span class="dt">u8</span><span class="op">,</span> <span class="bu">@as</span>(<span class="dt">u16</span><span class="op">,</span> <span class="dv">0</span><span class="er">xff00</span>) <span class="op">&gt;&gt;</span> <span class="bu">@truncate</span>(u4<span class="op">,</span> bits_left));</span>
<span id="cb5-57"><a href="#cb5-57" tabindex="-1"></a>                <span class="cf">if</span> (c[<span class="dv">0</span>] <span class="op">&amp;</span> pad_mask <span class="op">==</span> pad_mask)</span>
<span id="cb5-58"><a href="#cb5-58" tabindex="-1"></a>                    <span class="cf">break</span> <span class="op">:</span>all</span>
<span id="cb5-59"><a href="#cb5-59" tabindex="-1"></a>                <span class="cf">else</span></span>
<span id="cb5-60"><a href="#cb5-60" tabindex="-1"></a>                    <span class="cf">return</span> <span class="kw">error</span><span class="op">.</span>HuffInvalidPadding;</span>
<span id="cb5-61"><a href="#cb5-61" tabindex="-1"></a>            }</span>
<span id="cb5-62"><a href="#cb5-62" tabindex="-1"></a></span>
<span id="cb5-63"><a href="#cb5-63" tabindex="-1"></a>            <span class="at">const</span> glen_rem <span class="op">=</span> <span class="bu">@truncate</span>(u3<span class="op">,</span> group<span class="op">.</span>len);</span>
<span id="cb5-64"><a href="#cb5-64" tabindex="-1"></a>            <span class="at">const</span> last_mask <span class="op">=</span> <span class="cf">if</span> (glen_rem <span class="op">==</span> <span class="dv">0</span>)</span>
<span id="cb5-65"><a href="#cb5-65" tabindex="-1"></a>                <span class="dv">0</span><span class="er">xff</span></span>
<span id="cb5-66"><a href="#cb5-66" tabindex="-1"></a>            <span class="cf">else</span></span>
<span id="cb5-67"><a href="#cb5-67" tabindex="-1"></a>                <span class="bu">@truncate</span>(<span class="dt">u8</span><span class="op">,</span> <span class="bu">@as</span>(<span class="dt">u16</span><span class="op">,</span> <span class="dv">0</span><span class="er">xff00</span>) <span class="op">&gt;&gt;</span> glen_rem);</span>
<span id="cb5-68"><a href="#cb5-68" tabindex="-1"></a>            <span class="at">const</span> last <span class="op">=</span> c[glen <span class="op">-</span> <span class="dv">1</span>];</span>
<span id="cb5-69"><a href="#cb5-69" tabindex="-1"></a></span>
<span id="cb5-70"><a href="#cb5-70" tabindex="-1"></a>            <span class="cf">for</span> (group<span class="op">.</span>codes) <span class="op">|</span>huff<span class="op">|</span> {</span>
<span id="cb5-71"><a href="#cb5-71" tabindex="-1"></a>                <span class="cf">if</span> (huff<span class="op">.</span>code[glen <span class="op">-</span> <span class="dv">1</span>] <span class="op">!=</span> last_mask <span class="op">&amp;</span> last)</span>
<span id="cb5-72"><a href="#cb5-72" tabindex="-1"></a>                    <span class="cf">continue</span>;</span>
<span id="cb5-73"><a href="#cb5-73" tabindex="-1"></a></span>
<span id="cb5-74"><a href="#cb5-74" tabindex="-1"></a>                <span class="cf">if</span> (mem<span class="op">.</span>eql(<span class="dt">u8</span><span class="op">,</span> huff<span class="op">.</span>code[<span class="dv">0</span> <span class="op">..</span> glen <span class="op">-</span> <span class="dv">1</span>]<span class="op">,</span> c[<span class="dv">0</span> <span class="op">..</span> glen <span class="op">-</span> <span class="dv">1</span>]))</span>
<span id="cb5-75"><a href="#cb5-75" tabindex="-1"></a>                    <span class="cf">break</span> <span class="op">:</span>decode huff;</span>
<span id="cb5-76"><a href="#cb5-76" tabindex="-1"></a>            }</span>
<span id="cb5-77"><a href="#cb5-77" tabindex="-1"></a>        } <span class="cf">else</span> {</span>
<span id="cb5-78"><a href="#cb5-78" tabindex="-1"></a>            <span class="cf">return</span> <span class="kw">error</span><span class="op">.</span>HuffNoMatch;</span>
<span id="cb5-79"><a href="#cb5-79" tabindex="-1"></a>        };</span>
<span id="cb5-80"><a href="#cb5-80" tabindex="-1"></a></span>
<span id="cb5-81"><a href="#cb5-81" tabindex="-1"></a>        j <span class="op">+=</span> dehuff<span class="op">.</span>len;</span>
<span id="cb5-82"><a href="#cb5-82" tabindex="-1"></a>        i <span class="op">=</span> <span class="bu">@truncate</span>(<span class="dt">u16</span><span class="op">,</span> j <span class="op">/</span> <span class="dv">8</span>);</span>
<span id="cb5-83"><a href="#cb5-83" tabindex="-1"></a></span>
<span id="cb5-84"><a href="#cb5-84" tabindex="-1"></a>        to<span class="op">.*</span>[k] <span class="op">=</span> dehuff<span class="op">.</span>sym;</span>
<span id="cb5-85"><a href="#cb5-85" tabindex="-1"></a>        k <span class="op">+=</span> <span class="dv">1</span>;</span>
<span id="cb5-86"><a href="#cb5-86" tabindex="-1"></a>    }</span>
<span id="cb5-87"><a href="#cb5-87" tabindex="-1"></a></span>
<span id="cb5-88"><a href="#cb5-88" tabindex="-1"></a>    <span class="at">const</span> ret <span class="op">=</span> to<span class="op">.*</span>[<span class="dv">0</span><span class="er">.</span><span class="op">.</span>k];</span>
<span id="cb5-89"><a href="#cb5-89" tabindex="-1"></a>    to<span class="op">.*</span> <span class="op">=</span> to<span class="op">.*</span>[k<span class="op">..</span>];</span>
<span id="cb5-90"><a href="#cb5-90" tabindex="-1"></a></span>
<span id="cb5-91"><a href="#cb5-91" tabindex="-1"></a>    <span class="cf">return</span> ret;</span>
<span id="cb5-92"><a href="#cb5-92" tabindex="-1"></a>}</span></code></pre></div>
            <p>The first four lines of the function decode the string if
            it is not Huffman encoded. HPACK is actually relatively
            simple in the sense that the Huffman encoding is static. So
            I didn’t have spend long reading my algorithms text book and
            could put it back under the table leg, thus restoring
            stability.</p>
            <p>There is a big static table in <a
            href="https://github.com/richiejp/barely-http2/blob/main/src/huff.zig#L34"><code>src/huff.zig</code></a>
            which I sort at <code>comptime</code>. From a Zig point of
            view it makes use of the loop labels and break statements.
            Which are good. There is lots of use of the
            <code>@truncate</code> builtin. Possibly this is wrong in
            some cases and it should be <code>@intCast</code> or
            something. The later having a safety check I believe.</p>
            <p>Zig prevents some bit banging techniques common to C.
            Because it results in undefined behaviour. The end result
            being lots of uses of builtins instead of shifts and
            masking. Of course you can still use the wrong builtins.</p>
            <p>Finally, for HPACK, let’s look at the
            <code>Decoder</code> struct which provides an iterator
            interface for incoming headers. I’ll show how it’s used
            first:</p>
            <div class="sourceCode" id="cb6"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb6-1"><a href="#cb6-1" tabindex="-1"></a><span class="kw">fn</span> serveFiles(h2c<span class="op">:</span> <span class="op">*</span>http2<span class="op">.</span>NetConnection<span class="op">,</span> dir<span class="op">:</span> fs<span class="op">.</span>Dir) <span class="op">!</span><span class="dt">void</span> {</span>
<span id="cb6-2"><a href="#cb6-2" tabindex="-1"></a></span>
<span id="cb6-3"><a href="#cb6-3" tabindex="-1"></a>    <span class="op">...</span></span>
<span id="cb6-4"><a href="#cb6-4" tabindex="-1"></a></span>
<span id="cb6-5"><a href="#cb6-5" tabindex="-1"></a>                <span class="at">var</span> path_buf<span class="op">:</span> [fs<span class="op">.</span>MAX_PATH_BYTES]<span class="dt">u8</span> <span class="op">=</span> <span class="cn">undefined</span>;</span>
<span id="cb6-6"><a href="#cb6-6" tabindex="-1"></a>                <span class="at">var</span> path<span class="op">:</span> <span class="op">?</span>[]<span class="at">const</span> <span class="dt">u8</span> <span class="op">=</span> <span class="cn">null</span>;</span>
<span id="cb6-7"><a href="#cb6-7" tabindex="-1"></a></span>
<span id="cb6-8"><a href="#cb6-8" tabindex="-1"></a>                <span class="cf">while</span> (headers<span class="op">.</span>next()) <span class="op">|</span>h<span class="op">|</span> {</span>
<span id="cb6-9"><a href="#cb6-9" tabindex="-1"></a>                    std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;    {s} =&gt; {s}&quot;</span><span class="op">,</span> h);</span>
<span id="cb6-10"><a href="#cb6-10" tabindex="-1"></a></span>
<span id="cb6-11"><a href="#cb6-11" tabindex="-1"></a>                    <span class="cf">if</span> (mem<span class="op">.</span>eql(<span class="dt">u8</span><span class="op">,</span> <span class="st">&quot;:path&quot;</span><span class="op">,</span> h<span class="op">.</span>name) <span class="kw">and</span> h<span class="op">.</span>value<span class="op">.</span>len <span class="op">&lt;=</span> path_buf<span class="op">.</span>len) {</span>
<span id="cb6-12"><a href="#cb6-12" tabindex="-1"></a>                        mem<span class="op">.</span>copy(<span class="dt">u8</span><span class="op">,</span> <span class="op">&amp;</span>path_buf<span class="op">,</span> h<span class="op">.</span>value);</span>
<span id="cb6-13"><a href="#cb6-13" tabindex="-1"></a>                        path <span class="op">=</span> path_buf[<span class="dv">0</span><span class="er">.</span><span class="op">.</span>h<span class="op">.</span>value<span class="op">.</span>len];</span>
<span id="cb6-14"><a href="#cb6-14" tabindex="-1"></a>                    }</span>
<span id="cb6-15"><a href="#cb6-15" tabindex="-1"></a>                } <span class="cf">else</span> <span class="op">|</span>err<span class="op">|</span> {</span>
<span id="cb6-16"><a href="#cb6-16" tabindex="-1"></a>                    <span class="cf">if</span> (err <span class="op">!=</span> <span class="kw">error</span><span class="op">.</span>EndOfData) <span class="cf">return</span> err;</span>
<span id="cb6-17"><a href="#cb6-17" tabindex="-1"></a>                }</span>
<span id="cb6-18"><a href="#cb6-18" tabindex="-1"></a></span>
<span id="cb6-19"><a href="#cb6-19" tabindex="-1"></a>                <span class="cf">if</span> (path) <span class="op">|</span>p<span class="op">|</span></span>
<span id="cb6-20"><a href="#cb6-20" tabindex="-1"></a>                    <span class="cf">try</span> sendFile(h2c<span class="op">,</span> dir<span class="op">,</span> frame<span class="op">.</span>hdr<span class="op">.</span>id<span class="op">,</span> p)</span>
<span id="cb6-21"><a href="#cb6-21" tabindex="-1"></a>                <span class="cf">else</span></span>
<span id="cb6-22"><a href="#cb6-22" tabindex="-1"></a>                    <span class="cf">return</span> <span class="kw">error</span><span class="op">.</span>DidntFindThePathHeader;</span>
<span id="cb6-23"><a href="#cb6-23" tabindex="-1"></a>    <span class="op">...</span></span></code></pre></div>
            <p>Above we can see that <code>headers.next()</code> is
            called and if we find the path header we copy it. Most HTTP
            implementations I have seen will allocate strings for every
            header and pass them up to a higher level Framework. We
            could do that too, but I think it’s important to keep the
            option of just skipping over stuff we don’t care about.</p>
            <p>Also some headers we do care about we could deal with
            during the parsing without performing a copy. Below is the
            implementation which correlates with <a
            href="https://www.rfc-editor.org/rfc/rfc7541#section-6.2">this
            part of the spec</a>.</p>
            <div class="sourceCode" id="cb7"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb7-1"><a href="#cb7-1" tabindex="-1"></a><span class="co">/// An iterator that takes I/O buffers, a decoding table and returns</span></span>
<span id="cb7-2"><a href="#cb7-2" tabindex="-1"></a><span class="co">/// header entries. The table is mutated so you can&#39;t run the iterator</span></span>
<span id="cb7-3"><a href="#cb7-3" tabindex="-1"></a><span class="co">/// twice.</span></span>
<span id="cb7-4"><a href="#cb7-4" tabindex="-1"></a><span class="kw">pub</span> <span class="at">const</span> Decoder <span class="op">=</span> <span class="kw">struct</span> {</span>
<span id="cb7-5"><a href="#cb7-5" tabindex="-1"></a>    <span class="co">/// buffer containing the encoded data</span></span>
<span id="cb7-6"><a href="#cb7-6" tabindex="-1"></a>    from<span class="op">:</span> []<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span></span>
<span id="cb7-7"><a href="#cb7-7" tabindex="-1"></a>    <span class="co">/// A scratch buffer for decoded headers</span></span>
<span id="cb7-8"><a href="#cb7-8" tabindex="-1"></a>    to<span class="op">:</span> []<span class="dt">u8</span><span class="op">,</span></span>
<span id="cb7-9"><a href="#cb7-9" tabindex="-1"></a>    table<span class="op">:</span> hdrIndx<span class="op">.</span>Table <span class="op">=</span> hdrIndx<span class="op">.</span>Table{}<span class="op">,</span></span>
<span id="cb7-10"><a href="#cb7-10" tabindex="-1"></a></span>
<span id="cb7-11"><a href="#cb7-11" tabindex="-1"></a>    <span class="kw">pub</span> <span class="kw">fn</span> init(from<span class="op">:</span> []<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span> to<span class="op">:</span> []<span class="dt">u8</span>) Decoder {</span>
<span id="cb7-12"><a href="#cb7-12" tabindex="-1"></a>        <span class="cf">return</span> <span class="op">.</span>{</span>
<span id="cb7-13"><a href="#cb7-13" tabindex="-1"></a>            <span class="op">.</span>from <span class="op">=</span> from<span class="op">,</span></span>
<span id="cb7-14"><a href="#cb7-14" tabindex="-1"></a>            <span class="op">.</span>to <span class="op">=</span> to<span class="op">,</span></span>
<span id="cb7-15"><a href="#cb7-15" tabindex="-1"></a>        };</span>
<span id="cb7-16"><a href="#cb7-16" tabindex="-1"></a>    }</span>
<span id="cb7-17"><a href="#cb7-17" tabindex="-1"></a></span>
<span id="cb7-18"><a href="#cb7-18" tabindex="-1"></a>    <span class="kw">pub</span> <span class="kw">fn</span> newData(<span class="va">self</span><span class="op">:</span> <span class="op">*</span>Decoder<span class="op">,</span> from<span class="op">:</span> []<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span> to<span class="op">:</span> []<span class="dt">u8</span>) <span class="dt">void</span> {</span>
<span id="cb7-19"><a href="#cb7-19" tabindex="-1"></a>        <span class="va">self</span><span class="op">.</span>from <span class="op">=</span> from;</span>
<span id="cb7-20"><a href="#cb7-20" tabindex="-1"></a>        <span class="va">self</span><span class="op">.</span>to <span class="op">=</span> to;</span>
<span id="cb7-21"><a href="#cb7-21" tabindex="-1"></a>    }</span>
<span id="cb7-22"><a href="#cb7-22" tabindex="-1"></a></span>
<span id="cb7-23"><a href="#cb7-23" tabindex="-1"></a>    <span class="co">/// Get the next header. The contents of the header may be</span></span>
<span id="cb7-24"><a href="#cb7-24" tabindex="-1"></a>    <span class="co">/// borrowed from the scratch buffer or the table&#39;s buffer.</span></span>
<span id="cb7-25"><a href="#cb7-25" tabindex="-1"></a>    <span class="kw">pub</span> <span class="kw">fn</span> next(<span class="va">self</span><span class="op">:</span> <span class="op">*</span>Decoder) <span class="op">!</span>HdrConst {</span>
<span id="cb7-26"><a href="#cb7-26" tabindex="-1"></a>        <span class="cf">if</span> (<span class="va">self</span><span class="op">.</span>from<span class="op">.</span>len <span class="op">&lt;</span> <span class="dv">1</span>)</span>
<span id="cb7-27"><a href="#cb7-27" tabindex="-1"></a>            <span class="cf">return</span> <span class="kw">error</span><span class="op">.</span>EndOfData;</span>
<span id="cb7-28"><a href="#cb7-28" tabindex="-1"></a></span>
<span id="cb7-29"><a href="#cb7-29" tabindex="-1"></a>        <span class="at">const</span> table <span class="op">=</span> <span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>table;</span>
<span id="cb7-30"><a href="#cb7-30" tabindex="-1"></a>        <span class="at">const</span> tag <span class="op">=</span> <span class="va">self</span><span class="op">.</span>from[<span class="dv">0</span>];</span>
<span id="cb7-31"><a href="#cb7-31" tabindex="-1"></a>        <span class="at">const</span> repr <span class="op">=</span> <span class="cf">try</span> HdrFieldRepr<span class="op">.</span>from(tag);</span>
<span id="cb7-32"><a href="#cb7-32" tabindex="-1"></a></span>
<span id="cb7-33"><a href="#cb7-33" tabindex="-1"></a>        <span class="cf">switch</span> (repr) {</span>
<span id="cb7-34"><a href="#cb7-34" tabindex="-1"></a>            <span class="op">.</span>indexed <span class="op">=&gt;</span> {</span>
<span id="cb7-35"><a href="#cb7-35" tabindex="-1"></a>                <span class="at">const</span> i <span class="op">=</span> <span class="cf">try</span> decodeInt(<span class="dt">u8</span><span class="op">,</span> <span class="dv">7</span><span class="op">,</span> <span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>from);</span>
<span id="cb7-36"><a href="#cb7-36" tabindex="-1"></a>                <span class="cf">return</span> table<span class="op">.</span>get(i);</span>
<span id="cb7-37"><a href="#cb7-37" tabindex="-1"></a>            }<span class="op">,</span></span>
<span id="cb7-38"><a href="#cb7-38" tabindex="-1"></a>            <span class="op">.</span>indexedNameAddValue <span class="op">=&gt;</span> {</span>
<span id="cb7-39"><a href="#cb7-39" tabindex="-1"></a>                <span class="at">const</span> i <span class="op">=</span> <span class="cf">try</span> decodeInt(<span class="dt">u8</span><span class="op">,</span> <span class="dv">6</span><span class="op">,</span> <span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>from);</span>
<span id="cb7-40"><a href="#cb7-40" tabindex="-1"></a>                <span class="at">const</span> ihdr <span class="op">=</span> <span class="cf">try</span> table<span class="op">.</span>get(i);</span>
<span id="cb7-41"><a href="#cb7-41" tabindex="-1"></a>                <span class="at">const</span> hdr <span class="op">=</span> <span class="op">.</span>{</span>
<span id="cb7-42"><a href="#cb7-42" tabindex="-1"></a>                    <span class="op">.</span>name <span class="op">=</span> ihdr<span class="op">.</span>name<span class="op">,</span></span>
<span id="cb7-43"><a href="#cb7-43" tabindex="-1"></a>                    <span class="op">.</span>value <span class="op">=</span> <span class="cf">try</span> decodeStr(<span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>from<span class="op">,</span> <span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>to)<span class="op">,</span></span>
<span id="cb7-44"><a href="#cb7-44" tabindex="-1"></a>                };</span>
<span id="cb7-45"><a href="#cb7-45" tabindex="-1"></a></span>
<span id="cb7-46"><a href="#cb7-46" tabindex="-1"></a>                <span class="cf">try</span> table<span class="op">.</span>add(hdr<span class="op">.</span>name<span class="op">,</span> hdr<span class="op">.</span>value);</span>
<span id="cb7-47"><a href="#cb7-47" tabindex="-1"></a></span>
<span id="cb7-48"><a href="#cb7-48" tabindex="-1"></a>                <span class="cf">return</span> hdr;</span>
<span id="cb7-49"><a href="#cb7-49" tabindex="-1"></a>            }<span class="op">,</span></span>
<span id="cb7-50"><a href="#cb7-50" tabindex="-1"></a>            <span class="op">.</span>indexedNameLitValue <span class="op">=&gt;</span> {</span>
<span id="cb7-51"><a href="#cb7-51" tabindex="-1"></a>                <span class="at">const</span> i <span class="op">=</span> <span class="cf">try</span> decodeInt(<span class="dt">u8</span><span class="op">,</span> <span class="dv">4</span><span class="op">,</span> <span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>from);</span>
<span id="cb7-52"><a href="#cb7-52" tabindex="-1"></a>                <span class="at">const</span> ihdr <span class="op">=</span> <span class="cf">try</span> table<span class="op">.</span>get(i);</span>
<span id="cb7-53"><a href="#cb7-53" tabindex="-1"></a>                <span class="at">const</span> hdr <span class="op">=</span> <span class="op">.</span>{</span>
<span id="cb7-54"><a href="#cb7-54" tabindex="-1"></a>                    <span class="op">.</span>name <span class="op">=</span> ihdr<span class="op">.</span>name<span class="op">,</span></span>
<span id="cb7-55"><a href="#cb7-55" tabindex="-1"></a>                    <span class="op">.</span>value <span class="op">=</span> <span class="cf">try</span> decodeStr(<span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>from<span class="op">,</span> <span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>to)<span class="op">,</span></span>
<span id="cb7-56"><a href="#cb7-56" tabindex="-1"></a>                };</span>
<span id="cb7-57"><a href="#cb7-57" tabindex="-1"></a></span>
<span id="cb7-58"><a href="#cb7-58" tabindex="-1"></a>                <span class="cf">return</span> hdr;</span>
<span id="cb7-59"><a href="#cb7-59" tabindex="-1"></a>            }<span class="op">,</span></span>
<span id="cb7-60"><a href="#cb7-60" tabindex="-1"></a>            <span class="op">.</span>addNameAddValue<span class="op">,</span> <span class="op">.</span>litNameLitValue <span class="op">=&gt;</span> {</span>
<span id="cb7-61"><a href="#cb7-61" tabindex="-1"></a>                <span class="va">self</span><span class="op">.</span>from <span class="op">=</span> <span class="va">self</span><span class="op">.</span>from[<span class="dv">1</span><span class="er">.</span><span class="op">.</span>];</span>
<span id="cb7-62"><a href="#cb7-62" tabindex="-1"></a></span>
<span id="cb7-63"><a href="#cb7-63" tabindex="-1"></a>                <span class="at">const</span> hdr<span class="op">:</span> HdrConst <span class="op">=</span> <span class="op">.</span>{</span>
<span id="cb7-64"><a href="#cb7-64" tabindex="-1"></a>                    <span class="op">.</span>name <span class="op">=</span> <span class="cf">try</span> decodeStr(<span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>from<span class="op">,</span> <span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>to)<span class="op">,</span></span>
<span id="cb7-65"><a href="#cb7-65" tabindex="-1"></a>                    <span class="op">.</span>value <span class="op">=</span> <span class="cf">try</span> decodeStr(<span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>from<span class="op">,</span> <span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>to)<span class="op">,</span></span>
<span id="cb7-66"><a href="#cb7-66" tabindex="-1"></a>                };</span>
<span id="cb7-67"><a href="#cb7-67" tabindex="-1"></a></span>
<span id="cb7-68"><a href="#cb7-68" tabindex="-1"></a>                <span class="cf">if</span> (repr <span class="op">==</span> <span class="op">.</span>addNameAddValue)</span>
<span id="cb7-69"><a href="#cb7-69" tabindex="-1"></a>                    <span class="cf">try</span> table<span class="op">.</span>add(hdr<span class="op">.</span>name<span class="op">,</span> hdr<span class="op">.</span>value);</span>
<span id="cb7-70"><a href="#cb7-70" tabindex="-1"></a></span>
<span id="cb7-71"><a href="#cb7-71" tabindex="-1"></a>                <span class="cf">return</span> hdr;</span>
<span id="cb7-72"><a href="#cb7-72" tabindex="-1"></a>            }<span class="op">,</span></span>
<span id="cb7-73"><a href="#cb7-73" tabindex="-1"></a>        }</span>
<span id="cb7-74"><a href="#cb7-74" tabindex="-1"></a>    }</span>
<span id="cb7-75"><a href="#cb7-75" tabindex="-1"></a>};</span></code></pre></div>
            <p>Couldn’t they have just made a HTTP1.2 that encoded the
            headers in <a
            href="https://redis.io/docs/reference/protocol-spec/#resp-bulk-strings">RESP
            strings</a> or something?</p>
            <h1 id="frames">Frames</h1>
            <p>A HTTP/2 connection is made up of a series of frames with
            various types of payload. The <a
            href="https://www.rfc-editor.org/rfc/rfc7540#section-4.1">frame
            header</a> is always the same and not too complicated.
            Initially I tried decoding the header by declaring a packed
            struct for it and doing a <code>@ptrCast</code> or
            <code>@bitCast</code>.</p>
            <p>If it worked it would look something like this:</p>
            <div class="sourceCode" id="cb8"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb8-1"><a href="#cb8-1" tabindex="-1"></a><span class="at">const</span> HdrData <span class="op">=</span> <span class="kw">packed</span> <span class="kw">struct</span> {</span>
<span id="cb8-2"><a href="#cb8-2" tabindex="-1"></a>    length<span class="op">:</span> u24<span class="op">,</span></span>
<span id="cb8-3"><a href="#cb8-3" tabindex="-1"></a>    <span class="dt">type</span><span class="op">:</span> FrameType<span class="op">,</span> <span class="co">// or u8</span></span>
<span id="cb8-4"><a href="#cb8-4" tabindex="-1"></a>    flags<span class="op">:</span> FrameFlags<span class="op">,</span></span>
<span id="cb8-5"><a href="#cb8-5" tabindex="-1"></a>    r<span class="op">:</span> <span class="dt">bool</span><span class="op">,</span></span>
<span id="cb8-6"><a href="#cb8-6" tabindex="-1"></a>    id<span class="op">:</span> u31<span class="op">,</span></span>
<span id="cb8-7"><a href="#cb8-7" tabindex="-1"></a>};</span>
<span id="cb8-8"><a href="#cb8-8" tabindex="-1"></a></span>
<span id="cb8-9"><a href="#cb8-9" tabindex="-1"></a><span class="op">...</span></span>
<span id="cb8-10"><a href="#cb8-10" tabindex="-1"></a>    <span class="at">var</span> hdr <span class="op">=</span> <span class="bu">@ptrCast</span>(HdrData<span class="op">,</span> buf[n<span class="op">..</span>n<span class="op">+</span><span class="dv">9</span>]);</span>
<span id="cb8-11"><a href="#cb8-11" tabindex="-1"></a>    <span class="co">// Then switch the endianess</span></span></code></pre></div>
            <p>The major issue is the endianess needs switching from
            network byte order to native for length and perhaps id.
            There are functions to help with reading in integers of
            different endianess. However it didn’t seem worth trying to
            do that and a pointer cast.</p>
            <p>Then there is <code>FrameFlags</code>, which it is
            convenient to define as a tagged union for printing. To my
            knowledge Zig doesn’t define where the tag is or allow you
            to specify it. We can’t tell it that a packed union has its
            tag immediately before the union field content.</p>
            <p>So I feel this code could be cleaner, but also that I am
            fussing over a minor details. The below is how a Frame is
            decoded and encoded.</p>
            <div class="sourceCode" id="cb9"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb9-1"><a href="#cb9-1" tabindex="-1"></a><span class="co">/// All HTTP/2 traffic is made up of frames with a fixed sized header</span></span>
<span id="cb9-2"><a href="#cb9-2" tabindex="-1"></a><span class="co">/// of the same format. Each frame specifies its type and payload</span></span>
<span id="cb9-3"><a href="#cb9-3" tabindex="-1"></a><span class="co">/// length. On an abstract level this makes parsing HTTP/2 traffic</span></span>
<span id="cb9-4"><a href="#cb9-4" tabindex="-1"></a><span class="co">/// easy.</span></span>
<span id="cb9-5"><a href="#cb9-5" tabindex="-1"></a><span class="co">///</span></span>
<span id="cb9-6"><a href="#cb9-6" tabindex="-1"></a><span class="co">/// The only complicatin in Zig being the interactiong between</span></span>
<span id="cb9-7"><a href="#cb9-7" tabindex="-1"></a><span class="co">/// endianess and packed u24. Otherwise we could declare the flags as</span></span>
<span id="cb9-8"><a href="#cb9-8" tabindex="-1"></a><span class="co">/// u8, mark the struct as packed then do a single @ptrCast. I tried</span></span>
<span id="cb9-9"><a href="#cb9-9" tabindex="-1"></a><span class="co">/// something like this, but got in a mess and settled on the below.</span></span>
<span id="cb9-10"><a href="#cb9-10" tabindex="-1"></a><span class="kw">pub</span> <span class="at">const</span> FrameHdr <span class="op">=</span> <span class="kw">struct</span> {</span>
<span id="cb9-11"><a href="#cb9-11" tabindex="-1"></a>    <span class="co">/// Length of the frame&#39;s payload</span></span>
<span id="cb9-12"><a href="#cb9-12" tabindex="-1"></a>    length<span class="op">:</span> u24<span class="op">,</span></span>
<span id="cb9-13"><a href="#cb9-13" tabindex="-1"></a>    <span class="co">/// How we should interpret everything that follows</span></span>
<span id="cb9-14"><a href="#cb9-14" tabindex="-1"></a>    <span class="dt">type</span><span class="op">:</span> FrameType<span class="op">,</span></span>
<span id="cb9-15"><a href="#cb9-15" tabindex="-1"></a>    flags<span class="op">:</span> FrameFlags<span class="op">,</span></span>
<span id="cb9-16"><a href="#cb9-16" tabindex="-1"></a>    <span class="co">/// A reserved bit, which we can set to 1 as an act of rebellion.</span></span>
<span id="cb9-17"><a href="#cb9-17" tabindex="-1"></a>    r<span class="op">:</span> <span class="dt">bool</span> <span class="op">=</span> <span class="cn">false</span><span class="op">,</span></span>
<span id="cb9-18"><a href="#cb9-18" tabindex="-1"></a>    <span class="co">/// The stream ID or zero if this frame applies to the connection.</span></span>
<span id="cb9-19"><a href="#cb9-19" tabindex="-1"></a>    id<span class="op">:</span> u31<span class="op">,</span></span>
<span id="cb9-20"><a href="#cb9-20" tabindex="-1"></a></span>
<span id="cb9-21"><a href="#cb9-21" tabindex="-1"></a>    <span class="kw">pub</span> <span class="kw">fn</span> from(buf<span class="op">:</span> <span class="op">*</span><span class="at">const</span> [<span class="dv">9</span>]<span class="dt">u8</span>) FrameHdr {</span>
<span id="cb9-22"><a href="#cb9-22" tabindex="-1"></a>        <span class="at">const</span> ftype <span class="op">=</span> <span class="bu">@intToEnum</span>(FrameType<span class="op">,</span> buf[<span class="dv">3</span>]);</span>
<span id="cb9-23"><a href="#cb9-23" tabindex="-1"></a></span>
<span id="cb9-24"><a href="#cb9-24" tabindex="-1"></a>        <span class="cf">return</span> <span class="op">.</span>{</span>
<span id="cb9-25"><a href="#cb9-25" tabindex="-1"></a>            <span class="op">.</span>length <span class="op">=</span> mem<span class="op">.</span>readIntBig(u24<span class="op">,</span> buf[<span class="dv">0</span><span class="er">.</span><span class="op">.</span><span class="dv">3</span>])<span class="op">,</span></span>
<span id="cb9-26"><a href="#cb9-26" tabindex="-1"></a>            <span class="op">.</span><span class="dt">type</span> <span class="op">=</span> ftype<span class="op">,</span></span>
<span id="cb9-27"><a href="#cb9-27" tabindex="-1"></a>            <span class="op">.</span>flags <span class="op">=</span> <span class="cf">switch</span> (ftype) {</span>
<span id="cb9-28"><a href="#cb9-28" tabindex="-1"></a>                <span class="op">.</span>headers <span class="op">=&gt;</span> <span class="op">.</span>{ <span class="op">.</span>headers <span class="op">=</span> <span class="bu">@bitCast</span>(HeadersFlags<span class="op">,</span> buf[<span class="dv">4</span>]) }<span class="op">,</span></span>
<span id="cb9-29"><a href="#cb9-29" tabindex="-1"></a>                <span class="op">.</span>settings <span class="op">=&gt;</span> <span class="op">.</span>{ <span class="op">.</span>settings <span class="op">=</span> <span class="bu">@bitCast</span>(SettingsFlags<span class="op">,</span> buf[<span class="dv">4</span>]) }<span class="op">,</span></span>
<span id="cb9-30"><a href="#cb9-30" tabindex="-1"></a>                <span class="op">.</span>windowUpdate <span class="op">=&gt;</span> <span class="op">.</span>{ <span class="op">.</span>unused <span class="op">=</span> buf[<span class="dv">4</span>] }<span class="op">,</span></span>
<span id="cb9-31"><a href="#cb9-31" tabindex="-1"></a>                <span class="cf">else</span> <span class="op">=&gt;</span> <span class="op">.</span>{ <span class="op">.</span>unknown <span class="op">=</span> buf[<span class="dv">4</span>] }<span class="op">,</span></span>
<span id="cb9-32"><a href="#cb9-32" tabindex="-1"></a>            }<span class="op">,</span></span>
<span id="cb9-33"><a href="#cb9-33" tabindex="-1"></a>            <span class="op">.</span>r <span class="op">=</span> <span class="bu">@bitCast</span>(<span class="dt">bool</span><span class="op">,</span> <span class="bu">@truncate</span>(u1<span class="op">,</span> buf[<span class="dv">5</span>] <span class="op">&gt;&gt;</span> <span class="dv">7</span>))<span class="op">,</span></span>
<span id="cb9-34"><a href="#cb9-34" tabindex="-1"></a>            <span class="op">.</span>id <span class="op">=</span> <span class="bu">@intCast</span>(u31<span class="op">,</span> mem<span class="op">.</span>readIntBig(<span class="dt">u32</span><span class="op">,</span> buf[<span class="dv">5</span><span class="er">.</span><span class="op">.</span><span class="dv">9</span>]) <span class="op">&amp;</span> <span class="dv">0</span><span class="er">x7fffffff</span>)<span class="op">,</span></span>
<span id="cb9-35"><a href="#cb9-35" tabindex="-1"></a>        };</span>
<span id="cb9-36"><a href="#cb9-36" tabindex="-1"></a>    }</span>
<span id="cb9-37"><a href="#cb9-37" tabindex="-1"></a></span>
<span id="cb9-38"><a href="#cb9-38" tabindex="-1"></a>    <span class="kw">pub</span> <span class="kw">fn</span> to(<span class="va">self</span><span class="op">:</span> FrameHdr<span class="op">,</span> buf<span class="op">:</span> []<span class="dt">u8</span>) []<span class="at">const</span> <span class="dt">u8</span> {</span>
<span id="cb9-39"><a href="#cb9-39" tabindex="-1"></a>        mem<span class="op">.</span>writeIntBig(u24<span class="op">,</span> buf[<span class="dv">0</span><span class="er">.</span><span class="op">.</span><span class="dv">3</span>]<span class="op">,</span> <span class="va">self</span><span class="op">.</span>length);</span>
<span id="cb9-40"><a href="#cb9-40" tabindex="-1"></a>        buf[<span class="dv">3</span>] <span class="op">=</span> <span class="bu">@enumToInt</span>(<span class="va">self</span><span class="op">.</span><span class="dt">type</span>);</span>
<span id="cb9-41"><a href="#cb9-41" tabindex="-1"></a>        buf[<span class="dv">4</span>] <span class="op">=</span> <span class="cf">switch</span> (<span class="va">self</span><span class="op">.</span>flags) {</span>
<span id="cb9-42"><a href="#cb9-42" tabindex="-1"></a>            <span class="op">.</span>data <span class="op">=&gt;</span> <span class="op">|</span>flags<span class="op">|</span> <span class="bu">@bitCast</span>(<span class="dt">u8</span><span class="op">,</span> flags)<span class="op">,</span></span>
<span id="cb9-43"><a href="#cb9-43" tabindex="-1"></a>            <span class="op">.</span>headers <span class="op">=&gt;</span> <span class="bu">@bitCast</span>(<span class="dt">u8</span><span class="op">,</span> <span class="va">self</span><span class="op">.</span>flags<span class="op">.</span>headers)<span class="op">,</span></span>
<span id="cb9-44"><a href="#cb9-44" tabindex="-1"></a>            <span class="op">.</span>settings <span class="op">=&gt;</span> <span class="bu">@bitCast</span>(<span class="dt">u8</span><span class="op">,</span> <span class="va">self</span><span class="op">.</span>flags<span class="op">.</span>settings)<span class="op">,</span></span>
<span id="cb9-45"><a href="#cb9-45" tabindex="-1"></a>            <span class="cf">else</span> <span class="op">=&gt;</span> <span class="kw">unreachable</span><span class="op">,</span></span>
<span id="cb9-46"><a href="#cb9-46" tabindex="-1"></a>        };</span>
<span id="cb9-47"><a href="#cb9-47" tabindex="-1"></a>        <span class="co">// r is always 0</span></span>
<span id="cb9-48"><a href="#cb9-48" tabindex="-1"></a>        mem<span class="op">.</span>writeIntBig(<span class="dt">u32</span><span class="op">,</span> buf[<span class="dv">5</span><span class="er">.</span><span class="op">.</span><span class="dv">9</span>]<span class="op">,</span> <span class="bu">@intCast</span>(<span class="dt">u32</span><span class="op">,</span> <span class="va">self</span><span class="op">.</span>id));</span>
<span id="cb9-49"><a href="#cb9-49" tabindex="-1"></a></span>
<span id="cb9-50"><a href="#cb9-50" tabindex="-1"></a>        <span class="cf">return</span> buf[<span class="dv">0</span><span class="er">.</span><span class="op">.</span><span class="dv">9</span>];</span>
<span id="cb9-51"><a href="#cb9-51" tabindex="-1"></a>    }</span>
<span id="cb9-52"><a href="#cb9-52" tabindex="-1"></a>};</span></code></pre></div>
            <p>There are a number of frame types which are listed
            below.</p>
            <div class="sourceCode" id="cb10"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb10-1"><a href="#cb10-1" tabindex="-1"></a><span class="at">const</span> FrameType <span class="op">=</span> <span class="kw">enum</span>(<span class="dt">u8</span>) {</span>
<span id="cb10-2"><a href="#cb10-2" tabindex="-1"></a>    data<span class="op">,</span></span>
<span id="cb10-3"><a href="#cb10-3" tabindex="-1"></a>    headers<span class="op">,</span></span>
<span id="cb10-4"><a href="#cb10-4" tabindex="-1"></a>    priority<span class="op">,</span></span>
<span id="cb10-5"><a href="#cb10-5" tabindex="-1"></a>    rstStream<span class="op">,</span></span>
<span id="cb10-6"><a href="#cb10-6" tabindex="-1"></a>    settings<span class="op">,</span></span>
<span id="cb10-7"><a href="#cb10-7" tabindex="-1"></a>    pushPromise<span class="op">,</span></span>
<span id="cb10-8"><a href="#cb10-8" tabindex="-1"></a>    ping<span class="op">,</span></span>
<span id="cb10-9"><a href="#cb10-9" tabindex="-1"></a>    goAway<span class="op">,</span></span>
<span id="cb10-10"><a href="#cb10-10" tabindex="-1"></a>    windowUpdate<span class="op">,</span></span>
<span id="cb10-11"><a href="#cb10-11" tabindex="-1"></a>    continuation<span class="op">,</span></span>
<span id="cb10-12"><a href="#cb10-12" tabindex="-1"></a>    _<span class="op">,</span></span>
<span id="cb10-13"><a href="#cb10-13" tabindex="-1"></a>};</span></code></pre></div>
            <p>These mostly have different payloads. The structure of
            the payloads varies depending what flags are specified in
            the frame header. They are mostly not too complicated in
            terms of layout though. The exception being the headers of
            course.</p>
            <p>Let’s look at the settings payload decoder. It would be
            possible to ignore the settings completely. However I was
            curious to see what Curl would send.</p>
            <div class="sourceCode" id="cb11"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb11-1"><a href="#cb11-1" tabindex="-1"></a><span class="at">const</span> SettingId <span class="op">=</span> <span class="kw">enum</span>(<span class="dt">u16</span>) {</span>
<span id="cb11-2"><a href="#cb11-2" tabindex="-1"></a>    headerTableSize <span class="op">=</span> <span class="dv">0</span><span class="er">x1</span><span class="op">,</span></span>
<span id="cb11-3"><a href="#cb11-3" tabindex="-1"></a>    enablePush<span class="op">,</span></span>
<span id="cb11-4"><a href="#cb11-4" tabindex="-1"></a>    maxConcurrentStreams<span class="op">,</span></span>
<span id="cb11-5"><a href="#cb11-5" tabindex="-1"></a>    initialWindowSize<span class="op">,</span></span>
<span id="cb11-6"><a href="#cb11-6" tabindex="-1"></a>    maxFrameSize<span class="op">,</span></span>
<span id="cb11-7"><a href="#cb11-7" tabindex="-1"></a>    maxHeaderListSize<span class="op">,</span></span>
<span id="cb11-8"><a href="#cb11-8" tabindex="-1"></a>};</span>
<span id="cb11-9"><a href="#cb11-9" tabindex="-1"></a></span>
<span id="cb11-10"><a href="#cb11-10" tabindex="-1"></a><span class="at">const</span> Setting <span class="op">=</span> <span class="kw">union</span>(SettingId) {</span>
<span id="cb11-11"><a href="#cb11-11" tabindex="-1"></a>    headerTableSize<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb11-12"><a href="#cb11-12" tabindex="-1"></a>    enablePush<span class="op">:</span> <span class="dt">bool</span><span class="op">,</span></span>
<span id="cb11-13"><a href="#cb11-13" tabindex="-1"></a>    maxConcurrentStreams<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb11-14"><a href="#cb11-14" tabindex="-1"></a>    initialWindowSize<span class="op">:</span> u31<span class="op">,</span></span>
<span id="cb11-15"><a href="#cb11-15" tabindex="-1"></a>    maxFrameSize<span class="op">:</span> u24<span class="op">,</span></span>
<span id="cb11-16"><a href="#cb11-16" tabindex="-1"></a>    maxHeaderListSize<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb11-17"><a href="#cb11-17" tabindex="-1"></a>};</span>
<span id="cb11-18"><a href="#cb11-18" tabindex="-1"></a></span>
<span id="cb11-19"><a href="#cb11-19" tabindex="-1"></a><span class="co">/// Settings limit what we can send to the other side. This is</span></span>
<span id="cb11-20"><a href="#cb11-20" tabindex="-1"></a><span class="co">/// essentially an iterator which returns a tagged u32</span></span>
<span id="cb11-21"><a href="#cb11-21" tabindex="-1"></a><span class="at">const</span> SettingsPayload <span class="op">=</span> <span class="kw">struct</span> {</span>
<span id="cb11-22"><a href="#cb11-22" tabindex="-1"></a>    settings<span class="op">:</span> []<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span></span>
<span id="cb11-23"><a href="#cb11-23" tabindex="-1"></a>    used<span class="op">:</span> <span class="dt">usize</span><span class="op">,</span></span>
<span id="cb11-24"><a href="#cb11-24" tabindex="-1"></a></span>
<span id="cb11-25"><a href="#cb11-25" tabindex="-1"></a>    <span class="kw">pub</span> <span class="kw">fn</span> init(buf<span class="op">:</span> []<span class="at">const</span> <span class="dt">u8</span>) SettingsPayload {</span>
<span id="cb11-26"><a href="#cb11-26" tabindex="-1"></a>        <span class="cf">return</span> <span class="op">.</span>{ <span class="op">.</span>settings <span class="op">=</span> buf<span class="op">,</span> <span class="op">.</span>used <span class="op">=</span> <span class="dv">0</span> };</span>
<span id="cb11-27"><a href="#cb11-27" tabindex="-1"></a>    }</span>
<span id="cb11-28"><a href="#cb11-28" tabindex="-1"></a></span>
<span id="cb11-29"><a href="#cb11-29" tabindex="-1"></a>    <span class="kw">pub</span> <span class="kw">fn</span> next(<span class="va">self</span><span class="op">:</span> <span class="op">*</span>SettingsPayload) <span class="op">!</span>Setting {</span>
<span id="cb11-30"><a href="#cb11-30" tabindex="-1"></a>        <span class="cf">if</span> (<span class="va">self</span><span class="op">.</span>settings<span class="op">.</span>len <span class="op">-</span> <span class="va">self</span><span class="op">.</span>used <span class="op">==</span> <span class="dv">0</span>)</span>
<span id="cb11-31"><a href="#cb11-31" tabindex="-1"></a>            <span class="cf">return</span> <span class="kw">error</span><span class="op">.</span>EndOfData;</span>
<span id="cb11-32"><a href="#cb11-32" tabindex="-1"></a></span>
<span id="cb11-33"><a href="#cb11-33" tabindex="-1"></a>        <span class="cf">if</span> (<span class="va">self</span><span class="op">.</span>settings<span class="op">.</span>len <span class="op">-</span> <span class="va">self</span><span class="op">.</span>used <span class="op">&lt;</span> <span class="dv">6</span>)</span>
<span id="cb11-34"><a href="#cb11-34" tabindex="-1"></a>            <span class="cf">return</span> <span class="kw">error</span><span class="op">.</span>UnexpectedEndOfData;</span>
<span id="cb11-35"><a href="#cb11-35" tabindex="-1"></a></span>
<span id="cb11-36"><a href="#cb11-36" tabindex="-1"></a>        <span class="at">const</span> buf <span class="op">=</span> <span class="va">self</span><span class="op">.</span>settings[<span class="va">self</span><span class="op">.</span>used<span class="op">..</span>][<span class="dv">0</span><span class="er">.</span><span class="op">.</span><span class="dv">6</span>];</span>
<span id="cb11-37"><a href="#cb11-37" tabindex="-1"></a>        <span class="va">self</span><span class="op">.</span>used <span class="op">+=</span> <span class="dv">6</span>;</span>
<span id="cb11-38"><a href="#cb11-38" tabindex="-1"></a></span>
<span id="cb11-39"><a href="#cb11-39" tabindex="-1"></a>        <span class="at">const</span> id <span class="op">=</span> mem<span class="op">.</span>readIntBig(<span class="dt">u16</span><span class="op">,</span> buf[<span class="dv">0</span><span class="er">.</span><span class="op">.</span><span class="dv">2</span>]);</span>
<span id="cb11-40"><a href="#cb11-40" tabindex="-1"></a>        <span class="at">const</span> val <span class="op">=</span> mem<span class="op">.</span>readIntBig(<span class="dt">u32</span><span class="op">,</span> buf[<span class="dv">2</span><span class="er">.</span><span class="op">.</span>]);</span>
<span id="cb11-41"><a href="#cb11-41" tabindex="-1"></a></span>
<span id="cb11-42"><a href="#cb11-42" tabindex="-1"></a>        <span class="cf">return</span> <span class="cf">switch</span> (id) {</span>
<span id="cb11-43"><a href="#cb11-43" tabindex="-1"></a>            <span class="dv">1</span> <span class="op">=&gt;</span> <span class="op">.</span>{ <span class="op">.</span>headerTableSize <span class="op">=</span> val }<span class="op">,</span></span>
<span id="cb11-44"><a href="#cb11-44" tabindex="-1"></a>            <span class="dv">2</span> <span class="op">=&gt;</span> <span class="op">.</span>{ <span class="op">.</span>enablePush <span class="op">=</span> <span class="bu">@bitCast</span>(<span class="dt">bool</span><span class="op">,</span> <span class="bu">@intCast</span>(u1<span class="op">,</span> val)) }<span class="op">,</span></span>
<span id="cb11-45"><a href="#cb11-45" tabindex="-1"></a>            <span class="dv">3</span> <span class="op">=&gt;</span> <span class="op">.</span>{ <span class="op">.</span>maxConcurrentStreams <span class="op">=</span> val }<span class="op">,</span></span>
<span id="cb11-46"><a href="#cb11-46" tabindex="-1"></a>            <span class="dv">4</span> <span class="op">=&gt;</span> <span class="op">.</span>{ <span class="op">.</span>initialWindowSize <span class="op">=</span> <span class="bu">@intCast</span>(u31<span class="op">,</span> val) }<span class="op">,</span></span>
<span id="cb11-47"><a href="#cb11-47" tabindex="-1"></a>            <span class="dv">5</span> <span class="op">=&gt;</span> <span class="op">.</span>{ <span class="op">.</span>maxFrameSize <span class="op">=</span> <span class="bu">@intCast</span>(u24<span class="op">,</span> val) }<span class="op">,</span></span>
<span id="cb11-48"><a href="#cb11-48" tabindex="-1"></a>            <span class="dv">6</span> <span class="op">=&gt;</span> <span class="op">.</span>{ <span class="op">.</span>maxHeaderListSize <span class="op">=</span> val }<span class="op">,</span></span>
<span id="cb11-49"><a href="#cb11-49" tabindex="-1"></a>            <span class="cf">else</span> <span class="op">=&gt;</span> <span class="kw">error</span><span class="op">.</span>NoIdeaWhatThatSettingIs<span class="op">,</span></span>
<span id="cb11-50"><a href="#cb11-50" tabindex="-1"></a>        };</span>
<span id="cb11-51"><a href="#cb11-51" tabindex="-1"></a>    }</span>
<span id="cb11-52"><a href="#cb11-52" tabindex="-1"></a>};</span></code></pre></div>
            <p>It is another iterator. Again it returns a tagged union
            which is nice because we can switch on it and it is
            formatted for printing automatically.</p>
            <p>By this point I was no longer passing around pointers to
            slices. Instead I had (re)discovered that functions in a Zig
            struct can begin with a <code>self</code> argument. I could
            then just keep a slice and used argument on the struct. It
            occurred to me while writing this, that I don’t even need
            <code>used</code>.</p>
            <div class="sourceCode" id="cb12"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb12-1"><a href="#cb12-1" tabindex="-1"></a><span class="at">const</span> SettingsPayload <span class="op">=</span> <span class="kw">struct</span> {</span>
<span id="cb12-2"><a href="#cb12-2" tabindex="-1"></a>    settings<span class="op">:</span> []<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span></span>
<span id="cb12-3"><a href="#cb12-3" tabindex="-1"></a></span>
<span id="cb12-4"><a href="#cb12-4" tabindex="-1"></a>    <span class="kw">pub</span> <span class="kw">fn</span> init(buf<span class="op">:</span> []<span class="at">const</span> <span class="dt">u8</span>) SettingsPayload {</span>
<span id="cb12-5"><a href="#cb12-5" tabindex="-1"></a>        <span class="cf">return</span> <span class="op">.</span>{ <span class="op">.</span>settings <span class="op">=</span> buf };</span>
<span id="cb12-6"><a href="#cb12-6" tabindex="-1"></a>    }</span>
<span id="cb12-7"><a href="#cb12-7" tabindex="-1"></a></span>
<span id="cb12-8"><a href="#cb12-8" tabindex="-1"></a>    <span class="kw">pub</span> <span class="kw">fn</span> next(<span class="va">self</span><span class="op">:</span> <span class="op">*</span>SettingsPayload) <span class="op">!</span>Setting {</span>
<span id="cb12-9"><a href="#cb12-9" tabindex="-1"></a>        <span class="cf">if</span> (<span class="va">self</span><span class="op">.</span>settings<span class="op">.</span>len <span class="op">==</span> <span class="dv">0</span>)</span>
<span id="cb12-10"><a href="#cb12-10" tabindex="-1"></a>            <span class="cf">return</span> <span class="kw">error</span><span class="op">.</span>EndOfData;</span>
<span id="cb12-11"><a href="#cb12-11" tabindex="-1"></a></span>
<span id="cb12-12"><a href="#cb12-12" tabindex="-1"></a>        <span class="cf">if</span> (<span class="va">self</span><span class="op">.</span>settings<span class="op">.</span>len <span class="op">&lt;</span> <span class="dv">6</span>)</span>
<span id="cb12-13"><a href="#cb12-13" tabindex="-1"></a>            <span class="cf">return</span> <span class="kw">error</span><span class="op">.</span>UnexpectedEndOfData;</span>
<span id="cb12-14"><a href="#cb12-14" tabindex="-1"></a></span>
<span id="cb12-15"><a href="#cb12-15" tabindex="-1"></a>        <span class="at">const</span> buf <span class="op">=</span> <span class="va">self</span><span class="op">.</span>settings[<span class="dv">0</span><span class="er">.</span><span class="op">.</span><span class="dv">6</span>];</span>
<span id="cb12-16"><a href="#cb12-16" tabindex="-1"></a></span>
<span id="cb12-17"><a href="#cb12-17" tabindex="-1"></a>        <span class="va">self</span><span class="op">.</span>settings <span class="op">=</span> <span class="va">self</span><span class="op">.</span>settings[<span class="dv">6</span><span class="er">.</span><span class="op">.</span>];</span>
<span id="cb12-18"><a href="#cb12-18" tabindex="-1"></a></span>
<span id="cb12-19"><a href="#cb12-19" tabindex="-1"></a>        <span class="at">const</span> id <span class="op">=</span> mem<span class="op">.</span>readIntBig(<span class="dt">u16</span><span class="op">,</span> buf[<span class="dv">0</span><span class="er">.</span><span class="op">.</span><span class="dv">2</span>]);</span>
<span id="cb12-20"><a href="#cb12-20" tabindex="-1"></a>        <span class="at">const</span> val <span class="op">=</span> mem<span class="op">.</span>readIntBig(<span class="dt">u32</span><span class="op">,</span> buf[<span class="dv">2</span><span class="er">.</span><span class="op">.</span>]);</span>
<span id="cb12-21"><a href="#cb12-21" tabindex="-1"></a></span>
<span id="cb12-22"><a href="#cb12-22" tabindex="-1"></a>        <span class="cf">return</span> <span class="cf">switch</span> (id) {</span>
<span id="cb12-23"><a href="#cb12-23" tabindex="-1"></a>            <span class="dv">1</span> <span class="op">=&gt;</span> <span class="op">.</span>{ <span class="op">.</span>headerTableSize <span class="op">=</span> val }<span class="op">,</span></span>
<span id="cb12-24"><a href="#cb12-24" tabindex="-1"></a>            <span class="dv">2</span> <span class="op">=&gt;</span> <span class="op">.</span>{ <span class="op">.</span>enablePush <span class="op">=</span> <span class="bu">@bitCast</span>(<span class="dt">bool</span><span class="op">,</span> <span class="bu">@intCast</span>(u1<span class="op">,</span> val)) }<span class="op">,</span></span>
<span id="cb12-25"><a href="#cb12-25" tabindex="-1"></a>            <span class="dv">3</span> <span class="op">=&gt;</span> <span class="op">.</span>{ <span class="op">.</span>maxConcurrentStreams <span class="op">=</span> val }<span class="op">,</span></span>
<span id="cb12-26"><a href="#cb12-26" tabindex="-1"></a>            <span class="dv">4</span> <span class="op">=&gt;</span> <span class="op">.</span>{ <span class="op">.</span>initialWindowSize <span class="op">=</span> <span class="bu">@intCast</span>(u31<span class="op">,</span> val) }<span class="op">,</span></span>
<span id="cb12-27"><a href="#cb12-27" tabindex="-1"></a>            <span class="dv">5</span> <span class="op">=&gt;</span> <span class="op">.</span>{ <span class="op">.</span>maxFrameSize <span class="op">=</span> <span class="bu">@intCast</span>(u24<span class="op">,</span> val) }<span class="op">,</span></span>
<span id="cb12-28"><a href="#cb12-28" tabindex="-1"></a>            <span class="dv">6</span> <span class="op">=&gt;</span> <span class="op">.</span>{ <span class="op">.</span>maxHeaderListSize <span class="op">=</span> val }<span class="op">,</span></span>
<span id="cb12-29"><a href="#cb12-29" tabindex="-1"></a>            <span class="cf">else</span> <span class="op">=&gt;</span> <span class="kw">error</span><span class="op">.</span>NoIdeaWhatThatSettingIs<span class="op">,</span></span>
<span id="cb12-30"><a href="#cb12-30" tabindex="-1"></a>        };</span>
<span id="cb12-31"><a href="#cb12-31" tabindex="-1"></a>    }</span>
<span id="cb12-32"><a href="#cb12-32" tabindex="-1"></a>};</span></code></pre></div>
            <p>It’s not like I haven’t used a language with slices in
            before. However there does seem to be some kind of mental
            block. Perhaps because they look similar to arrays.</p>
            <h1 id="connection">Connection</h1>
            <p>HTTP/2 organises things into connections and streams
            within connections. Some things are connection level and
            other things are stream level.</p>
            <p>Also a connection maps to a TCP/TLS connection… I think.
            At least for the current purposes a stream is just an ID
            number we need to remember when responding to a headers
            frame containing a get method.</p>
            <p>There are no explicit request/response objects. However a
            request can be thought of as a sequence of headers,
            continuation and data frames on a single stream. There seems
            to be some leeway here though which could be another barrier
            to adoption.</p>
            <p>For now though, we just have a Connection object which
            provides a low level interface for getting frames.</p>
            <div class="sourceCode" id="cb13"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb13-1"><a href="#cb13-1" tabindex="-1"></a><span class="co">/// HTTP/2&#39;s idea of a connection which wraps around the underlying</span></span>
<span id="cb13-2"><a href="#cb13-2" tabindex="-1"></a><span class="co">/// stream.  Usually the underlying stream will be a TCP connection,</span></span>
<span id="cb13-3"><a href="#cb13-3" tabindex="-1"></a><span class="co">/// but could by anything which provides the Reader and Writer</span></span>
<span id="cb13-4"><a href="#cb13-4" tabindex="-1"></a><span class="co">/// interfaces e.g. a file or buffer with captured frame data in.</span></span>
<span id="cb13-5"><a href="#cb13-5" tabindex="-1"></a><span class="co">///</span></span>
<span id="cb13-6"><a href="#cb13-6" tabindex="-1"></a><span class="co">/// This is essentially an iterator interface to the underlying</span></span>
<span id="cb13-7"><a href="#cb13-7" tabindex="-1"></a><span class="co">/// data. Which recurses into iterators for the various types of frame</span></span>
<span id="cb13-8"><a href="#cb13-8" tabindex="-1"></a><span class="co">/// payload.</span></span>
<span id="cb13-9"><a href="#cb13-9" tabindex="-1"></a><span class="co">///</span></span>
<span id="cb13-10"><a href="#cb13-10" tabindex="-1"></a><span class="co">/// It only allocates memory during init and we can reuse the</span></span>
<span id="cb13-11"><a href="#cb13-11" tabindex="-1"></a><span class="co">/// connection object by calling reinit. This preempts the No. 1</span></span>
<span id="cb13-12"><a href="#cb13-12" tabindex="-1"></a><span class="co">/// performance issue I have seen in most open source libraries.</span></span>
<span id="cb13-13"><a href="#cb13-13" tabindex="-1"></a><span class="co">///</span></span>
<span id="cb13-14"><a href="#cb13-14" tabindex="-1"></a><span class="co">/// One should assume that any pointers it returns are to buffers it</span></span>
<span id="cb13-15"><a href="#cb13-15" tabindex="-1"></a><span class="co">/// allocated at init time. So their lifetime is only until the next</span></span>
<span id="cb13-16"><a href="#cb13-16" tabindex="-1"></a><span class="co">/// call to nextFrame[Hdr]. Long lived data therefor needs to be</span></span>
<span id="cb13-17"><a href="#cb13-17" tabindex="-1"></a><span class="co">/// copied. Whether this is an issue depends on the use case.</span></span>
<span id="cb13-18"><a href="#cb13-18" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">fn</span> Connection(<span class="at">comptime</span> Reader<span class="op">:</span> <span class="dt">type</span><span class="op">,</span> <span class="at">comptime</span> Writer<span class="op">:</span> <span class="dt">type</span>) <span class="dt">type</span> {</span>
<span id="cb13-19"><a href="#cb13-19" tabindex="-1"></a>    <span class="at">const</span> preface <span class="op">=</span> [_]<span class="dt">u8</span>{</span>
<span id="cb13-20"><a href="#cb13-20" tabindex="-1"></a>        <span class="dv">0</span><span class="er">x50</span><span class="op">,</span> <span class="dv">0</span><span class="er">x52</span><span class="op">,</span> <span class="dv">0</span><span class="er">x49</span><span class="op">,</span> <span class="dv">0</span><span class="er">x20</span><span class="op">,</span> <span class="dv">0</span><span class="er">x2a</span><span class="op">,</span> <span class="dv">0</span><span class="er">x20</span><span class="op">,</span> <span class="dv">0</span><span class="er">x48</span><span class="op">,</span> <span class="dv">0</span><span class="er">x54</span><span class="op">,</span> <span class="dv">0</span><span class="er">x54</span><span class="op">,</span> <span class="dv">0</span><span class="er">x50</span><span class="op">,</span> <span class="dv">0</span><span class="er">x2f</span><span class="op">,</span></span>
<span id="cb13-21"><a href="#cb13-21" tabindex="-1"></a>        <span class="dv">0</span><span class="er">x32</span><span class="op">,</span> <span class="dv">0</span><span class="er">x2e</span><span class="op">,</span> <span class="dv">0</span><span class="er">x30</span><span class="op">,</span> <span class="dv">0</span><span class="er">x0d</span><span class="op">,</span> <span class="dv">0</span><span class="er">x0a</span><span class="op">,</span> <span class="dv">0</span><span class="er">x0d</span><span class="op">,</span> <span class="dv">0</span><span class="er">x0a</span><span class="op">,</span> <span class="dv">0</span><span class="er">x53</span><span class="op">,</span> <span class="dv">0</span><span class="er">x4d</span><span class="op">,</span> <span class="dv">0</span><span class="er">x0d</span><span class="op">,</span> <span class="dv">0</span><span class="er">x0a</span><span class="op">,</span></span>
<span id="cb13-22"><a href="#cb13-22" tabindex="-1"></a>        <span class="dv">0</span><span class="er">x0d</span><span class="op">,</span> <span class="dv">0</span><span class="er">x0a</span><span class="op">,</span></span>
<span id="cb13-23"><a href="#cb13-23" tabindex="-1"></a>    };</span>
<span id="cb13-24"><a href="#cb13-24" tabindex="-1"></a></span>
<span id="cb13-25"><a href="#cb13-25" tabindex="-1"></a>    <span class="cf">return</span> <span class="kw">struct</span> {</span>
<span id="cb13-26"><a href="#cb13-26" tabindex="-1"></a>        <span class="at">const</span> Self <span class="op">=</span> <span class="bu">@This</span>();</span>
<span id="cb13-27"><a href="#cb13-27" tabindex="-1"></a></span>
<span id="cb13-28"><a href="#cb13-28" tabindex="-1"></a>        allocator<span class="op">:</span> std<span class="op">.</span>mem<span class="op">.</span>Allocator<span class="op">,</span></span>
<span id="cb13-29"><a href="#cb13-29" tabindex="-1"></a>        frame_in<span class="op">:</span> []<span class="dt">u8</span><span class="op">,</span></span>
<span id="cb13-30"><a href="#cb13-30" tabindex="-1"></a>        have<span class="op">:</span> <span class="dt">usize</span> <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb13-31"><a href="#cb13-31" tabindex="-1"></a>        used<span class="op">:</span> <span class="dt">usize</span> <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb13-32"><a href="#cb13-32" tabindex="-1"></a></span>
<span id="cb13-33"><a href="#cb13-33" tabindex="-1"></a>        headers_in<span class="op">:</span> []<span class="dt">u8</span><span class="op">,</span></span>
<span id="cb13-34"><a href="#cb13-34" tabindex="-1"></a>        hdec<span class="op">:</span> hpack<span class="op">.</span>Decoder<span class="op">,</span></span>
<span id="cb13-35"><a href="#cb13-35" tabindex="-1"></a>        frame_out<span class="op">:</span> []<span class="dt">u8</span><span class="op">,</span></span>
<span id="cb13-36"><a href="#cb13-36" tabindex="-1"></a></span>
<span id="cb13-37"><a href="#cb13-37" tabindex="-1"></a>        reader<span class="op">:</span> Reader<span class="op">,</span></span>
<span id="cb13-38"><a href="#cb13-38" tabindex="-1"></a>        writer<span class="op">:</span> Writer<span class="op">,</span></span>
<span id="cb13-39"><a href="#cb13-39" tabindex="-1"></a></span>
<span id="cb13-40"><a href="#cb13-40" tabindex="-1"></a>        <span class="kw">pub</span> <span class="kw">fn</span> init(</span>
<span id="cb13-41"><a href="#cb13-41" tabindex="-1"></a>            a<span class="op">:</span> std<span class="op">.</span>mem<span class="op">.</span>Allocator<span class="op">,</span></span>
<span id="cb13-42"><a href="#cb13-42" tabindex="-1"></a>            buf_len<span class="op">:</span> <span class="dt">usize</span><span class="op">,</span></span>
<span id="cb13-43"><a href="#cb13-43" tabindex="-1"></a>        ) <span class="op">!</span>Self {</span>
<span id="cb13-44"><a href="#cb13-44" tabindex="-1"></a>            <span class="cf">return</span> <span class="op">.</span>{</span>
<span id="cb13-45"><a href="#cb13-45" tabindex="-1"></a>                <span class="op">.</span>allocator <span class="op">=</span> a<span class="op">,</span></span>
<span id="cb13-46"><a href="#cb13-46" tabindex="-1"></a>                <span class="op">.</span>frame_in <span class="op">=</span> <span class="cf">try</span> a<span class="op">.</span>alloc(<span class="dt">u8</span><span class="op">,</span> buf_len)<span class="op">,</span></span>
<span id="cb13-47"><a href="#cb13-47" tabindex="-1"></a>                <span class="op">.</span>headers_in <span class="op">=</span> <span class="cf">try</span> a<span class="op">.</span>alloc(<span class="dt">u8</span><span class="op">,</span> buf_len)<span class="op">,</span></span>
<span id="cb13-48"><a href="#cb13-48" tabindex="-1"></a>                <span class="op">.</span>hdec <span class="op">=</span> <span class="op">.</span>{ <span class="op">.</span>from <span class="op">=</span> <span class="cn">undefined</span><span class="op">,</span> <span class="op">.</span>to <span class="op">=</span> <span class="cn">undefined</span> }<span class="op">,</span></span>
<span id="cb13-49"><a href="#cb13-49" tabindex="-1"></a>                <span class="op">.</span>frame_out <span class="op">=</span> <span class="cf">try</span> a<span class="op">.</span>alloc(<span class="dt">u8</span><span class="op">,</span> buf_len)<span class="op">,</span></span>
<span id="cb13-50"><a href="#cb13-50" tabindex="-1"></a>                <span class="op">.</span>reader <span class="op">=</span> <span class="cn">undefined</span><span class="op">,</span></span>
<span id="cb13-51"><a href="#cb13-51" tabindex="-1"></a>                <span class="op">.</span>writer <span class="op">=</span> <span class="cn">undefined</span><span class="op">,</span></span>
<span id="cb13-52"><a href="#cb13-52" tabindex="-1"></a>            };</span>
<span id="cb13-53"><a href="#cb13-53" tabindex="-1"></a>        }</span>
<span id="cb13-54"><a href="#cb13-54" tabindex="-1"></a></span>
<span id="cb13-55"><a href="#cb13-55" tabindex="-1"></a>        <span class="kw">pub</span> <span class="kw">fn</span> reinit(<span class="va">self</span><span class="op">:</span> <span class="op">*</span>Self<span class="op">,</span> r<span class="op">:</span> Reader<span class="op">,</span> w<span class="op">:</span> Writer) <span class="dt">void</span> {</span>
<span id="cb13-56"><a href="#cb13-56" tabindex="-1"></a>            <span class="va">self</span><span class="op">.</span>have <span class="op">=</span> <span class="dv">0</span>;</span>
<span id="cb13-57"><a href="#cb13-57" tabindex="-1"></a>            <span class="va">self</span><span class="op">.</span>used <span class="op">=</span> <span class="dv">0</span>;</span>
<span id="cb13-58"><a href="#cb13-58" tabindex="-1"></a>            <span class="va">self</span><span class="op">.</span>reader <span class="op">=</span> r;</span>
<span id="cb13-59"><a href="#cb13-59" tabindex="-1"></a>            <span class="va">self</span><span class="op">.</span>writer <span class="op">=</span> w;</span>
<span id="cb13-60"><a href="#cb13-60" tabindex="-1"></a></span>
<span id="cb13-61"><a href="#cb13-61" tabindex="-1"></a>            mem<span class="op">.</span>set(<span class="dt">u8</span><span class="op">,</span> <span class="va">self</span><span class="op">.</span>frame_in<span class="op">,</span> <span class="dv">0</span>);</span>
<span id="cb13-62"><a href="#cb13-62" tabindex="-1"></a>            mem<span class="op">.</span>set(<span class="dt">u8</span><span class="op">,</span> <span class="va">self</span><span class="op">.</span>frame_out<span class="op">,</span> <span class="dv">0</span>);</span>
<span id="cb13-63"><a href="#cb13-63" tabindex="-1"></a>            mem<span class="op">.</span>set(<span class="dt">u8</span><span class="op">,</span> <span class="va">self</span><span class="op">.</span>headers_in<span class="op">,</span> <span class="dv">0</span>);</span>
<span id="cb13-64"><a href="#cb13-64" tabindex="-1"></a>        }</span>
<span id="cb13-65"><a href="#cb13-65" tabindex="-1"></a></span>
<span id="cb13-66"><a href="#cb13-66" tabindex="-1"></a>        <span class="kw">pub</span> <span class="kw">fn</span> deinit(<span class="va">self</span><span class="op">:</span> <span class="op">*</span>Self) <span class="dt">void</span> {</span>
<span id="cb13-67"><a href="#cb13-67" tabindex="-1"></a>            <span class="va">self</span><span class="op">.</span>allocator<span class="op">.</span>free(<span class="va">self</span><span class="op">.</span>frame_in);</span>
<span id="cb13-68"><a href="#cb13-68" tabindex="-1"></a>            <span class="va">self</span><span class="op">.</span>allocator<span class="op">.</span>free(<span class="va">self</span><span class="op">.</span>frame_out);</span>
<span id="cb13-69"><a href="#cb13-69" tabindex="-1"></a>            <span class="va">self</span><span class="op">.</span>allocator<span class="op">.</span>free(<span class="va">self</span><span class="op">.</span>headers_in);</span>
<span id="cb13-70"><a href="#cb13-70" tabindex="-1"></a>        }</span>
<span id="cb13-71"><a href="#cb13-71" tabindex="-1"></a></span>
<span id="cb13-72"><a href="#cb13-72" tabindex="-1"></a>        <span class="kw">fn</span> read(<span class="va">self</span><span class="op">:</span> <span class="op">*</span>Self<span class="op">,</span> needed<span class="op">:</span> <span class="dt">usize</span>) <span class="op">!</span>[]<span class="at">const</span> <span class="dt">u8</span> {</span>
<span id="cb13-73"><a href="#cb13-73" tabindex="-1"></a>            <span class="at">var</span> have <span class="op">=</span> <span class="va">self</span><span class="op">.</span>have <span class="op">-</span> <span class="va">self</span><span class="op">.</span>used;</span>
<span id="cb13-74"><a href="#cb13-74" tabindex="-1"></a>            <span class="at">const</span> in <span class="op">=</span> <span class="va">self</span><span class="op">.</span>frame_in[<span class="va">self</span><span class="op">.</span>used<span class="op">..</span>];</span>
<span id="cb13-75"><a href="#cb13-75" tabindex="-1"></a></span>
<span id="cb13-76"><a href="#cb13-76" tabindex="-1"></a>            <span class="at">const</span> len <span class="op">=</span> <span class="cf">if</span> (have <span class="op">&lt;</span> needed)</span>
<span id="cb13-77"><a href="#cb13-77" tabindex="-1"></a>                <span class="cf">try</span> <span class="va">self</span><span class="op">.</span>reader<span class="op">.</span>readAtLeast(in<span class="op">,</span> needed <span class="op">-</span> have)</span>
<span id="cb13-78"><a href="#cb13-78" tabindex="-1"></a>            <span class="cf">else</span></span>
<span id="cb13-79"><a href="#cb13-79" tabindex="-1"></a>                <span class="dv">0</span>;</span>
<span id="cb13-80"><a href="#cb13-80" tabindex="-1"></a></span>
<span id="cb13-81"><a href="#cb13-81" tabindex="-1"></a>            have <span class="op">+=</span> len;</span>
<span id="cb13-82"><a href="#cb13-82" tabindex="-1"></a>            <span class="va">self</span><span class="op">.</span>have <span class="op">+=</span> len;</span>
<span id="cb13-83"><a href="#cb13-83" tabindex="-1"></a></span>
<span id="cb13-84"><a href="#cb13-84" tabindex="-1"></a>            <span class="cf">if</span> (have <span class="op">==</span> <span class="dv">0</span>)</span>
<span id="cb13-85"><a href="#cb13-85" tabindex="-1"></a>                <span class="cf">return</span> <span class="kw">error</span><span class="op">.</span>EndOfData;</span>
<span id="cb13-86"><a href="#cb13-86" tabindex="-1"></a></span>
<span id="cb13-87"><a href="#cb13-87" tabindex="-1"></a>            <span class="cf">if</span> (have <span class="op">&lt;</span> needed)</span>
<span id="cb13-88"><a href="#cb13-88" tabindex="-1"></a>                <span class="cf">return</span> <span class="kw">error</span><span class="op">.</span>UnexpectedEndOfData;</span>
<span id="cb13-89"><a href="#cb13-89" tabindex="-1"></a></span>
<span id="cb13-90"><a href="#cb13-90" tabindex="-1"></a>            <span class="va">self</span><span class="op">.</span>used <span class="op">+=</span> needed;</span>
<span id="cb13-91"><a href="#cb13-91" tabindex="-1"></a></span>
<span id="cb13-92"><a href="#cb13-92" tabindex="-1"></a>            <span class="cf">return</span> in[<span class="dv">0</span><span class="er">.</span><span class="op">.</span>needed];</span>
<span id="cb13-93"><a href="#cb13-93" tabindex="-1"></a>        }</span>
<span id="cb13-94"><a href="#cb13-94" tabindex="-1"></a></span>
<span id="cb13-95"><a href="#cb13-95" tabindex="-1"></a>        <span class="co">/// Start the HTTP/2 connection as the server and assuming</span></span>
<span id="cb13-96"><a href="#cb13-96" tabindex="-1"></a>        <span class="co">/// &quot;prior knowledge&quot;. This is a simple case of reading in the</span></span>
<span id="cb13-97"><a href="#cb13-97" tabindex="-1"></a>        <span class="co">/// magic string (preface) sent by the client and sending a</span></span>
<span id="cb13-98"><a href="#cb13-98" tabindex="-1"></a>        <span class="co">/// settings frame.</span></span>
<span id="cb13-99"><a href="#cb13-99" tabindex="-1"></a>        <span class="kw">pub</span> <span class="kw">fn</span> start(<span class="va">self</span><span class="op">:</span> <span class="op">*</span>Self) <span class="op">!</span><span class="dt">void</span> {</span>
<span id="cb13-100"><a href="#cb13-100" tabindex="-1"></a>            <span class="at">const</span> pface <span class="op">=</span> <span class="cf">try</span> <span class="va">self</span><span class="op">.</span>read(preface<span class="op">.</span>len);</span>
<span id="cb13-101"><a href="#cb13-101" tabindex="-1"></a></span>
<span id="cb13-102"><a href="#cb13-102" tabindex="-1"></a>            <span class="cf">if</span> (mem<span class="op">.</span>eql(<span class="dt">u8</span><span class="op">,</span> <span class="op">&amp;</span>preface<span class="op">,</span> pface))</span>
<span id="cb13-103"><a href="#cb13-103" tabindex="-1"></a>                std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;&lt;&lt;&lt; Got preface!&quot;</span><span class="op">,</span> <span class="op">.</span>{})</span>
<span id="cb13-104"><a href="#cb13-104" tabindex="-1"></a>            <span class="cf">else</span> {</span>
<span id="cb13-105"><a href="#cb13-105" tabindex="-1"></a>                std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;&lt;&lt;&lt; Expected preface, bug got</span><span class="sc">\n</span><span class="st">: {s}&quot;</span><span class="op">,</span> <span class="op">.</span>{pface});</span>
<span id="cb13-106"><a href="#cb13-106" tabindex="-1"></a>                <span class="cf">return</span> <span class="kw">error</span><span class="op">.</span>InvalidPreface;</span>
<span id="cb13-107"><a href="#cb13-107" tabindex="-1"></a>            }</span>
<span id="cb13-108"><a href="#cb13-108" tabindex="-1"></a></span>
<span id="cb13-109"><a href="#cb13-109" tabindex="-1"></a>            std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;&gt;&gt;&gt; Sending server preface&quot;</span><span class="op">,</span> <span class="op">.</span>{});</span>
<span id="cb13-110"><a href="#cb13-110" tabindex="-1"></a>            <span class="at">const</span> empty_settings <span class="op">=</span> FrameHdr{</span>
<span id="cb13-111"><a href="#cb13-111" tabindex="-1"></a>                <span class="op">.</span>length <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb13-112"><a href="#cb13-112" tabindex="-1"></a>                <span class="op">.</span><span class="dt">type</span> <span class="op">=</span> <span class="op">.</span>settings<span class="op">,</span></span>
<span id="cb13-113"><a href="#cb13-113" tabindex="-1"></a>                <span class="op">.</span>flags <span class="op">=</span> <span class="op">.</span>{ <span class="op">.</span>settings <span class="op">=</span> <span class="op">.</span>{ <span class="op">.</span>ack <span class="op">=</span> <span class="cn">false</span> } }<span class="op">,</span></span>
<span id="cb13-114"><a href="#cb13-114" tabindex="-1"></a>                <span class="op">.</span>id <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb13-115"><a href="#cb13-115" tabindex="-1"></a>            };</span>
<span id="cb13-116"><a href="#cb13-116" tabindex="-1"></a>            <span class="cf">try</span> <span class="va">self</span><span class="op">.</span>writer<span class="op">.</span>writeAll(empty_settings<span class="op">.</span>to(<span class="va">self</span><span class="op">.</span>frame_out));</span>
<span id="cb13-117"><a href="#cb13-117" tabindex="-1"></a>        }</span>
<span id="cb13-118"><a href="#cb13-118" tabindex="-1"></a></span>
<span id="cb13-119"><a href="#cb13-119" tabindex="-1"></a>        <span class="co">/// Lower level iterator which returns just the frame</span></span>
<span id="cb13-120"><a href="#cb13-120" tabindex="-1"></a>        <span class="co">/// header. Potentially this can be used to skip over</span></span>
<span id="cb13-121"><a href="#cb13-121" tabindex="-1"></a>        <span class="co">/// uninteresting frames.</span></span>
<span id="cb13-122"><a href="#cb13-122" tabindex="-1"></a>        <span class="kw">pub</span> <span class="kw">fn</span> nextFrameHdr(<span class="va">self</span><span class="op">:</span> <span class="op">*</span>Self) <span class="op">!</span>FrameHdr {</span>
<span id="cb13-123"><a href="#cb13-123" tabindex="-1"></a>            <span class="cf">return</span> FrameHdr<span class="op">.</span>from((<span class="cf">try</span> <span class="va">self</span><span class="op">.</span>read(<span class="dv">9</span>))[<span class="dv">0</span><span class="er">.</span><span class="op">.</span><span class="dv">9</span>]);</span>
<span id="cb13-124"><a href="#cb13-124" tabindex="-1"></a>        }</span>
<span id="cb13-125"><a href="#cb13-125" tabindex="-1"></a></span>
<span id="cb13-126"><a href="#cb13-126" tabindex="-1"></a>        <span class="co">/// Returns a slightly higher level Frame payload iterator and</span></span>
<span id="cb13-127"><a href="#cb13-127" tabindex="-1"></a>        <span class="co">/// frame header object. Still pretty low level. We&#39;d probably</span></span>
<span id="cb13-128"><a href="#cb13-128" tabindex="-1"></a>        <span class="co">/// want to abstract this into streams and abstract streams</span></span>
<span id="cb13-129"><a href="#cb13-129" tabindex="-1"></a>        <span class="co">/// into requests and responses.</span></span>
<span id="cb13-130"><a href="#cb13-130" tabindex="-1"></a>        <span class="kw">pub</span> <span class="kw">fn</span> nextFrame(<span class="va">self</span><span class="op">:</span> <span class="op">*</span>Self) <span class="op">!</span>Frame {</span>
<span id="cb13-131"><a href="#cb13-131" tabindex="-1"></a>            <span class="at">const</span> hdr <span class="op">=</span> <span class="cf">try</span> <span class="va">self</span><span class="op">.</span>nextFrameHdr();</span>
<span id="cb13-132"><a href="#cb13-132" tabindex="-1"></a></span>
<span id="cb13-133"><a href="#cb13-133" tabindex="-1"></a>            <span class="cf">return</span> Frame<span class="op">.</span>init(</span>
<span id="cb13-134"><a href="#cb13-134" tabindex="-1"></a>                hdr<span class="op">,</span></span>
<span id="cb13-135"><a href="#cb13-135" tabindex="-1"></a>                <span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>hdec<span class="op">,</span></span>
<span id="cb13-136"><a href="#cb13-136" tabindex="-1"></a>                <span class="cf">try</span> <span class="va">self</span><span class="op">.</span>read(hdr<span class="op">.</span>length)<span class="op">,</span></span>
<span id="cb13-137"><a href="#cb13-137" tabindex="-1"></a>                <span class="va">self</span><span class="op">.</span>headers_in<span class="op">,</span></span>
<span id="cb13-138"><a href="#cb13-138" tabindex="-1"></a>            );</span>
<span id="cb13-139"><a href="#cb13-139" tabindex="-1"></a>        }</span>
<span id="cb13-140"><a href="#cb13-140" tabindex="-1"></a></span>
<span id="cb13-141"><a href="#cb13-141" tabindex="-1"></a>        <span class="op">...</span></span>
<span id="cb13-142"><a href="#cb13-142" tabindex="-1"></a>}</span></code></pre></div>
            <p>There is a simplified listener in the main library file
            which can be run with <code>$ zig run src/http2.c</code>.
            This uses the connection as follows.</p>
            <div class="sourceCode" id="cb14"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb14-1"><a href="#cb14-1" tabindex="-1"></a><span class="kw">pub</span> <span class="at">const</span> NetConnection <span class="op">=</span> Connection(std<span class="op">.</span>net<span class="op">.</span>Stream<span class="op">,</span> std<span class="op">.</span>net<span class="op">.</span>Stream);</span>
<span id="cb14-2"><a href="#cb14-2" tabindex="-1"></a></span>
<span id="cb14-3"><a href="#cb14-3" tabindex="-1"></a><span class="kw">fn</span> serve(h2c<span class="op">:</span> <span class="op">*</span>NetConnection) <span class="op">!</span><span class="dt">void</span> {</span>
<span id="cb14-4"><a href="#cb14-4" tabindex="-1"></a>    <span class="cf">try</span> h2c<span class="op">.</span>start();</span>
<span id="cb14-5"><a href="#cb14-5" tabindex="-1"></a></span>
<span id="cb14-6"><a href="#cb14-6" tabindex="-1"></a>    <span class="cf">while</span> (h2c<span class="op">.</span>nextFrame()) <span class="op">|</span>frame<span class="op">|</span> {</span>
<span id="cb14-7"><a href="#cb14-7" tabindex="-1"></a>        std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;&lt;&lt;&lt; {} {}&quot;</span><span class="op">,</span> frame);</span>
<span id="cb14-8"><a href="#cb14-8" tabindex="-1"></a></span>
<span id="cb14-9"><a href="#cb14-9" tabindex="-1"></a>        <span class="at">var</span> payload <span class="op">=</span> frame<span class="op">.</span>payload;</span>
<span id="cb14-10"><a href="#cb14-10" tabindex="-1"></a></span>
<span id="cb14-11"><a href="#cb14-11" tabindex="-1"></a>        <span class="cf">switch</span> (payload) {</span>
<span id="cb14-12"><a href="#cb14-12" tabindex="-1"></a>            <span class="op">.</span>settings <span class="op">=&gt;</span> <span class="op">|*</span>settings<span class="op">|</span> {</span>
<span id="cb14-13"><a href="#cb14-13" tabindex="-1"></a>                <span class="cf">while</span> (settings<span class="op">.</span>next()) <span class="op">|</span>setting<span class="op">|</span> {</span>
<span id="cb14-14"><a href="#cb14-14" tabindex="-1"></a>                    std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;    {}&quot;</span><span class="op">,</span> <span class="op">.</span>{setting});</span>
<span id="cb14-15"><a href="#cb14-15" tabindex="-1"></a>                } <span class="cf">else</span> <span class="op">|</span>err<span class="op">|</span> {</span>
<span id="cb14-16"><a href="#cb14-16" tabindex="-1"></a>                    <span class="cf">if</span> (err <span class="op">!=</span> <span class="kw">error</span><span class="op">.</span>EndOfData) <span class="cf">return</span> err;</span>
<span id="cb14-17"><a href="#cb14-17" tabindex="-1"></a>                }</span>
<span id="cb14-18"><a href="#cb14-18" tabindex="-1"></a>            }<span class="op">,</span></span>
<span id="cb14-19"><a href="#cb14-19" tabindex="-1"></a>            <span class="op">.</span>headers <span class="op">=&gt;</span> <span class="op">|*</span>headers<span class="op">|</span> {</span>
<span id="cb14-20"><a href="#cb14-20" tabindex="-1"></a>                <span class="cf">while</span> (headers<span class="op">.</span>next()) <span class="op">|</span>h<span class="op">|</span> {</span>
<span id="cb14-21"><a href="#cb14-21" tabindex="-1"></a>                    std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;    {s} =&gt; {s}&quot;</span><span class="op">,</span> h);</span>
<span id="cb14-22"><a href="#cb14-22" tabindex="-1"></a>                } <span class="cf">else</span> <span class="op">|</span>err<span class="op">|</span> {</span>
<span id="cb14-23"><a href="#cb14-23" tabindex="-1"></a>                    <span class="cf">if</span> (err <span class="op">!=</span> <span class="kw">error</span><span class="op">.</span>EndOfData) <span class="cf">return</span> err;</span>
<span id="cb14-24"><a href="#cb14-24" tabindex="-1"></a>                }</span>
<span id="cb14-25"><a href="#cb14-25" tabindex="-1"></a></span>
<span id="cb14-26"><a href="#cb14-26" tabindex="-1"></a>                std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;&gt;&gt;&gt; Sending 200 OK and end stream&quot;</span><span class="op">,</span> <span class="op">.</span>{});</span>
<span id="cb14-27"><a href="#cb14-27" tabindex="-1"></a></span>
<span id="cb14-28"><a href="#cb14-28" tabindex="-1"></a>                <span class="cf">try</span> h2c<span class="op">.</span>sendHeaders(<span class="op">.</span>{</span>
<span id="cb14-29"><a href="#cb14-29" tabindex="-1"></a>                    <span class="op">.</span>end_stream <span class="op">=</span> <span class="cn">true</span><span class="op">,</span></span>
<span id="cb14-30"><a href="#cb14-30" tabindex="-1"></a>                    <span class="op">.</span>stream_id <span class="op">=</span> frame<span class="op">.</span>hdr<span class="op">.</span>id<span class="op">,</span></span>
<span id="cb14-31"><a href="#cb14-31" tabindex="-1"></a>                }<span class="op">,</span> <span class="op">&amp;.</span>{<span class="op">.</span>{</span>
<span id="cb14-32"><a href="#cb14-32" tabindex="-1"></a>                    <span class="op">.</span>indexed <span class="op">=</span> <span class="op">.</span>status200<span class="op">,</span></span>
<span id="cb14-33"><a href="#cb14-33" tabindex="-1"></a>                }});</span>
<span id="cb14-34"><a href="#cb14-34" tabindex="-1"></a>            }<span class="op">,</span></span>
<span id="cb14-35"><a href="#cb14-35" tabindex="-1"></a>            <span class="cf">else</span> <span class="op">=&gt;</span> {}<span class="op">,</span></span>
<span id="cb14-36"><a href="#cb14-36" tabindex="-1"></a>        }</span>
<span id="cb14-37"><a href="#cb14-37" tabindex="-1"></a>    } <span class="cf">else</span> <span class="op">|</span>err<span class="op">|</span> {</span>
<span id="cb14-38"><a href="#cb14-38" tabindex="-1"></a>        <span class="cf">if</span> (err <span class="op">!=</span> <span class="kw">error</span><span class="op">.</span>EndOfData)</span>
<span id="cb14-39"><a href="#cb14-39" tabindex="-1"></a>            <span class="cf">return</span> err;</span>
<span id="cb14-40"><a href="#cb14-40" tabindex="-1"></a>    }</span>
<span id="cb14-41"><a href="#cb14-41" tabindex="-1"></a>}</span></code></pre></div>
            <p>So again it is an iterator that goes over the incoming
            frames from any stream. I have omitted the
            <code>sendHeaders()</code> function, but it just sends a
            headers frame back with <code>:status =&gt; 200</code> and
            <code>END_STREAM</code> set. This closes the stream and Curl
            seems to go away happy.</p>
            <h1 id="zelf-zerve-2">Zelf Zerve 2</h1>
            <p>Finally I <a
            href="https://github.com/richiejp/barely-http2/blob/main/src/self-serve2.zig">converted
            my static site server</a> to use barely HTTP/2. It’s
            currently pretty useless due to the lack of TLS, which I’ll
            come to in a minute. To my knowledge browsers refuse to use
            HTTP/2 without TLS. So all we can do is use Curl.</p>
            <p>I suppose one part that might be interesting is the use
            of <code>std.os.sendfile</code> to populated data
            frames.</p>
            <div class="sourceCode" id="cb15"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb15-1"><a href="#cb15-1" tabindex="-1"></a>    <span class="cf">while</span> (send_total <span class="op">&lt;</span> file_len) {</span>
<span id="cb15-2"><a href="#cb15-2" tabindex="-1"></a>        <span class="at">const</span> len_left <span class="op">=</span> file_len <span class="op">-</span> send_total;</span>
<span id="cb15-3"><a href="#cb15-3" tabindex="-1"></a>        <span class="at">const</span> frame_len <span class="op">=</span> std<span class="op">.</span>math<span class="op">.</span>min(max_frame_len<span class="op">,</span> len_left);</span>
<span id="cb15-4"><a href="#cb15-4" tabindex="-1"></a>        <span class="at">const</span> data_hdr <span class="op">=</span> http2<span class="op">.</span>FrameHdr{</span>
<span id="cb15-5"><a href="#cb15-5" tabindex="-1"></a>            <span class="op">.</span>length <span class="op">=</span> <span class="bu">@intCast</span>(u24<span class="op">,</span> frame_len)<span class="op">,</span></span>
<span id="cb15-6"><a href="#cb15-6" tabindex="-1"></a>            <span class="op">.</span><span class="dt">type</span> <span class="op">=</span> <span class="op">.</span>data<span class="op">,</span></span>
<span id="cb15-7"><a href="#cb15-7" tabindex="-1"></a>            <span class="op">.</span>flags <span class="op">=</span> <span class="op">.</span>{ <span class="op">.</span>data <span class="op">=</span> <span class="op">.</span>{</span>
<span id="cb15-8"><a href="#cb15-8" tabindex="-1"></a>                <span class="op">.</span>endStream <span class="op">=</span> len_left <span class="op">==</span> frame_len<span class="op">,</span></span>
<span id="cb15-9"><a href="#cb15-9" tabindex="-1"></a>                <span class="op">.</span>padded <span class="op">=</span> <span class="cn">false</span><span class="op">,</span></span>
<span id="cb15-10"><a href="#cb15-10" tabindex="-1"></a>            } }<span class="op">,</span></span>
<span id="cb15-11"><a href="#cb15-11" tabindex="-1"></a>            <span class="op">.</span>id <span class="op">=</span> stream_id<span class="op">,</span></span>
<span id="cb15-12"><a href="#cb15-12" tabindex="-1"></a>        };</span>
<span id="cb15-13"><a href="#cb15-13" tabindex="-1"></a>        <span class="at">var</span> data_buf<span class="op">:</span> [<span class="dv">9</span>]<span class="dt">u8</span> <span class="op">=</span> <span class="cn">undefined</span>;</span>
<span id="cb15-14"><a href="#cb15-14" tabindex="-1"></a></span>
<span id="cb15-15"><a href="#cb15-15" tabindex="-1"></a>        std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;&gt;&gt;&gt; Sending DATA {}&quot;</span><span class="op">,</span> <span class="op">.</span>{data_hdr});</span>
<span id="cb15-16"><a href="#cb15-16" tabindex="-1"></a>        <span class="cf">try</span> h2c<span class="op">.</span>writer<span class="op">.</span>writeAll(data_hdr<span class="op">.</span>to(<span class="op">&amp;</span>data_buf));</span>
<span id="cb15-17"><a href="#cb15-17" tabindex="-1"></a></span>
<span id="cb15-18"><a href="#cb15-18" tabindex="-1"></a>        <span class="at">var</span> send_len<span class="op">:</span> <span class="dt">usize</span> <span class="op">=</span> <span class="dv">0</span>;</span>
<span id="cb15-19"><a href="#cb15-19" tabindex="-1"></a>        <span class="cf">while</span> (send_len <span class="op">&lt;</span> frame_len) {</span>
<span id="cb15-20"><a href="#cb15-20" tabindex="-1"></a>            send_len <span class="op">+=</span> <span class="cf">try</span> std<span class="op">.</span>os<span class="op">.</span>sendfile(</span>
<span id="cb15-21"><a href="#cb15-21" tabindex="-1"></a>                h2c<span class="op">.</span>writer<span class="op">.</span>handle<span class="op">,</span></span>
<span id="cb15-22"><a href="#cb15-22" tabindex="-1"></a>                body_file<span class="op">.</span>handle<span class="op">,</span></span>
<span id="cb15-23"><a href="#cb15-23" tabindex="-1"></a>                send_total<span class="op">,</span></span>
<span id="cb15-24"><a href="#cb15-24" tabindex="-1"></a>                frame_len <span class="op">-</span> send_len<span class="op">,</span></span>
<span id="cb15-25"><a href="#cb15-25" tabindex="-1"></a>                zero_iovec<span class="op">,</span></span>
<span id="cb15-26"><a href="#cb15-26" tabindex="-1"></a>                zero_iovec<span class="op">,</span></span>
<span id="cb15-27"><a href="#cb15-27" tabindex="-1"></a>                <span class="dv">0</span><span class="op">,</span></span>
<span id="cb15-28"><a href="#cb15-28" tabindex="-1"></a>            );</span>
<span id="cb15-29"><a href="#cb15-29" tabindex="-1"></a>        }</span>
<span id="cb15-30"><a href="#cb15-30" tabindex="-1"></a></span>
<span id="cb15-31"><a href="#cb15-31" tabindex="-1"></a>        send_total <span class="op">+=</span> send_len;</span>
<span id="cb15-32"><a href="#cb15-32" tabindex="-1"></a>    }</span></code></pre></div>
            <p>This is a bit more complicated than the <a
            href="https://gitlab.com/Palethorpe/portfolio/-/blob/master/src/self-serve.zig#L125">HTTP1.1
            version</a>, but that is only because we split the data
            across multiple frames. It would probably be worse in
            HTTP1.1 if we started using chunked encoding.</p>
            <p>Because we are working on a low level we can send the
            frame headers then copy the file (or page cache) data to the
            socket buffer in the kernel. Meaning we never have to buffer
            the file data in user land.</p>
            <p>This may even be possible if we are using TLS if we can
            use a Linux crypto socket. Speaking of which…</p>
            <h1 id="tls-todo">TLS (TODO)</h1>
            <p>I was quite shocked that Zig already has a TLS
            implementation in the standard library. Presently though
            only the client is implemented. There is an <a
            href="https://github.com/ziglang/zig/issues/14171">issue
            open to create the server</a>.</p>
            <p>HTTP/2 connections are meant to be established with TLS
            and the ALPN extension. I suppose some clients may support
            HTTP/2, but not ALPN however I imagine Chrome and Firefox
            do. So hopefully I can find time to implement that or
            someone else will.</p>
            <h1 id="related">Related</h1>
            <ul>
            <li><a href="/https/richiejp.com/zig-vs-c-mini-http-server">Zig Vs C - Minimal
            HTTP server</a></li>
            <li><a href="/https/richiejp.com/zig-cross-compile-ltp-ltx-linux">Minimal Linux
            VM cross compiled with Clang and Zig</a></li>
            <li><a href="/https/richiejp.com/zig-ld-preload-trick">Override libc’s malloc
            with Zig</a></li>
            </ul>
    </div>
  </content>
</entry>
<entry>
  <title>Why I am selling my Crypto</title>
  <id>https://richiejp.com/block-chain</id>
  <published>2022-01-03T23:50:45Z</published>
  <updated>2022-01-26T10:51:59Z</updated>
  <link rel="alternate" href="https://richiejp.com/block-chain" />
  <summary>In the unlikely event block-chain solves a real problem, no
crypto token holders will profit from it. Purchasing tokens is
gambling.</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p><em>This is an article primarily for friends and family;
            a warning from someone you know.</em></p>
            <p><em>TLDR; in the unlikely event block-chain solves a real
            problem, no crypto token holders will profit from it.
            Purchasing tokens is gambling.</em></p>
            <p>On a basic technical level Bitcoin is awful and has been
            superseded by Monero<a href="#fn1" class="footnote-ref"
            id="fnref1"><sup>1</sup></a>. The only practical use of
            Bitcoin was criminal activity. This is past tense because
            the authorities are now aware of how easily traceable it is.
            I have not owned any significant amount of Bitcoin for a
            long time. The electricity and silicon costs alone are
            reason enough for me to dump it. It’s depressing even having
            to talk about it.</p>
            <p>So this article is not about Bitcoin. I have opinions
            about Bitcoin, but I’m an expert on moving bytes around,
            checking bits are in the corrected order, that sort of
            thing. I don’t know about whatever it is that keeps bitcoin
            going. For that you should consult Nassim Taleb’s Bitcoin
            “Black paper”.</p>
            <p>This isn’t about other pure crypto “currencies”, like
            Monero or Reserve, either. Perhaps these can function as a
            <em>medium of exchange</em> or a <em>store of value</em>.<a
            href="#fn2" class="footnote-ref"
            id="fnref2"><sup>2</sup></a>. It is doubtful, but that is
            not something I can talk about authoritatively. So this is
            about block-chains and their associated tokens
            e.g. Ethereum, Solana etc.</p>
            <p>The narrative is that these are going to revolutionise
            the world. Much like the internet has done and even the
            movement of compute resources to the <em>Cloud</em>. This is
            going to start in places like Africa for things like
            tracking the distribution of people and goods.</p>
            <p>Of course, people living in Europe have difficulty
            verifying anything that happens in Africa. This is due to
            basic physical and cultural barriers. It is difficult to
            know that something is really happening in a remote part of
            the world. Even a place such as New York which has an
            excellent internet connection and is easily accessible.</p>
            <p>There is an obvious disconnect between sensory data being
            recorded digitally and what is actually taking place. Not to
            mention how one chooses to interpret complex data. I have
            not the faintest idea how block-chain can help with these
            issues. It doesn’t sit at the interface between the physical
            and the virtual.</p>
            <p>It’s not clear if block-chain can help with
            <em>recording</em> data, it’s a distributed data
            <em>storage</em> and <em>transformation</em> technology.
            There is a whole load of hardware and software in between
            the physical world and the block-chain.</p>
            <p>Recording data requires a physical interaction with the
            world followed by a series of transmissions. Eventually the
            data is stored somewhere, which could be a block-chain, a
            combination of block-chain and other technologies or just a
            regular old database with some cryptography thrown in.</p>
            <p>Transmission may require a number of hops, with
            transformations applied to the data along the way. This
            could be because the source data is too large and needs
            reducing. Perhaps some kind of electronic tampering can be
            detected using cryptography. Perhaps multiple parties need
            to vote on something to agree it happened. This is not only
            plausible, it already exists and it doesn’t require a
            block-chain.</p>
            <p>Block-chains, by their very nature, require a large
            network to operate. Otherwise they are not distributed.
            Public block-chains require the public internet. By the time
            data reaches the public internet, there has been plenty of
            opportunity to tamper with it. If the local networks are not
            to be trusted for some reason. Then the data must already be
            protected cryptographically.</p>
            <p>Still it’s possible there is a place in the world for
            slower semi-validated distributed databases with a fee
            structure built in. Which is essentially the nature of
            block-chains. I haven’t ruled it out as a possibility. Nor
            do I need to, I can sit back and wait for the technology to
            mature, not giving it too much time or attention.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>Since writing this and thinking about what <a
            href="ways-to-help-your-project-fail">causes projects to
            fail</a>. It’s becoming increasingly clear that block-chain
            breaks all the rules of robust engineering. This still
            doesn’t rule out that a use will be found for it. However it
            makes the technology and any project based on it, incredibly
            fragile.</p>
            </div>
            </div>
            <p>What I can rule out, is that owning crypto tokens is an
            investment. At best it is speculative gambling. Something I
            am not above. However I much prefer to speculate on things
            which have some kind of connection to reality. This is just
            a pure game of betting on which crypto coin has the best
            narrative to sell or will be pumped by a big player
            next.</p>
            <p>What I don’t see is a reason why the end users of a
            block-chain would pay sky high network fees instead of
            starting a new network. Public block-chains are necessarily
            limited in the volume of transactions they can perform.
            Validation and synchronisation are necessarily expensive
            operations in terms of computer resources. Therefor a high
            fee has to be paid for each transaction to pay back those
            who bought into the network as well as those who provided
            the computational resources.</p>
            <p>Spinning up some “validator”, compute and storage nodes
            is relatively trivial. Just select some cloud vendors and
            fire up some preconfigured virtual machine images.</p>
            <p>All of the software is Open Source and made so that
            independent parties can run it. Third party services, such
            as oracles, that use the block-chain are made to be
            interoperable. There is no lock-in to a particular
            block-chain or even to block-chain at all.</p>
            <p>If you don’t trust cloud vendors, you can buy some
            preconfigured boxes and stick them wherever. If you can’t
            afford that then you also can’t afford to pay token owners
            enough to justify massive valuations. If you are not
            technical enough, you are not technical enough to protect
            your <em>private keys</em>. You will have to trust a third
            party somewhere down the line.</p>
            <p>To cut a long story short, I can’t see a logical reason
            why a group of end users, people who are <em>getting shit
            done</em>, are going to pay for some tokens. If they need
            that technology, they will find the cheapest way of
            obtaining it. Paying back token holders is a useless expense
            and is easily avoidable.</p>
            <p>There are no technical “network effects”, that would
            ensure one block chain network becomes dominant. It’s
            relatively easy for users to access multiple block-chains at
            once. It’s easy to transfer tokens between chains and even
            convert “smart contracts”, from one chain to another.</p>
            <p>Possibly one chain could become dominant if network fees
            stayed the same as the number of users increased. In such a
            case though there has to be losers and it’s going to be
            passive token holders.</p>
            <p>Some of these arguments could be falsified by an instance
            of a block chain application with stable profitability.
            However I can not find a single non-circular block-chain
            project with significant end users paying for the
            service.</p>
            <p>Neither can I see a reason why I would personally use
            block-chain or web3 except as a free promotion. Even then I
            wouldn’t trust it with sensitive data. I’d ensure I can run
            my infra on regular cloud or bare metal as well.</p>
            <p>This pains me, because I absolutely love the idea of
            “comoditizing” cloud services and giving AWS a kick in the
            teeth. I really wanted web3 to be viable, but looking at it
            pragmatically, it appears to be unnecessary
            complication.</p>
            <p>Ironically block-chains are transparent, which I really
            like. You can peer in at everything happening through a
            block-chain explorer. Look at the programs being executed
            and what they do. Usually this is just maintaining the
            network itself or some Dex/NFT/game nonsense. All stuff that
            circles back around to block-chain.</p>
            <p>At this point the only way to invest in block-chain is to
            learn the tech and try to use it. I started on that path and
            found a small amount of tech mixed with huge amounts of
            bullshit. There is always some bullshit with new tech, but
            the ratio here is not good.</p>
            <p>The environment appears very conducive to making money as
            a freelance software developer. The problem is the level of
            horse shit crosses over from exaggeration and enthusiasm
            into outright fraud. Operating in such an environment while
            maintaining some sense of ethics, is more than my little
            heart could bear.</p>
            <p>Frankly we would not be talking about block-chain if it
            weren’t for the speculative mania around Cryptos. Friends
            and family don’t ask me about
            conflict-free-replicated-data-types or cryptographically
            verifiable computations in general.</p>
            <p>NFTs are the purest form of bollocks imaginable. There is
            nothing technically preventing anyone from copying them.
            Their value is a pure social construct. Frankly not too
            dissimilar to physical paintings and other types of art.
            There is nothing concrete for me to say about them.</p>
            <p>However they probably show what block-chain really is, a
            social construct. The manifestation of belief in technology.
            NFTs are obviously bollocks and proud of it. This is fine by
            me. Where other types of block-chain token fall foul is they
            claim to be something other than art.</p>
            <p>The closer I look at such claims, the more they appear to
            be chimerical. Fantasy beasts made up of parts that don’t
            make sense. I can’t rule out in absolute terms that some of
            this technology will come to fruition. Occasionally “putting
            wings on a horse and seeing if it flies”, works.</p>
            <p>The probability though is small and the probability that
            token holders will reap the profits is zero. I have made
            quite a profit on Crypto myself, but my interest is rapidly
            waning as I continue to fail to find anything concrete at
            its base.</p>
            <p>So I’m selling up and sharing my findings as fair
            warning. Be careful giving any of those fuckers your
            money.</p>
            <div class="footnotes footnotes-end-of-document">
            <hr />
            <ol>
            <li id="fn1"><p>Which is probably also old hat by now, but
            it’s a while since I investigated what criminals on the dark
            web are up to<a href="#fnref1"
            class="footnote-back">↩︎</a></p></li>
            <li id="fn2"><p>In my opinion, reserve currencies require
            PoM (Proof of Military) and it is no coincidence the dollar
            is backed by the world’s best military<a href="#fnref2"
            class="footnote-back">↩︎</a></p></li>
            </ol>
            </div>
    </div>
  </content>
</entry>
<entry>
  <title>Supporting both Linux CGroup APIs</title>
  <id>https://richiejp.com/cgroup-compat-layer</id>
  <published>2021-08-17T16:38:17+01:00</published>
  <updated>2021-08-17T16:38:17+01:00</updated>
  <link rel="alternate" href="https://richiejp.com/cgroup-compat-layer" />
  <summary>Linux Control Groups API underwent a major revision. We now
have a legacy V1 interface and the current V2. For now, both must be
supported.</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>Having spent considerable time trying to create a
            “CGroup” compatability layer. In addition to figuring out
            why a counter in the memory CGroup was underflowing. I now
            know for sure that I do not know what Linux Control Groups
            are. Before I thought maybe I did, but after a much deeper
            investigation, it’s clear I do not.</p>
            <p>With a little bit of luck this article will help tip you
            over the edge. Perhaps it is time for a career change? Have
            you ever considered making things out of wood?</p>
            <p>Joking aside, Linux Control Groups are trees of groups. A
            group can have processes or other groups inside it.
            <em>Normally</em> it can’t have both unless it is the root
            group. Internally the Kernel provides an interface to the
            CGroup hierarchy (or tree(s)).</p>
            <p>So we have a generic hierarchy of groups and processes.
            This is taken advantage of by <em>Controllers</em>. Usually
            these encapsulate control of various resources. In
            particular the memory and CPU controllers allow one to
            restrict the amount of memory and CPU time groups can
            use.</p>
            <p>Internally the kernel provides a standard interface for
            developing controllers. So for each controller we get a
            roughly similar interface. Both internally and in user land.
            In user land we get a file based interface, usually located
            at <code>/sys/fs/cgroup</code>.</p>
            <p>Each controller has wildly different knobs, represented
            by files. Each file can produce and consume arbitrary data.
            Although all the files I have seen are text based.</p>
            <p>Something to keep in mind; there are many controller
            specific details. Some resources require mutual exclusion
            while others can be “over committed”. Details such as these
            can break abstractions. Linux does not hold back features
            because they have “leaky” abstractions. This is a reason for
            its success and a source of confusion.</p>
            <p>Linux also has the following maxim: <em>do not break user
            land</em>. This creates an interesting scenario when a
            regrettable interface is introduced. Which appears to be
            what happened with CGroups V1.</p>
            <p>The first interface allowed each controller to have its
            own hierarchy. In V2 this was simplified to a single
            hierarchy. Many other things were changed as well. Including
            many details of the controller interfaces.</p>
            <p>Because the kernel can’t break user land. It now must
            support both. It also supports hybrid configurations. Where
            both V1 and V2 are active at once. This is possibly because
            controllers are being migrated to V2 piecemeal. So some
            controllers are missing altogether from V2. Meanwhile others
            are missing features in V2.</p>
            <p>Each controller must be exclusively mounted as V1 or V2.
            However we may have a mixture of different V1 and V2
            controllers. It’s not clear to me whether anyone
            <em>needs</em> a hybrid setup at this point. However it is
            being used by various distributions.</p>
            <p>The Linux Test Project has many tests which rely on
            CGroups. A lot of these were, and many still are, limited to
            CGroups V1. Perhaps foolishly, we decided to create a
            compatability layer. Below is what the author wrote in <a
            href="https://github.com/linux-test-project/ltp/blob/master/include/tst_cgroup.h"><code>tst_cgroup.h</code></a>.</p>
            <blockquote>
            <p>The LTP CGroups API tries to present a consistent
            interface to the many possible CGroup configurations a
            system could have.</p>
            <p>You may ask; “Why don’t you just mount a simple CGroup
            hierarchy, instead of scanning the current setup?”. The
            short answer is that it is not possible unless no CGroups
            are currently active and almost all of our users will have
            CGroups active. Even if unmounting the current CGroup
            hierarchy is a reasonable thing to do to the sytem manager,
            it is highly unlikely the CGroup hierarchy will be
            destroyed. So users would be forced to remove their CGroup
            configuration and reboot the system.</p>
            </blockquote>
            <p>This perhaps deserves some emphasis. We need to test with
            specific CGroup controls, but we also have to play nice with
            <code>init</code>. There is
            <code>unshare(CLONE_CGROUP_NAMESPACE)</code> which may help,
            but it requires root.</p>
            <blockquote>
            <p>The core library tries to ensure an LTP CGroup exists on
            each hierarchy root. Inside the LTP group it ensures a
            ‘drain’ group exists and creats a test group for the current
            test. In the worst case we end up with a set of hierarchies
            like the follwoing. Where existing system-manager-created
            CGroups have been omitted.</p>
            <pre><code>      (V2 Root)       (V1 Root 1)     ...     (V1 Root N)
          |                |                      |
        (ltp)            (ltp)        ...        (ltp)
       /     \          /     \                  /    \
  (drain) (test-n) (drain)  (test-n)  ...     (drain)  (test-n)</code></pre>
            <p>V2 CGroup controllers use a single unified hierarchy on a
            single root. Two or more V1 controllers may share a root or
            have their own root. However there may exist only one
            instance of a controller. So you can not have the same V1
            controller on multiple roots.</p>
            <p>It is possible to have both a V2 hierarchy and V1
            hierarchies active at the same time. Which is what is shown
            above. Any controllers attached to V1 hierarchies will not
            be available in the V2 hierarchy. The reverse is also
            true.</p>
            <p>Note that a single hierarchy may be mounted multiple
            times. Allowing it to be accessed at different locations.
            However subsequent mount operations will fail if the mount
            options are different from the first.</p>
            <p>The user may pre-create the CGroup hierarchies and the
            ltp CGroup, otherwise the library will try to create them.
            If the ltp group already exists and has appropriate
            permissions, then admin privileges will not be required to
            run the tests.</p>
            <p>Because the test may not have access to the CGroup
            root(s), the drain CGroup is created. This can be used to
            store processes which would otherwise block the destruction
            of the individual test CGroup or one of its descendants.</p>
            <p>The test author may create child CGroups within the test
            CGroup using the CGroup Item API. The library will create
            the new CGroup in all the relevant hierarchies.</p>
            <p>There are many differences between the V1 and V2 CGroup
            APIs. If a controller is on both V1 and V2, it may have
            different parameters and control files. Some of these
            control files have a different name, but similar
            functionality. In this case the Item API uses the V2 names
            and aliases them to the V1 name when appropriate.</p>
            <p>Some control files only exist on one of the versions or
            they can be missing due to other reasons. The Item API
            allows the user to check if the file exists before trying to
            use it.</p>
            <p>Often a control file has almost the same functionality
            between V1 and V2. Which means it can be used in the same
            way most of the time, but not all. For now this is handled
            by exposing the API version a controller is using to allow
            the test author to handle edge cases. (e.g. V2
            memory.swap.max accepts “max”, but V1
            memory.memsw.limit_in_bytes does not).</p>
            </blockquote>
            <p>So what does this API look like? Below is an example
            taken from <a
            href="https://github.com/linux-test-project/ltp/wiki/C-Test-API#136-using-control-group">the
            docs</a>.</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="pp">#include </span><span class="im">&quot;tst_test.h&quot;</span></span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a><span class="pp">#include </span><span class="im">&quot;tst_cgroup.h&quot;</span></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a></span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a><span class="dt">static</span> <span class="dt">const</span> <span class="kw">struct</span> tst_cgroup_group <span class="op">*</span>cg<span class="op">;</span></span>
<span id="cb2-5"><a href="#cb2-5" tabindex="-1"></a></span>
<span id="cb2-6"><a href="#cb2-6" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> run<span class="op">(</span><span class="dt">void</span><span class="op">)</span></span>
<span id="cb2-7"><a href="#cb2-7" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb2-8"><a href="#cb2-8" tabindex="-1"></a>    <span class="op">...</span></span>
<span id="cb2-9"><a href="#cb2-9" tabindex="-1"></a>    <span class="co">// do test under cgroup</span></span>
<span id="cb2-10"><a href="#cb2-10" tabindex="-1"></a>    <span class="op">...</span></span>
<span id="cb2-11"><a href="#cb2-11" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb2-12"><a href="#cb2-12" tabindex="-1"></a></span>
<span id="cb2-13"><a href="#cb2-13" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> setup<span class="op">(</span><span class="dt">void</span><span class="op">)</span></span>
<span id="cb2-14"><a href="#cb2-14" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb2-15"><a href="#cb2-15" tabindex="-1"></a>    tst_cgroup_require<span class="op">(</span><span class="st">&quot;memory&quot;</span><span class="op">,</span> NULL<span class="op">);</span></span>
<span id="cb2-16"><a href="#cb2-16" tabindex="-1"></a>    cg <span class="op">=</span> tst_cgroup_get_test_group<span class="op">();</span></span>
<span id="cb2-17"><a href="#cb2-17" tabindex="-1"></a>    SAFE_CGROUP_PRINTF<span class="op">(</span>cg<span class="op">,</span> <span class="st">&quot;cgroup.procs&quot;</span><span class="op">,</span> <span class="st">&quot;</span><span class="sc">%d</span><span class="st">&quot;</span><span class="op">,</span> getpid<span class="op">());</span></span>
<span id="cb2-18"><a href="#cb2-18" tabindex="-1"></a>    SAFE_CGROUP_PRINTF<span class="op">(</span>cg<span class="op">,</span> <span class="st">&quot;memory.max&quot;</span><span class="op">,</span> <span class="st">&quot;</span><span class="sc">%lu</span><span class="st">&quot;</span><span class="op">,</span> MEMSIZE<span class="op">);</span></span>
<span id="cb2-19"><a href="#cb2-19" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>SAFE_CGROUP_HAS<span class="op">(</span>cg<span class="op">,</span> <span class="st">&quot;memory.swap.max&quot;</span><span class="op">))</span></span>
<span id="cb2-20"><a href="#cb2-20" tabindex="-1"></a>        SAFE_CGROUP_PRINTF<span class="op">(</span>cg<span class="op">,</span> <span class="st">&quot;memory.swap.max&quot;</span><span class="op">,</span> <span class="st">&quot;</span><span class="sc">%zu</span><span class="st">&quot;</span><span class="op">,</span> memsw<span class="op">);</span></span>
<span id="cb2-21"><a href="#cb2-21" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb2-22"><a href="#cb2-22" tabindex="-1"></a></span>
<span id="cb2-23"><a href="#cb2-23" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> cleanup<span class="op">(</span><span class="dt">void</span><span class="op">)</span></span>
<span id="cb2-24"><a href="#cb2-24" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb2-25"><a href="#cb2-25" tabindex="-1"></a>    tst_cgroup_cleanup<span class="op">();</span></span>
<span id="cb2-26"><a href="#cb2-26" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb2-27"><a href="#cb2-27" tabindex="-1"></a></span>
<span id="cb2-28"><a href="#cb2-28" tabindex="-1"></a><span class="kw">struct</span> tst_test test <span class="op">=</span> <span class="op">{</span></span>
<span id="cb2-29"><a href="#cb2-29" tabindex="-1"></a>    <span class="op">.</span>setup <span class="op">=</span> setup<span class="op">,</span></span>
<span id="cb2-30"><a href="#cb2-30" tabindex="-1"></a>    <span class="op">.</span>test_all <span class="op">=</span> run<span class="op">,</span></span>
<span id="cb2-31"><a href="#cb2-31" tabindex="-1"></a>    <span class="op">.</span>cleanup <span class="op">=</span> cleanup<span class="op">,</span></span>
<span id="cb2-32"><a href="#cb2-32" tabindex="-1"></a>    <span class="op">...</span></span>
<span id="cb2-33"><a href="#cb2-33" tabindex="-1"></a><span class="op">};</span></span></code></pre></div>
            <p>This works quite nicely for the memory CGroup. Most of
            the time we can just translate V2 names to V1 names. There
            are things V2 accepts when V1 does not. For example V2
            allows “max” to be written to <code>memory.max</code>, but
            V1 does not allow it to be written to
            <code>memory.limit_in_bytes</code>.</p>
            <p>For other CGroups we are looking at some bigger issues.
            The following function sets the “bandwidth” of a CPU CGroup
            in <a
            href="https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/sched/cfs-scheduler/cfs_bandwidth01.c">cfs_bandwidth01</a>.
            The bandwidth is the amount of CPU time used in a given
            period. So it is a two dimensional value.</p>
            <div class="sourceCode" id="cb3"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> set_cpu_quota<span class="op">(</span><span class="dt">const</span> <span class="kw">struct</span> tst_cgroup_group <span class="op">*</span><span class="dt">const</span> cg<span class="op">,</span></span>
<span id="cb3-2"><a href="#cb3-2" tabindex="-1"></a>              <span class="dt">const</span> <span class="dt">float</span> quota_percent<span class="op">)</span></span>
<span id="cb3-3"><a href="#cb3-3" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb3-4"><a href="#cb3-4" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">unsigned</span> <span class="dt">int</span> period_us <span class="op">=</span> <span class="dv">10000</span><span class="op">;</span></span>
<span id="cb3-5"><a href="#cb3-5" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">unsigned</span> <span class="dt">int</span> quota_us <span class="op">=</span> <span class="op">(</span>quota_percent <span class="op">/</span> <span class="dv">100</span><span class="op">)</span> <span class="op">*</span> <span class="op">(</span><span class="dt">float</span><span class="op">)</span>period_us<span class="op">;</span></span>
<span id="cb3-6"><a href="#cb3-6" tabindex="-1"></a></span>
<span id="cb3-7"><a href="#cb3-7" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>TST_CGROUP_VER<span class="op">(</span>cg<span class="op">,</span> <span class="st">&quot;cpu&quot;</span><span class="op">)</span> <span class="op">!=</span> TST_CGROUP_V1<span class="op">)</span> <span class="op">{</span></span>
<span id="cb3-8"><a href="#cb3-8" tabindex="-1"></a>        SAFE_CGROUP_PRINTF<span class="op">(</span>cg<span class="op">,</span> <span class="st">&quot;cpu.max&quot;</span><span class="op">,</span></span>
<span id="cb3-9"><a href="#cb3-9" tabindex="-1"></a>                   <span class="st">&quot;</span><span class="sc">%u</span><span class="st"> </span><span class="sc">%u</span><span class="st">&quot;</span><span class="op">,</span> quota_us<span class="op">,</span> period_us<span class="op">);</span></span>
<span id="cb3-10"><a href="#cb3-10" tabindex="-1"></a>    <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb3-11"><a href="#cb3-11" tabindex="-1"></a>        SAFE_CGROUP_PRINTF<span class="op">(</span>cg<span class="op">,</span> <span class="st">&quot;cpu.cfs_period_us&quot;</span><span class="op">,</span></span>
<span id="cb3-12"><a href="#cb3-12" tabindex="-1"></a>                  <span class="st">&quot;</span><span class="sc">%u</span><span class="st">&quot;</span><span class="op">,</span> period_us<span class="op">);</span></span>
<span id="cb3-13"><a href="#cb3-13" tabindex="-1"></a>        <span class="co">/* Actually cpu.cfs_quota_us, but we translate it */</span></span>
<span id="cb3-14"><a href="#cb3-14" tabindex="-1"></a>        SAFE_CGROUP_PRINTF<span class="op">(</span>cg<span class="op">,</span> <span class="st">&quot;cpu.max&quot;</span><span class="op">,</span></span>
<span id="cb3-15"><a href="#cb3-15" tabindex="-1"></a>                   <span class="st">&quot;</span><span class="sc">%u</span><span class="st">&quot;</span><span class="op">,</span> quota_us<span class="op">);</span></span>
<span id="cb3-16"><a href="#cb3-16" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb3-17"><a href="#cb3-17" tabindex="-1"></a></span>
<span id="cb3-18"><a href="#cb3-18" tabindex="-1"></a>    tst_res<span class="op">(</span>TINFO<span class="op">,</span> <span class="st">&quot;Set &#39;</span><span class="sc">%s</span><span class="st">/cpu.max&#39; = &#39;</span><span class="sc">%d</span><span class="st"> </span><span class="sc">%d</span><span class="st">&#39;&quot;</span><span class="op">,</span></span>
<span id="cb3-19"><a href="#cb3-19" tabindex="-1"></a>        tst_cgroup_group_name<span class="op">(</span>cg<span class="op">),</span> quota_us<span class="op">,</span> period_us<span class="op">);</span></span>
<span id="cb3-20"><a href="#cb3-20" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>Note that we must branch on the CGroup Controller
            version. In V1 two files were used to represent the two
            values representing the bandwidth. In V2 these were
            combined. Presently our translation layer can’t handle
            something like this. It’s not entirely clear if it is
            needed. Often the extra complication to the library code is
            not worth saving some lines in the tests.</p>
            <p>Likely there are many other corner cases. On the plus
            side, we are now able to run some tests on way more setups.
            Practically speaking the change from V1 to V2 did break user
            land. At least it broke LTP. Although to be fair LTP is not
            a real user. Still some tests stopped working because of the
            introduction of V2. Well technically the adoption of V2
            configs by init systems like Systemd broke LTP…</p>
    </div>
  </content>
</entry>
<entry>
  <title>A review of tools for rolling your own C static
analysis</title>
  <id>https://richiejp.com/custom-c-static-analyses-tools</id>
  <published>2021-08-09T20:52:28+01:00</published>
  <updated>2025-01-28T07:30:23Z</updated>
  <link rel="alternate" href="https://richiejp.com/custom-c-static-analyses-tools" />
  <summary>Why we chose Sparse to develop the Linux Test Project
“compile time” checks. Our experience with a few Open Source tools.
Including sample code for Sparse, Coccinelle and libclang.</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>So you have a large C code base. It has its own
            libraries, rules and context. You see the same mistakes
            being repeated. Not general C coding mistakes, but errors
            unique to your project.</p>
            <p>They look like the kind of mistakes which could be
            detected at “compile time”. In fact, they may only be
            detectable at compile time. Put another way, we think we can
            find bugs in the source code without running it. This is
            commonly referred to as “static analysis”.</p>
            <p><a href="https://github.com/linux-test-project">The Linux
            Test Project</a> is a large and eccentric C code base. It
            has its own library and rules. We most definitely see the
            same mistakes, again and again. Despite having spent years
            developing the LTP, the present author still makes those
            mistakes.</p>
            <p>Beyond simple regular expressions and syntactic checks.
            Writing static analysis tools is hard. Some languages come
            equipped with self analysis or reflection. Needless to say,
            this is not the case for C.</p>
            <p>While we can justify a significant investment in
            developing checks. It can not be on the order of creating a
            new compiler. Much code review, feedback delays and bugs can
            be eliminated with checks. However the effort saved by
            static analysis, should not be all spent on developing
            static analysis.</p>
            <p>Luckily C does have tooling available to develop semantic
            checks and perform various types of analysis. For the LTP we
            investigated a range of tools. This article will review
            those tools and explain our choices.</p>
            <p>For the time being we have chosen <a
            href="https://sparse.docs.kernel.org/en/latest/">Sparse</a>
            to power our primary tool. This is not the most powerful,
            nor arguably the most user friendly. It is the most
            self-contained and small enough to vendor in.</p>
            <h1 id="background">Background</h1>
            <h2 id="program-representation">Program representation</h2>
            <p>There are many ways to represent a computer program. With
            different abstractions and structures. One may also
            partially evaluate a program with given assumptions.
            Theoretically there is no difference between transforming a
            program and running it. They are both computations.</p>
            <p>For the sake of this review and the simple minds of
            quality assurance engineers, we just need to roughly
            identify what level each tool works on.</p>
            <ol style="list-style-type: decimal">
            <li>Text</li>
            <li>Abstract Syntax Tree (AST)</li>
            <li>Linearised Intermediate Representation (IR)</li>
            <li>Exploded Graph or IR with state tracking</li>
            </ol>
            <p>The first level is the raw source code. Secondly we have
            the AST which is fairly well known. Then we have IR and IR
            with state. IR is a generic term which we shall take some
            liberties with.</p>
            <p>C compiler’s usually convert the AST into a collection of
            linear instruction blocks (basicblocks). These blocks are
            linked together into a graph or network (Control Flow
            Graph). The links (graph edges) represent function calls,
            jumps or branches.</p>
            <p>The instructions within a block are sequential. Meanwhile
            one may go between blocks in whatever order the logic
            allows. Usually compilers also convert the IR into Single
            Static Assignment (SSA) form. Meaning each variable is only
            assigned to once. New variable names are generated for each
            additional assignment.</p>
            <p>In this article we use IR to mean approximately the
            above. If you look in the compiler of a functional language
            you may find something utterly different. Also most
            compilers tend to use a different IR for assembly
            generation. We are not concerned with that.</p>
            <p>IR with state is where we are given a range of possible
            values for each variable. This means we may also be given
            multiple versions of the code for each branch which sets a
            variable differently. Thus resulting in an “exploded”
            CFG.</p>
            <p>If that makes no sense to you, then imagine being able to
            see multiple parallel universes. With the caveat that each
            universe could be further split into more exact
            possibilities.</p>
            <p>For many checks, the AST is not an ideal level of
            abstraction. Even just figuring out if a memory location is
            modified from the AST can be difficult. There are lots of
            syntactic constructs which will cause a store instruction to
            be issued.</p>
            <p>The reason compilers have “linearized” IR, is to cut out
            a bunch of syntactic details. Making it easier to perform
            optimisations and machine code generation. Sometimes the AST
            is the best place to do certain types of analysis, but often
            not.</p>
            <h2 id="ltp-requirements">LTP Requirements</h2>
            <p>The review of each tool is heavily influenced by the
            LTP’s requirements. With a different project in mind, one
            may come to very different conclusions.</p>
            <p>The LTP project can not tolerate more barriers to
            development. We have contributors from many different
            organisations. All using different Linux distros. There are
            even some downstream forks of LTP on other operating
            systems.</p>
            <p>The test API is very large, there is a lot to learn for a
            new contributor. Not to mention that the thing we are
            testing is exceedingly complicated.</p>
            <p>There are many things we want to create checks for.
            However we have begun by trying to enforce <a
            href="https://github.com/linux-test-project/ltp/wiki/LTP-Library-API-Writing-Guidelines#22-ltp-002-tst_ret-and-tst_err-are-not-modified">the
            following rule</a>:</p>
            <pre><code>The test author is guaranteed that the test API will not modify the
TST_ERR and TST_RET. This prevents silent errors where the return
value and errno are overwritten before the test has chance to
check them.</code></pre>
            <p>These global variables are used by the <a
            href="https://github.com/linux-test-project/ltp/wiki/C-Test-API#12-basic-test-interface"><code>TEST()</code>
            macro</a> and similar. These are intended for exclusive use
            by the test author. However they were being written to by
            library code.</p>
            <p>Possibly there is somme way to define these variables as
            <code>const</code> in library code, but not in test code. In
            general though we want an extensible way of performing
            checks. This seemed as good a place as any to start.</p>
            <p>The goal is to enforce these checks and for contributors
            to run them locally. Both to save reviewer time and to save
            contributor time. It is much less expensive to correct
            mistakes before sending a patch.</p>
            <p>If checks are not mandatory, then they tend to be
            forgotten about. Or it becomes one person’s job to run and
            maintain them. This then means the check have to be robust.
            We need to avoid false positives and allow checks to be
            disabled sometimes.</p>
            <h1 id="tool-overview">Tool overview</h1>
            <p>We took a look at the following tools.</p>
            <ul>
            <li>GCC
            <ul>
            <li><a href="#GCC-Analyzer">Analyzer</a></li>
            <li><a href="#GCC-Plugins">Plugins</a></li>
            </ul></li>
            <li><a href="">Smatch</a></li>
            <li>Clang
            <ul>
            <li><a href="#Clang-Plugins">Clang Plugins</a></li>
            <li><a href="#Clang-Analyzer">Clang analyzer</a></li>
            <li><a href="#Clang-Tidy">Clang Tidy</a></li>
            <li><a href="#Clang-LibTooling">Clang LibTooling</a></li>
            <li><a href="#libclang">libclang</a></li>
            </ul></li>
            <li><a href="#Coccinelle">Coccinelle</a></li>
            <li><a href="#Sparse">Sparse</a></li>
            <li><a href="#Tree-sitter">Tree-sitter</a></li>
            </ul>
            <p>There are more out there. These are not the only ones we
            found even. We just have limited time and resources.</p>
            <p>We did not investigate any proprietary tools. It is
            expected that LTP developers can freely download, modify and
            run any mandatory development tools.</p>
            <p>The amount of time and effort assigned to each tool was
            not equal. They have been listed roughly in the amount of
            progress made before giving up. With tools such as GCC I
            quickly abandoned them. This is as much due to the nature of
            the LTP as the tool in question.</p>
            <h1 id="gcc">GCC</h1>
            <p>The GCC compiler is available on practically every Linux
            distribution and desktop OS. It is the main compiler used to
            build LTP. We don’t need to worry about parsing problems and
            non-standard C when using GCC.</p>
            <p>Some of the older parts of the LTP are quite disgusting.
            They are not compliant with any C standard, but GCC has
            accepted them. We of course want to remove this code, but in
            the meantime we have to deal with it.</p>
            <h2 id="gcc-analyzer">GCC Analyzer</h2>
            <p>The <a
            href="https://gcc.gnu.org/onlinedocs/gcc/Static-Analyzer-Options.html">GCC
            Analyzer</a> appears to be powerful if inaccurate. It tracks
            program control and data flow. Including between procedures,
            which is often absent in analysers of this type. This means
            you can see an approximation of what values a variable may
            take at any given point in the program.</p>
            <p>At the time we investigated it, there did not seem to be
            any way to extend it. So we couldn’t use it to develop LTP
            specific checks without forking GCC.</p>
            <p>It did find some general errors. Such as null pointer
            dereferences. It also found some false positives and missed
            other errors.</p>
            <h2 id="gcc-plugins">GCC Plugins</h2>
            <p><a
            href="http://gcc.gnu.org/onlinedocs/gccint/Plugins.html">GCC
            has plugins</a> which allow one to interfere with various
            compilation passes. They provide access to more than one
            type of intermediate representation used by GCC.</p>
            <p>The access is both read and write. So we could also
            create our own instrumentation, that is, insert runtime
            checks.</p>
            <p>Primarily there are two reasons for discarded this
            option. Firstly GCC’s code base is nearly opaque. Secondly
            plugins appear to be version dependent. Possibly GCC’s
            internal representation does not change much. However even
            small changes would create issues when the resident compiler
            “expert” is not around.</p>
            <p>This means we would have high up front cost and ongoing
            maintenance. This is a shame because we can always rely on
            LTP developers to have GCC.</p>
            <h1 id="smatch">Smatch</h1>
            <p>The <a href="https://repo.or.cz/w/smatch.git">Smatch
            analyser</a> is in the same league as the GCC Analyzer and
            Clang Analyzer. It can do inter-procedural control flow and
            state tracking. At least if you can figure out how to
            operate it.</p>
            <p>To get the full might of Smatch one needs to generate a
            database. This required quite some time and fiddling for the
            LTP. It was never quite clear if this was fully working. On
            the other hand, it found some general bugs without any false
            positives.</p>
            <p>To extend Smatch we could either fork it or submit LTP
            specific tests upstream. It is clear how a new check is
            added to Smatch. It is less obvious how to construct the
            check logic.</p>
            <p>Smatch now uses Sparse to parse the C AST. Otherwise it
            seems to be its own beast. Possibly the only analyser of
            this type which can find bugs in the Linux kernel without
            producing huge amounts of noise.</p>
            <p>We discarded it for now because of the high level of
            friction. In the future we may able to swap our checks from
            Sparse to Smatch.</p>
            <h1 id="clang">Clang</h1>
            <p>The <a href="https://llvm.org/">LLVM</a> C frontend.
            Clang is essentially part of LLVM. All of the tools below
            are in the LLVM mono repository. It appears that they all
            get wrapped up into LLVM releases.</p>
            <p>While LLVM and Clang are supported by every major Linux
            distribution. It is often a much older version on stable
            releases. We also found that compiling against LLVM on
            multiple distributions is inconvenient.</p>
            <p>LLVM comes with the <code>llvm-config</code> utility to
            help figure out what compiler flags are needed and such.
            This itself comes in multiple versions on some
            distributions. There isn’t necessarily a default version
            either.</p>
            <p>Unsurprisingly building LLVM and Clang from source is
            quite time consuming. So we can not sidestep distribution
            package issues by vendoring it in to the LTP.</p>
            <p>Clang can output LLVM IR. We could read this and perform
            checks on it. We did not see an easy way to do this. So it
            was not properly investigated.</p>
            <h2 id="clang-plugins">Clang Plugins</h2>
            <p>Like GCC, Clang has plugins. These appear to be based on
            the same interface(s) as Clang Tidy and LibTooling described
            below.</p>
            <h2 id="clang-analyzer">Clang Analyzer</h2>
            <p>The <a href="https://clang-analyzer.llvm.org/">Clang
            Analyzer</a> is another powerful analyser capable of
            tracking state. It is comparable to GCC’s analyser described
            earlier. Less so to Smatch which is far more self-contained.
            Out of the analysers, Clang appears to produce the most
            false positives.</p>
            <p>Unlike GCC it has an <a
            href="https://clang-analyzer.llvm.org/checker_dev_manual.html">extention
            mechanism</a>. It’s not clear how well supported or popular
            this is. It appears that analyser extensions do not get
            inter-procedural state information.</p>
            <p>It was dismissed primarily for the same reasons as
            LibTooling and Clang Tidy. Although it provides an exploded
            graph instead of AST matching. The analyser appears to be
            accessible through Clang Tidy and LibTooling.</p>
            <p>This is a very attractive option for those already
            invested in the LLVM ecosystem.</p>
            <h2 id="clang-tidy">Clang Tidy</h2>
            <p>Checks developed with the <code>clang-tidy</code> command
            need to be added to the LLVM mono repository. Other project
            specific checks have been added to upstream. So perhaps LTP
            specific checks would also be accepted.</p>
            <p>The issue for us is the time between a check being
            accepted into LLVM upstream and the check being available to
            all LTP contributors. Considering the frequency of LLVM
            releases and stable distribution releases. It could be years
            before we can demand test developers run the checker.</p>
            <p>Demanding our contributors download and compile LLVM is
            not reasonable. So we can dismiss the Clang Tidy
            approach.</p>
            <h2 id="clang-libtooling">Clang LibTooling</h2>
            <p>Clang has an unstable C++ interface and a stable C
            interface. LibTooling represents the C++ interface.</p>
            <p>As the C++ interface is not stable, any checks written
            with it will need to be adapted for each LLVM release.
            Although Clang and LLVM are much less opaque than GCC. We
            still can’t afford that kind of maintenance.</p>
            <p>The LTP is also written in C not C++. This is only a
            minor point, but it does save some effort to use C
            throughout.</p>
            <h2 id="libclang">libclang</h2>
            <p>This is the stable C interface. It is a wrapper for the
            C++ interface. The main advantage is that functions are only
            added, not changed or removed.</p>
            <p>It appears that the interface’s primary clients are text
            editors. Specifically to allow features like auto
            completion. The primary header is even called
            <code>Index.h</code>.</p>
            <p>It does provide some access to the AST. This is done
            through a relatively simple and well documented API.
            Combined with its stability promises, we decided this was
            enough to <a
            href="https://patchwork.ozlabs.org/project/ltp/list/?series=&amp;submitter=73518&amp;state=*&amp;q=libclang">give
            it a serious try</a>.</p>
            <p>Below is the code which performs the check in version
            three of the patch series. Code for printing errors and such
            has been removed.</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;clang-c/Index.h&gt;</span></span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a><span class="co">/* The rules for test, library and tool code are different */</span></span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a><span class="kw">enum</span> ltp_tu_kind <span class="op">{</span></span>
<span id="cb2-5"><a href="#cb2-5" tabindex="-1"></a>    LTP_LIB<span class="op">,</span></span>
<span id="cb2-6"><a href="#cb2-6" tabindex="-1"></a>    LTP_OTHER<span class="op">,</span></span>
<span id="cb2-7"><a href="#cb2-7" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb2-8"><a href="#cb2-8" tabindex="-1"></a></span>
<span id="cb2-9"><a href="#cb2-9" tabindex="-1"></a><span class="co">/* Holds information about the TU which we gathered on the first pass */</span></span>
<span id="cb2-10"><a href="#cb2-10" tabindex="-1"></a><span class="dt">static</span> <span class="kw">struct</span> <span class="op">{</span></span>
<span id="cb2-11"><a href="#cb2-11" tabindex="-1"></a>    <span class="kw">enum</span> ltp_tu_kind tu_kind<span class="op">;</span></span>
<span id="cb2-12"><a href="#cb2-12" tabindex="-1"></a><span class="op">}</span> tu_info<span class="op">;</span></span>
<span id="cb2-13"><a href="#cb2-13" tabindex="-1"></a></span>
<span id="cb2-14"><a href="#cb2-14" tabindex="-1"></a><span class="dt">static</span> <span class="dt">int</span> cursor_cmp_spelling<span class="op">(</span><span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span><span class="dt">const</span> spelling<span class="op">,</span> CXCursor cursor<span class="op">)</span></span>
<span id="cb2-15"><a href="#cb2-15" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb2-16"><a href="#cb2-16" tabindex="-1"></a>    CXString cursor_spelling <span class="op">=</span> clang_getCursorSpelling<span class="op">(</span>cursor<span class="op">);</span></span>
<span id="cb2-17"><a href="#cb2-17" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">int</span> ret <span class="op">=</span> strcmp<span class="op">(</span>spelling<span class="op">,</span> clang_getCString<span class="op">(</span>cursor_spelling<span class="op">));</span></span>
<span id="cb2-18"><a href="#cb2-18" tabindex="-1"></a></span>
<span id="cb2-19"><a href="#cb2-19" tabindex="-1"></a>    clang_disposeString<span class="op">(</span>cursor_spelling<span class="op">);</span></span>
<span id="cb2-20"><a href="#cb2-20" tabindex="-1"></a></span>
<span id="cb2-21"><a href="#cb2-21" tabindex="-1"></a>    <span class="cf">return</span> ret<span class="op">;</span></span>
<span id="cb2-22"><a href="#cb2-22" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb2-23"><a href="#cb2-23" tabindex="-1"></a></span>
<span id="cb2-24"><a href="#cb2-24" tabindex="-1"></a><span class="dt">static</span> <span class="dt">int</span> cursor_type_cmp_spelling<span class="op">(</span><span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span><span class="dt">const</span> spelling<span class="op">,</span> CXCursor cursor<span class="op">)</span></span>
<span id="cb2-25"><a href="#cb2-25" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb2-26"><a href="#cb2-26" tabindex="-1"></a>    CXType ctype <span class="op">=</span> clang_getCursorType<span class="op">(</span>cursor<span class="op">);</span></span>
<span id="cb2-27"><a href="#cb2-27" tabindex="-1"></a>    CXString ctype_spelling <span class="op">=</span> clang_getTypeSpelling<span class="op">(</span>ctype<span class="op">);</span></span>
<span id="cb2-28"><a href="#cb2-28" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">int</span> ret <span class="op">=</span> strcmp<span class="op">(</span>spelling<span class="op">,</span> clang_getCString<span class="op">(</span>ctype_spelling<span class="op">));</span></span>
<span id="cb2-29"><a href="#cb2-29" tabindex="-1"></a></span>
<span id="cb2-30"><a href="#cb2-30" tabindex="-1"></a>    clang_disposeString<span class="op">(</span>ctype_spelling<span class="op">);</span></span>
<span id="cb2-31"><a href="#cb2-31" tabindex="-1"></a></span>
<span id="cb2-32"><a href="#cb2-32" tabindex="-1"></a>    <span class="cf">return</span> ret<span class="op">;</span></span>
<span id="cb2-33"><a href="#cb2-33" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb2-34"><a href="#cb2-34" tabindex="-1"></a></span>
<span id="cb2-35"><a href="#cb2-35" tabindex="-1"></a><span class="co">/*</span></span>
<span id="cb2-36"><a href="#cb2-36" tabindex="-1"></a><span class="co"> * Check if the </span><span class="al">TEST</span><span class="co">() macro is used inside the library.</span></span>
<span id="cb2-37"><a href="#cb2-37" tabindex="-1"></a><span class="co"> *</span></span>
<span id="cb2-38"><a href="#cb2-38" tabindex="-1"></a><span class="co"> * This check takes an AST node which should already be known to be a</span></span>
<span id="cb2-39"><a href="#cb2-39" tabindex="-1"></a><span class="co"> * macro expansion kind.</span></span>
<span id="cb2-40"><a href="#cb2-40" tabindex="-1"></a><span class="co"> *</span></span>
<span id="cb2-41"><a href="#cb2-41" tabindex="-1"></a><span class="co"> * If the TU appears to be a test executable then the test does not</span></span>
<span id="cb2-42"><a href="#cb2-42" tabindex="-1"></a><span class="co"> * apply. So in that case we return.</span></span>
<span id="cb2-43"><a href="#cb2-43" tabindex="-1"></a><span class="co"> *</span></span>
<span id="cb2-44"><a href="#cb2-44" tabindex="-1"></a><span class="co"> * If the macro expansion AST node is spelled </span><span class="al">TEST</span><span class="co">, then we emit an</span></span>
<span id="cb2-45"><a href="#cb2-45" tabindex="-1"></a><span class="co"> * error. Otherwise do nothing.</span></span>
<span id="cb2-46"><a href="#cb2-46" tabindex="-1"></a><span class="co"> */</span></span>
<span id="cb2-47"><a href="#cb2-47" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> check_TEST_macro<span class="op">(</span>CXCursor macro_cursor<span class="op">)</span></span>
<span id="cb2-48"><a href="#cb2-48" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb2-49"><a href="#cb2-49" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>tu_info<span class="op">.</span>tu_kind <span class="op">!=</span> LTP_LIB<span class="op">)</span></span>
<span id="cb2-50"><a href="#cb2-50" tabindex="-1"></a>        <span class="cf">return</span><span class="op">;</span></span>
<span id="cb2-51"><a href="#cb2-51" tabindex="-1"></a></span>
<span id="cb2-52"><a href="#cb2-52" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>cursor_cmp_spelling<span class="op">(</span><span class="st">&quot;TEST&quot;</span><span class="op">,</span> macro_cursor<span class="op">))</span> <span class="op">{</span></span>
<span id="cb2-53"><a href="#cb2-53" tabindex="-1"></a>        emit_check_error<span class="op">(</span>macro_cursor<span class="op">,</span></span>
<span id="cb2-54"><a href="#cb2-54" tabindex="-1"></a>               <span class="st">&quot;TEST() macro should not be used in library&quot;</span><span class="op">);</span></span>
<span id="cb2-55"><a href="#cb2-55" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb2-56"><a href="#cb2-56" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb2-57"><a href="#cb2-57" tabindex="-1"></a></span>
<span id="cb2-58"><a href="#cb2-58" tabindex="-1"></a><span class="co">/* Recursively visit each AST node and run checks based on node kind */</span></span>
<span id="cb2-59"><a href="#cb2-59" tabindex="-1"></a><span class="dt">static</span> <span class="kw">enum</span> CXChildVisitResult check_visitor<span class="op">(</span>CXCursor cursor<span class="op">,</span></span>
<span id="cb2-60"><a href="#cb2-60" tabindex="-1"></a>                         attr_unused CXCursor parent<span class="op">,</span></span>
<span id="cb2-61"><a href="#cb2-61" tabindex="-1"></a>                         attr_unused CXClientData client_data<span class="op">)</span></span>
<span id="cb2-62"><a href="#cb2-62" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb2-63"><a href="#cb2-63" tabindex="-1"></a>    CXSourceLocation loc <span class="op">=</span> clang_getCursorLocation<span class="op">(</span>cursor<span class="op">);</span></span>
<span id="cb2-64"><a href="#cb2-64" tabindex="-1"></a></span>
<span id="cb2-65"><a href="#cb2-65" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>clang_Location_isInSystemHeader<span class="op">(</span>loc<span class="op">))</span></span>
<span id="cb2-66"><a href="#cb2-66" tabindex="-1"></a>        <span class="cf">return</span> CXChildVisit_Continue<span class="op">;</span></span>
<span id="cb2-67"><a href="#cb2-67" tabindex="-1"></a></span>
<span id="cb2-68"><a href="#cb2-68" tabindex="-1"></a>    <span class="cf">switch</span> <span class="op">(</span>clang_getCursorKind<span class="op">(</span>cursor<span class="op">))</span> <span class="op">{</span></span>
<span id="cb2-69"><a href="#cb2-69" tabindex="-1"></a>    <span class="cf">case</span> CXCursor_MacroExpansion<span class="op">:</span></span>
<span id="cb2-70"><a href="#cb2-70" tabindex="-1"></a>            check_TEST_macro<span class="op">(</span>cursor<span class="op">);</span></span>
<span id="cb2-71"><a href="#cb2-71" tabindex="-1"></a>        <span class="cf">break</span><span class="op">;</span></span>
<span id="cb2-72"><a href="#cb2-72" tabindex="-1"></a>    <span class="cf">default</span><span class="op">:</span></span>
<span id="cb2-73"><a href="#cb2-73" tabindex="-1"></a>        <span class="cf">break</span><span class="op">;</span></span>
<span id="cb2-74"><a href="#cb2-74" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb2-75"><a href="#cb2-75" tabindex="-1"></a></span>
<span id="cb2-76"><a href="#cb2-76" tabindex="-1"></a>    <span class="cf">return</span> CXChildVisit_Recurse<span class="op">;</span></span>
<span id="cb2-77"><a href="#cb2-77" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb2-78"><a href="#cb2-78" tabindex="-1"></a></span>
<span id="cb2-79"><a href="#cb2-79" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> collect_info_from_args<span class="op">(</span><span class="dt">const</span> <span class="dt">int</span> argc<span class="op">,</span> <span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span><span class="dt">const</span> <span class="op">*</span><span class="dt">const</span> argv<span class="op">)</span></span>
<span id="cb2-80"><a href="#cb2-80" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb2-81"><a href="#cb2-81" tabindex="-1"></a>    <span class="dt">int</span> i<span class="op">;</span></span>
<span id="cb2-82"><a href="#cb2-82" tabindex="-1"></a></span>
<span id="cb2-83"><a href="#cb2-83" tabindex="-1"></a>    <span class="cf">for</span> <span class="op">(</span>i <span class="op">=</span> <span class="dv">0</span><span class="op">;</span> i <span class="op">&lt;</span> argc<span class="op">;</span> i<span class="op">++)</span> <span class="op">{</span></span>
<span id="cb2-84"><a href="#cb2-84" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(!</span>strcmp<span class="op">(</span><span class="st">&quot;-DLTPLIB&quot;</span><span class="op">,</span> argv<span class="op">[</span>i<span class="op">]))</span> <span class="op">{</span></span>
<span id="cb2-85"><a href="#cb2-85" tabindex="-1"></a>            tu_info<span class="op">.</span>tu_kind <span class="op">=</span> LTP_LIB<span class="op">;</span></span>
<span id="cb2-86"><a href="#cb2-86" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb2-87"><a href="#cb2-87" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb2-88"><a href="#cb2-88" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb2-89"><a href="#cb2-89" tabindex="-1"></a></span>
<span id="cb2-90"><a href="#cb2-90" tabindex="-1"></a><span class="dt">int</span> main<span class="op">(</span><span class="dt">const</span> <span class="dt">int</span> argc<span class="op">,</span> <span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span><span class="dt">const</span> <span class="op">*</span><span class="dt">const</span> argv<span class="op">)</span></span>
<span id="cb2-91"><a href="#cb2-91" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb2-92"><a href="#cb2-92" tabindex="-1"></a>    CXIndex cindex <span class="op">=</span> clang_createIndex<span class="op">(</span><span class="dv">0</span><span class="op">,</span> <span class="dv">1</span><span class="op">);</span></span>
<span id="cb2-93"><a href="#cb2-93" tabindex="-1"></a>    CXTranslationUnit tu<span class="op">;</span></span>
<span id="cb2-94"><a href="#cb2-94" tabindex="-1"></a>    CXCursor tuc<span class="op">;</span></span>
<span id="cb2-95"><a href="#cb2-95" tabindex="-1"></a>    <span class="kw">enum</span> CXErrorCode ret<span class="op">;</span></span>
<span id="cb2-96"><a href="#cb2-96" tabindex="-1"></a></span>
<span id="cb2-97"><a href="#cb2-97" tabindex="-1"></a>    tu_info<span class="op">.</span>tu_kind <span class="op">=</span> LTP_OTHER<span class="op">;</span></span>
<span id="cb2-98"><a href="#cb2-98" tabindex="-1"></a>    collect_info_from_args<span class="op">(</span>argc<span class="op">,</span> argv<span class="op">);</span></span>
<span id="cb2-99"><a href="#cb2-99" tabindex="-1"></a></span>
<span id="cb2-100"><a href="#cb2-100" tabindex="-1"></a>    ret <span class="op">=</span> clang_parseTranslationUnit2<span class="op">(</span></span>
<span id="cb2-101"><a href="#cb2-101" tabindex="-1"></a>        cindex<span class="op">,</span></span>
<span id="cb2-102"><a href="#cb2-102" tabindex="-1"></a>        <span class="co">/*source_filename=*/</span>NULL<span class="op">,</span></span>
<span id="cb2-103"><a href="#cb2-103" tabindex="-1"></a>        argv <span class="op">+</span> <span class="dv">1</span><span class="op">,</span> argc <span class="op">-</span> <span class="dv">1</span><span class="op">,</span></span>
<span id="cb2-104"><a href="#cb2-104" tabindex="-1"></a>        <span class="co">/*unsaved_files=*/</span>NULL<span class="op">,</span> <span class="co">/*num_unsaved_files=*/</span><span class="dv">0</span><span class="op">,</span></span>
<span id="cb2-105"><a href="#cb2-105" tabindex="-1"></a>        CXTranslationUnit_DetailedPreprocessingRecord<span class="op">,</span></span>
<span id="cb2-106"><a href="#cb2-106" tabindex="-1"></a>        <span class="op">&amp;</span>tu<span class="op">);</span></span>
<span id="cb2-107"><a href="#cb2-107" tabindex="-1"></a></span>
<span id="cb2-108"><a href="#cb2-108" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>ret <span class="op">!=</span> CXError_Success<span class="op">)</span> <span class="op">{</span></span>
<span id="cb2-109"><a href="#cb2-109" tabindex="-1"></a>        emit_error<span class="op">(</span><span class="st">&quot;Failed to parse translation unit!&quot;</span><span class="op">);</span></span>
<span id="cb2-110"><a href="#cb2-110" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb2-111"><a href="#cb2-111" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb2-112"><a href="#cb2-112" tabindex="-1"></a></span>
<span id="cb2-113"><a href="#cb2-113" tabindex="-1"></a>    tuc <span class="op">=</span> clang_getTranslationUnitCursor<span class="op">(</span>tu<span class="op">);</span></span>
<span id="cb2-114"><a href="#cb2-114" tabindex="-1"></a></span>
<span id="cb2-115"><a href="#cb2-115" tabindex="-1"></a>    clang_visitChildren<span class="op">(</span>tuc<span class="op">,</span> check_visitor<span class="op">,</span> NULL<span class="op">);</span></span>
<span id="cb2-116"><a href="#cb2-116" tabindex="-1"></a></span>
<span id="cb2-117"><a href="#cb2-117" tabindex="-1"></a>    <span class="co">/* Stop leak sanitizer from complaining */</span></span>
<span id="cb2-118"><a href="#cb2-118" tabindex="-1"></a>    clang_disposeTranslationUnit<span class="op">(</span>tu<span class="op">);</span></span>
<span id="cb2-119"><a href="#cb2-119" tabindex="-1"></a>    clang_disposeIndex<span class="op">(</span>cindex<span class="op">);</span></span>
<span id="cb2-120"><a href="#cb2-120" tabindex="-1"></a></span>
<span id="cb2-121"><a href="#cb2-121" tabindex="-1"></a>    <span class="cf">return</span> error_flag<span class="op">;</span></span>
<span id="cb2-122"><a href="#cb2-122" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>The above code uses Clang to create an AST from a C file
            (Translation Unit; TU). Then recurses into the AST, checking
            the type (kind) of each node (cursor). If we find a node of
            a kind we can check, then we call a checking function on
            it.</p>
            <p>The LTP build system passes the same flags it would pass
            to the compiler. In addition we add
            <code>-resource-dir $(shell $(CLANG) -print-resource-dir)</code>.
            Because libclang can not find the compiler’s resource
            directory.</p>
            <p>The resource directory contains some compiler specific
            headers and libraries. The <code>clang</code> command is
            able to find it automatically. The code which performs this
            search is not in the Clang library.</p>
            <p>We search the arguments for <code>-DLTPLIB</code> which
            tells us if we are compiling the test library. The
            <code>TEST()</code> macro check only applies to the test
            library. In a previous version we looked at the code itself
            to decide if the the file were a test.</p>
            <div class="sourceCode" id="cb3"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a><span class="co">/* If we find `struct tst_test = {...}` then record that this TU is a test */</span></span>
<span id="cb3-2"><a href="#cb3-2" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> info_ltp_tu_kind<span class="op">(</span>CXCursor cursor<span class="op">)</span></span>
<span id="cb3-3"><a href="#cb3-3" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb3-4"><a href="#cb3-4" tabindex="-1"></a>    CXCursor initializer<span class="op">;</span></span>
<span id="cb3-5"><a href="#cb3-5" tabindex="-1"></a></span>
<span id="cb3-6"><a href="#cb3-6" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>clang_Cursor_hasVarDeclGlobalStorage<span class="op">(</span>cursor<span class="op">)</span> <span class="op">!=</span> <span class="dv">1</span><span class="op">)</span></span>
<span id="cb3-7"><a href="#cb3-7" tabindex="-1"></a>        <span class="cf">return</span><span class="op">;</span></span>
<span id="cb3-8"><a href="#cb3-8" tabindex="-1"></a></span>
<span id="cb3-9"><a href="#cb3-9" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>cursor_cmp_spelling<span class="op">(</span><span class="st">&quot;test&quot;</span><span class="op">,</span> cursor<span class="op">))</span></span>
<span id="cb3-10"><a href="#cb3-10" tabindex="-1"></a>        <span class="cf">return</span><span class="op">;</span></span>
<span id="cb3-11"><a href="#cb3-11" tabindex="-1"></a></span>
<span id="cb3-12"><a href="#cb3-12" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>cursor_type_cmp_spelling<span class="op">(</span><span class="st">&quot;struct tst_test&quot;</span><span class="op">,</span> cursor<span class="op">))</span></span>
<span id="cb3-13"><a href="#cb3-13" tabindex="-1"></a>        <span class="cf">return</span><span class="op">;</span></span>
<span id="cb3-14"><a href="#cb3-14" tabindex="-1"></a></span>
<span id="cb3-15"><a href="#cb3-15" tabindex="-1"></a>    initializer <span class="op">=</span> clang_Cursor_getVarDeclInitializer<span class="op">(</span>cursor<span class="op">);</span></span>
<span id="cb3-16"><a href="#cb3-16" tabindex="-1"></a></span>
<span id="cb3-17"><a href="#cb3-17" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>clang_Cursor_isNull<span class="op">(</span>initializer<span class="op">))</span></span>
<span id="cb3-18"><a href="#cb3-18" tabindex="-1"></a>        tu_info<span class="op">.</span>tu_kind <span class="op">=</span> LTP_TEST<span class="op">;</span></span>
<span id="cb3-19"><a href="#cb3-19" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>Apart from being more complicated, the problem here was
            <code>clang_Cursor_getVarDeclInitializer</code>. This
            function was only introduced in LLVM 12. Meanwhile stable
            Ubuntu was on LLVM 10. It’s not clear how to achieve the
            same thing without this function.</p>
            <p>There is another problem with our <code>TEST()</code>
            check. The actual requirement is to ensure the variables
            <code>TST_ERR</code> and <code>TST_RET</code> are not
            written to. Determining from the AST if a variable is
            written to is awkward enough. In libclang’s case it seems to
            be impossible. The necessary information is not exposed.</p>
            <p>The amount of friction simply integrating with libclang
            is probably enough for us to have dismissed it. Even if that
            were not the case though, there is too much stuff missing
            for it to be useful.</p>
            <p>If you can use LLVM at all, it is better to use the C++
            interface.</p>
            <h1 id="coccinelle">Coccinelle</h1>
            <p>Also known as the <code>spatch</code> command. It is
            described as a semantic patch tool. It implements a pattern
            matching language which looks somewhat like a C code
            “diff”.</p>
            <p>These patterns match against the syntax, semantics and
            control flow of C code. Under the hood Coccinelle operates
            on one or more IRs of the C program. However the user is not
            exposed to that. We are given a quirky language which looks
            like a Git commit to some C code.</p>
            <p>Apparently a Coccinelle semantic patch is compiled into
            Control Tree Logic (CTL) and this is matched against some
            representation of the C code. This is perhaps analogous to
            how a regular expression is compiled into an automata and
            the automata matches the input text.</p>
            <p>As far as we can tell, Coccinelle does not track state
            automatically. It does understand control flow however.
            Limited state tracking can be added using Python or OCaml
            snippets. These may be attached at certain points in the
            matching process.</p>
            <p>All in all, You can be forgiven for thinking it works by
            magic. The tool has <a
            href="https://coccinelle.gitlabpages.inria.fr/website/papers.html">multiple
            papers and presentations</a>. There is quite a bit of <a
            href="https://coccinelle.gitlabpages.inria.fr/website/documentation.html">documentation</a>.
            Still it is difficult to grasp. One suspects this is due to
            some misconceptions and communication issues. Perhaps the
            notes below will help.</p>
            <ol style="list-style-type: decimal">
            <li><p>There is no plain text or C code in a semantic patch.
            It all has meaning specified by the domain specific
            language. It looks like C code mixed with some special
            symbols, but it is not.</p></li>
            <li><p>Matching takes the control flow into consideration.
            You can specify that all branches must match. Or that one or
            more matches exists.</p></li>
            <li><p>You can match against the spelling of variables and
            other syntactic details. However it is primarily matching
            against the deeper structure of the program.</p></li>
            </ol>
            <p>With these things in mind you may have more of a chance
            understanding the documentation.</p>
            <p><code>smatch</code> does not have helpful error messages.
            The implementation is also opaque to us (more on that
            later). So the process of writing a semantic patch is often
            blind trial and error, mixed with reading the docs and
            examples.</p>
            <p>That said it is a truly wonderful tool. We made a lot of
            progress in a short time. Below is a semantic patch which
            both finds and (almost) fixes <code>TEST()</code> macro
            usages.</p>
            <div class="sourceCode" id="cb4"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb4-1"><a href="#cb4-1" tabindex="-1"></a><span class="co">// Find and fix violations of rule LTP-002</span></span>
<span id="cb4-2"><a href="#cb4-2" tabindex="-1"></a></span>
<span id="cb4-3"><a href="#cb4-3" tabindex="-1"></a><span class="co">// Set with -D fix</span></span>
<span id="cb4-4"><a href="#cb4-4" tabindex="-1"></a>virtual fix</span>
<span id="cb4-5"><a href="#cb4-5" tabindex="-1"></a></span>
<span id="cb4-6"><a href="#cb4-6" tabindex="-1"></a><span class="co">// Find all positions where </span><span class="al">TEST</span><span class="co"> is _used_.</span></span>
<span id="cb4-7"><a href="#cb4-7" tabindex="-1"></a>@ depends on <span class="op">!</span>fix exists @</span>
<span id="cb4-8"><a href="#cb4-8" tabindex="-1"></a>@@</span>
<span id="cb4-9"><a href="#cb4-9" tabindex="-1"></a></span>
<span id="cb4-10"><a href="#cb4-10" tabindex="-1"></a><span class="op">*</span> TEST<span class="op">(...);</span></span>
<span id="cb4-11"><a href="#cb4-11" tabindex="-1"></a></span>
<span id="cb4-12"><a href="#cb4-12" tabindex="-1"></a><span class="co">// Below are rules which will create a patch to replace </span><span class="al">TEST</span><span class="co"> usage</span></span>
<span id="cb4-13"><a href="#cb4-13" tabindex="-1"></a><span class="co">// It assumes we can use the ret var without conflicts</span></span>
<span id="cb4-14"><a href="#cb4-14" tabindex="-1"></a></span>
<span id="cb4-15"><a href="#cb4-15" tabindex="-1"></a><span class="co">// Fix all references to the variables </span><span class="al">TEST</span><span class="co"> modifies when they occur in a</span></span>
<span id="cb4-16"><a href="#cb4-16" tabindex="-1"></a><span class="co">// function where </span><span class="al">TEST</span><span class="co"> was used.</span></span>
<span id="cb4-17"><a href="#cb4-17" tabindex="-1"></a>@ depends on fix exists @</span>
<span id="cb4-18"><a href="#cb4-18" tabindex="-1"></a>@@</span>
<span id="cb4-19"><a href="#cb4-19" tabindex="-1"></a></span>
<span id="cb4-20"><a href="#cb4-20" tabindex="-1"></a> TEST<span class="op">(...)</span></span>
<span id="cb4-21"><a href="#cb4-21" tabindex="-1"></a></span>
<span id="cb4-22"><a href="#cb4-22" tabindex="-1"></a> <span class="op">...</span></span>
<span id="cb4-23"><a href="#cb4-23" tabindex="-1"></a></span>
<span id="cb4-24"><a href="#cb4-24" tabindex="-1"></a><span class="op">(</span></span>
<span id="cb4-25"><a href="#cb4-25" tabindex="-1"></a><span class="op">-</span> TST_RET</span>
<span id="cb4-26"><a href="#cb4-26" tabindex="-1"></a><span class="op">+</span> ret</span>
<span id="cb4-27"><a href="#cb4-27" tabindex="-1"></a><span class="op">|</span></span>
<span id="cb4-28"><a href="#cb4-28" tabindex="-1"></a><span class="op">-</span> TST_ERR</span>
<span id="cb4-29"><a href="#cb4-29" tabindex="-1"></a><span class="op">+</span> errno</span>
<span id="cb4-30"><a href="#cb4-30" tabindex="-1"></a><span class="op">|</span></span>
<span id="cb4-31"><a href="#cb4-31" tabindex="-1"></a><span class="op">-</span> TTERRNO</span>
<span id="cb4-32"><a href="#cb4-32" tabindex="-1"></a><span class="op">+</span> TERRNO</span>
<span id="cb4-33"><a href="#cb4-33" tabindex="-1"></a><span class="op">)</span></span>
<span id="cb4-34"><a href="#cb4-34" tabindex="-1"></a></span>
<span id="cb4-35"><a href="#cb4-35" tabindex="-1"></a><span class="co">// Replace </span><span class="al">TEST</span><span class="co"> in all functions where it occurs only at the start. It</span></span>
<span id="cb4-36"><a href="#cb4-36" tabindex="-1"></a><span class="co">// is slightly complicated by adding a newline if a statement appears</span></span>
<span id="cb4-37"><a href="#cb4-37" tabindex="-1"></a><span class="co">// on the line after </span><span class="al">TEST</span><span class="co">(). It is not clear to me what the rules are</span></span>
<span id="cb4-38"><a href="#cb4-38" tabindex="-1"></a><span class="co">// for matching whitespace as it has no semantic meaning, but this</span></span>
<span id="cb4-39"><a href="#cb4-39" tabindex="-1"></a><span class="co">// appears to work.</span></span>
<span id="cb4-40"><a href="#cb4-40" tabindex="-1"></a>@ depends on fix @</span>
<span id="cb4-41"><a href="#cb4-41" tabindex="-1"></a>identifier fn<span class="op">;</span></span>
<span id="cb4-42"><a href="#cb4-42" tabindex="-1"></a>expression tested_expr<span class="op">;</span></span>
<span id="cb4-43"><a href="#cb4-43" tabindex="-1"></a>statement st<span class="op">;</span></span>
<span id="cb4-44"><a href="#cb4-44" tabindex="-1"></a>@@</span>
<span id="cb4-45"><a href="#cb4-45" tabindex="-1"></a></span>
<span id="cb4-46"><a href="#cb4-46" tabindex="-1"></a>  fn <span class="op">(...)</span></span>
<span id="cb4-47"><a href="#cb4-47" tabindex="-1"></a>  <span class="op">{</span></span>
<span id="cb4-48"><a href="#cb4-48" tabindex="-1"></a><span class="op">-</span>   TEST<span class="op">(</span>tested_expr<span class="op">);</span></span>
<span id="cb4-49"><a href="#cb4-49" tabindex="-1"></a><span class="op">+</span>   <span class="dt">const</span> <span class="dt">long</span> ret <span class="op">=</span> tested_expr<span class="op">;</span></span>
<span id="cb4-50"><a href="#cb4-50" tabindex="-1"></a><span class="op">(</span></span>
<span id="cb4-51"><a href="#cb4-51" tabindex="-1"></a><span class="op">+</span></span>
<span id="cb4-52"><a href="#cb4-52" tabindex="-1"></a>    st</span>
<span id="cb4-53"><a href="#cb4-53" tabindex="-1"></a><span class="op">|</span></span>
<span id="cb4-54"><a href="#cb4-54" tabindex="-1"></a></span>
<span id="cb4-55"><a href="#cb4-55" tabindex="-1"></a><span class="op">)</span></span>
<span id="cb4-56"><a href="#cb4-56" tabindex="-1"></a>    <span class="op">...</span> when <span class="op">!=</span> TEST<span class="op">(...)</span></span>
<span id="cb4-57"><a href="#cb4-57" tabindex="-1"></a>  <span class="op">}</span></span>
<span id="cb4-58"><a href="#cb4-58" tabindex="-1"></a></span>
<span id="cb4-59"><a href="#cb4-59" tabindex="-1"></a><span class="co">// Replace </span><span class="al">TEST</span><span class="co"> in all functions where it occurs at the start</span></span>
<span id="cb4-60"><a href="#cb4-60" tabindex="-1"></a><span class="co">// Functions where it *only* occurs at the start were handled above</span></span>
<span id="cb4-61"><a href="#cb4-61" tabindex="-1"></a>@ depends on fix @</span>
<span id="cb4-62"><a href="#cb4-62" tabindex="-1"></a>identifier fn<span class="op">;</span></span>
<span id="cb4-63"><a href="#cb4-63" tabindex="-1"></a>expression tested_expr<span class="op">;</span></span>
<span id="cb4-64"><a href="#cb4-64" tabindex="-1"></a>statement st<span class="op">;</span></span>
<span id="cb4-65"><a href="#cb4-65" tabindex="-1"></a>@@</span>
<span id="cb4-66"><a href="#cb4-66" tabindex="-1"></a></span>
<span id="cb4-67"><a href="#cb4-67" tabindex="-1"></a>  fn <span class="op">(...)</span></span>
<span id="cb4-68"><a href="#cb4-68" tabindex="-1"></a>  <span class="op">{</span></span>
<span id="cb4-69"><a href="#cb4-69" tabindex="-1"></a><span class="op">-</span>   TEST<span class="op">(</span>tested_expr<span class="op">);</span></span>
<span id="cb4-70"><a href="#cb4-70" tabindex="-1"></a><span class="op">+</span>   <span class="dt">long</span> ret <span class="op">=</span> tested_expr<span class="op">;</span></span>
<span id="cb4-71"><a href="#cb4-71" tabindex="-1"></a><span class="op">(</span></span>
<span id="cb4-72"><a href="#cb4-72" tabindex="-1"></a><span class="op">+</span></span>
<span id="cb4-73"><a href="#cb4-73" tabindex="-1"></a>    st</span>
<span id="cb4-74"><a href="#cb4-74" tabindex="-1"></a><span class="op">|</span></span>
<span id="cb4-75"><a href="#cb4-75" tabindex="-1"></a></span>
<span id="cb4-76"><a href="#cb4-76" tabindex="-1"></a><span class="op">)</span></span>
<span id="cb4-77"><a href="#cb4-77" tabindex="-1"></a>    <span class="op">...</span></span>
<span id="cb4-78"><a href="#cb4-78" tabindex="-1"></a>  <span class="op">}</span></span>
<span id="cb4-79"><a href="#cb4-79" tabindex="-1"></a></span>
<span id="cb4-80"><a href="#cb4-80" tabindex="-1"></a><span class="co">// Add ret var at the start of a function where </span><span class="al">TEST</span><span class="co"> occurs and there</span></span>
<span id="cb4-81"><a href="#cb4-81" tabindex="-1"></a><span class="co">// is not already a ret declaration</span></span>
<span id="cb4-82"><a href="#cb4-82" tabindex="-1"></a>@ depends on fix exists @</span>
<span id="cb4-83"><a href="#cb4-83" tabindex="-1"></a>identifier fn<span class="op">;</span></span>
<span id="cb4-84"><a href="#cb4-84" tabindex="-1"></a>@@</span>
<span id="cb4-85"><a href="#cb4-85" tabindex="-1"></a></span>
<span id="cb4-86"><a href="#cb4-86" tabindex="-1"></a>  fn <span class="op">(...)</span></span>
<span id="cb4-87"><a href="#cb4-87" tabindex="-1"></a>  <span class="op">{</span></span>
<span id="cb4-88"><a href="#cb4-88" tabindex="-1"></a><span class="op">+</span>   <span class="dt">long</span> ret<span class="op">;</span></span>
<span id="cb4-89"><a href="#cb4-89" tabindex="-1"></a>    <span class="op">...</span> when <span class="op">!=</span> <span class="dt">long</span> ret<span class="op">;</span></span>
<span id="cb4-90"><a href="#cb4-90" tabindex="-1"></a></span>
<span id="cb4-91"><a href="#cb4-91" tabindex="-1"></a>    TEST<span class="op">(...)</span></span>
<span id="cb4-92"><a href="#cb4-92" tabindex="-1"></a>    <span class="op">...</span></span>
<span id="cb4-93"><a href="#cb4-93" tabindex="-1"></a>  <span class="op">}</span></span>
<span id="cb4-94"><a href="#cb4-94" tabindex="-1"></a></span>
<span id="cb4-95"><a href="#cb4-95" tabindex="-1"></a><span class="co">// Replace any remaining occurrences of </span><span class="al">TEST</span></span>
<span id="cb4-96"><a href="#cb4-96" tabindex="-1"></a>@ depends on fix @</span>
<span id="cb4-97"><a href="#cb4-97" tabindex="-1"></a>expression tested_expr<span class="op">;</span></span>
<span id="cb4-98"><a href="#cb4-98" tabindex="-1"></a>@@</span>
<span id="cb4-99"><a href="#cb4-99" tabindex="-1"></a></span>
<span id="cb4-100"><a href="#cb4-100" tabindex="-1"></a><span class="op">-</span>   TEST<span class="op">(</span>tested_expr<span class="op">);</span></span>
<span id="cb4-101"><a href="#cb4-101" tabindex="-1"></a><span class="op">+</span>   ret <span class="op">=</span> tested_expr<span class="op">;</span></span></code></pre></div>
            <p>This has been merged into the LTP. However we determined
            that Coccinelle can not be forced upon LTP contributors.
            Despite the fact Coccinelle is stable and has been around
            for years. We ran into distribution issues. It seems that at
            least the Gentoo package is lacking a maintainer.</p>
            <p>We suspect this has little to do with Coccinelle itself.
            The issue is that it is written in OCaml. Package
            maintainers struggle with OCaml projects. It is easy to see
            why, as our attempts to learn the basics of OCaml were
            fraught with issues.</p>
            <p>For a tool as good as Coccinelle, some of us are willing
            to learn a new language. If it were Haskell, for example,
            we’d not have a problem fixing the <em>occasional</em>
            issue.</p>
            <p>However everyones’ patience ran out with OCaml. Being
            functional when we are primarily working on C does not help.
            However the main issue is that many distributions are not
            maintaining the packages properly. So it is often difficult
            just to get the REPL and compiler running.</p>
            <p>I personally have no opinion on whether it is a good
            language. I didn’t get far enough to decide that. It seems
            to be the case though that people are not interested in it.
            Meanwhile we want to get some static analysis done, not
            revive a struggling language.</p>
            <p>We still merged the Coccinelle scripts into the LTP. They
            provide a useful example of how to automate changes with
            <code>spatch</code>. We haven’t found another option for
            making these kinds of changes. Using Clang Tidy is extremely
            laborious compared to writing a semantic patch.</p>
            <p>Sadly it has to be dismissed as our primary checker due
            to the OCaml ecosystem.</p>
            <h1 id="sparse">Sparse</h1>
            <p>Sparse is a stand alone C parser library. It produces an
            AST and linearised IR consisting of basicblocks. In fact it
            can produce executable x86 code or LLVM IR. So it is
            essentially a compiler.</p>
            <p>Unlike most C compilers however it is very simple. It is
            not designed to produce fast code, nor can it parse
            everything GCC can. It only parses C and is not concerned
            with C++.</p>
            <p>Sparse itself can be compiled relatively quickly and has
            few dependencies. It doesn’t take long to clone it with Git
            either. This meant we were able to vendor it in as a Git
            module.</p>
            <p>Some disapprove of vendoring and Git modules.
            Unfortunately the Sparse package available on most
            distributions is not useful to us. Sparse is linked
            statically and the package only contains an executable for
            use with the Linux kernel. There is no dynamic library. Of
            course someone can change that, but it would take time to
            propagate downstream.</p>
            <p>The IR is relatively easy to traverse and write checks
            against. The documentation is maybe a little sparse. However
            with some knowledge about compilers, it’s not too hard to
            understand the code. It is written in a similar style to the
            kernel and LTP. Albeit with some quirks.</p>
            <p>Below is the full checker program sans some
            boilerplate.</p>
            <div class="sourceCode" id="cb5"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb5-1"><a href="#cb5-1" tabindex="-1"></a><span class="co">/* The rules for test, library and tool code are different */</span></span>
<span id="cb5-2"><a href="#cb5-2" tabindex="-1"></a><span class="kw">enum</span> ltp_tu_kind <span class="op">{</span></span>
<span id="cb5-3"><a href="#cb5-3" tabindex="-1"></a>    LTP_LIB<span class="op">,</span></span>
<span id="cb5-4"><a href="#cb5-4" tabindex="-1"></a>    LTP_OTHER<span class="op">,</span></span>
<span id="cb5-5"><a href="#cb5-5" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb5-6"><a href="#cb5-6" tabindex="-1"></a></span>
<span id="cb5-7"><a href="#cb5-7" tabindex="-1"></a><span class="dt">static</span> <span class="kw">enum</span> ltp_tu_kind tu_kind <span class="op">=</span> LTP_OTHER<span class="op">;</span></span>
<span id="cb5-8"><a href="#cb5-8" tabindex="-1"></a></span>
<span id="cb5-9"><a href="#cb5-9" tabindex="-1"></a><span class="co">/* Check for LTP-002</span></span>
<span id="cb5-10"><a href="#cb5-10" tabindex="-1"></a><span class="co"> *</span></span>
<span id="cb5-11"><a href="#cb5-11" tabindex="-1"></a><span class="co"> * Inspects the destination symbol of each store instruction. If it is</span></span>
<span id="cb5-12"><a href="#cb5-12" tabindex="-1"></a><span class="co"> * TST_RET or TST_ERR then emit a warning.</span></span>
<span id="cb5-13"><a href="#cb5-13" tabindex="-1"></a><span class="co"> */</span></span>
<span id="cb5-14"><a href="#cb5-14" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> check_lib_sets_TEST_vars<span class="op">(</span><span class="dt">const</span> <span class="kw">struct</span> instruction <span class="op">*</span>insn<span class="op">)</span></span>
<span id="cb5-15"><a href="#cb5-15" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-16"><a href="#cb5-16" tabindex="-1"></a>    <span class="dt">static</span> <span class="kw">struct</span> ident <span class="op">*</span>TST_RES_id<span class="op">,</span> <span class="op">*</span>TST_ERR_id<span class="op">;</span></span>
<span id="cb5-17"><a href="#cb5-17" tabindex="-1"></a></span>
<span id="cb5-18"><a href="#cb5-18" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>TST_RES_id<span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-19"><a href="#cb5-19" tabindex="-1"></a>        TST_RES_id <span class="op">=</span> built_in_ident<span class="op">(</span><span class="st">&quot;TST_RET&quot;</span><span class="op">);</span></span>
<span id="cb5-20"><a href="#cb5-20" tabindex="-1"></a>        TST_ERR_id <span class="op">=</span> built_in_ident<span class="op">(</span><span class="st">&quot;TST_ERR&quot;</span><span class="op">);</span></span>
<span id="cb5-21"><a href="#cb5-21" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb5-22"><a href="#cb5-22" tabindex="-1"></a></span>
<span id="cb5-23"><a href="#cb5-23" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>insn<span class="op">-&gt;</span>opcode <span class="op">!=</span> OP_STORE<span class="op">)</span></span>
<span id="cb5-24"><a href="#cb5-24" tabindex="-1"></a>        <span class="cf">return</span><span class="op">;</span></span>
<span id="cb5-25"><a href="#cb5-25" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>insn<span class="op">-&gt;</span>src<span class="op">-&gt;</span>ident <span class="op">!=</span> TST_RES_id <span class="op">&amp;&amp;</span></span>
<span id="cb5-26"><a href="#cb5-26" tabindex="-1"></a>        insn<span class="op">-&gt;</span>src<span class="op">-&gt;</span>ident <span class="op">!=</span> TST_ERR_id<span class="op">)</span></span>
<span id="cb5-27"><a href="#cb5-27" tabindex="-1"></a>        <span class="cf">return</span><span class="op">;</span></span>
<span id="cb5-28"><a href="#cb5-28" tabindex="-1"></a></span>
<span id="cb5-29"><a href="#cb5-29" tabindex="-1"></a>    warning<span class="op">(</span>insn<span class="op">-&gt;</span>pos<span class="op">,</span></span>
<span id="cb5-30"><a href="#cb5-30" tabindex="-1"></a>        <span class="st">&quot;LTP-002: Library should not write to TST_RET or TST_ERR&quot;</span><span class="op">);</span></span>
<span id="cb5-31"><a href="#cb5-31" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb5-32"><a href="#cb5-32" tabindex="-1"></a></span>
<span id="cb5-33"><a href="#cb5-33" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> do_basicblock_checks<span class="op">(</span><span class="kw">struct</span> basic_block <span class="op">*</span>bb<span class="op">)</span></span>
<span id="cb5-34"><a href="#cb5-34" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-35"><a href="#cb5-35" tabindex="-1"></a>    <span class="kw">struct</span> instruction <span class="op">*</span>insn<span class="op">;</span></span>
<span id="cb5-36"><a href="#cb5-36" tabindex="-1"></a></span>
<span id="cb5-37"><a href="#cb5-37" tabindex="-1"></a>    FOR_EACH_PTR<span class="op">(</span>bb<span class="op">-&gt;</span>insns<span class="op">,</span> insn<span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-38"><a href="#cb5-38" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(!</span>bb_reachable<span class="op">(</span>insn<span class="op">-&gt;</span>bb<span class="op">))</span></span>
<span id="cb5-39"><a href="#cb5-39" tabindex="-1"></a>            <span class="cf">continue</span><span class="op">;</span></span>
<span id="cb5-40"><a href="#cb5-40" tabindex="-1"></a></span>
<span id="cb5-41"><a href="#cb5-41" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>tu_kind <span class="op">==</span> LTP_LIB<span class="op">)</span></span>
<span id="cb5-42"><a href="#cb5-42" tabindex="-1"></a>            check_lib_sets_TEST_vars<span class="op">(</span>insn<span class="op">);</span></span>
<span id="cb5-43"><a href="#cb5-43" tabindex="-1"></a>    <span class="op">}</span> END_FOR_EACH_PTR<span class="op">(</span>insn<span class="op">);</span></span>
<span id="cb5-44"><a href="#cb5-44" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb5-45"><a href="#cb5-45" tabindex="-1"></a></span>
<span id="cb5-46"><a href="#cb5-46" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> do_entrypoint_checks<span class="op">(</span><span class="kw">struct</span> entrypoint <span class="op">*</span>ep<span class="op">)</span></span>
<span id="cb5-47"><a href="#cb5-47" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-48"><a href="#cb5-48" tabindex="-1"></a>    <span class="kw">struct</span> basic_block <span class="op">*</span>bb<span class="op">;</span></span>
<span id="cb5-49"><a href="#cb5-49" tabindex="-1"></a></span>
<span id="cb5-50"><a href="#cb5-50" tabindex="-1"></a>    FOR_EACH_PTR<span class="op">(</span>ep<span class="op">-&gt;</span>bbs<span class="op">,</span> bb<span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-51"><a href="#cb5-51" tabindex="-1"></a>        do_basicblock_checks<span class="op">(</span>bb<span class="op">);</span></span>
<span id="cb5-52"><a href="#cb5-52" tabindex="-1"></a>    <span class="op">}</span> END_FOR_EACH_PTR<span class="op">(</span>bb<span class="op">);</span></span>
<span id="cb5-53"><a href="#cb5-53" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb5-54"><a href="#cb5-54" tabindex="-1"></a></span>
<span id="cb5-55"><a href="#cb5-55" tabindex="-1"></a><span class="co">/* Compile the AST into a graph of basicblocks */</span></span>
<span id="cb5-56"><a href="#cb5-56" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> process_symbols<span class="op">(</span><span class="kw">struct</span> symbol_list <span class="op">*</span>list<span class="op">)</span></span>
<span id="cb5-57"><a href="#cb5-57" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-58"><a href="#cb5-58" tabindex="-1"></a>    <span class="kw">struct</span> symbol <span class="op">*</span>sym<span class="op">;</span></span>
<span id="cb5-59"><a href="#cb5-59" tabindex="-1"></a></span>
<span id="cb5-60"><a href="#cb5-60" tabindex="-1"></a>    FOR_EACH_PTR<span class="op">(</span>list<span class="op">,</span> sym<span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-61"><a href="#cb5-61" tabindex="-1"></a>        <span class="kw">struct</span> entrypoint <span class="op">*</span>ep<span class="op">;</span></span>
<span id="cb5-62"><a href="#cb5-62" tabindex="-1"></a></span>
<span id="cb5-63"><a href="#cb5-63" tabindex="-1"></a>        expand_symbol<span class="op">(</span>sym<span class="op">);</span></span>
<span id="cb5-64"><a href="#cb5-64" tabindex="-1"></a>        ep <span class="op">=</span> linearize_symbol<span class="op">(</span>sym<span class="op">);</span></span>
<span id="cb5-65"><a href="#cb5-65" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(!</span>ep <span class="op">||</span> <span class="op">!</span>ep<span class="op">-&gt;</span>entry<span class="op">)</span></span>
<span id="cb5-66"><a href="#cb5-66" tabindex="-1"></a>            <span class="cf">continue</span><span class="op">;</span></span>
<span id="cb5-67"><a href="#cb5-67" tabindex="-1"></a></span>
<span id="cb5-68"><a href="#cb5-68" tabindex="-1"></a>        do_entrypoint_checks<span class="op">(</span>ep<span class="op">);</span></span>
<span id="cb5-69"><a href="#cb5-69" tabindex="-1"></a></span>
<span id="cb5-70"><a href="#cb5-70" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>dbg_entry<span class="op">)</span></span>
<span id="cb5-71"><a href="#cb5-71" tabindex="-1"></a>            show_entry<span class="op">(</span>ep<span class="op">);</span></span>
<span id="cb5-72"><a href="#cb5-72" tabindex="-1"></a>    <span class="op">}</span> END_FOR_EACH_PTR<span class="op">(</span>sym<span class="op">);</span></span>
<span id="cb5-73"><a href="#cb5-73" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb5-74"><a href="#cb5-74" tabindex="-1"></a></span>
<span id="cb5-75"><a href="#cb5-75" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> collect_info_from_args<span class="op">(</span><span class="dt">const</span> <span class="dt">int</span> argc<span class="op">,</span> <span class="dt">char</span> <span class="op">*</span><span class="dt">const</span> <span class="op">*</span><span class="dt">const</span> argv<span class="op">)</span></span>
<span id="cb5-76"><a href="#cb5-76" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-77"><a href="#cb5-77" tabindex="-1"></a>    <span class="dt">int</span> i<span class="op">;</span></span>
<span id="cb5-78"><a href="#cb5-78" tabindex="-1"></a></span>
<span id="cb5-79"><a href="#cb5-79" tabindex="-1"></a>    <span class="cf">for</span> <span class="op">(</span>i <span class="op">=</span> <span class="dv">0</span><span class="op">;</span> i <span class="op">&lt;</span> argc<span class="op">;</span> i<span class="op">++)</span> <span class="op">{</span></span>
<span id="cb5-80"><a href="#cb5-80" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(!</span>strcmp<span class="op">(</span><span class="st">&quot;-DLTPLIB&quot;</span><span class="op">,</span> argv<span class="op">[</span>i<span class="op">]))</span></span>
<span id="cb5-81"><a href="#cb5-81" tabindex="-1"></a>            tu_kind <span class="op">=</span> LTP_LIB<span class="op">;</span></span>
<span id="cb5-82"><a href="#cb5-82" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb5-83"><a href="#cb5-83" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb5-84"><a href="#cb5-84" tabindex="-1"></a></span>
<span id="cb5-85"><a href="#cb5-85" tabindex="-1"></a><span class="dt">int</span> main<span class="op">(</span><span class="dt">int</span> argc<span class="op">,</span> <span class="dt">char</span> <span class="op">**</span>argv<span class="op">)</span></span>
<span id="cb5-86"><a href="#cb5-86" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-87"><a href="#cb5-87" tabindex="-1"></a>    <span class="kw">struct</span> string_list <span class="op">*</span>filelist <span class="op">=</span> NULL<span class="op">;</span></span>
<span id="cb5-88"><a href="#cb5-88" tabindex="-1"></a>    <span class="dt">char</span> <span class="op">*</span>file<span class="op">;</span></span>
<span id="cb5-89"><a href="#cb5-89" tabindex="-1"></a></span>
<span id="cb5-90"><a href="#cb5-90" tabindex="-1"></a>    <span class="co">/* ... Disable a bunch of inbuilt checks ... */</span></span>
<span id="cb5-91"><a href="#cb5-91" tabindex="-1"></a></span>
<span id="cb5-92"><a href="#cb5-92" tabindex="-1"></a>    do_output <span class="op">=</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb5-93"><a href="#cb5-93" tabindex="-1"></a></span>
<span id="cb5-94"><a href="#cb5-94" tabindex="-1"></a>    collect_info_from_args<span class="op">(</span>argc<span class="op">,</span> argv<span class="op">);</span></span>
<span id="cb5-95"><a href="#cb5-95" tabindex="-1"></a></span>
<span id="cb5-96"><a href="#cb5-96" tabindex="-1"></a>    process_symbols<span class="op">(</span>sparse_initialize<span class="op">(</span>argc<span class="op">,</span> argv<span class="op">,</span> <span class="op">&amp;</span>filelist<span class="op">));</span></span>
<span id="cb5-97"><a href="#cb5-97" tabindex="-1"></a>    FOR_EACH_PTR<span class="op">(</span>filelist<span class="op">,</span> file<span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-98"><a href="#cb5-98" tabindex="-1"></a>        process_symbols<span class="op">(</span>sparse<span class="op">(</span>file<span class="op">));</span></span>
<span id="cb5-99"><a href="#cb5-99" tabindex="-1"></a>    <span class="op">}</span> END_FOR_EACH_PTR<span class="op">(</span>file<span class="op">);</span></span>
<span id="cb5-100"><a href="#cb5-100" tabindex="-1"></a></span>
<span id="cb5-101"><a href="#cb5-101" tabindex="-1"></a>    report_stats<span class="op">();</span></span>
<span id="cb5-102"><a href="#cb5-102" tabindex="-1"></a>    <span class="cf">return</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb5-103"><a href="#cb5-103" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>Unlike the Clang and Coccinelle checks, this actually
            checks the variables themselves. We traverse the IR and look
            for writes to them. This will catch some additional cases
            where we write to the variables without using the
            macros.</p>
            <p>It may be possible to fool it somehow. It does have the
            issue that library header files are considered part of the
            test code. We have only just begun to use Sparse so this
            will likely be modified over time.</p>
            <p>For now we don’t try to do anything with the AST, we just
            look at the IR. Unlike Clang, Sparse does not save
            information about macro expansions. They do not show up as
            nodes in the AST. It appears that preprocessing is performed
            without saving any details. We may need to change this.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>Note that since writing this article I have added more
            checks. Including some which operate on the “AST”. You can
            see more <a
            href="https://github.com/linux-test-project/ltp/blob/master/tools/sparse/sparse-ltp.c">here</a>.
            Frankly I find the AST in Sparse horribly confusing.</p>
            </div>
            </div>
            <p>Sparse has many built-in checks and warnings. We have
            disabled most of them for now. In some cases they are kernel
            specific. In other cases they have been adopted by GCC and
            Clang which produce prettier warnings. Mostly though we just
            need to clean up the LTP, then we can enable them.</p>
            <p>Sparse also introduces some attributes
            (e.g. <code>__attribute__(address_space(name))</code>) which
            may be useful or not. Attributes are a way of extending C
            which does not interfere with compilers that do not support
            them. The kernel uses them to prevent functions and
            variables being used in certain ways.</p>
            <h1 id="tree-sitter">Tree-sitter</h1>
            <p>Since writting this article and adopting Sparse I
            discovered <a
            href="https://tree-sitter.github.io/tree-sitter/">Tree-sitter</a>
            thanks to <a
            href="https://github.com/googleprojectzero/weggli">Weggli</a>.
            Weggli is a very fast “semantic search tool” inspired by
            Coccinelle amongst others. I’d say it’s more of an AST
            matcher. As far as I know it doesn’t have the control flow
            analysis features of Coccinelle. On the plus side I can see
            myself contributing to it as it is written in Rust. It’s
            also much faster than Coccinelle and easy to install.</p>
            <p>Just go and try it, it should only take 5 minutes if you
            are willing to install it with Rust’s <code>cargo</code>
            command. I often use it now for searching the Linux tree as
            it tends to find things that <code>clangd</code> doesn’t
            because <code>compile_commands.json</code> doesn’t have some
            files in it due to the build configuration.</p>
            <p>However for the purposes of the LTP checker, it’s
            Tree-sitter that is really interesting. Tree-sitter ticks a
            lot of the boxes in our requirements: it generates zero
            dependency parsers written in C. These can easily be
            vendored in.</p>
            <p>It supports many languages including C and Bash. We also
            have many tests written in Shell and a Shell test API. So
            there is also the possiblity of producing LTP specific
            checks for Shell as well.</p>
            <p>It only operates at the AST level, but is vastly easier
            to understand than Sparse’s AST. For one thing it has a nice
            CLI for interactively inspecting ASTs and even a <a
            href="https://tree-sitter.github.io/tree-sitter/playground">web
            based playground</a>. The C API appears to have some proper
            documentation and looks straightforward compared to
            Sparse.</p>
            <p>The problem is the lack of linearization. Some things are
            just much easier with some IR, that’s why it exists. There
            is also the fact we already have something working in
            Sparse. Still I would not rule out us using Tree-sitter.</p>
            <h1 id="conclusion">Conclusion</h1>
            <p>Going forwards we will continue to develop Sparse as our
            main tool. We may still need to abandon it. Perhaps the
            checks we really want will be too difficult. Personally, I
            will also continue to use Coccinelle, especially for
            “evolutionary development”.</p>
            <p>There is a huge amount of great software here. Which took
            a lot of hard work by smart people. As usual with open
            source it is rough around the edges. In the end we chose the
            solution which we are mostly likely able to fix ourselves.
            Also the solution least likely to need fixing once
            implemented.</p>
            <p>Depending on how things progress, I will be back to write
            about using Sparse and Coccinelle. Please send any
            suggestions, praise or insults via the contact details
            below.</p>
    </div>
  </content>
</entry>
<entry>
  <title>A review of tools for rolling your own C static
analysis</title>
  <id>https://richiejp.com/custom-c-static-analysis-tools</id>
  <published>2021-11-24T08:50:57Z</published>
  <updated>2021-11-24T08:50:57Z</updated>
  <link rel="alternate" href="https://richiejp.com/custom-c-static-analysis-tools" />
  <summary>Why we chose Sparse to develop the Linux Test Project
“compile time” checks. Our experience with a few Open Source tools.
Including sample code for Sparse, Coccinelle and libclang.</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>So you have a large C code base. It has its own
            libraries, rules and context. You see the same mistakes
            being repeated. Not general C coding mistakes, but errors
            unique to your project.</p>
            <p>They look like the kind of mistakes which could be
            detected at “compile time”. In fact, they may only be
            detectable at compile time. Put another way, we think we can
            find bugs in the source code without running it. This is
            commonly referred to as “static analysis”.</p>
            <p><a href="https://github.com/linux-test-project">The Linux
            Test Project</a> is a large and eccentric C code base. It
            has its own library and rules. We most definitely see the
            same mistakes, again and again. Despite having spent years
            developing the LTP, the present author still makes those
            mistakes.</p>
            <p>Beyond simple regular expressions and syntactic checks.
            Writing static analysis tools is hard. Some languages come
            equipped with self analysis or reflection. Needless to say,
            this is not the case for C.</p>
            <p>While we can justify a significant investment in
            developing checks. It can not be on the order of creating a
            new compiler. Much code review, feedback delays and bugs can
            be eliminated with checks. However the effort saved by
            static analysis, should not be all spent on developing
            static analysis.</p>
            <p>Luckily C does have tooling available to develop semantic
            checks and perform various types of analysis. For the LTP we
            investigated a range of tools. This article will review
            those tools and explain our choices.</p>
            <p>For the time being we have chosen <a
            href="https://sparse.docs.kernel.org/en/latest/">Sparse</a>
            to power our primary tool. This is not the most powerful,
            nor arguably the most user friendly. It is the most
            self-contained and small enough to vendor in.</p>
            <h1 id="background">Background</h1>
            <h2 id="program-representation">Program representation</h2>
            <p>There are many ways to represent a computer program. With
            different abstractions and structures. One may also
            partially evaluate a program with given assumptions.
            Theoretically there is no difference between transforming a
            program and running it. They are both computations.</p>
            <p>For the sake of this review and the simple minds of
            quality assurance engineers, we just need to roughly
            identify what level each tool works on.</p>
            <ol style="list-style-type: decimal">
            <li>Text</li>
            <li>Abstract Syntax Tree (AST)</li>
            <li>Linearised Intermediate Representation (IR)</li>
            <li>Exploded Graph or IR with state tracking</li>
            </ol>
            <p>The first level is the raw source code. Secondly we have
            the AST which is fairly well known. Then we have IR and IR
            with state. IR is a generic term which we shall take some
            liberties with.</p>
            <p>C compiler’s usually convert the AST into a collection of
            linear instruction blocks (basicblocks). These blocks are
            linked together into a graph or network (Control Flow
            Graph). The links (graph edges) represent function calls,
            jumps or branches.</p>
            <p>The instructions within a block are sequential. Meanwhile
            one may go between blocks in whatever order the logic
            allows. Usually compilers also convert the IR into Single
            Static Assignment (SSA) form. Meaning each variable is only
            assigned to once. New variable names are generated for each
            additional assignment.</p>
            <p>In this article we use IR to mean approximately the
            above. If you look in the compiler of a functional language
            you may find something utterly different. Also most
            compilers tend to use a different IR for assembly
            generation. We are not concerned with that.</p>
            <p>IR with state is where we are given a range of possible
            values for each variable. This means we may also be given
            multiple versions of the code for each branch which sets a
            variable differently. Thus resulting in an “exploded”
            CFG.</p>
            <p>If that makes no sense to you, then imagine being able to
            see multiple parallel universes. With the caveat that each
            universe could be further split into more exact
            possibilities.</p>
            <p>For many checks, the AST is not an ideal level of
            abstraction. Even just figuring out if a memory location is
            modified from the AST can be difficult. There are lots of
            syntactic constructs which will cause a store instruction to
            be issued.</p>
            <p>The reason compilers have “linearized” IR, is to cut out
            a bunch of syntactic details. Making it easier to perform
            optimisations and machine code generation. Sometimes the AST
            is the best place to do certain types of analysis, but often
            not.</p>
            <h2 id="ltp-requirements">LTP Requirements</h2>
            <p>The review of each tool is heavily influenced by the
            LTP’s requirements. With a different project in mind, one
            may come to very different conclusions.</p>
            <p>The LTP project can not tolerate more barriers to
            development. We have contributors from many different
            organisations. All using different Linux distros. There are
            even some downstream forks of LTP on other operating
            systems.</p>
            <p>The test API is very large, there is a lot to learn for a
            new contributor. Not to mention that the thing we are
            testing is exceedingly complicated.</p>
            <p>There are many things we want to create checks for.
            However we have begun by trying to enforce <a
            href="https://github.com/linux-test-project/ltp/wiki/LTP-Library-API-Writing-Guidelines#22-ltp-002-tst_ret-and-tst_err-are-not-modified">the
            following rule</a>:</p>
            <pre><code>The test author is guaranteed that the test API will not modify the
TST_ERR and TST_RET. This prevents silent errors where the return
value and errno are overwritten before the test has chance to
check them.</code></pre>
            <p>These global variables are used by the <a
            href="https://github.com/linux-test-project/ltp/wiki/C-Test-API#12-basic-test-interface"><code>TEST()</code>
            macro</a> and similar. These are intended for exclusive use
            by the test author. However they were being written to by
            library code.</p>
            <p>Possibly there is somme way to define these variables as
            <code>const</code> in library code, but not in test code. In
            general though we want an extensible way of performing
            checks. This seemed as good a place as any to start.</p>
            <p>The goal is to enforce these checks and for contributors
            to run them locally. Both to save reviewer time and to save
            contributor time. It is much less expensive to correct
            mistakes before sending a patch.</p>
            <p>If checks are not mandatory, then they tend to be
            forgotten about. Or it becomes one person’s job to run and
            maintain them. This then means the check have to be robust.
            We need to avoid false positives and allow checks to be
            disabled sometimes.</p>
            <h1 id="tool-overview">Tool overview</h1>
            <p>We took a look at the following tools.</p>
            <ul>
            <li>GCC
            <ul>
            <li><a href="#GCC-Analyzer">Analyzer</a></li>
            <li><a href="#GCC-Plugins">Plugins</a></li>
            </ul></li>
            <li><a href="">Smatch</a></li>
            <li>Clang
            <ul>
            <li><a href="#Clang-Plugins">Clang Plugins</a></li>
            <li><a href="#Clang-Analyzer">Clang analyzer</a></li>
            <li><a href="#Clang-Tidy">Clang Tidy</a></li>
            <li><a href="#Clang-LibTooling">Clang LibTooling</a></li>
            <li><a href="#libclang">libclang</a></li>
            </ul></li>
            <li><a href="#Coccinelle">Coccinelle</a></li>
            <li><a href="#Sparse">Sparse</a></li>
            <li><a href="#Tree-sitter">Tree-sitter</a></li>
            </ul>
            <p>There are more out there. These are not the only ones we
            found even. We just have limited time and resources.</p>
            <p>We did not investigate any proprietary tools. It is
            expected that LTP developers can freely download, modify and
            run any mandatory development tools.</p>
            <p>The amount of time and effort assigned to each tool was
            not equal. They have been listed roughly in the amount of
            progress made before giving up. With tools such as GCC I
            quickly abandoned them. This is as much due to the nature of
            the LTP as the tool in question.</p>
            <h1 id="gcc">GCC</h1>
            <p>The GCC compiler is available on practically every Linux
            distribution and desktop OS. It is the main compiler used to
            build LTP. We don’t need to worry about parsing problems and
            non-standard C when using GCC.</p>
            <p>Some of the older parts of the LTP are quite disgusting.
            They are not compliant with any C standard, but GCC has
            accepted them. We of course want to remove this code, but in
            the meantime we have to deal with it.</p>
            <h2 id="gcc-analyzer">GCC Analyzer</h2>
            <p>The <a
            href="https://gcc.gnu.org/onlinedocs/gcc/Static-Analyzer-Options.html">GCC
            Analyzer</a> appears to be powerful if inaccurate. It tracks
            program control and data flow. Including between procedures,
            which is often absent in analysers of this type. This means
            you can see an approximation of what values a variable may
            take at any given point in the program.</p>
            <p>At the time we investigated it, there did not seem to be
            any way to extend it. So we couldn’t use it to develop LTP
            specific checks without forking GCC.</p>
            <p>It did find some general errors. Such as null pointer
            dereferences. It also found some false positives and missed
            other errors.</p>
            <h2 id="gcc-plugins">GCC Plugins</h2>
            <p><a
            href="http://gcc.gnu.org/onlinedocs/gccint/Plugins.html">GCC
            has plugins</a> which allow one to interfere with various
            compilation passes. They provide access to more than one
            type of intermediate representation used by GCC.</p>
            <p>The access is both read and write. So we could also
            create our own instrumentation, that is, insert runtime
            checks.</p>
            <p>Primarily there are two reasons for discarded this
            option. Firstly GCC’s code base is nearly opaque. Secondly
            plugins appear to be version dependent. Possibly GCC’s
            internal representation does not change much. However even
            small changes would create issues when the resident compiler
            “expert” is not around.</p>
            <p>This means we would have high up front cost and ongoing
            maintenance. This is a shame because we can always rely on
            LTP developers to have GCC.</p>
            <h1 id="smatch">Smatch</h1>
            <p>The <a href="https://repo.or.cz/w/smatch.git">Smatch
            analyser</a> is in the same league as the GCC Analyzer and
            Clang Analyzer. It can do inter-procedural control flow and
            state tracking. At least if you can figure out how to
            operate it.</p>
            <p>To get the full might of Smatch one needs to generate a
            database. This required quite some time and fiddling for the
            LTP. It was never quite clear if this was fully working. On
            the other hand, it found some general bugs without any false
            positives.</p>
            <p>To extend Smatch we could either fork it or submit LTP
            specific tests upstream. It is clear how a new check is
            added to Smatch. It is less obvious how to construct the
            check logic.</p>
            <p>Smatch now uses Sparse to parse the C AST. Otherwise it
            seems to be its own beast. Possibly the only analyser of
            this type which can find bugs in the Linux kernel without
            producing huge amounts of noise.</p>
            <p>We discarded it for now because of the high level of
            friction. In the future we may able to swap our checks from
            Sparse to Smatch.</p>
            <h1 id="clang">Clang</h1>
            <p>The <a href="https://llvm.org/">LLVM</a> C frontend.
            Clang is essentially part of LLVM. All of the tools below
            are in the LLVM mono repository. It appears that they all
            get wrapped up into LLVM releases.</p>
            <p>While LLVM and Clang are supported by every major Linux
            distribution. It is often a much older version on stable
            releases. We also found that compiling against LLVM on
            multiple distributions is inconvenient.</p>
            <p>LLVM comes with the <code>llvm-config</code> utility to
            help figure out what compiler flags are needed and such.
            This itself comes in multiple versions on some
            distributions. There isn’t necessarily a default version
            either.</p>
            <p>Unsurprisingly building LLVM and Clang from source is
            quite time consuming. So we can not sidestep distribution
            package issues by vendoring it in to the LTP.</p>
            <p>Clang can output LLVM IR. We could read this and perform
            checks on it. We did not see an easy way to do this. So it
            was not properly investigated.</p>
            <h2 id="clang-plugins">Clang Plugins</h2>
            <p>Like GCC, Clang has plugins. These appear to be based on
            the same interface(s) as Clang Tidy and LibTooling described
            below.</p>
            <h2 id="clang-analyzer">Clang Analyzer</h2>
            <p>The <a href="https://clang-analyzer.llvm.org/">Clang
            Analyzer</a> is another powerful analyser capable of
            tracking state. It is comparable to GCC’s analyser described
            earlier. Less so to Smatch which is far more self-contained.
            Out of the analysers, Clang appears to produce the most
            false positives.</p>
            <p>Unlike GCC it has an <a
            href="https://clang-analyzer.llvm.org/checker_dev_manual.html">extention
            mechanism</a>. It’s not clear how well supported or popular
            this is. It appears that analyser extensions do not get
            inter-procedural state information.</p>
            <p>It was dismissed primarily for the same reasons as
            LibTooling and Clang Tidy. Although it provides an exploded
            graph instead of AST matching. The analyser appears to be
            accessible through Clang Tidy and LibTooling.</p>
            <p>This is a very attractive option for those already
            invested in the LLVM ecosystem.</p>
            <h2 id="clang-tidy">Clang Tidy</h2>
            <p>Checks developed with the <code>clang-tidy</code> command
            need to be added to the LLVM mono repository. Other project
            specific checks have been added to upstream. So perhaps LTP
            specific checks would also be accepted.</p>
            <p>The issue for us is the time between a check being
            accepted into LLVM upstream and the check being available to
            all LTP contributors. Considering the frequency of LLVM
            releases and stable distribution releases. It could be years
            before we can demand test developers run the checker.</p>
            <p>Demanding our contributors download and compile LLVM is
            not reasonable. So we can dismiss the Clang Tidy
            approach.</p>
            <h2 id="clang-libtooling">Clang LibTooling</h2>
            <p>Clang has an unstable C++ interface and a stable C
            interface. LibTooling represents the C++ interface.</p>
            <p>As the C++ interface is not stable, any checks written
            with it will need to be adapted for each LLVM release.
            Although Clang and LLVM are much less opaque than GCC. We
            still can’t afford that kind of maintenance.</p>
            <p>The LTP is also written in C not C++. This is only a
            minor point, but it does save some effort to use C
            throughout.</p>
            <h2 id="libclang">libclang</h2>
            <p>This is the stable C interface. It is a wrapper for the
            C++ interface. The main advantage is that functions are only
            added, not changed or removed.</p>
            <p>It appears that the interface’s primary clients are text
            editors. Specifically to allow features like auto
            completion. The primary header is even called
            <code>Index.h</code>.</p>
            <p>It does provide some access to the AST. This is done
            through a relatively simple and well documented API.
            Combined with its stability promises, we decided this was
            enough to <a
            href="https://patchwork.ozlabs.org/project/ltp/list/?series=&amp;submitter=73518&amp;state=*&amp;q=libclang">give
            it a serious try</a>.</p>
            <p>Below is the code which performs the check in version
            three of the patch series. Code for printing errors and such
            has been removed.</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;clang-c/Index.h&gt;</span></span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a><span class="co">/* The rules for test, library and tool code are different */</span></span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a><span class="kw">enum</span> ltp_tu_kind <span class="op">{</span></span>
<span id="cb2-5"><a href="#cb2-5" tabindex="-1"></a>    LTP_LIB<span class="op">,</span></span>
<span id="cb2-6"><a href="#cb2-6" tabindex="-1"></a>    LTP_OTHER<span class="op">,</span></span>
<span id="cb2-7"><a href="#cb2-7" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb2-8"><a href="#cb2-8" tabindex="-1"></a></span>
<span id="cb2-9"><a href="#cb2-9" tabindex="-1"></a><span class="co">/* Holds information about the TU which we gathered on the first pass */</span></span>
<span id="cb2-10"><a href="#cb2-10" tabindex="-1"></a><span class="dt">static</span> <span class="kw">struct</span> <span class="op">{</span></span>
<span id="cb2-11"><a href="#cb2-11" tabindex="-1"></a>    <span class="kw">enum</span> ltp_tu_kind tu_kind<span class="op">;</span></span>
<span id="cb2-12"><a href="#cb2-12" tabindex="-1"></a><span class="op">}</span> tu_info<span class="op">;</span></span>
<span id="cb2-13"><a href="#cb2-13" tabindex="-1"></a></span>
<span id="cb2-14"><a href="#cb2-14" tabindex="-1"></a><span class="dt">static</span> <span class="dt">int</span> cursor_cmp_spelling<span class="op">(</span><span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span><span class="dt">const</span> spelling<span class="op">,</span> CXCursor cursor<span class="op">)</span></span>
<span id="cb2-15"><a href="#cb2-15" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb2-16"><a href="#cb2-16" tabindex="-1"></a>    CXString cursor_spelling <span class="op">=</span> clang_getCursorSpelling<span class="op">(</span>cursor<span class="op">);</span></span>
<span id="cb2-17"><a href="#cb2-17" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">int</span> ret <span class="op">=</span> strcmp<span class="op">(</span>spelling<span class="op">,</span> clang_getCString<span class="op">(</span>cursor_spelling<span class="op">));</span></span>
<span id="cb2-18"><a href="#cb2-18" tabindex="-1"></a></span>
<span id="cb2-19"><a href="#cb2-19" tabindex="-1"></a>    clang_disposeString<span class="op">(</span>cursor_spelling<span class="op">);</span></span>
<span id="cb2-20"><a href="#cb2-20" tabindex="-1"></a></span>
<span id="cb2-21"><a href="#cb2-21" tabindex="-1"></a>    <span class="cf">return</span> ret<span class="op">;</span></span>
<span id="cb2-22"><a href="#cb2-22" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb2-23"><a href="#cb2-23" tabindex="-1"></a></span>
<span id="cb2-24"><a href="#cb2-24" tabindex="-1"></a><span class="dt">static</span> <span class="dt">int</span> cursor_type_cmp_spelling<span class="op">(</span><span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span><span class="dt">const</span> spelling<span class="op">,</span> CXCursor cursor<span class="op">)</span></span>
<span id="cb2-25"><a href="#cb2-25" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb2-26"><a href="#cb2-26" tabindex="-1"></a>    CXType ctype <span class="op">=</span> clang_getCursorType<span class="op">(</span>cursor<span class="op">);</span></span>
<span id="cb2-27"><a href="#cb2-27" tabindex="-1"></a>    CXString ctype_spelling <span class="op">=</span> clang_getTypeSpelling<span class="op">(</span>ctype<span class="op">);</span></span>
<span id="cb2-28"><a href="#cb2-28" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">int</span> ret <span class="op">=</span> strcmp<span class="op">(</span>spelling<span class="op">,</span> clang_getCString<span class="op">(</span>ctype_spelling<span class="op">));</span></span>
<span id="cb2-29"><a href="#cb2-29" tabindex="-1"></a></span>
<span id="cb2-30"><a href="#cb2-30" tabindex="-1"></a>    clang_disposeString<span class="op">(</span>ctype_spelling<span class="op">);</span></span>
<span id="cb2-31"><a href="#cb2-31" tabindex="-1"></a></span>
<span id="cb2-32"><a href="#cb2-32" tabindex="-1"></a>    <span class="cf">return</span> ret<span class="op">;</span></span>
<span id="cb2-33"><a href="#cb2-33" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb2-34"><a href="#cb2-34" tabindex="-1"></a></span>
<span id="cb2-35"><a href="#cb2-35" tabindex="-1"></a><span class="co">/*</span></span>
<span id="cb2-36"><a href="#cb2-36" tabindex="-1"></a><span class="co"> * Check if the </span><span class="al">TEST</span><span class="co">() macro is used inside the library.</span></span>
<span id="cb2-37"><a href="#cb2-37" tabindex="-1"></a><span class="co"> *</span></span>
<span id="cb2-38"><a href="#cb2-38" tabindex="-1"></a><span class="co"> * This check takes an AST node which should already be known to be a</span></span>
<span id="cb2-39"><a href="#cb2-39" tabindex="-1"></a><span class="co"> * macro expansion kind.</span></span>
<span id="cb2-40"><a href="#cb2-40" tabindex="-1"></a><span class="co"> *</span></span>
<span id="cb2-41"><a href="#cb2-41" tabindex="-1"></a><span class="co"> * If the TU appears to be a test executable then the test does not</span></span>
<span id="cb2-42"><a href="#cb2-42" tabindex="-1"></a><span class="co"> * apply. So in that case we return.</span></span>
<span id="cb2-43"><a href="#cb2-43" tabindex="-1"></a><span class="co"> *</span></span>
<span id="cb2-44"><a href="#cb2-44" tabindex="-1"></a><span class="co"> * If the macro expansion AST node is spelled </span><span class="al">TEST</span><span class="co">, then we emit an</span></span>
<span id="cb2-45"><a href="#cb2-45" tabindex="-1"></a><span class="co"> * error. Otherwise do nothing.</span></span>
<span id="cb2-46"><a href="#cb2-46" tabindex="-1"></a><span class="co"> */</span></span>
<span id="cb2-47"><a href="#cb2-47" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> check_TEST_macro<span class="op">(</span>CXCursor macro_cursor<span class="op">)</span></span>
<span id="cb2-48"><a href="#cb2-48" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb2-49"><a href="#cb2-49" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>tu_info<span class="op">.</span>tu_kind <span class="op">!=</span> LTP_LIB<span class="op">)</span></span>
<span id="cb2-50"><a href="#cb2-50" tabindex="-1"></a>        <span class="cf">return</span><span class="op">;</span></span>
<span id="cb2-51"><a href="#cb2-51" tabindex="-1"></a></span>
<span id="cb2-52"><a href="#cb2-52" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>cursor_cmp_spelling<span class="op">(</span><span class="st">&quot;TEST&quot;</span><span class="op">,</span> macro_cursor<span class="op">))</span> <span class="op">{</span></span>
<span id="cb2-53"><a href="#cb2-53" tabindex="-1"></a>        emit_check_error<span class="op">(</span>macro_cursor<span class="op">,</span></span>
<span id="cb2-54"><a href="#cb2-54" tabindex="-1"></a>               <span class="st">&quot;TEST() macro should not be used in library&quot;</span><span class="op">);</span></span>
<span id="cb2-55"><a href="#cb2-55" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb2-56"><a href="#cb2-56" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb2-57"><a href="#cb2-57" tabindex="-1"></a></span>
<span id="cb2-58"><a href="#cb2-58" tabindex="-1"></a><span class="co">/* Recursively visit each AST node and run checks based on node kind */</span></span>
<span id="cb2-59"><a href="#cb2-59" tabindex="-1"></a><span class="dt">static</span> <span class="kw">enum</span> CXChildVisitResult check_visitor<span class="op">(</span>CXCursor cursor<span class="op">,</span></span>
<span id="cb2-60"><a href="#cb2-60" tabindex="-1"></a>                         attr_unused CXCursor parent<span class="op">,</span></span>
<span id="cb2-61"><a href="#cb2-61" tabindex="-1"></a>                         attr_unused CXClientData client_data<span class="op">)</span></span>
<span id="cb2-62"><a href="#cb2-62" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb2-63"><a href="#cb2-63" tabindex="-1"></a>    CXSourceLocation loc <span class="op">=</span> clang_getCursorLocation<span class="op">(</span>cursor<span class="op">);</span></span>
<span id="cb2-64"><a href="#cb2-64" tabindex="-1"></a></span>
<span id="cb2-65"><a href="#cb2-65" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>clang_Location_isInSystemHeader<span class="op">(</span>loc<span class="op">))</span></span>
<span id="cb2-66"><a href="#cb2-66" tabindex="-1"></a>        <span class="cf">return</span> CXChildVisit_Continue<span class="op">;</span></span>
<span id="cb2-67"><a href="#cb2-67" tabindex="-1"></a></span>
<span id="cb2-68"><a href="#cb2-68" tabindex="-1"></a>    <span class="cf">switch</span> <span class="op">(</span>clang_getCursorKind<span class="op">(</span>cursor<span class="op">))</span> <span class="op">{</span></span>
<span id="cb2-69"><a href="#cb2-69" tabindex="-1"></a>    <span class="cf">case</span> CXCursor_MacroExpansion<span class="op">:</span></span>
<span id="cb2-70"><a href="#cb2-70" tabindex="-1"></a>            check_TEST_macro<span class="op">(</span>cursor<span class="op">);</span></span>
<span id="cb2-71"><a href="#cb2-71" tabindex="-1"></a>        <span class="cf">break</span><span class="op">;</span></span>
<span id="cb2-72"><a href="#cb2-72" tabindex="-1"></a>    <span class="cf">default</span><span class="op">:</span></span>
<span id="cb2-73"><a href="#cb2-73" tabindex="-1"></a>        <span class="cf">break</span><span class="op">;</span></span>
<span id="cb2-74"><a href="#cb2-74" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb2-75"><a href="#cb2-75" tabindex="-1"></a></span>
<span id="cb2-76"><a href="#cb2-76" tabindex="-1"></a>    <span class="cf">return</span> CXChildVisit_Recurse<span class="op">;</span></span>
<span id="cb2-77"><a href="#cb2-77" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb2-78"><a href="#cb2-78" tabindex="-1"></a></span>
<span id="cb2-79"><a href="#cb2-79" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> collect_info_from_args<span class="op">(</span><span class="dt">const</span> <span class="dt">int</span> argc<span class="op">,</span> <span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span><span class="dt">const</span> <span class="op">*</span><span class="dt">const</span> argv<span class="op">)</span></span>
<span id="cb2-80"><a href="#cb2-80" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb2-81"><a href="#cb2-81" tabindex="-1"></a>    <span class="dt">int</span> i<span class="op">;</span></span>
<span id="cb2-82"><a href="#cb2-82" tabindex="-1"></a></span>
<span id="cb2-83"><a href="#cb2-83" tabindex="-1"></a>    <span class="cf">for</span> <span class="op">(</span>i <span class="op">=</span> <span class="dv">0</span><span class="op">;</span> i <span class="op">&lt;</span> argc<span class="op">;</span> i<span class="op">++)</span> <span class="op">{</span></span>
<span id="cb2-84"><a href="#cb2-84" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(!</span>strcmp<span class="op">(</span><span class="st">&quot;-DLTPLIB&quot;</span><span class="op">,</span> argv<span class="op">[</span>i<span class="op">]))</span> <span class="op">{</span></span>
<span id="cb2-85"><a href="#cb2-85" tabindex="-1"></a>            tu_info<span class="op">.</span>tu_kind <span class="op">=</span> LTP_LIB<span class="op">;</span></span>
<span id="cb2-86"><a href="#cb2-86" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb2-87"><a href="#cb2-87" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb2-88"><a href="#cb2-88" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb2-89"><a href="#cb2-89" tabindex="-1"></a></span>
<span id="cb2-90"><a href="#cb2-90" tabindex="-1"></a><span class="dt">int</span> main<span class="op">(</span><span class="dt">const</span> <span class="dt">int</span> argc<span class="op">,</span> <span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span><span class="dt">const</span> <span class="op">*</span><span class="dt">const</span> argv<span class="op">)</span></span>
<span id="cb2-91"><a href="#cb2-91" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb2-92"><a href="#cb2-92" tabindex="-1"></a>    CXIndex cindex <span class="op">=</span> clang_createIndex<span class="op">(</span><span class="dv">0</span><span class="op">,</span> <span class="dv">1</span><span class="op">);</span></span>
<span id="cb2-93"><a href="#cb2-93" tabindex="-1"></a>    CXTranslationUnit tu<span class="op">;</span></span>
<span id="cb2-94"><a href="#cb2-94" tabindex="-1"></a>    CXCursor tuc<span class="op">;</span></span>
<span id="cb2-95"><a href="#cb2-95" tabindex="-1"></a>    <span class="kw">enum</span> CXErrorCode ret<span class="op">;</span></span>
<span id="cb2-96"><a href="#cb2-96" tabindex="-1"></a></span>
<span id="cb2-97"><a href="#cb2-97" tabindex="-1"></a>    tu_info<span class="op">.</span>tu_kind <span class="op">=</span> LTP_OTHER<span class="op">;</span></span>
<span id="cb2-98"><a href="#cb2-98" tabindex="-1"></a>    collect_info_from_args<span class="op">(</span>argc<span class="op">,</span> argv<span class="op">);</span></span>
<span id="cb2-99"><a href="#cb2-99" tabindex="-1"></a></span>
<span id="cb2-100"><a href="#cb2-100" tabindex="-1"></a>    ret <span class="op">=</span> clang_parseTranslationUnit2<span class="op">(</span></span>
<span id="cb2-101"><a href="#cb2-101" tabindex="-1"></a>        cindex<span class="op">,</span></span>
<span id="cb2-102"><a href="#cb2-102" tabindex="-1"></a>        <span class="co">/*source_filename=*/</span>NULL<span class="op">,</span></span>
<span id="cb2-103"><a href="#cb2-103" tabindex="-1"></a>        argv <span class="op">+</span> <span class="dv">1</span><span class="op">,</span> argc <span class="op">-</span> <span class="dv">1</span><span class="op">,</span></span>
<span id="cb2-104"><a href="#cb2-104" tabindex="-1"></a>        <span class="co">/*unsaved_files=*/</span>NULL<span class="op">,</span> <span class="co">/*num_unsaved_files=*/</span><span class="dv">0</span><span class="op">,</span></span>
<span id="cb2-105"><a href="#cb2-105" tabindex="-1"></a>        CXTranslationUnit_DetailedPreprocessingRecord<span class="op">,</span></span>
<span id="cb2-106"><a href="#cb2-106" tabindex="-1"></a>        <span class="op">&amp;</span>tu<span class="op">);</span></span>
<span id="cb2-107"><a href="#cb2-107" tabindex="-1"></a></span>
<span id="cb2-108"><a href="#cb2-108" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>ret <span class="op">!=</span> CXError_Success<span class="op">)</span> <span class="op">{</span></span>
<span id="cb2-109"><a href="#cb2-109" tabindex="-1"></a>        emit_error<span class="op">(</span><span class="st">&quot;Failed to parse translation unit!&quot;</span><span class="op">);</span></span>
<span id="cb2-110"><a href="#cb2-110" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb2-111"><a href="#cb2-111" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb2-112"><a href="#cb2-112" tabindex="-1"></a></span>
<span id="cb2-113"><a href="#cb2-113" tabindex="-1"></a>    tuc <span class="op">=</span> clang_getTranslationUnitCursor<span class="op">(</span>tu<span class="op">);</span></span>
<span id="cb2-114"><a href="#cb2-114" tabindex="-1"></a></span>
<span id="cb2-115"><a href="#cb2-115" tabindex="-1"></a>    clang_visitChildren<span class="op">(</span>tuc<span class="op">,</span> check_visitor<span class="op">,</span> NULL<span class="op">);</span></span>
<span id="cb2-116"><a href="#cb2-116" tabindex="-1"></a></span>
<span id="cb2-117"><a href="#cb2-117" tabindex="-1"></a>    <span class="co">/* Stop leak sanitizer from complaining */</span></span>
<span id="cb2-118"><a href="#cb2-118" tabindex="-1"></a>    clang_disposeTranslationUnit<span class="op">(</span>tu<span class="op">);</span></span>
<span id="cb2-119"><a href="#cb2-119" tabindex="-1"></a>    clang_disposeIndex<span class="op">(</span>cindex<span class="op">);</span></span>
<span id="cb2-120"><a href="#cb2-120" tabindex="-1"></a></span>
<span id="cb2-121"><a href="#cb2-121" tabindex="-1"></a>    <span class="cf">return</span> error_flag<span class="op">;</span></span>
<span id="cb2-122"><a href="#cb2-122" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>The above code uses Clang to create an AST from a C file
            (Translation Unit; TU). Then recurses into the AST, checking
            the type (kind) of each node (cursor). If we find a node of
            a kind we can check, then we call a checking function on
            it.</p>
            <p>The LTP build system passes the same flags it would pass
            to the compiler. In addition we add
            <code>-resource-dir $(shell $(CLANG) -print-resource-dir)</code>.
            Because libclang can not find the compiler’s resource
            directory.</p>
            <p>The resource directory contains some compiler specific
            headers and libraries. The <code>clang</code> command is
            able to find it automatically. The code which performs this
            search is not in the Clang library.</p>
            <p>We search the arguments for <code>-DLTPLIB</code> which
            tells us if we are compiling the test library. The
            <code>TEST()</code> macro check only applies to the test
            library. In a previous version we looked at the code itself
            to decide if the the file were a test.</p>
            <div class="sourceCode" id="cb3"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a><span class="co">/* If we find `struct tst_test = {...}` then record that this TU is a test */</span></span>
<span id="cb3-2"><a href="#cb3-2" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> info_ltp_tu_kind<span class="op">(</span>CXCursor cursor<span class="op">)</span></span>
<span id="cb3-3"><a href="#cb3-3" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb3-4"><a href="#cb3-4" tabindex="-1"></a>    CXCursor initializer<span class="op">;</span></span>
<span id="cb3-5"><a href="#cb3-5" tabindex="-1"></a></span>
<span id="cb3-6"><a href="#cb3-6" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>clang_Cursor_hasVarDeclGlobalStorage<span class="op">(</span>cursor<span class="op">)</span> <span class="op">!=</span> <span class="dv">1</span><span class="op">)</span></span>
<span id="cb3-7"><a href="#cb3-7" tabindex="-1"></a>        <span class="cf">return</span><span class="op">;</span></span>
<span id="cb3-8"><a href="#cb3-8" tabindex="-1"></a></span>
<span id="cb3-9"><a href="#cb3-9" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>cursor_cmp_spelling<span class="op">(</span><span class="st">&quot;test&quot;</span><span class="op">,</span> cursor<span class="op">))</span></span>
<span id="cb3-10"><a href="#cb3-10" tabindex="-1"></a>        <span class="cf">return</span><span class="op">;</span></span>
<span id="cb3-11"><a href="#cb3-11" tabindex="-1"></a></span>
<span id="cb3-12"><a href="#cb3-12" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>cursor_type_cmp_spelling<span class="op">(</span><span class="st">&quot;struct tst_test&quot;</span><span class="op">,</span> cursor<span class="op">))</span></span>
<span id="cb3-13"><a href="#cb3-13" tabindex="-1"></a>        <span class="cf">return</span><span class="op">;</span></span>
<span id="cb3-14"><a href="#cb3-14" tabindex="-1"></a></span>
<span id="cb3-15"><a href="#cb3-15" tabindex="-1"></a>    initializer <span class="op">=</span> clang_Cursor_getVarDeclInitializer<span class="op">(</span>cursor<span class="op">);</span></span>
<span id="cb3-16"><a href="#cb3-16" tabindex="-1"></a></span>
<span id="cb3-17"><a href="#cb3-17" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>clang_Cursor_isNull<span class="op">(</span>initializer<span class="op">))</span></span>
<span id="cb3-18"><a href="#cb3-18" tabindex="-1"></a>        tu_info<span class="op">.</span>tu_kind <span class="op">=</span> LTP_TEST<span class="op">;</span></span>
<span id="cb3-19"><a href="#cb3-19" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>Apart from being more complicated, the problem here was
            <code>clang_Cursor_getVarDeclInitializer</code>. This
            function was only introduced in LLVM 12. Meanwhile stable
            Ubuntu was on LLVM 10. It’s not clear how to achieve the
            same thing without this function.</p>
            <p>There is another problem with our <code>TEST()</code>
            check. The actual requirement is to ensure the variables
            <code>TST_ERR</code> and <code>TST_RET</code> are not
            written to. Determining from the AST if a variable is
            written to is awkward enough. In libclang’s case it seems to
            be impossible. The necessary information is not exposed.</p>
            <p>The amount of friction simply integrating with libclang
            is probably enough for us to have dismissed it. Even if that
            were not the case though, there is too much stuff missing
            for it to be useful.</p>
            <p>If you can use LLVM at all, it is better to use the C++
            interface.</p>
            <h1 id="coccinelle">Coccinelle</h1>
            <p>Also known as the <code>spatch</code> command. It is
            described as a semantic patch tool. It implements a pattern
            matching language which looks somewhat like a C code
            “diff”.</p>
            <p>These patterns match against the syntax, semantics and
            control flow of C code. Under the hood Coccinelle operates
            on one or more IRs of the C program. However the user is not
            exposed to that. We are given a quirky language which looks
            like a Git commit to some C code.</p>
            <p>Apparently a Coccinelle semantic patch is compiled into
            Control Tree Logic (CTL) and this is matched against some
            representation of the C code. This is perhaps analogous to
            how a regular expression is compiled into an automata and
            the automata matches the input text.</p>
            <p>As far as we can tell, Coccinelle does not track state
            automatically. It does understand control flow however.
            Limited state tracking can be added using Python or OCaml
            snippets. These may be attached at certain points in the
            matching process.</p>
            <p>All in all, You can be forgiven for thinking it works by
            magic. The tool has <a
            href="https://coccinelle.gitlabpages.inria.fr/website/papers.html">multiple
            papers and presentations</a>. There is quite a bit of <a
            href="https://coccinelle.gitlabpages.inria.fr/website/documentation.html">documentation</a>.
            Still it is difficult to grasp. One suspects this is due to
            some misconceptions and communication issues. Perhaps the
            notes below will help.</p>
            <ol style="list-style-type: decimal">
            <li><p>There is no plain text or C code in a semantic patch.
            It all has meaning specified by the domain specific
            language. It looks like C code mixed with some special
            symbols, but it is not.</p></li>
            <li><p>Matching takes the control flow into consideration.
            You can specify that all branches must match. Or that one or
            more matches exists.</p></li>
            <li><p>You can match against the spelling of variables and
            other syntactic details. However it is primarily matching
            against the deeper structure of the program.</p></li>
            </ol>
            <p>With these things in mind you may have more of a chance
            understanding the documentation.</p>
            <p><code>smatch</code> does not have helpful error messages.
            The implementation is also opaque to us (more on that
            later). So the process of writing a semantic patch is often
            blind trial and error, mixed with reading the docs and
            examples.</p>
            <p>That said it is a truly wonderful tool. We made a lot of
            progress in a short time. Below is a semantic patch which
            both finds and (almost) fixes <code>TEST()</code> macro
            usages.</p>
            <div class="sourceCode" id="cb4"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb4-1"><a href="#cb4-1" tabindex="-1"></a><span class="co">// Find and fix violations of rule LTP-002</span></span>
<span id="cb4-2"><a href="#cb4-2" tabindex="-1"></a></span>
<span id="cb4-3"><a href="#cb4-3" tabindex="-1"></a><span class="co">// Set with -D fix</span></span>
<span id="cb4-4"><a href="#cb4-4" tabindex="-1"></a>virtual fix</span>
<span id="cb4-5"><a href="#cb4-5" tabindex="-1"></a></span>
<span id="cb4-6"><a href="#cb4-6" tabindex="-1"></a><span class="co">// Find all positions where </span><span class="al">TEST</span><span class="co"> is _used_.</span></span>
<span id="cb4-7"><a href="#cb4-7" tabindex="-1"></a>@ depends on <span class="op">!</span>fix exists @</span>
<span id="cb4-8"><a href="#cb4-8" tabindex="-1"></a>@@</span>
<span id="cb4-9"><a href="#cb4-9" tabindex="-1"></a></span>
<span id="cb4-10"><a href="#cb4-10" tabindex="-1"></a><span class="op">*</span> TEST<span class="op">(...);</span></span>
<span id="cb4-11"><a href="#cb4-11" tabindex="-1"></a></span>
<span id="cb4-12"><a href="#cb4-12" tabindex="-1"></a><span class="co">// Below are rules which will create a patch to replace </span><span class="al">TEST</span><span class="co"> usage</span></span>
<span id="cb4-13"><a href="#cb4-13" tabindex="-1"></a><span class="co">// It assumes we can use the ret var without conflicts</span></span>
<span id="cb4-14"><a href="#cb4-14" tabindex="-1"></a></span>
<span id="cb4-15"><a href="#cb4-15" tabindex="-1"></a><span class="co">// Fix all references to the variables </span><span class="al">TEST</span><span class="co"> modifies when they occur in a</span></span>
<span id="cb4-16"><a href="#cb4-16" tabindex="-1"></a><span class="co">// function where </span><span class="al">TEST</span><span class="co"> was used.</span></span>
<span id="cb4-17"><a href="#cb4-17" tabindex="-1"></a>@ depends on fix exists @</span>
<span id="cb4-18"><a href="#cb4-18" tabindex="-1"></a>@@</span>
<span id="cb4-19"><a href="#cb4-19" tabindex="-1"></a></span>
<span id="cb4-20"><a href="#cb4-20" tabindex="-1"></a> TEST<span class="op">(...)</span></span>
<span id="cb4-21"><a href="#cb4-21" tabindex="-1"></a></span>
<span id="cb4-22"><a href="#cb4-22" tabindex="-1"></a> <span class="op">...</span></span>
<span id="cb4-23"><a href="#cb4-23" tabindex="-1"></a></span>
<span id="cb4-24"><a href="#cb4-24" tabindex="-1"></a><span class="op">(</span></span>
<span id="cb4-25"><a href="#cb4-25" tabindex="-1"></a><span class="op">-</span> TST_RET</span>
<span id="cb4-26"><a href="#cb4-26" tabindex="-1"></a><span class="op">+</span> ret</span>
<span id="cb4-27"><a href="#cb4-27" tabindex="-1"></a><span class="op">|</span></span>
<span id="cb4-28"><a href="#cb4-28" tabindex="-1"></a><span class="op">-</span> TST_ERR</span>
<span id="cb4-29"><a href="#cb4-29" tabindex="-1"></a><span class="op">+</span> errno</span>
<span id="cb4-30"><a href="#cb4-30" tabindex="-1"></a><span class="op">|</span></span>
<span id="cb4-31"><a href="#cb4-31" tabindex="-1"></a><span class="op">-</span> TTERRNO</span>
<span id="cb4-32"><a href="#cb4-32" tabindex="-1"></a><span class="op">+</span> TERRNO</span>
<span id="cb4-33"><a href="#cb4-33" tabindex="-1"></a><span class="op">)</span></span>
<span id="cb4-34"><a href="#cb4-34" tabindex="-1"></a></span>
<span id="cb4-35"><a href="#cb4-35" tabindex="-1"></a><span class="co">// Replace </span><span class="al">TEST</span><span class="co"> in all functions where it occurs only at the start. It</span></span>
<span id="cb4-36"><a href="#cb4-36" tabindex="-1"></a><span class="co">// is slightly complicated by adding a newline if a statement appears</span></span>
<span id="cb4-37"><a href="#cb4-37" tabindex="-1"></a><span class="co">// on the line after </span><span class="al">TEST</span><span class="co">(). It is not clear to me what the rules are</span></span>
<span id="cb4-38"><a href="#cb4-38" tabindex="-1"></a><span class="co">// for matching whitespace as it has no semantic meaning, but this</span></span>
<span id="cb4-39"><a href="#cb4-39" tabindex="-1"></a><span class="co">// appears to work.</span></span>
<span id="cb4-40"><a href="#cb4-40" tabindex="-1"></a>@ depends on fix @</span>
<span id="cb4-41"><a href="#cb4-41" tabindex="-1"></a>identifier fn<span class="op">;</span></span>
<span id="cb4-42"><a href="#cb4-42" tabindex="-1"></a>expression tested_expr<span class="op">;</span></span>
<span id="cb4-43"><a href="#cb4-43" tabindex="-1"></a>statement st<span class="op">;</span></span>
<span id="cb4-44"><a href="#cb4-44" tabindex="-1"></a>@@</span>
<span id="cb4-45"><a href="#cb4-45" tabindex="-1"></a></span>
<span id="cb4-46"><a href="#cb4-46" tabindex="-1"></a>  fn <span class="op">(...)</span></span>
<span id="cb4-47"><a href="#cb4-47" tabindex="-1"></a>  <span class="op">{</span></span>
<span id="cb4-48"><a href="#cb4-48" tabindex="-1"></a><span class="op">-</span>   TEST<span class="op">(</span>tested_expr<span class="op">);</span></span>
<span id="cb4-49"><a href="#cb4-49" tabindex="-1"></a><span class="op">+</span>   <span class="dt">const</span> <span class="dt">long</span> ret <span class="op">=</span> tested_expr<span class="op">;</span></span>
<span id="cb4-50"><a href="#cb4-50" tabindex="-1"></a><span class="op">(</span></span>
<span id="cb4-51"><a href="#cb4-51" tabindex="-1"></a><span class="op">+</span></span>
<span id="cb4-52"><a href="#cb4-52" tabindex="-1"></a>    st</span>
<span id="cb4-53"><a href="#cb4-53" tabindex="-1"></a><span class="op">|</span></span>
<span id="cb4-54"><a href="#cb4-54" tabindex="-1"></a></span>
<span id="cb4-55"><a href="#cb4-55" tabindex="-1"></a><span class="op">)</span></span>
<span id="cb4-56"><a href="#cb4-56" tabindex="-1"></a>    <span class="op">...</span> when <span class="op">!=</span> TEST<span class="op">(...)</span></span>
<span id="cb4-57"><a href="#cb4-57" tabindex="-1"></a>  <span class="op">}</span></span>
<span id="cb4-58"><a href="#cb4-58" tabindex="-1"></a></span>
<span id="cb4-59"><a href="#cb4-59" tabindex="-1"></a><span class="co">// Replace </span><span class="al">TEST</span><span class="co"> in all functions where it occurs at the start</span></span>
<span id="cb4-60"><a href="#cb4-60" tabindex="-1"></a><span class="co">// Functions where it *only* occurs at the start were handled above</span></span>
<span id="cb4-61"><a href="#cb4-61" tabindex="-1"></a>@ depends on fix @</span>
<span id="cb4-62"><a href="#cb4-62" tabindex="-1"></a>identifier fn<span class="op">;</span></span>
<span id="cb4-63"><a href="#cb4-63" tabindex="-1"></a>expression tested_expr<span class="op">;</span></span>
<span id="cb4-64"><a href="#cb4-64" tabindex="-1"></a>statement st<span class="op">;</span></span>
<span id="cb4-65"><a href="#cb4-65" tabindex="-1"></a>@@</span>
<span id="cb4-66"><a href="#cb4-66" tabindex="-1"></a></span>
<span id="cb4-67"><a href="#cb4-67" tabindex="-1"></a>  fn <span class="op">(...)</span></span>
<span id="cb4-68"><a href="#cb4-68" tabindex="-1"></a>  <span class="op">{</span></span>
<span id="cb4-69"><a href="#cb4-69" tabindex="-1"></a><span class="op">-</span>   TEST<span class="op">(</span>tested_expr<span class="op">);</span></span>
<span id="cb4-70"><a href="#cb4-70" tabindex="-1"></a><span class="op">+</span>   <span class="dt">long</span> ret <span class="op">=</span> tested_expr<span class="op">;</span></span>
<span id="cb4-71"><a href="#cb4-71" tabindex="-1"></a><span class="op">(</span></span>
<span id="cb4-72"><a href="#cb4-72" tabindex="-1"></a><span class="op">+</span></span>
<span id="cb4-73"><a href="#cb4-73" tabindex="-1"></a>    st</span>
<span id="cb4-74"><a href="#cb4-74" tabindex="-1"></a><span class="op">|</span></span>
<span id="cb4-75"><a href="#cb4-75" tabindex="-1"></a></span>
<span id="cb4-76"><a href="#cb4-76" tabindex="-1"></a><span class="op">)</span></span>
<span id="cb4-77"><a href="#cb4-77" tabindex="-1"></a>    <span class="op">...</span></span>
<span id="cb4-78"><a href="#cb4-78" tabindex="-1"></a>  <span class="op">}</span></span>
<span id="cb4-79"><a href="#cb4-79" tabindex="-1"></a></span>
<span id="cb4-80"><a href="#cb4-80" tabindex="-1"></a><span class="co">// Add ret var at the start of a function where </span><span class="al">TEST</span><span class="co"> occurs and there</span></span>
<span id="cb4-81"><a href="#cb4-81" tabindex="-1"></a><span class="co">// is not already a ret declaration</span></span>
<span id="cb4-82"><a href="#cb4-82" tabindex="-1"></a>@ depends on fix exists @</span>
<span id="cb4-83"><a href="#cb4-83" tabindex="-1"></a>identifier fn<span class="op">;</span></span>
<span id="cb4-84"><a href="#cb4-84" tabindex="-1"></a>@@</span>
<span id="cb4-85"><a href="#cb4-85" tabindex="-1"></a></span>
<span id="cb4-86"><a href="#cb4-86" tabindex="-1"></a>  fn <span class="op">(...)</span></span>
<span id="cb4-87"><a href="#cb4-87" tabindex="-1"></a>  <span class="op">{</span></span>
<span id="cb4-88"><a href="#cb4-88" tabindex="-1"></a><span class="op">+</span>   <span class="dt">long</span> ret<span class="op">;</span></span>
<span id="cb4-89"><a href="#cb4-89" tabindex="-1"></a>    <span class="op">...</span> when <span class="op">!=</span> <span class="dt">long</span> ret<span class="op">;</span></span>
<span id="cb4-90"><a href="#cb4-90" tabindex="-1"></a></span>
<span id="cb4-91"><a href="#cb4-91" tabindex="-1"></a>    TEST<span class="op">(...)</span></span>
<span id="cb4-92"><a href="#cb4-92" tabindex="-1"></a>    <span class="op">...</span></span>
<span id="cb4-93"><a href="#cb4-93" tabindex="-1"></a>  <span class="op">}</span></span>
<span id="cb4-94"><a href="#cb4-94" tabindex="-1"></a></span>
<span id="cb4-95"><a href="#cb4-95" tabindex="-1"></a><span class="co">// Replace any remaining occurrences of </span><span class="al">TEST</span></span>
<span id="cb4-96"><a href="#cb4-96" tabindex="-1"></a>@ depends on fix @</span>
<span id="cb4-97"><a href="#cb4-97" tabindex="-1"></a>expression tested_expr<span class="op">;</span></span>
<span id="cb4-98"><a href="#cb4-98" tabindex="-1"></a>@@</span>
<span id="cb4-99"><a href="#cb4-99" tabindex="-1"></a></span>
<span id="cb4-100"><a href="#cb4-100" tabindex="-1"></a><span class="op">-</span>   TEST<span class="op">(</span>tested_expr<span class="op">);</span></span>
<span id="cb4-101"><a href="#cb4-101" tabindex="-1"></a><span class="op">+</span>   ret <span class="op">=</span> tested_expr<span class="op">;</span></span></code></pre></div>
            <p>This has been merged into the LTP. However we determined
            that Coccinelle can not be forced upon LTP contributors.
            Despite the fact Coccinelle is stable and has been around
            for years. We ran into distribution issues. It seems that at
            least the Gentoo package is lacking a maintainer.</p>
            <p>We suspect this has little to do with Coccinelle itself.
            The issue is that it is written in OCaml. Package
            maintainers struggle with OCaml projects. It is easy to see
            why, as our attempts to learn the basics of OCaml were
            fraught with issues.</p>
            <p>For a tool as good as Coccinelle, some of us are willing
            to learn a new language. If it were Haskell, for example,
            we’d not have a problem fixing the <em>occasional</em>
            issue.</p>
            <p>However everyones’ patience ran out with OCaml. Being
            functional when we are primarily working on C does not help.
            However the main issue is that many distributions are not
            maintaining the packages properly. So it is often difficult
            just to get the REPL and compiler running.</p>
            <p>I personally have no opinion on whether it is a good
            language. I didn’t get far enough to decide that. It seems
            to be the case though that people are not interested in it.
            Meanwhile we want to get some static analysis done, not
            revive a struggling language.</p>
            <p>We still merged the Coccinelle scripts into the LTP. They
            provide a useful example of how to automate changes with
            <code>spatch</code>. We haven’t found another option for
            making these kinds of changes. Using Clang Tidy is extremely
            laborious compared to writing a semantic patch.</p>
            <p>Sadly it has to be dismissed as our primary checker due
            to the OCaml ecosystem.</p>
            <h1 id="sparse">Sparse</h1>
            <p>Sparse is a stand alone C parser library. It produces an
            AST and linearised IR consisting of basicblocks. In fact it
            can produce executable x86 code or LLVM IR. So it is
            essentially a compiler.</p>
            <p>Unlike most C compilers however it is very simple. It is
            not designed to produce fast code, nor can it parse
            everything GCC can. It only parses C and is not concerned
            with C++.</p>
            <p>Sparse itself can be compiled relatively quickly and has
            few dependencies. It doesn’t take long to clone it with Git
            either. This meant we were able to vendor it in as a Git
            module.</p>
            <p>Some disapprove of vendoring and Git modules.
            Unfortunately the Sparse package available on most
            distributions is not useful to us. Sparse is linked
            statically and the package only contains an executable for
            use with the Linux kernel. There is no dynamic library. Of
            course someone can change that, but it would take time to
            propagate downstream.</p>
            <p>The IR is relatively easy to traverse and write checks
            against. The documentation is maybe a little sparse. However
            with some knowledge about compilers, it’s not too hard to
            understand the code. It is written in a similar style to the
            kernel and LTP. Albeit with some quirks.</p>
            <p>Below is the full checker program sans some
            boilerplate.</p>
            <div class="sourceCode" id="cb5"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb5-1"><a href="#cb5-1" tabindex="-1"></a><span class="co">/* The rules for test, library and tool code are different */</span></span>
<span id="cb5-2"><a href="#cb5-2" tabindex="-1"></a><span class="kw">enum</span> ltp_tu_kind <span class="op">{</span></span>
<span id="cb5-3"><a href="#cb5-3" tabindex="-1"></a>    LTP_LIB<span class="op">,</span></span>
<span id="cb5-4"><a href="#cb5-4" tabindex="-1"></a>    LTP_OTHER<span class="op">,</span></span>
<span id="cb5-5"><a href="#cb5-5" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb5-6"><a href="#cb5-6" tabindex="-1"></a></span>
<span id="cb5-7"><a href="#cb5-7" tabindex="-1"></a><span class="dt">static</span> <span class="kw">enum</span> ltp_tu_kind tu_kind <span class="op">=</span> LTP_OTHER<span class="op">;</span></span>
<span id="cb5-8"><a href="#cb5-8" tabindex="-1"></a></span>
<span id="cb5-9"><a href="#cb5-9" tabindex="-1"></a><span class="co">/* Check for LTP-002</span></span>
<span id="cb5-10"><a href="#cb5-10" tabindex="-1"></a><span class="co"> *</span></span>
<span id="cb5-11"><a href="#cb5-11" tabindex="-1"></a><span class="co"> * Inspects the destination symbol of each store instruction. If it is</span></span>
<span id="cb5-12"><a href="#cb5-12" tabindex="-1"></a><span class="co"> * TST_RET or TST_ERR then emit a warning.</span></span>
<span id="cb5-13"><a href="#cb5-13" tabindex="-1"></a><span class="co"> */</span></span>
<span id="cb5-14"><a href="#cb5-14" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> check_lib_sets_TEST_vars<span class="op">(</span><span class="dt">const</span> <span class="kw">struct</span> instruction <span class="op">*</span>insn<span class="op">)</span></span>
<span id="cb5-15"><a href="#cb5-15" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-16"><a href="#cb5-16" tabindex="-1"></a>    <span class="dt">static</span> <span class="kw">struct</span> ident <span class="op">*</span>TST_RES_id<span class="op">,</span> <span class="op">*</span>TST_ERR_id<span class="op">;</span></span>
<span id="cb5-17"><a href="#cb5-17" tabindex="-1"></a></span>
<span id="cb5-18"><a href="#cb5-18" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>TST_RES_id<span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-19"><a href="#cb5-19" tabindex="-1"></a>        TST_RES_id <span class="op">=</span> built_in_ident<span class="op">(</span><span class="st">&quot;TST_RET&quot;</span><span class="op">);</span></span>
<span id="cb5-20"><a href="#cb5-20" tabindex="-1"></a>        TST_ERR_id <span class="op">=</span> built_in_ident<span class="op">(</span><span class="st">&quot;TST_ERR&quot;</span><span class="op">);</span></span>
<span id="cb5-21"><a href="#cb5-21" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb5-22"><a href="#cb5-22" tabindex="-1"></a></span>
<span id="cb5-23"><a href="#cb5-23" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>insn<span class="op">-&gt;</span>opcode <span class="op">!=</span> OP_STORE<span class="op">)</span></span>
<span id="cb5-24"><a href="#cb5-24" tabindex="-1"></a>        <span class="cf">return</span><span class="op">;</span></span>
<span id="cb5-25"><a href="#cb5-25" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>insn<span class="op">-&gt;</span>src<span class="op">-&gt;</span>ident <span class="op">!=</span> TST_RES_id <span class="op">&amp;&amp;</span></span>
<span id="cb5-26"><a href="#cb5-26" tabindex="-1"></a>        insn<span class="op">-&gt;</span>src<span class="op">-&gt;</span>ident <span class="op">!=</span> TST_ERR_id<span class="op">)</span></span>
<span id="cb5-27"><a href="#cb5-27" tabindex="-1"></a>        <span class="cf">return</span><span class="op">;</span></span>
<span id="cb5-28"><a href="#cb5-28" tabindex="-1"></a></span>
<span id="cb5-29"><a href="#cb5-29" tabindex="-1"></a>    warning<span class="op">(</span>insn<span class="op">-&gt;</span>pos<span class="op">,</span></span>
<span id="cb5-30"><a href="#cb5-30" tabindex="-1"></a>        <span class="st">&quot;LTP-002: Library should not write to TST_RET or TST_ERR&quot;</span><span class="op">);</span></span>
<span id="cb5-31"><a href="#cb5-31" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb5-32"><a href="#cb5-32" tabindex="-1"></a></span>
<span id="cb5-33"><a href="#cb5-33" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> do_basicblock_checks<span class="op">(</span><span class="kw">struct</span> basic_block <span class="op">*</span>bb<span class="op">)</span></span>
<span id="cb5-34"><a href="#cb5-34" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-35"><a href="#cb5-35" tabindex="-1"></a>    <span class="kw">struct</span> instruction <span class="op">*</span>insn<span class="op">;</span></span>
<span id="cb5-36"><a href="#cb5-36" tabindex="-1"></a></span>
<span id="cb5-37"><a href="#cb5-37" tabindex="-1"></a>    FOR_EACH_PTR<span class="op">(</span>bb<span class="op">-&gt;</span>insns<span class="op">,</span> insn<span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-38"><a href="#cb5-38" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(!</span>bb_reachable<span class="op">(</span>insn<span class="op">-&gt;</span>bb<span class="op">))</span></span>
<span id="cb5-39"><a href="#cb5-39" tabindex="-1"></a>            <span class="cf">continue</span><span class="op">;</span></span>
<span id="cb5-40"><a href="#cb5-40" tabindex="-1"></a></span>
<span id="cb5-41"><a href="#cb5-41" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>tu_kind <span class="op">==</span> LTP_LIB<span class="op">)</span></span>
<span id="cb5-42"><a href="#cb5-42" tabindex="-1"></a>            check_lib_sets_TEST_vars<span class="op">(</span>insn<span class="op">);</span></span>
<span id="cb5-43"><a href="#cb5-43" tabindex="-1"></a>    <span class="op">}</span> END_FOR_EACH_PTR<span class="op">(</span>insn<span class="op">);</span></span>
<span id="cb5-44"><a href="#cb5-44" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb5-45"><a href="#cb5-45" tabindex="-1"></a></span>
<span id="cb5-46"><a href="#cb5-46" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> do_entrypoint_checks<span class="op">(</span><span class="kw">struct</span> entrypoint <span class="op">*</span>ep<span class="op">)</span></span>
<span id="cb5-47"><a href="#cb5-47" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-48"><a href="#cb5-48" tabindex="-1"></a>    <span class="kw">struct</span> basic_block <span class="op">*</span>bb<span class="op">;</span></span>
<span id="cb5-49"><a href="#cb5-49" tabindex="-1"></a></span>
<span id="cb5-50"><a href="#cb5-50" tabindex="-1"></a>    FOR_EACH_PTR<span class="op">(</span>ep<span class="op">-&gt;</span>bbs<span class="op">,</span> bb<span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-51"><a href="#cb5-51" tabindex="-1"></a>        do_basicblock_checks<span class="op">(</span>bb<span class="op">);</span></span>
<span id="cb5-52"><a href="#cb5-52" tabindex="-1"></a>    <span class="op">}</span> END_FOR_EACH_PTR<span class="op">(</span>bb<span class="op">);</span></span>
<span id="cb5-53"><a href="#cb5-53" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb5-54"><a href="#cb5-54" tabindex="-1"></a></span>
<span id="cb5-55"><a href="#cb5-55" tabindex="-1"></a><span class="co">/* Compile the AST into a graph of basicblocks */</span></span>
<span id="cb5-56"><a href="#cb5-56" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> process_symbols<span class="op">(</span><span class="kw">struct</span> symbol_list <span class="op">*</span>list<span class="op">)</span></span>
<span id="cb5-57"><a href="#cb5-57" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-58"><a href="#cb5-58" tabindex="-1"></a>    <span class="kw">struct</span> symbol <span class="op">*</span>sym<span class="op">;</span></span>
<span id="cb5-59"><a href="#cb5-59" tabindex="-1"></a></span>
<span id="cb5-60"><a href="#cb5-60" tabindex="-1"></a>    FOR_EACH_PTR<span class="op">(</span>list<span class="op">,</span> sym<span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-61"><a href="#cb5-61" tabindex="-1"></a>        <span class="kw">struct</span> entrypoint <span class="op">*</span>ep<span class="op">;</span></span>
<span id="cb5-62"><a href="#cb5-62" tabindex="-1"></a></span>
<span id="cb5-63"><a href="#cb5-63" tabindex="-1"></a>        expand_symbol<span class="op">(</span>sym<span class="op">);</span></span>
<span id="cb5-64"><a href="#cb5-64" tabindex="-1"></a>        ep <span class="op">=</span> linearize_symbol<span class="op">(</span>sym<span class="op">);</span></span>
<span id="cb5-65"><a href="#cb5-65" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(!</span>ep <span class="op">||</span> <span class="op">!</span>ep<span class="op">-&gt;</span>entry<span class="op">)</span></span>
<span id="cb5-66"><a href="#cb5-66" tabindex="-1"></a>            <span class="cf">continue</span><span class="op">;</span></span>
<span id="cb5-67"><a href="#cb5-67" tabindex="-1"></a></span>
<span id="cb5-68"><a href="#cb5-68" tabindex="-1"></a>        do_entrypoint_checks<span class="op">(</span>ep<span class="op">);</span></span>
<span id="cb5-69"><a href="#cb5-69" tabindex="-1"></a></span>
<span id="cb5-70"><a href="#cb5-70" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>dbg_entry<span class="op">)</span></span>
<span id="cb5-71"><a href="#cb5-71" tabindex="-1"></a>            show_entry<span class="op">(</span>ep<span class="op">);</span></span>
<span id="cb5-72"><a href="#cb5-72" tabindex="-1"></a>    <span class="op">}</span> END_FOR_EACH_PTR<span class="op">(</span>sym<span class="op">);</span></span>
<span id="cb5-73"><a href="#cb5-73" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb5-74"><a href="#cb5-74" tabindex="-1"></a></span>
<span id="cb5-75"><a href="#cb5-75" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> collect_info_from_args<span class="op">(</span><span class="dt">const</span> <span class="dt">int</span> argc<span class="op">,</span> <span class="dt">char</span> <span class="op">*</span><span class="dt">const</span> <span class="op">*</span><span class="dt">const</span> argv<span class="op">)</span></span>
<span id="cb5-76"><a href="#cb5-76" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-77"><a href="#cb5-77" tabindex="-1"></a>    <span class="dt">int</span> i<span class="op">;</span></span>
<span id="cb5-78"><a href="#cb5-78" tabindex="-1"></a></span>
<span id="cb5-79"><a href="#cb5-79" tabindex="-1"></a>    <span class="cf">for</span> <span class="op">(</span>i <span class="op">=</span> <span class="dv">0</span><span class="op">;</span> i <span class="op">&lt;</span> argc<span class="op">;</span> i<span class="op">++)</span> <span class="op">{</span></span>
<span id="cb5-80"><a href="#cb5-80" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(!</span>strcmp<span class="op">(</span><span class="st">&quot;-DLTPLIB&quot;</span><span class="op">,</span> argv<span class="op">[</span>i<span class="op">]))</span></span>
<span id="cb5-81"><a href="#cb5-81" tabindex="-1"></a>            tu_kind <span class="op">=</span> LTP_LIB<span class="op">;</span></span>
<span id="cb5-82"><a href="#cb5-82" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb5-83"><a href="#cb5-83" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb5-84"><a href="#cb5-84" tabindex="-1"></a></span>
<span id="cb5-85"><a href="#cb5-85" tabindex="-1"></a><span class="dt">int</span> main<span class="op">(</span><span class="dt">int</span> argc<span class="op">,</span> <span class="dt">char</span> <span class="op">**</span>argv<span class="op">)</span></span>
<span id="cb5-86"><a href="#cb5-86" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-87"><a href="#cb5-87" tabindex="-1"></a>    <span class="kw">struct</span> string_list <span class="op">*</span>filelist <span class="op">=</span> NULL<span class="op">;</span></span>
<span id="cb5-88"><a href="#cb5-88" tabindex="-1"></a>    <span class="dt">char</span> <span class="op">*</span>file<span class="op">;</span></span>
<span id="cb5-89"><a href="#cb5-89" tabindex="-1"></a></span>
<span id="cb5-90"><a href="#cb5-90" tabindex="-1"></a>    <span class="co">/* ... Disable a bunch of inbuilt checks ... */</span></span>
<span id="cb5-91"><a href="#cb5-91" tabindex="-1"></a></span>
<span id="cb5-92"><a href="#cb5-92" tabindex="-1"></a>    do_output <span class="op">=</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb5-93"><a href="#cb5-93" tabindex="-1"></a></span>
<span id="cb5-94"><a href="#cb5-94" tabindex="-1"></a>    collect_info_from_args<span class="op">(</span>argc<span class="op">,</span> argv<span class="op">);</span></span>
<span id="cb5-95"><a href="#cb5-95" tabindex="-1"></a></span>
<span id="cb5-96"><a href="#cb5-96" tabindex="-1"></a>    process_symbols<span class="op">(</span>sparse_initialize<span class="op">(</span>argc<span class="op">,</span> argv<span class="op">,</span> <span class="op">&amp;</span>filelist<span class="op">));</span></span>
<span id="cb5-97"><a href="#cb5-97" tabindex="-1"></a>    FOR_EACH_PTR<span class="op">(</span>filelist<span class="op">,</span> file<span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-98"><a href="#cb5-98" tabindex="-1"></a>        process_symbols<span class="op">(</span>sparse<span class="op">(</span>file<span class="op">));</span></span>
<span id="cb5-99"><a href="#cb5-99" tabindex="-1"></a>    <span class="op">}</span> END_FOR_EACH_PTR<span class="op">(</span>file<span class="op">);</span></span>
<span id="cb5-100"><a href="#cb5-100" tabindex="-1"></a></span>
<span id="cb5-101"><a href="#cb5-101" tabindex="-1"></a>    report_stats<span class="op">();</span></span>
<span id="cb5-102"><a href="#cb5-102" tabindex="-1"></a>    <span class="cf">return</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb5-103"><a href="#cb5-103" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>Unlike the Clang and Coccinelle checks, this actually
            checks the variables themselves. We traverse the IR and look
            for writes to them. This will catch some additional cases
            where we write to the variables without using the
            macros.</p>
            <p>It may be possible to fool it somehow. It does have the
            issue that library header files are considered part of the
            test code. We have only just begun to use Sparse so this
            will likely be modified over time.</p>
            <p>For now we don’t try to do anything with the AST, we just
            look at the IR. Unlike Clang, Sparse does not save
            information about macro expansions. They do not show up as
            nodes in the AST. It appears that preprocessing is performed
            without saving any details. We may need to change this.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>Note that since writing this article I have added more
            checks. Including some which operate on the “AST”. You can
            see more <a
            href="https://github.com/linux-test-project/ltp/blob/master/tools/sparse/sparse-ltp.c">here</a>.
            Frankly I find the AST in Sparse horribly confusing.</p>
            </div>
            </div>
            <p>Sparse has many built-in checks and warnings. We have
            disabled most of them for now. In some cases they are kernel
            specific. In other cases they have been adopted by GCC and
            Clang which produce prettier warnings. Mostly though we just
            need to clean up the LTP, then we can enable them.</p>
            <p>Sparse also introduces some attributes
            (e.g. <code>__attribute__(address_space(name))</code>) which
            may be useful or not. Attributes are a way of extending C
            which does not interfere with compilers that do not support
            them. The kernel uses them to prevent functions and
            variables being used in certain ways.</p>
            <h1 id="tree-sitter">Tree-sitter</h1>
            <p>Since writting this article and adopting Sparse I
            discovered <a
            href="https://tree-sitter.github.io/tree-sitter/">Tree-sitter</a>
            thanks to <a
            href="https://github.com/googleprojectzero/weggli">Weggli</a>.
            Weggli is a very fast “semantic search tool” inspired by
            Coccinelle amongst others. I’d say it’s more of an AST
            matcher. As far as I know it doesn’t have the control flow
            analysis features of Coccinelle. On the plus side I can see
            myself contributing to it as it is written in Rust. It’s
            also much faster than Coccinelle and easy to install.</p>
            <p>Just go and try it, it should only take 5 minutes if you
            are willing to install it with Rust’s <code>cargo</code>
            command. I often use it now for searching the Linux tree as
            it tends to find things that <code>clangd</code> doesn’t
            because <code>compile_commands.json</code> doesn’t have some
            files in it due to the build configuration.</p>
            <p>However for the purposes of the LTP checker, it’s
            Tree-sitter that is really interesting. Tree-sitter ticks a
            lot of the boxes in our requirements: it generates zero
            dependency parsers written in C. These can easily be
            vendored in.</p>
            <p>It supports many languages including C and Bash. We also
            have many tests written in Shell and a Shell test API. So
            there is also the possiblity of producing LTP specific
            checks for Shell as well.</p>
            <p>It only operates at the AST level, but is vastly easier
            to understand than Sparse’s AST. For one thing it has a nice
            CLI for interactively inspecting ASTs and even a <a
            href="https://tree-sitter.github.io/tree-sitter/playground">web
            based playground</a>. The C API appears to have some proper
            documentation and looks straightforward compared to
            Sparse.</p>
            <p>The problem is the lack of linearization. Some things are
            just much easier with some IR, that’s why it exists. There
            is also the fact we already have something working in
            Sparse. Still I would not rule out us using Tree-sitter.</p>
            <h1 id="conclusion">Conclusion</h1>
            <p>Going forwards we will continue to develop Sparse as our
            main tool. We may still need to abandon it. Perhaps the
            checks we really want will be too difficult. Personally, I
            will also continue to use Coccinelle, especially for
            “evolutionary development”.</p>
            <p>There is a huge amount of great software here. Which took
            a lot of hard work by smart people. As usual with open
            source it is rough around the edges. In the end we chose the
            solution which we are mostly likely able to fix ourselves.
            Also the solution least likely to need fixing once
            implemented.</p>
            <p>Depending on how things progress, I will be back to write
            about using Sparse and Coccinelle. Please send any
            suggestions, praise or insults via the contact details
            below.</p>
    </div>
  </content>
</entry>
<entry>
  <title>Software freelancing list</title>
  <id>https://richiejp.com/freelance-list</id>
  <published>2023-05-16T21:29:06+01:00</published>
  <updated>2023-07-14T13:20:29+01:00</updated>
  <link rel="alternate" href="https://richiejp.com/freelance-list" />
  <summary>A small list of platforms for freelancing as a software
developer</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>There are two equally terrible strategies to doing
            anything significant.</p>
            <ol style="list-style-type: decimal">
            <li>Jump straight in head first (active)</li>
            <li>Wallow in research and talk forever (passive)</li>
            </ol>
            <p>I don’t recall anyone saying to do (2), but it’s often
            said to be worse than (1). In my opinion this is because the
            vast graveyard of people who do (1) and fail are mostly
            silent. Meanwhile the successful 1’ers are noisy and
            visible.</p>
            <p>So what do you do? Passively research what you are about
            to do while actively probing the water. This is where I am
            now. Anyway below is a list of freelance platforms I have
            compiled for my own purposes.</p>
            <p>For the most part I rely on what the companies themselves
            say and cross reference it with information from places like
            Crunchbase, IndieHackers and Reddit.</p>
            <h1 id="general">General</h1>
            <p>Basically they all provide some billing services and
            whatnot. These things aren’t that interesting so I won’t
            cover them.</p>
            <h1 id="gun.io">Gun.io</h1>
            <p>I generally get a good feeling from Gun.io. My perception
            is that they would provide me with more value than simply an
            introduction to clients.</p>
            <ul>
            <li>Senior only</li>
            <li>In depth vetting process, but no tests</li>
            <li>Charge clients a retainer</li>
            <li>Some VC funding</li>
            <li>Some hand holding and matching</li>
            <li>Fast</li>
            </ul>
            <div class="message is-info">
            <div class="message-body">
            <p>Update: Also web developer focused. However after
            speaking to them I think my feeling was correct.</p>
            </div>
            </div>
            <h1 id="lemon.io">Lemon.io</h1>
            <p>I also get a good feeling from these guys. Although they
            don’t seem to be into the crusty C code market.</p>
            <ul>
            <li>Middle and senior upwards</li>
            <li>Web developer focused</li>
            <li>Accept teams</li>
            <li>Targetted at startups (i.e. lower rates)</li>
            <li>0% commission (I didn’t get to the bottom of how they
            earn)</li>
            <li>Only seed funding</li>
            </ul>
            <h1 id="toptal">Toptal</h1>
            <p>Definitely the gorilla in the room throwing its weight
            around. According to third-party sources they add a margin
            on how much the freelancer charges. It doesn’t say anywhere
            how much and I tend to think this is because it is variable
            with a high lower bound. I also think though that if you
            need them then you need them.</p>
            <ul>
            <li>They have the biggest brand</li>
            <li>Fast</li>
            <li>Opaque</li>
            <li>Make you do a Codility test and fake project</li>
            <li>Only seed funding</li>
            <li>They match you with clients</li>
            </ul>
            <h1 id="upwork">Upwork</h1>
            <p>It is what it is. My brother in-law says that you have to
            put a lot of work in to get work, but it is a good platform
            if you put in the time.</p>
            <ul>
            <li>Free-for-all</li>
            <li>Significant upfront effort</li>
            <li>Freelancer reviews</li>
            <li>Low commission</li>
            <li>Lots of jobs to bid on</li>
            <li>Bidding on jobs requires tokens</li>
            <li>Plus membership available</li>
            </ul>
            <p>I initially did not realise they have a token system. It
            costs money to buy tokens, but you get some free. This makes
            spamming costly and from having hired someone on one of
            these sites before, I can tell you this is a good thing.</p>
            <h1 id="contra">Contra</h1>
            <p>Kind of an odd one, they charge for fancy portfolios not
            commission. How long this will last is debatable as they
            will need a lot of portfolio customers to pay off their
            funding (assuming such things matter again).</p>
            <ul>
            <li>0% commission</li>
            <li>Free-for-all</li>
            <li>Significant VC funding</li>
            <li>More design focused perhaps</li>
            </ul>
            <h1 id="codementor.io">Codementor.io</h1>
            <ul>
            <li>On demand code review and pair programming</li>
            <li>Also freelance projects</li>
            </ul>
            <h1 id="arc">Arc</h1>
            <ul>
            <li>Same company as codementor, but more vetting(?)</li>
            <li>Remote jobs and freelance</li>
            </ul>
            <h1 id="braintrust">Braintrust</h1>
            <p>OK, this is where I start to lose it. Basically all of
            the companies that do more than provide a platform do some
            obfustication. Most likely because not all clients and
            talent are created equal. Publishing a flat fee would not be
            wise if some engagements can and should have a higher fee
            than others.</p>
            <p>Braintrust look a bit like the other vetted platforms.
            What’s really interesting though is that they are a
            non-profit which took $23M in VC funding and then did a
            $123M initial coin offering.</p>
            <p>They charge clients a fee on top of the amount they pay
            to the freelancer. So they then say that the freelancer gets
            to keep 100% of what they make. This of course assumes that
            the freelancer would charge the same on platform and off
            it.</p>
            <p>It gets better, because apparently this fee goes towards
            buying back their token. Not to their employees or investors
            who appear to be substantial. If you are familiar with
            blockchain shenanigans then I’m sure you get the point. If
            you are not, then perhaps you don’t and that is the point of
            blockchain.</p>
            <ul>
            <li>Vetting and you can earn tokens for vetting new
            candidates</li>
            <li>Blockchain</li>
            <li>Small fee</li>
            <li>Blockchain</li>
            <li>You can earn tokens for referals</li>
            <li>Blockchain</li>
            <li>They did work for NASA; its going to the Moooon!</li>
            </ul>
            <h1 id="richiejp.com">richiejp.com</h1>
            <p>Improbable to produce work for now, but it is nearly a
            free option. I’m going to do the work regardless of whether
            I freelance, create another SaaS or work on an Open Source
            project.</p>
    </div>
  </content>
</entry>
<entry>
  <title>Generating type specific deserialisers for BSON.jl</title>
  <id>https://richiejp.com/generating-type-specific-deserialisers-for-bson</id>
  <published>2020-06-30T16:49:58+01:00</published>
  <updated>2020-06-30T16:49:58+01:00</updated>
  <link rel="alternate" href="https://richiejp.com/generating-type-specific-deserialisers-for-bson" />
  <summary></summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p><em>This was orgiinally posted <a
            href="https://discourse.julialang.org/t/generating-type-specific-deserialisers-for-bson-jl/25720">here</a></em></p>
            <p>I have spent quite some time trying to optimise the
            excellent <a
            href="https://github.com/JuliaIO/BSON.jl">BSON.jl</a>
            library as we make heavy use of it. The result of this so
            far is some updates to the original library and a fork
            called <a
            href="https://github.com/richiejp/BSONqs.jl">BSONqs</a>
            which has more radical changes (more on that at the
            bottom).</p>
            <p>I like working with Julia’s structs and organise most of
            my data into structs. Even if there is no need to do so for
            performance reasons and it introduces some rigidity, I still
            prefer it over working with generic data structures like
            Dicts or tables. I have a bad memory so I will forget what
            properties the data has if it is not written down
            somewhere.</p>
            <p>After trying a few different methods which will serialise
            arbitrary Julia structs in a reliable way I found that
            BSON.jl worked the best for what I am doing. I won’t go into
            any more details on that because at the time many libraries
            were simply broken in Julia 1.0 or didn’t exist.</p>
            <p>We store several GBs of very boring Linux test results in
            Redis and load large chunks of it into RAM. This can take up
            to 30s and a lot of that time is spent by BSON.jl. So I
            decided to try optimising its read performance as much as
            possible.</p>
            <p>Like most (de)serialisers BSON.jl first takes the raw
            BSON document and parses that into an intermediate format.
            Which originally was a set of nested Dicts. The dictionaries
            mirror the raw BSON very closely. Below you can see how that
            looks.</p>
            <div class="sourceCode" id="cb1"><pre
            class="sourceCode julia"><code class="sourceCode julia"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a>julia<span class="op">&gt;</span> <span class="kw">struct</span> Foo</span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a>       bar<span class="op">::</span><span class="dt">Set{String}</span></span>
<span id="cb1-3"><a href="#cb1-3" tabindex="-1"></a>       <span class="kw">end</span></span>
<span id="cb1-4"><a href="#cb1-4" tabindex="-1"></a></span>
<span id="cb1-5"><a href="#cb1-5" tabindex="-1"></a>julia<span class="op">&gt;</span> io <span class="op">=</span> <span class="fu">IOBuffer</span>();</span>
<span id="cb1-6"><a href="#cb1-6" tabindex="-1"></a></span>
<span id="cb1-7"><a href="#cb1-7" tabindex="-1"></a>julia<span class="op">&gt;</span> BSON.<span class="fu">bson</span>(io, <span class="fu">Dict</span>(<span class="op">:</span>foo <span class="op">=&gt;</span> <span class="fu">Foo</span>(<span class="fu">Set</span>([<span class="st">&quot;baz&quot;</span>]))))</span>
<span id="cb1-8"><a href="#cb1-8" tabindex="-1"></a></span>
<span id="cb1-9"><a href="#cb1-9" tabindex="-1"></a>julia<span class="op">&gt;</span> <span class="fu">seek</span>(io, <span class="fl">0</span>);</span>
<span id="cb1-10"><a href="#cb1-10" tabindex="-1"></a></span>
<span id="cb1-11"><a href="#cb1-11" tabindex="-1"></a>julia<span class="op">&gt;</span> BSON.<span class="fu">parse</span>(io)</span>
<span id="cb1-12"><a href="#cb1-12" tabindex="-1"></a><span class="dt">Dict</span>{<span class="dt">Symbol</span>,<span class="dt">Any</span>} with <span class="fl">1</span> entry<span class="op">:</span></span>
<span id="cb1-13"><a href="#cb1-13" tabindex="-1"></a>  <span class="op">:</span>foo <span class="op">=&gt;</span> <span class="fu">Dict</span><span class="dt">{Symbol,Any}</span>(<span class="op">:</span>tag<span class="op">=&gt;</span><span class="st">&quot;struct&quot;</span>,<span class="op">:</span><span class="kw">type</span><span class="op">=&gt;</span><span class="fu">Dict</span><span class="dt">{Symbol,Any}</span>(<span class="op">:</span>tag<span class="op">=&gt;</span><span class="st">&quot;datatype&quot;</span>,<span class="op">:</span>params<span class="op">=&gt;</span><span class="dt">Any</span>[],<span class="op">:</span>name<span class="op">=&gt;</span><span class="dt">Any</span>[<span class="st">&quot;Main&quot;</span>, <span class="st">&quot;Foo&quot;</span>]),<span class="op">:</span>data<span class="op">=…</span></span>
<span id="cb1-14"><a href="#cb1-14" tabindex="-1"></a></span>
<span id="cb1-15"><a href="#cb1-15" tabindex="-1"></a>julia<span class="op">&gt;</span> <span class="kw">ans</span>[<span class="op">:</span>foo]</span>
<span id="cb1-16"><a href="#cb1-16" tabindex="-1"></a><span class="dt">Dict</span>{<span class="dt">Symbol</span>,<span class="dt">Any</span>} with <span class="fl">3</span> entries<span class="op">:</span></span>
<span id="cb1-17"><a href="#cb1-17" tabindex="-1"></a>  <span class="op">:</span>tag  <span class="op">=&gt;</span> <span class="st">&quot;struct&quot;</span></span>
<span id="cb1-18"><a href="#cb1-18" tabindex="-1"></a>  <span class="op">:</span><span class="kw">type</span> <span class="op">=&gt;</span> <span class="fu">Dict</span><span class="dt">{Symbol,Any}</span>(<span class="op">:</span>tag<span class="op">=&gt;</span><span class="st">&quot;datatype&quot;</span>,<span class="op">:</span>params<span class="op">=&gt;</span><span class="dt">Any</span>[],<span class="op">:</span>name<span class="op">=&gt;</span><span class="dt">Any</span>[<span class="st">&quot;Main&quot;</span>, <span class="st">&quot;Foo&quot;</span>])</span>
<span id="cb1-19"><a href="#cb1-19" tabindex="-1"></a>  <span class="op">:</span>data <span class="op">=&gt;</span> <span class="dt">Any</span>[<span class="fu">Dict</span><span class="dt">{Symbol,Any}</span>(<span class="op">:</span>tag<span class="op">=&gt;</span><span class="st">&quot;struct&quot;</span>,<span class="op">:</span>type<span class="op">=&gt;</span><span class="fu">Dict</span><span class="dt">{Symbol,Any}</span>(<span class="op">:</span>tag<span class="op">=&gt;</span><span class="st">&quot;datatype&quot;</span>,<span class="op">:</span>params<span class="op">=&gt;</span><span class="dt">Any</span>[<span class="fu">Dict</span><span class="dt">{Symbol,Any}</span>(<span class="op">:</span>tag<span class="op">=&gt;</span><span class="st">&quot;dataty…</span></span></code></pre></div>
            <p>After doing a few basic optimisations, the profiler
            showed that a lot of time was being spent simply allocating
            dictionaries for the intermediate format. From the above you
            can see the dictionaries have very few members and these are
            always the same for serialised Julia data structures.</p>
            <p>So instead of using <code>Dicts</code> I defined some
            structs for use in the intermediate layer instead.</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode julia"><code class="sourceCode julia"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="st">&quot;&quot;&quot;Type class which represents a tagged dictionary</span></span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a><span class="st">Tagged dictionaries are used to represent complex Julia types. Using a struct</span></span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a><span class="st">instead of an actual Dictionary requires less memory allocation and allows us</span></span>
<span id="cb2-5"><a href="#cb2-5" tabindex="-1"></a><span class="st">to use multiple dispatch on the resulting tree structure.</span></span>
<span id="cb2-6"><a href="#cb2-6" tabindex="-1"></a></span>
<span id="cb2-7"><a href="#cb2-7" tabindex="-1"></a><span class="st">It inherits abstract dict just for show.&quot;&quot;&quot;</span></span>
<span id="cb2-8"><a href="#cb2-8" tabindex="-1"></a><span class="kw">abstract type</span> Tagged <span class="op">&lt;:</span><span class="dt"> AbstractDict{Symbol, Any} </span><span class="kw">end</span></span>
<span id="cb2-9"><a href="#cb2-9" tabindex="-1"></a></span>
<span id="cb2-10"><a href="#cb2-10" tabindex="-1"></a><span class="st">&quot;Type class for types which can occupuy the &#39;type&#39; field in a struct&quot;</span></span>
<span id="cb2-11"><a href="#cb2-11" tabindex="-1"></a><span class="kw">abstract type</span> TaggedStructType <span class="op">&lt;:</span><span class="dt"> Tagged </span><span class="kw">end</span></span>
<span id="cb2-12"><a href="#cb2-12" tabindex="-1"></a></span>
<span id="cb2-13"><a href="#cb2-13" tabindex="-1"></a><span class="op">...</span></span>
<span id="cb2-14"><a href="#cb2-14" tabindex="-1"></a></span>
<span id="cb2-15"><a href="#cb2-15" tabindex="-1"></a><span class="kw">mutable struct</span> TaggedType <span class="op">&lt;:</span><span class="dt"> TaggedStructType</span></span>
<span id="cb2-16"><a href="#cb2-16" tabindex="-1"></a>  name<span class="op">::</span><span class="dt">Vector{String}</span></span>
<span id="cb2-17"><a href="#cb2-17" tabindex="-1"></a>  params<span class="op">::</span><span class="dt">Vector{TaggedParam}</span></span>
<span id="cb2-18"><a href="#cb2-18" tabindex="-1"></a><span class="kw">end</span></span>
<span id="cb2-19"><a href="#cb2-19" tabindex="-1"></a></span>
<span id="cb2-20"><a href="#cb2-20" tabindex="-1"></a><span class="kw">mutable struct</span> TaggedStruct <span class="op">&lt;:</span><span class="dt"> TaggedStructType</span></span>
<span id="cb2-21"><a href="#cb2-21" tabindex="-1"></a>  ttype<span class="op">::</span><span class="dt">TaggedStructType</span></span>
<span id="cb2-22"><a href="#cb2-22" tabindex="-1"></a>  data<span class="op">::</span><span class="dt">BSONArray</span></span>
<span id="cb2-23"><a href="#cb2-23" tabindex="-1"></a><span class="kw">end</span></span></code></pre></div>
            <p>This resulted in an annoying and complex type system
            (maybe necessarily, maybe not), but saved a huge amount of
            memory. Initially the performance was worse, which was
            pretty upsetting, but I figured out this was because I had
            increased the number of type unstable functions resulting in
            more dynamic dispatch. After fixing some of that I got a
            100% speedup.</p>
            <p>At this point I decided to go in a completely different
            direction and try to cut out the intermediate layer
            altogether. This basically means we now have three BSON
            parsers:</p>
            <ol style="list-style-type: decimal">
            <li>Using Dicts for the intermediate layer (in BSON.jl)</li>
            <li>Using dedicated types for the intermediate layer (in
            BSONqs.jl with load_compat)</li>
            <li>No intermediate layer and type specific parsing (in
            BSONqs.jl with load)</li>
            </ol>
            <p>One could argue that 3. still has an intermediate data
            structure, but that it is allocated on the stack in the form
            of a finite state machine. The wisdom of removing the
            intermediate layer is debatable as it makes a lot of things
            harder to debug and reason about. Also the intermediate
            layer may also have better cache efficiency.</p>
            <p>On the other hand, with no intermediate layer we can copy
            data directly from the input stream/buffer to the native
            Julia data structures. In theory we could even make it
            zero-copy if Julia’s memory system allows it. This could be
            extremely useful when parsing massive, contiguous arrays of
            numerical data from a memory-mapped file. As we would not
            even need to read most of the data until it is used in a
            computation.</p>
            <p>The third parser is by far the quickest for large sets of
            structs. However while I have differentiated the parsers by
            the intermediate layer or lack thereof. The third parser is
            also more aware of the resultant Julia type of the data
            being parsed.</p>
            <p>In many cases we know in advance what type the data
            should be that we are parsing. For example take the
            following structure.</p>
            <div class="sourceCode" id="cb3"><pre
            class="sourceCode julia"><code class="sourceCode julia"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a><span class="kw">struct</span> Foo</span>
<span id="cb3-2"><a href="#cb3-2" tabindex="-1"></a>  bar<span class="op">::</span><span class="dt">Vector{Int}</span></span>
<span id="cb3-3"><a href="#cb3-3" tabindex="-1"></a><span class="kw">end</span></span>
<span id="cb3-4"><a href="#cb3-4" tabindex="-1"></a></span>
<span id="cb3-5"><a href="#cb3-5" tabindex="-1"></a><span class="fu">bson</span>(io, <span class="fu">Dict</span>(<span class="op">:</span>foo <span class="op">=&gt;</span> <span class="fu">Foo</span>(<span class="fu">rand</span>(<span class="dt">Int</span>, <span class="fl">1000</span>))))</span></code></pre></div>
            <p>When we come to parse the <code>:foo</code> entry, we
            will have to determine its type (which is <code>Foo</code>)
            first. However after that point we can switch to a parser
            specifically generated to parse Foo. For Foo’s only member
            bar, we don’t even need to parse its type because we already
            know what it is from the containing struct’s field
            definition. In theory we can call a generated method which
            does the bare minimum necessary for parsing a
            <code>Vector{Int}</code> and is type stable.</p>
            <p>In practice, this is pretty close to how the third parser
            actually works, although there is definitely room for
            improvement. It achieves this through the use of two
            strategies. Firstly by using multiple-dispatch on concrete
            types, thus allowing the Julia compiler to work its magic,
            creating type specific code from generic methods.</p>
            <p>In places where that doesn’t work, it uses <span
            class="citation">@generated</span> methods, which allow us
            to explicitly generate code based on the types of the method
            arguments. I doubt the below code will make much sense taken
            out of context (or in context for that matter), but
            hopefully it can give you an idea of what this looks
            like.</p>
            <div class="sourceCode" id="cb4"><pre
            class="sourceCode julia"><code class="sourceCode julia"><span id="cb4-1"><a href="#cb4-1" tabindex="-1"></a><span class="co"># This is called when we know what type &#39;T&#39; we want. There are many</span></span>
<span id="cb4-2"><a href="#cb4-2" tabindex="-1"></a><span class="co"># definitions of parse_specific, this one is the most generic and so</span></span>
<span id="cb4-3"><a href="#cb4-3" tabindex="-1"></a><span class="co"># is called when all the others have failed to match. The assumption here is</span></span>
<span id="cb4-4"><a href="#cb4-4" tabindex="-1"></a><span class="co"># that T will be a struct type (if it is concrete) as all other types will</span></span>
<span id="cb4-5"><a href="#cb4-5" tabindex="-1"></a><span class="co"># match with a more specific parse_specific method (say what you like about my</span></span>
<span id="cb4-6"><a href="#cb4-6" tabindex="-1"></a><span class="co"># naming)</span></span>
<span id="cb4-7"><a href="#cb4-7" tabindex="-1"></a><span class="kw">function</span> <span class="fu">parse_specific</span>(io<span class="op">::</span><span class="dt">IO</span>, <span class="op">::</span><span class="dt">Type{T}</span>, tag<span class="op">::</span><span class="dt">BSONType</span>,</span>
<span id="cb4-8"><a href="#cb4-8" tabindex="-1"></a>                        ctx<span class="op">::</span><span class="dt">ParseCtx</span>)<span class="op">::</span><span class="dt">T </span><span class="kw">where</span> T</span>
<span id="cb4-9"><a href="#cb4-9" tabindex="-1"></a></span>
<span id="cb4-10"><a href="#cb4-10" tabindex="-1"></a>  <span class="co"># structs are always represented by BSON documents</span></span>
<span id="cb4-11"><a href="#cb4-11" tabindex="-1"></a>  <span class="co"># note that this runs normally at &#39;runtime&#39; before any generated code or the</span></span>
<span id="cb4-12"><a href="#cb4-12" tabindex="-1"></a>  <span class="co"># fallback</span></span>
<span id="cb4-13"><a href="#cb4-13" tabindex="-1"></a>  <span class="pp">@asserteq</span> tag document</span>
<span id="cb4-14"><a href="#cb4-14" tabindex="-1"></a></span>
<span id="cb4-15"><a href="#cb4-15" tabindex="-1"></a>  <span class="co"># This tells the compiler this is a generated method. The code in the first</span></span>
<span id="cb4-16"><a href="#cb4-16" tabindex="-1"></a>  <span class="co"># branch of this if statement is run at/before &#39;compile time&#39; and returns an</span></span>
<span id="cb4-17"><a href="#cb4-17" tabindex="-1"></a>  <span class="co"># AST (expression). The AST is then compiled, this is the same as for macros.</span></span>
<span id="cb4-18"><a href="#cb4-18" tabindex="-1"></a>  <span class="co"># The difference is macro&#39;s have no access to type information.</span></span>
<span id="cb4-19"><a href="#cb4-19" tabindex="-1"></a>  <span class="cf">if</span> <span class="pp">@generated</span></span>
<span id="cb4-20"><a href="#cb4-20" tabindex="-1"></a>    <span class="co"># If it is not a concrete type we fall back to a method which will first load</span></span>
<span id="cb4-21"><a href="#cb4-21" tabindex="-1"></a>    <span class="co"># the embedded type information from the BSON document. This will happen</span></span>
<span id="cb4-22"><a href="#cb4-22" tabindex="-1"></a>    <span class="co"># if (for example) you put abstract or union types in your struct definitions</span></span>
<span id="cb4-23"><a href="#cb4-23" tabindex="-1"></a>    <span class="cf">if</span> !<span class="fu">isconcretetype</span>(T)</span>
<span id="cb4-24"><a href="#cb4-24" tabindex="-1"></a>      <span class="cf">return</span> <span class="op">:</span>(<span class="fu">parse_tag</span>(io, tag, ctx))</span>
<span id="cb4-25"><a href="#cb4-25" tabindex="-1"></a>    <span class="cf">end</span></span>
<span id="cb4-26"><a href="#cb4-26" tabindex="-1"></a></span>
<span id="cb4-27"><a href="#cb4-27" tabindex="-1"></a>    <span class="co"># If it is a concrete struct, this chunk of code looks through the BSON</span></span>
<span id="cb4-28"><a href="#cb4-28" tabindex="-1"></a>    <span class="co"># document for members we are </span></span>
<span id="cb4-29"><a href="#cb4-29" tabindex="-1"></a>    <span class="co"># expecting in a serialised struct or reference to a struct and saves</span></span>
<span id="cb4-30"><a href="#cb4-30" tabindex="-1"></a>    <span class="co"># pointers to what it finds. This can be simplified if we can garuantee</span></span>
<span id="cb4-31"><a href="#cb4-31" tabindex="-1"></a>    <span class="co"># the order the elements will appear in, I am being cautious for now</span></span>
<span id="cb4-32"><a href="#cb4-32" tabindex="-1"></a>    <span class="kw">quote</span></span>
<span id="cb4-33"><a href="#cb4-33" tabindex="-1"></a>      startpos <span class="op">=</span> <span class="fu">position</span>(io)</span>
<span id="cb4-34"><a href="#cb4-34" tabindex="-1"></a>      len <span class="op">=</span> <span class="fu">read</span>(io, <span class="dt">Int32</span>)</span>
<span id="cb4-35"><a href="#cb4-35" tabindex="-1"></a>      <span class="kw">local</span> data<span class="op">::</span><span class="dt">BSONElem</span></span>
<span id="cb4-36"><a href="#cb4-36" tabindex="-1"></a>      ref <span class="op">=</span> <span class="cn">nothing</span></span>
<span id="cb4-37"><a href="#cb4-37" tabindex="-1"></a></span>
<span id="cb4-38"><a href="#cb4-38" tabindex="-1"></a>      <span class="cf">for</span> _ <span class="kw">in</span> <span class="fl">1</span><span class="op">:</span><span class="fl">4</span></span>
<span id="cb4-39"><a href="#cb4-39" tabindex="-1"></a>        <span class="cf">if</span> (tag <span class="op">=</span> <span class="fu">read</span>(io, BSONType)) <span class="op">==</span> eof</span>
<span id="cb4-40"><a href="#cb4-40" tabindex="-1"></a>          <span class="cf">break</span></span>
<span id="cb4-41"><a href="#cb4-41" tabindex="-1"></a>        <span class="cf">end</span></span>
<span id="cb4-42"><a href="#cb4-42" tabindex="-1"></a></span>
<span id="cb4-43"><a href="#cb4-43" tabindex="-1"></a>        k <span class="op">=</span> <span class="fu">parse_cstr_unsafe</span>(io)</span>
<span id="cb4-44"><a href="#cb4-44" tabindex="-1"></a>        <span class="cf">if</span> k <span class="op">==</span> <span class="st">b&quot;tag&quot;</span></span>
<span id="cb4-45"><a href="#cb4-45" tabindex="-1"></a>          <span class="pp">@asserteq</span> tag string</span>
<span id="cb4-46"><a href="#cb4-46" tabindex="-1"></a>        <span class="cf">elseif</span> k <span class="op">==</span> <span class="st">b&quot;ref&quot;</span></span>
<span id="cb4-47"><a href="#cb4-47" tabindex="-1"></a>          ref <span class="op">=</span> <span class="fu">BSONElem</span>(tag, io)</span>
<span id="cb4-48"><a href="#cb4-48" tabindex="-1"></a>        <span class="cf">elseif</span> k <span class="op">==</span> <span class="st">b&quot;type&quot;</span></span>
<span id="cb4-49"><a href="#cb4-49" tabindex="-1"></a>          <span class="pp">@asserteq</span> tag document</span>
<span id="cb4-50"><a href="#cb4-50" tabindex="-1"></a>        <span class="cf">elseif</span> k <span class="op">==</span> <span class="st">b&quot;data&quot;</span></span>
<span id="cb4-51"><a href="#cb4-51" tabindex="-1"></a>          <span class="pp">@asserteq</span> tag array</span>
<span id="cb4-52"><a href="#cb4-52" tabindex="-1"></a>          data <span class="op">=</span> <span class="fu">BSONElem</span>(tag, io)</span>
<span id="cb4-53"><a href="#cb4-53" tabindex="-1"></a>        <span class="cf">end</span></span>
<span id="cb4-54"><a href="#cb4-54" tabindex="-1"></a></span>
<span id="cb4-55"><a href="#cb4-55" tabindex="-1"></a>        <span class="fu">skip_over</span>(io, tag)</span>
<span id="cb4-56"><a href="#cb4-56" tabindex="-1"></a>      <span class="cf">end</span></span>
<span id="cb4-57"><a href="#cb4-57" tabindex="-1"></a></span>
<span id="cb4-58"><a href="#cb4-58" tabindex="-1"></a>      endpos <span class="op">=</span> <span class="fu">position</span>(io)</span>
<span id="cb4-59"><a href="#cb4-59" tabindex="-1"></a>      <span class="pp">@asserteq</span> (startpos <span class="op">+</span> len) endpos</span>
<span id="cb4-60"><a href="#cb4-60" tabindex="-1"></a></span>
<span id="cb4-61"><a href="#cb4-61" tabindex="-1"></a>      <span class="co"># Objects which appear in the data more than once are lowered into a special</span></span>
<span id="cb4-62"><a href="#cb4-62" tabindex="-1"></a>      <span class="co"># array and their instances replaced with indexes/references into that</span></span>
<span id="cb4-63"><a href="#cb4-63" tabindex="-1"></a>      <span class="co"># array before serialisation to BSON.</span></span>
<span id="cb4-64"><a href="#cb4-64" tabindex="-1"></a>      <span class="cf">if</span> ref <span class="op">≠</span> <span class="cn">nothing</span></span>
<span id="cb4-65"><a href="#cb4-65" tabindex="-1"></a>        <span class="co"># Follow the reference into the array to get/parse the actual struct</span></span>
<span id="cb4-66"><a href="#cb4-66" tabindex="-1"></a>        ret <span class="op">=</span> <span class="fu">parse_specific_ref</span>(io, <span class="op">$</span>T, ref, ctx)<span class="op">::</span><span class="dt">$T</span></span>
<span id="cb4-67"><a href="#cb4-67" tabindex="-1"></a>        <span class="fu">seek</span>(io, endpos)</span>
<span id="cb4-68"><a href="#cb4-68" tabindex="-1"></a>        <span class="cf">return</span> ret</span>
<span id="cb4-69"><a href="#cb4-69" tabindex="-1"></a>      <span class="cf">end</span></span>
<span id="cb4-70"><a href="#cb4-70" tabindex="-1"></a></span>
<span id="cb4-71"><a href="#cb4-71" tabindex="-1"></a>      <span class="fu">seek</span>(io, data.pos)</span>
<span id="cb4-72"><a href="#cb4-72" tabindex="-1"></a>      <span class="co"># This is where most the work is done to reconstitute the struct T</span></span>
<span id="cb4-73"><a href="#cb4-73" tabindex="-1"></a>      ret <span class="op">=</span> <span class="fu">load_struct</span>(io, <span class="op">$</span>T, data.tag, ctx)</span>
<span id="cb4-74"><a href="#cb4-74" tabindex="-1"></a>      <span class="fu">seek</span>(io, endpos)</span>
<span id="cb4-75"><a href="#cb4-75" tabindex="-1"></a>      ret</span>
<span id="cb4-76"><a href="#cb4-76" tabindex="-1"></a>    <span class="cf">end</span></span>
<span id="cb4-77"><a href="#cb4-77" tabindex="-1"></a>  else</span>
<span id="cb4-78"><a href="#cb4-78" tabindex="-1"></a>    <span class="co"># Fallback code to use if the compiler decides not to use the</span></span>
<span id="cb4-79"><a href="#cb4-79" tabindex="-1"></a>    <span class="co"># generated version</span></span>
<span id="cb4-80"><a href="#cb4-80" tabindex="-1"></a>    <span class="fu">parse_tag</span>(io<span class="op">::</span><span class="dt">IO</span>, tag, ctx)<span class="op">::</span><span class="dt">T</span></span>
<span id="cb4-81"><a href="#cb4-81" tabindex="-1"></a>  <span class="kw">end</span></span>
<span id="cb4-82"><a href="#cb4-82" tabindex="-1"></a><span class="kw">end</span></span>
<span id="cb4-83"><a href="#cb4-83" tabindex="-1"></a></span>
<span id="cb4-84"><a href="#cb4-84" tabindex="-1"></a><span class="op">...</span></span>
<span id="cb4-85"><a href="#cb4-85" tabindex="-1"></a></span>
<span id="cb4-86"><a href="#cb4-86" tabindex="-1"></a><span class="kw">function</span> <span class="fu">load_struct</span>(io<span class="op">::</span><span class="dt">IO</span>, <span class="op">::</span><span class="dt">Type{T}</span>, dtag<span class="op">::</span><span class="dt">BSONType</span>, ctx<span class="op">::</span><span class="dt">ParseCtx</span>)<span class="op">::</span><span class="dt">T </span><span class="kw">where</span> T</span>
<span id="cb4-87"><a href="#cb4-87" tabindex="-1"></a>  <span class="cf">if</span> <span class="pp">@generated</span></span>
<span id="cb4-88"><a href="#cb4-88" tabindex="-1"></a></span>
<span id="cb4-89"><a href="#cb4-89" tabindex="-1"></a>    <span class="co"># This time we have 4 alternative code blocks which may be produced</span></span>
<span id="cb4-90"><a href="#cb4-90" tabindex="-1"></a>    <span class="co"># depending on the type of struct. Fistly we have some code to build C</span></span>
<span id="cb4-91"><a href="#cb4-91" tabindex="-1"></a>    <span class="co"># style structs with a straight forward memory layout. These structs</span></span>
<span id="cb4-92"><a href="#cb4-92" tabindex="-1"></a>    <span class="co"># only contain bits types themselves (like Int, Float, ComplexF64).</span></span>
<span id="cb4-93"><a href="#cb4-93" tabindex="-1"></a>    <span class="cf">if</span> <span class="fu">isprimitive</span>(T)</span>
<span id="cb4-94"><a href="#cb4-94" tabindex="-1"></a>      <span class="pp">@assert</span> <span class="fu">isbitstype</span>(T)</span>
<span id="cb4-95"><a href="#cb4-95" tabindex="-1"></a>      <span class="kw">quote</span></span>
<span id="cb4-96"><a href="#cb4-96" tabindex="-1"></a>        <span class="pp">@asserteq</span> dtag binary</span>
<span id="cb4-97"><a href="#cb4-97" tabindex="-1"></a>        bits <span class="op">=</span> <span class="fu">parse_bin_unsafe</span>(io, ctx)</span>
<span id="cb4-98"><a href="#cb4-98" tabindex="-1"></a>        <span class="co"># This calls a C function in the Julia runtime which copies &#39;bits&#39;</span></span>
<span id="cb4-99"><a href="#cb4-99" tabindex="-1"></a>        <span class="co"># directly into a new Julia struct.</span></span>
<span id="cb4-100"><a href="#cb4-100" tabindex="-1"></a>        <span class="fu">ccall</span>(<span class="op">:</span>jl_new_bits, <span class="dt">Any</span>, (<span class="dt">Any</span>, <span class="dt">Ptr</span>{Cvoid}), <span class="op">$</span>T, bits)</span>
<span id="cb4-101"><a href="#cb4-101" tabindex="-1"></a>      <span class="cf">end</span></span>
<span id="cb4-102"><a href="#cb4-102" tabindex="-1"></a>    <span class="cf">elseif</span> T <span class="op">&lt;:</span><span class="dt"> Dict</span></span>
<span id="cb4-103"><a href="#cb4-103" tabindex="-1"></a>      <span class="co"># dictionaries have their own special representation in BSON</span></span>
<span id="cb4-104"><a href="#cb4-104" tabindex="-1"></a>      <span class="op">:</span>(<span class="fu">load_dict!</span>(io, <span class="op">$</span><span class="fu">T</span>(), ctx))</span>
<span id="cb4-105"><a href="#cb4-105" tabindex="-1"></a>    <span class="cf">elseif</span> <span class="fu">fieldcount</span>(T) <span class="op">&lt;</span> <span class="fl">1</span></span>
<span id="cb4-106"><a href="#cb4-106" tabindex="-1"></a>      <span class="op">:</span>(<span class="op">$</span><span class="fu">T</span>())</span>
<span id="cb4-107"><a href="#cb4-107" tabindex="-1"></a>    <span class="cf">else</span></span>
<span id="cb4-108"><a href="#cb4-108" tabindex="-1"></a>      <span class="co"># Now we have to deal with the type of struct which has an</span></span>
<span id="cb4-109"><a href="#cb4-109" tabindex="-1"></a>      <span class="co"># unknown memory layout and we have to build one field at a time.</span></span>
<span id="cb4-110"><a href="#cb4-110" tabindex="-1"></a>      n <span class="op">=</span> <span class="fu">fieldcount</span>(T)</span>
<span id="cb4-111"><a href="#cb4-111" tabindex="-1"></a>      <span class="pp">@assert</span> n <span class="op">&gt;</span> <span class="fl">0</span></span>
<span id="cb4-112"><a href="#cb4-112" tabindex="-1"></a>      FT <span class="op">=</span> <span class="fu">fieldtype</span>(T, <span class="fl">1</span>)</span>
<span id="cb4-113"><a href="#cb4-113" tabindex="-1"></a></span>
<span id="cb4-114"><a href="#cb4-114" tabindex="-1"></a>      block <span class="op">=</span> <span class="op">:</span>(x <span class="op">=</span> <span class="fu">ccall</span>(<span class="op">:</span>jl_new_struct_uninit, <span class="dt">Any</span>, (<span class="dt">Any</span>,), <span class="op">$</span>T);</span>
<span id="cb4-115"><a href="#cb4-115" tabindex="-1"></a>                <span class="fu">setref</span>(x, ctx);</span>
<span id="cb4-116"><a href="#cb4-116" tabindex="-1"></a>                <span class="fu">parse_array_len</span>(io, ctx);</span>
<span id="cb4-117"><a href="#cb4-117" tabindex="-1"></a>                tag <span class="op">=</span> <span class="fu">parse_array_tag</span>(io, ctx);</span>
<span id="cb4-118"><a href="#cb4-118" tabindex="-1"></a>                f <span class="op">=</span> <span class="fu">parse_specific</span>(io, <span class="op">$</span>FT, tag, ctx)<span class="op">::</span><span class="dt">$FT</span>;</span>
<span id="cb4-119"><a href="#cb4-119" tabindex="-1"></a>                <span class="fu">ccall</span>(<span class="op">:</span>jl_set_nth_field, <span class="dt">Nothing</span>, (<span class="dt">Any</span>, <span class="dt">Csize_t</span>, <span class="dt">Any</span>), x, <span class="fl">0</span>, f))</span>
<span id="cb4-120"><a href="#cb4-120" tabindex="-1"></a>      <span class="cf">for</span> i <span class="kw">in</span> <span class="fl">2</span><span class="op">:</span>n</span>
<span id="cb4-121"><a href="#cb4-121" tabindex="-1"></a>    <span class="co"># this loop is at compile time, we concatenate the previous contents</span></span>
<span id="cb4-122"><a href="#cb4-122" tabindex="-1"></a>        <span class="co"># of &#39;block&#39; with the current field expression. There will be no loop in the</span></span>
<span id="cb4-123"><a href="#cb4-123" tabindex="-1"></a>        <span class="co"># final code just a list of type specific calls to parse and set field</span></span>
<span id="cb4-124"><a href="#cb4-124" tabindex="-1"></a>        </span>
<span id="cb4-125"><a href="#cb4-125" tabindex="-1"></a>        FT <span class="op">=</span> <span class="fu">fieldtype</span>(T, i)</span>
<span id="cb4-126"><a href="#cb4-126" tabindex="-1"></a>        block <span class="op">=</span> <span class="op">:</span>(<span class="op">$</span>block;</span>
<span id="cb4-127"><a href="#cb4-127" tabindex="-1"></a>                  tag <span class="op">=</span> <span class="fu">parse_array_tag</span>(io, ctx);</span>
<span id="cb4-128"><a href="#cb4-128" tabindex="-1"></a>                  f <span class="op">=</span> <span class="fu">parse_specific</span>(io, <span class="op">$</span>FT, tag, ctx)<span class="op">::</span><span class="dt">$FT</span>;</span>
<span id="cb4-129"><a href="#cb4-129" tabindex="-1"></a>                  <span class="fu">ccall</span>(<span class="op">:</span>jl_set_nth_field, <span class="dt">Nothing</span>, (<span class="dt">Any</span>, <span class="dt">Csize_t</span>, <span class="dt">Any</span>), x, <span class="op">$</span>i<span class="op">-</span><span class="fl">1</span>, f))</span>
<span id="cb4-130"><a href="#cb4-130" tabindex="-1"></a>      <span class="cf">end</span></span>
<span id="cb4-131"><a href="#cb4-131" tabindex="-1"></a></span>
<span id="cb4-132"><a href="#cb4-132" tabindex="-1"></a>      <span class="op">:</span>(<span class="op">$</span>block; x)</span>
<span id="cb4-133"><a href="#cb4-133" tabindex="-1"></a>    <span class="cf">end</span></span>
<span id="cb4-134"><a href="#cb4-134" tabindex="-1"></a>  else</span>
<span id="cb4-135"><a href="#cb4-135" tabindex="-1"></a>    <span class="co"># Fallback code omitted, it looks similar to the above but without expressions</span></span>
<span id="cb4-136"><a href="#cb4-136" tabindex="-1"></a>    <span class="op">...</span></span>
<span id="cb4-137"><a href="#cb4-137" tabindex="-1"></a>  <span class="kw">end</span></span>
<span id="cb4-138"><a href="#cb4-138" tabindex="-1"></a><span class="kw">end</span></span></code></pre></div>
            <p>This results in a ~3.5x performance improvement in my
            benchmarks. Probably in the best case, with only concrete
            types, it would be ~4x. However this is on the second run
            with a large data set. On the first run, with any dataset
            which is not huge, the compilation time can easily come to
            dominate and the compilation time for the third parser is
            much longer. This is probably because it must generate far
            more type specific code.</p>
            <p>I am not sure exactly what to do next with this, if
            anything, although it clearly would benefit from some
            cleanup, but who has time for that? I have forked the
            package because I have practically rewritten most of the
            library and changed its behavior (although it is still
            mostly compatible). Definitely some of these improvements
            can be added to the original (with effort), for others it is
            not so clear if they are wanted as no one has shown an
            interest in it.</p>
            <p>There is nothing strictly binding me to this particular
            BSON format either and there is probably something
            intrinsically more efficient. For a start a lot of type
            information could be omitted. On the other hand, I’d like to
            keep up the pretense of being compatible with something that
            has an existing user base.</p>
    </div>
  </content>
</entry>
<entry>
  <title>How to 10x the performance of most software</title>
  <id>https://richiejp.com/how-to-10x-most-software</id>
  <published>2024-11-12T22:40:43Z</published>
  <updated>2024-12-18T17:04:45Z</updated>
  <link rel="alternate" href="https://richiejp.com/how-to-10x-most-software" />
  <summary>By focusing on the code performance optimizations that you
can see with simple tools and methodology</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>I have a very large and excellent book called Systems
            Performance written by Brendon Gregg, who among other
            things, made Netflix faster. Opening it to a random page I
            see an overview of what appears to be the TCP stack in the
            Linux kernel.</p>
            <p>Near the beginning of the book there is a chapter
            describing various methodologies for performance tuning.
            Starting with what Brendon calls <em>anti-methods</em>.</p>
            <p>One method is the random-change anti-method, that is,
            looking at the code, trying stuff and manually running the
            program to see if the issue has gone away. Another is the
            ad-hock-checklist method, which is not referred to as an
            anti-method, but is listed directly after them.</p>
            <p>What I’m about to suggest is somewhat of a combination of
            these two methods and perhaps the slightest bit of more
            intelligent analysis. It’s essentially what most engineers
            who achieve a large performance gain do and it works because
            of the absolute state of software.</p>
            <p>The natural dynamics of the market mean that it is
            usually safer to quickly write slow software than to slowly
            write fast software. Not always of course because sometimes
            the main selling point of an application or component is
            speed, but even then this is relative to what came
            before.</p>
            <p>Note that this method does benefit from familiarity with
            the project in question and the components involved. The
            more unknowns there are the more a systematic approach will
            be superior to this.</p>
            <h1 id="x-is-easier-than-2x">10x is easier than 2x</h1>
            <p>There is a business book I haven’t read called “10x Is
            Easier than 2x”. The idea being that it is easier to aim
            from a 10x increase in revenue than a 2x.</p>
            <p>When optimizing software there is some truth to the idea
            that focusing on ideas that will, in a single blow, increase
            performance by 10x, is easier than trying to find multiple
            1-2x performance increases and compounding them.</p>
            <p>Due to things such as <a
            href="https://bsky.app/profile/andersonc0d3.bsky.social/post/3lc2tqap7as2r">measurement
            bias</a>, noise and the general difficulty of setting up a
            benchmark environment that can accurately detect small
            increases in performance; if there is a radical thing you
            can do to increase performance by an order of magnitude or
            more, it could be easier to do that.</p>
            <p>A single sample can be statistically significant if the
            variance is large enough. Meaning that if a process that
            previously took hours is reduce to seconds, then you can be
            fairly sure your performance optimization worked without an
            advanced benchmarking setup.</p>
            <p>You of course need to test for functionality regressions
            and edge cases which result in a timeout, but you don’t need
            to worry about whether a speed up is due to an environmental
            change or noise.</p>
            <p>What’s more the changes required are likely to be radical
            to the extent that pure logic or theory can be used to
            predict what the relative performance outcome will be. You
            also won’t have as many options to choose from and will be
            forced to focus on things that are high impact.</p>
            <p>Meanwhile a radical change isn’t necessarily difficult.
            To some extent this is just a filtering exercise; you look
            for the low hanging fruit that’s especially juicy, ignoring
            stuff that is too difficult or low impact.</p>
            <p>Logically it is not easier to 10x than to 2x because
            there are multiple 2x’s in a 10x. The point, I guess, is to
            discard tasks that won’t result in a big improvement because
            there are better things to be doing.</p>
            <p>So something to ask yourself before trying to improve
            performance is will a 10x improvement in performance result
            in a similar sized cost saving or improvement in sales? The
            later is often very hard to measure, but you can pick some
            other metric as a proxy.</p>
            <h1 id="code-optimization-method">Code optimization
            method</h1>
            <ol style="list-style-type: decimal">
            <li>Isolate the slow feature with a minimum of effort</li>
            </ol>
            <p>This could be done with profiling or tracing tools, but
            also just using the program and seeing what is visibly slow,
            uses up system memory, spends credits on AWS or causes the
            CPU fan to spin up.</p>
            <ol start="2" style="list-style-type: decimal">
            <li>Look at the code or a code trace</li>
            </ol>
            <p>Open up the source code and look for slow things (see
            below) especially anything that happens in a loop. Stepping
            through the code using debugging tools or a profiler can
            help, but it is also good to read the code and think.</p>
            <ol start="3" style="list-style-type: decimal">
            <li>Rewrite the code to do less work</li>
            </ol>
            <p>Can the code be rewritten to not do something or only do
            it once?</p>
            <ol start="4" style="list-style-type: decimal">
            <li>Try the slow feature again or run a simple
            benchmark</li>
            </ol>
            <p>Does it now feel nice when using the program? Do a few
            runs of the benchmark show a radical improvement?</p>
            <p>Because we are looking for very large changes we don’t
            need advanced benchmarking tools for performance, however
            you may need to check that performance doesn’t violently
            degrade with certain inputs.</p>
            <h1 id="things-to-look-for-when-optimizing-code">Things to
            look for when optimizing code</h1>
            <p>None of these things are necessarily slow and there is
            overlap between them. They are just places to start looking
            assuming it’s not so obvious that you can guess.</p>
            <ol style="list-style-type: decimal">
            <li>Loops and recursion (i.e. the algorithm)</li>
            </ol>
            <p>Unless a loop only ever iterates a small number of times
            then it should come under some suspicion. If the number of
            times a loop happens depends on the size of the input and
            the loop count grows <em>faster</em> than the input size
            grows then it should come under maximum suspicion.</p>
            <p>Operations that would usually be considered fast suddenly
            become slow and dangerous if they happen inside a loop.</p>
            <p>Using the wrong data structure or algorithm can result in
            unnecessary looping. For example repeatedly searching an
            unsorted list or an array, instead of sorting the list once
            and searching with a better algorithm that being sorted
            enables.</p>
            <p>Searching an array instead of using a hash map is another
            classic example. Often though doing the search is not the
            problem, doing it repeatedly within another loop is.</p>
            <p>Sometimes introducing more loops is faster than caching
            the results or using a fancy data structure or algorithm.
            Nevertheless when you see a loop at the top level that could
            iterate many times this is a place to start looking.</p>
            <ol start="2" style="list-style-type: decimal">
            <li>I/O</li>
            </ol>
            <p>In particular database queries and creating files.
            Databases and file systems are typically very very fast, but
            if you, for example, repeatedly create a temporary file,
            copy data into it and copy it back out again before deleting
            it, then you could be thrashing your system.</p>
            <p>A problem I’ve often see with databases is not really
            with the database, but with ORMs (object relation mapping)
            or even just the database connection library. In fact it’s
            not even with the ORM, it’s because the ORM is being used to
            extract an entire database table into memory.</p>
            <p>Another problem I have seen a lot is with RPCs (remote
            procedure calls) and not even because of the network
            latency, but because the RPC transmits a huge amount of
            unfiltered JSON that an interpreted language struggles to
            parse and turn into native objects. The network may not even
            struggle to send that much JSON, but the memory and CPU
            requirements to turn it all into a language’s arrays and
            objects is very high.</p>
            <p>Removing I/O altogether can be faster, such as using an
            in-memory cache or recalculating values when needed, but
            often not as well. The point is that the interface between
            an application and the outside world is a sensitive area for
            performance.</p>
            <ol start="3" style="list-style-type: decimal">
            <li>Synchronisation and processes</li>
            </ol>
            <p>In application with multiple threads, processes or
            instances on different servers, you will have points where
            everything needs to agree.</p>
            <p>For example one process will need to take a lock and
            check a value set by another. In fact it’s not uncommon to
            see that a process calls <code>sleep</code> in a loop while
            polling something.</p>
            <p>Problems with synchronisation can be more difficult to
            detect because the program may not be using a lot of CPU
            time. It may spend a lot of time sleeping waiting for a
            lock.</p>
            <p>Locks and calls to sleep can be hidden in library
            functions or appear in unexpected places. Problems with
            synchronisation may only appear in production when running
            at capacity.</p>
            <p>Creating new threads or processes is relatively expensive
            compared to doing calculations. Something to look out for
            are external processes being created in a loop. Some
            languages have a thread pool built in, so seeing a thread
            being created is not necessarily a bad sign.</p>
            <ol start="4" style="list-style-type: decimal">
            <li>Memory allocation and copying memory</li>
            </ol>
            <p>The worst crimes involving memory allocation I have seen
            were done during I/O. However in dynamic languages it can
            creep in anywhere and cause huge slow downs.</p>
            <p>During calculations or something like building a string,
            you may find that lots of intermediate results are created
            and they each require temporary allocations. Then the
            garbage collector has to free all of those objects.</p>
            <p>It varies from language to language which objects cause
            an allocation and how good the runtime is at optimising out
            intermediate results. Sometimes it can detect that an object
            can modified in-place instead of creating a copy.</p>
            <ol start="5" style="list-style-type: decimal">
            <li>System calls</li>
            </ol>
            <p>A system call is like a function call, but it is used to
            communicate with the operating system kernel. Even the most
            trivial system call incurs a penalty because the CPU has to
            swap out the user land context for the kernel context and
            back again.</p>
            <p>If you are using a high-level language then browsing the
            code for system calls is probably not going to be effective,
            this is where tooling comes in very handy. Even just using
            <code>strace</code> on Linux to see if a call is being made
            repeatedly can do the job.</p>
            <p>A lot of the system calls you are likely to see are
            related to I/O, but possibly from library calls you didn’t
            think are doing I/O.</p>
            <ol start="6" style="list-style-type: decimal">
            <li>Slow CPU instructions and cache invalidation</li>
            </ol>
            <p>Most of the time I have to go out of my way to get into
            this. Basically this is about optimising calculations you
            can’t avoid. For example my <a
            href="https://richiejp.com/1d-reversible-automata">reversible
            automata render</a> or <a
            href="https://richiejp.com/zc-data#hash-map">this hash
            map</a>.</p>
            <p>Once you are at the point of trying to optimise your data
            structures to avoid loading a new cache line during a
            calculation, then it’s probably time to consider investing
            in a real methodology although I’m sure that engineers still
            make progress by just rewriting code they think will be
            slow.</p>
            <p>The most common optimisation I have used on this level is
            avoiding modulo (<code>%</code>) and division
            (<code>/</code>) by making everything a power of 2 and using
            bitshifts (<code>&lt;&lt;</code>) with logical AND
            instead.</p>
            <p>Most probably you are not going to see a 10 or 100x speed
            up in the over all program unless it is doing some heavy
            computation as the main purpose of the program.</p>
            <h1 id="examples">Examples</h1>
            <h2 id="adding-an-index-to-a-csv-viewer"><a
            href="zsv-index">Adding an index to a CSV viewer</a></h2>
            <p>In this case an index was added which meant that
            navigating a very large CSV file went from linear time to
            constant. Potentially an infinite increase in performance,
            but practically it reduces lag time from seconds to
            milliseconds.</p>
            <h2 id="fuzzy-sync-race-exposition-library"><a
            href="a-rare-data-race">Fuzzy Sync race exposition
            library</a></h2>
            <p>You perhaps wouldn’t think of this as performance
            optimization, but this library allowed us to reproduce race
            condition bugs in the Linux kernel in seconds where
            previously it took hours, days or perhaps even never.</p>
            <h2 id="generating-deserialiser-code-for-bson-in-julia"><a
            href="generating-type-specific-deserialisers-for-bson">Generating
            deserialiser code for BSON in Julia</a></h2>
            <p>To some extent this is an example of what not to do as it
            <em>only</em> resulted in a 3-4x speed up. On long running
            tasks the cost saving was clear, but it’s questionable
            whether I should have been (de)serialising so much data to
            in the first place.</p>
            <h1 id="when-this-optimization-method-goes-wrong">When this
            optimization method goes wrong</h1>
            <p>This method works because often once you achieve some
            familiarity with a program and its code base it becomes
            glaringly obvious where a performance issue is. Fixing it is
            a case of doing the same thing you have done a bunch of
            times before.</p>
            <p>The problem comes when you mistake what is causing the
            performance issue or dramatically over estimate how much
            impact a change will have. Let’s say you make a change
            removing that “unnecessary” work and find nothing happens or
            things get worse.</p>
            <p>It could be possible that your change improved some
            metrics, but there was no perceptible overall change. Your
            then left with a decision to reverse the change or keep
            going with it.</p>
            <p>The risk is going in circles without making progress or
            even causing regressions.</p>
            <p>If you are in a situation where a single radical change
            appears to be impossible or very costly. Then you need to
            compound multiple small changes and use a process which
            systematically validates each change in a statistically
            sound way.</p>
            <p>Modifying the code to test a hypothesis may not be
            feasible, it could be too time consuming or you may need to
            optimize for one metric at a time. Meaning you need to
            instrument the system and identify bottlenecks. For example
            if you think of a change that would improve the instruction
            cache usage, you need evidence that it is a limiting factor
            and that a change did in improve it.</p>
            <h1 id="an-even-simpler-way">An even simpler way</h1>
            <p>Finally, instead of trying to locate a particular problem
            and fix it, just rewrite the whole application in a language
            where everyone is obsessed with performance.</p>
            <p>If all of the library authors and runtime contributors
            think performance is important, then every interface and bit
            of documentation will drive you towards performance.</p>
            <p>This of course only works if the initial language was of
            the slow variety and I am also joking, but at the same time
            if the project is very small then it is a workable
            solution.</p>
    </div>
  </content>
</entry>
<entry>
  <title>Exploiting a bug in the Linux kernel with Zig</title>
  <id>https://richiejp.com/linux-kernel-exploit-tls_context-uaf</id>
  <published>2023-11-09T10:14:23Z</published>
  <updated>2023-11-11T17:03:18Z</updated>
  <link rel="alternate" href="https://richiejp.com/linux-kernel-exploit-tls_context-uaf" />
  <summary>What it is like to exploit a Linux kernel bug in 2023 when
you have never exploited anything</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>At the tender young age of 36 I decided to write my first
            exploit. Because I have been writing Linux kernel tests and
            bug reproducers for years, it looked like a good idea to
            target the Linux kernel.</p>
            <div class="float">
            <img src="sec-not-qa.jpg"
            alt="I’m not bitter or anything" />
            <div class="figcaption">I’m not bitter or anything</div>
            </div>
            <p>The Linux kernel is probably the hardest thing I can
            expect to exploit in a reasonable time. I know less about
            browser internals and other things which appear to enjoy the
            same level of competition.</p>
            <p>The payoff here being insight into what it takes rather
            than a direct monetary reward.</p>
            <p>I chose to exploit a bug that was previously used in
            Google’s kernelCTF. It’s not a trendy bug with a marketing
            team. It’s just known by the name CVE-2023-0461. I chose it
            because it was the first bug in <a
            href="https://docs.google.com/spreadsheets/d/e/2PACX-1vS1REdTA29OJftst8xN5B5x8iIUcxuK6bXdzF8G1UXCmRtoNsoQ9MbebdRdFnj6qZ0Yd7LwQfvYC2oF/pubhtml">Google’s
            spreadsheet</a>.</p>
            <p>I don’t like the idea of exploiting fake bugs or bugs in
            older kernels without most of today’s mitigations. However I
            didn’t want to target a bug that has never been
            exploited.</p>
            <p>I wanted to know for certain that overall exploitation
            was possible. Otherwise I could get discouraged and give up
            on a bug when actually I would learn an important lesson
            from pushing through.</p>
            <p>Regardless of what I do next, I think it was well worth
            the effort to get this working.</p>
            <p>This isn’t a rigorous write-up on a new technique or a
            zero-day. It’s just my thoughts on the exploitation.
            Nonetheless it contains many technical details, a lot of
            which will be difficult to follow unless you are already
            very familiar with the subject matter.</p>
            <p>I have created a video on the <a
            href="https://youtu.be/g7ATRgat0v4?si=savUd2jwNCrJr-aG">last
            part of the exploit covering the ROP chain</a>. I had the
            intention of covering the other sections to, but when I
            started trying to step through those bits it got very
            disjointed and confused. I may come back to those, perhaps
            after I come up with a better solution for the heap spray.
            Perhpas not.</p>
            <p>I’m really stretching the Zig theme here, but yes the
            exploit is written in Zig. It’s terrible Zig code, I made no
            effort to clean it up. However there is now a <a
            href="https://github.com/richiejp/m/blob/main/src/xplt.zig">Linux
            kernel exploit written in Zig</a>.</p>
            <p>I would speculate that Zig is an attractive language for
            exploit writers. For precisely the same reasons its
            attractive for writing embedded and system code. However
            there is no evidence of that presented here.</p>
            <div class="float">
            <img src="exp-vs-ltp.jpg"
            alt="Again it’s not like I am bitter or anything" />
            <div class="figcaption">Again it’s not like I am bitter or
            anything</div>
            </div>
            <p>Note that more usefully there is also an <a
            href="https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/setsockopt/setsockopt10.c">LTP
            test which reproduces the bug</a>. The Linux Test Project
            has a number of such tests.</p>
            <h1 id="the-bug">The bug</h1>
            <p>The bug is in the “User Level Protocol” (ULP for short)
            layer on top of TCP sockets. In theory all ULPs are
            vulnerable, but I only looked at TLS.</p>
            <p>The Linux TLS module allows applications to offload some
            processing to the kernel or hardware. Once you have
            established a TCP connection and done the public key
            cryptography bit then the ULP allows you to offload the
            synchronous cryptographic stage to the kernel.</p>
            <p>The bug was originally exploited (probably) by “D3v17 -
            savy@syst3mfailure.io”. They used it against the mitigation
            bypass target in Google’s kernelCTF. I’m looking at their <a
            href="https://github.com/google/security-research/tree/master/pocs/linux/kernelctf/CVE-2023-0461_mitigation">exploit
            and write-up</a> for the first time now and it is mostly a
            mystery to me.</p>
            <p>The steps to reproduce the bug are the same, but that is
            where the similarities end. They didn’t resort to using FUSE
            and instead found a way to get their hands on those <a
            href="https://dl.acm.org/doi/10.1145/3372297.3423353">elastic
            objects</a>, I keep hearing about.</p>
            <p>We’ll come back to how annoyed I am at using FUSE later.
            For now lets look at my description of the bug which I wrote
            for the LTP test.</p>
            <blockquote>
            <p>Reproducer for CVE-2023-0461 which is an exploitable
            use-after-free in a TLS socket. In fact it is exploitable in
            any User Level Protocol (ULP) which does not clone its
            context when accepting a connection.</p>
            <p>Because it does not clone the context, the child socket
            which is created on accept has a pointer to the listening
            socket’s context. When the child is closed the parent’s
            context is freed while it still has a reference to it.</p>
            <p>TLS can only be added to a socket which is connected. Not
            listening or disconnected, and a connected socket can not be
            set to listening. So we have to connect the socket, add TLS,
            then disconnect, then set it to listening.</p>
            <p>To my knowledge, setting a socket from open to
            disconnected requires a trick; we have to “connect” to an
            unspecified address. This could explain why the bug was not
            found earlier.</p>
            <p>The accepted fix was to disallow listening on sockets
            with a ULP set which does not have a clone function.</p>
            </blockquote>
            <p>It took me a while of scanning the kernel code to find
            out that the connect system call can be used to disconnect a
            socket. I reviewed the locations where the socket state is
            updated and this appears to be the only place that meets the
            criteria.</p>
            <p>This is written in the <code>connect</code> Linux man
            pages, but I wouldn’t have thought to look there because…
            you know, the name. I also must have seen this information
            before, but strangely enough my initial thought was that
            this bug is impossible because socket state transitions are
            one way.</p>
            <p>For stream sockets at least this would match the BSD man
            pages. I find the POSIX page painful to read, but it doesn’t
            appear to address calling connect multiple times on a
            <em>connection mode</em> socket. It is left open to
            implementation.</p>
            <div class="float">
            <img src="opposite-of-connecting.jpg"
            alt="Never think things do what their name suggests" />
            <div class="figcaption">Never think things do what their
            name suggests</div>
            </div>
            <p>This bug is probably a <a
            href="https://www.johndcook.com/blog/2010/01/12/software-sins-of-omission/">sin
            of omission</a>, that is one where the software developer
            did not take into account some possible program state. If
            that is the case then I would also say that the semantic
            overloading of <code>connect</code> did not help.</p>
            <h1 id="reproducing">Reproducing</h1>
            <p>I’m not sure how the bug was found, but I was starting
            with a CVE and fix commit with no reproducer. So the first
            step is to find a way of reliably reproducing the bug.</p>
            <p>Below is the fix commit.</p>
            <div class="sourceCode" id="cb1"><pre
            class="sourceCode diff"><code class="sourceCode diff"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a>commit 2c02d41d71f90a5168391b6a5f2954112ba2307c</span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a>Author: Paolo Abeni &lt;pabeni@redhat.com&gt;</span>
<span id="cb1-3"><a href="#cb1-3" tabindex="-1"></a>Date:   Tue Jan 3 12:19:17 2023 +0100</span>
<span id="cb1-4"><a href="#cb1-4" tabindex="-1"></a></span>
<span id="cb1-5"><a href="#cb1-5" tabindex="-1"></a>    net/ulp: prevent ULP without clone op from entering the LISTEN status</span>
<span id="cb1-6"><a href="#cb1-6" tabindex="-1"></a></span>
<span id="cb1-7"><a href="#cb1-7" tabindex="-1"></a>    When an ULP-enabled socket enters the LISTEN status, the listener ULP data</span>
<span id="cb1-8"><a href="#cb1-8" tabindex="-1"></a>    pointer is copied inside the child/accepted sockets by sk_clone_lock().</span>
<span id="cb1-9"><a href="#cb1-9" tabindex="-1"></a></span>
<span id="cb1-10"><a href="#cb1-10" tabindex="-1"></a>    The relevant ULP can take care of de-duplicating the context pointer via</span>
<span id="cb1-11"><a href="#cb1-11" tabindex="-1"></a>    the clone() operation, but only MPTCP and SMC implement such op.</span>
<span id="cb1-12"><a href="#cb1-12" tabindex="-1"></a></span>
<span id="cb1-13"><a href="#cb1-13" tabindex="-1"></a>    Other ULPs may end-up with a double-free at socket disposal time.</span>
<span id="cb1-14"><a href="#cb1-14" tabindex="-1"></a></span>
<span id="cb1-15"><a href="#cb1-15" tabindex="-1"></a>    We can&#39;t simply clear the ULP data at clone time, as TLS replaces the</span>
<span id="cb1-16"><a href="#cb1-16" tabindex="-1"></a>    socket ops with custom ones assuming a valid TLS ULP context is</span>
<span id="cb1-17"><a href="#cb1-17" tabindex="-1"></a>    available.</span>
<span id="cb1-18"><a href="#cb1-18" tabindex="-1"></a></span>
<span id="cb1-19"><a href="#cb1-19" tabindex="-1"></a>    Instead completely prevent clone-less ULP sockets from entering the</span>
<span id="cb1-20"><a href="#cb1-20" tabindex="-1"></a>    LISTEN status.</span>
<span id="cb1-21"><a href="#cb1-21" tabindex="-1"></a></span>
<span id="cb1-22"><a href="#cb1-22" tabindex="-1"></a>    Fixes: 734942cc4ea6 (&quot;tcp: ULP infrastructure&quot;)</span>
<span id="cb1-23"><a href="#cb1-23" tabindex="-1"></a>    Reported-by: slipper &lt;slipper.alive@gmail.com&gt;</span>
<span id="cb1-24"><a href="#cb1-24" tabindex="-1"></a>    Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;</span>
<span id="cb1-25"><a href="#cb1-25" tabindex="-1"></a>    Link: https://lore.kernel.org/r/4b80c3d1dbe3d0ab072f80450c202d9bc88b4b03.1672740602.git.pabeni@redhat.com</span>
<span id="cb1-26"><a href="#cb1-26" tabindex="-1"></a>    Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;</span>
<span id="cb1-27"><a href="#cb1-27" tabindex="-1"></a></span>
<span id="cb1-28"><a href="#cb1-28" tabindex="-1"></a><span class="kw">diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c</span></span>
<span id="cb1-29"><a href="#cb1-29" tabindex="-1"></a>index 848ffc3e0239..d1f837579398 100644</span>
<span id="cb1-30"><a href="#cb1-30" tabindex="-1"></a><span class="dt">--- a/net/ipv4/inet_connection_sock.c</span></span>
<span id="cb1-31"><a href="#cb1-31" tabindex="-1"></a><span class="dt">+++ b/net/ipv4/inet_connection_sock.c</span></span>
<span id="cb1-32"><a href="#cb1-32" tabindex="-1"></a><span class="dt">@@ -1200,12 +1200,26 @@ void inet_csk_prepare_forced_close(struct sock *sk)</span></span>
<span id="cb1-33"><a href="#cb1-33" tabindex="-1"></a> }</span>
<span id="cb1-34"><a href="#cb1-34" tabindex="-1"></a> EXPORT_SYMBOL(inet_csk_prepare_forced_close);</span>
<span id="cb1-35"><a href="#cb1-35" tabindex="-1"></a></span>
<span id="cb1-36"><a href="#cb1-36" tabindex="-1"></a><span class="va">+static int inet_ulp_can_listen(const struct sock *sk)</span></span>
<span id="cb1-37"><a href="#cb1-37" tabindex="-1"></a><span class="va">+{</span></span>
<span id="cb1-38"><a href="#cb1-38" tabindex="-1"></a><span class="va">+       const struct inet_connection_sock *icsk = inet_csk(sk);</span></span>
<span id="cb1-39"><a href="#cb1-39" tabindex="-1"></a><span class="va">+</span></span>
<span id="cb1-40"><a href="#cb1-40" tabindex="-1"></a><span class="va">+       if (icsk-&gt;icsk_ulp_ops &amp;&amp; !icsk-&gt;icsk_ulp_ops-&gt;clone)</span></span>
<span id="cb1-41"><a href="#cb1-41" tabindex="-1"></a><span class="va">+               return -EINVAL;</span></span>
<span id="cb1-42"><a href="#cb1-42" tabindex="-1"></a><span class="va">+</span></span>
<span id="cb1-43"><a href="#cb1-43" tabindex="-1"></a><span class="va">+       return 0;</span></span>
<span id="cb1-44"><a href="#cb1-44" tabindex="-1"></a><span class="va">+}</span></span>
<span id="cb1-45"><a href="#cb1-45" tabindex="-1"></a><span class="va">+</span></span>
<span id="cb1-46"><a href="#cb1-46" tabindex="-1"></a> int inet_csk_listen_start(struct sock *sk)</span>
<span id="cb1-47"><a href="#cb1-47" tabindex="-1"></a> {</span>
<span id="cb1-48"><a href="#cb1-48" tabindex="-1"></a>        struct inet_connection_sock *icsk = inet_csk(sk);</span>
<span id="cb1-49"><a href="#cb1-49" tabindex="-1"></a>        struct inet_sock *inet = inet_sk(sk);</span>
<span id="cb1-50"><a href="#cb1-50" tabindex="-1"></a>        int err;</span>
<span id="cb1-51"><a href="#cb1-51" tabindex="-1"></a></span>
<span id="cb1-52"><a href="#cb1-52" tabindex="-1"></a><span class="va">+       err = inet_ulp_can_listen(sk);</span></span>
<span id="cb1-53"><a href="#cb1-53" tabindex="-1"></a><span class="va">+       if (unlikely(err))</span></span>
<span id="cb1-54"><a href="#cb1-54" tabindex="-1"></a><span class="va">+               return err;</span></span>
<span id="cb1-55"><a href="#cb1-55" tabindex="-1"></a><span class="va">+</span></span></code></pre></div>
            <p>and then the CVE description which was perhaps updated at
            some point. I will go with what’s in my notes.</p>
            <blockquote>
            <p>There is a use-after-free vulnerability in the Linux
            Kernel which can be exploited to achieve local privilege
            escalation. To reach the vulnerability kernel configuration
            flag CONFIG_TLS or CONFIG_XFRM_ESPINTCP has to be
            configured, but the operation does not require any
            privilege.</p>
            </blockquote>
            <blockquote>
            <p>There is a use-after-free bug of icsk_ulp_data of a
            struct inet_connection_sock. When CONFIG_TLS is enabled,
            user can install a tls context (struct tls_context) on a
            connected tcp socket. The context is not cleared if this
            socket is disconnected and reused as a listener.</p>
            </blockquote>
            <blockquote>
            <p>If a new socket is created from the listener, the context
            is inherited and vulnerable. The setsockopt TCP_ULP
            operation does not require any privilege.</p>
            </blockquote>
            <p>So no hint at how one disconnects a TCP socket and
            actually we need more than that. We want to set the lower
            level socket state to <code>SS_UNCONNECTED</code>. Otherwise
            we can’t call listen. This is different from the TCP level
            connection.</p>
            <p>Disconnecting a TCP socket can be done at least two other
            ways, you can close the other end or call
            <code>shutdown</code>. However that won’t set the socket
            state to <code>SS_UNCONNECTED</code> unless the connection
            was never fully established. That will just set the TCP
            state to <code>TCP_CLOSE</code>.</p>
            <p>If the connection is never fully established then we can
            not set the ULP to TLS. So we need to fully establish the
            connection then reset both the socket and TCP states. Which,
            as I already mentioned, is done using <code>connect</code>
            with an address of <code>AF_UNSPEC</code>.</p>
            <p>Once we know how to do it then it is <em>relatively</em>
            straight forward to reproduce. First we create a connection,
            calling <code>setsockopt</code> on the connected socket to
            add TLS to it. Then calling <code>connect</code> with
            <code>AF_UNSPEC</code> to set <code>SS_UNCONNECTED</code>
            and <code>TLS_CLOSE</code> on the socket. Finally we call
            <code>listen</code> and <code>accept</code> to accept an
            incoming connection and close that connection.</p>
            <p>When that final connection is closed it will free
            <code>tls_context</code> while the listening socket still
            has a reference to it.</p>
            <p>There are no race conditions and we could easily trigger
            the free multiple times once we have a listening socket with
            a <code>tls_context</code>. No instability comes from this
            part of the process.</p>
            <h1 id="heap-spray">Heap spray</h1>
            <p>At this point I knew that a use-after-free is useful for
            causing a <em>type confusion</em>. That is, the freed memory
            we still have a pointer to can be reallocated as a different
            type of object.</p>
            <div class="float">
            <img src="how-its-going-one.jpg"
            alt="The real fun begins" />
            <div class="figcaption">The real fun begins</div>
            </div>
            <p>More than one type of object then shares the same memory.
            So that we can corrupt the memory from the POV of one or
            both of the objects.</p>
            <p>The active Linux heap allocator (SLUB), uses various
            object caches. Some important objects get their own cache.
            Most however are allocated from general purpose caches of
            different sizes.</p>
            <p>The <code>tls_context</code> object is allocated in the
            <code>kmalloc-512</code> cache. There are multiple ways to
            find this out. Including looking at the size of the struct
            (328) and what flags it is allocated with
            <code>GPF_KERNEL</code>.</p>
            <p>Also by adding <code>slub_debug=T,*-512</code> to the
            kernel command line then doing the allocation we will see
            which 512 cache it is in. If we don’t know the size of the
            struct then debugging with gdb or tracing with kernel probes
            can show which cache is used.</p>
            <p>Based on various writeups I had read (sorry I lost track
            of where I first saw some things), there are a number of
            objects, such as <code>msg_msg</code> and
            <code>sk_buff</code>, that can be allocated in any cache
            size and are mostly full of attacker controlled data.</p>
            <p>So I started trying to heap spray these objects, however
            I didn’t appear to be getting any hits. Using some of the
            tracing methods mentioned above I realised that these
            objects were being allocated in <code>kmalloc-cg-512</code>,
            not <code>kmalloc-512</code>.</p>
            <p>Looking at their allocations, they use
            <code>GFP_KERNEL_ACCOUNT</code> not <code>GFP_KERNEL</code>.
            On the face of it this flag has something to do with memory
            counters used in control groups. However it now also causes
            objects to be allocated in a different cache.</p>
            <p>This is good for security because a lot of juicy variable
            length objects containing arbitrary user data are now
            segregated from everything else.</p>
            <p>At this point I knew I could arbitrarily free any object
            in the <code>kmalloc-512</code> cache. I wasn’t sure what to
            do with that though. I also know that cross cache attacks
            are possible, however I assumed it would be easier to find
            an object in the same cache. So I started looking for more
            objects I could use to replace <code>msg_msg</code>.</p>
            <p>I became fixated with <code>splice</code> and
            <code>writev</code>. Firstly because I read about some
            ancient attack that hasn’t worked since the introduction of
            <code>copy_to/from_user</code>. I should have known better,
            but at least at this point I could see that my heap spray
            worked.</p>
            <p>Secondly I moved onto <code>bio_vec</code> which is used
            when splicing between pipes. Because it can allocate an
            object in <code>kmalloc-512</code> and then block. The
            problem is the object is a <code>bio_vec</code> array which
            contains pointers to <code>struct page</code>. These pages
            are mapped in just before use.</p>
            <p>I didn’t have anything to pass as a pointer to a
            <code>struct page</code> at this point. I suppose that if I
            knew any valid page addresses (or pfn) then this could be
            used to read or write to it.</p>
            <p>Finding a suitable object was by far not the only
            problem. To my knowledge the free is delayed by up to 5000
            jiffies, which appears to be 5 seconds on the target kernel
            config.</p>
            <p>This is because the context is freed with
            <code>kfree_rcu</code> which batches up frees. The maximum
            batch age is 5000 jiffies and on my quiet system I would
            often hit that. I suppose the time could be reduced by
            spamming frees, but then that could interfere with other
            things.</p>
            <p>Throughout the exploit development I would accidentally
            introduce a change that put the wrong object into the freed
            slot. In one case I also moved the free and allocation to
            different processes and therefor different CPUs. Each CPU
            has its own free list, so this also stopped it from working.
            It was the main source of instability and confusion.</p>
            <div class="float">
            <img src="how-its-going-two.jpg" alt="Uhhff" />
            <div class="figcaption">Uhhff</div>
            </div>
            <p>I have a couple of pages of notes and ideas from this
            stage. Plus there will be more stuff that I looked into, but
            didn’t bother to write down. This is the part of the exploit
            I would most like to revisit.</p>
            <h1 id="fuse">FUSE</h1>
            <p>What I did know at this point, was that there was a <a
            href="https://chompie.rip/Blog+Posts/Put+an+io_uring+on+it+-+Exploiting+the+Linux+Kernel#userfaultfd+is+over%2C+FUSE+is+in">well
            proven heap spray technique using FUSE</a> and <a
            href="https://duasynt.com/blog/linux-kernel-heap-spray">extended
            file attributes</a>. Originally it was done with
            <code>userfaultfd</code>.</p>
            <div class="float">
            <img src="bugs-vs-people.jpg"
            alt="Please do not nitpick, it is the sentiment that counts" />
            <div class="figcaption">Please do not nitpick, it is the
            sentiment that counts</div>
            </div>
            <p>I don’t like this technique because I think it can be
            easily mitigated like userfaultfd. It relies on the attacker
            having access to <code>/dev/fuse</code> <em>and</em> either
            <code>fusermount</code> or unprivileged user namespaces.</p>
            <p>On a desktop it is quite likely that all users have
            access to <code>/dev/fuse</code>, but the only processes
            that need access to this are file system daemons. So there
            are obvious ways to shutdown this attack vector.</p>
            <p>Having said that, it may not be convenient for desktops
            to restrict access to <code>/dev/fuse</code>. FUSE is
            extremely useful. So I don’t see it going away altogether.
            Unlike <code>userfaultfd</code> the absence of which has
            probably not bothered many people outside of data
            centers.</p>
            <p>I have written two articles about FUSE already.</p>
            <ul>
            <li><a href="zig-fuse-one">Zig &amp; FUSE: Hello file
            systems</a></li>
            <li><a href="/https/richiejp.com/zig-fuse-two">Zig &amp; /dev/fuse: A weird
            file system</a></li>
            </ul>
            <p>The second article is about using the raw FUSE interface
            which was partially motivated by this exploit.</p>
            <p>I found that using the raw FUSE interface I don’t need to
            use the <code>mmap</code> page fault technique described in
            Chompie’s blog. Instead of mapping a file and blocking while
            reading from the file, I block on processing the
            <code>xattr</code> request itself.</p>
            <p>I don’t know that I need to use the raw interface for
            this or that it is the better way to do it. However I wanted
            to poke at the raw interface and I poked it.</p>
            <p>Needless to say it added a fair chunk of time onto the
            exploitation. Still, using FUSE at least, was the right
            thing to do because the probability it wouldn’t work was
            very low.</p>
            <p>Using this method I don’t really do heap spray. I just
            allocate a single xattr after a delay. It usually gets the
            freed slot so long as the free and the alloc happen on the
            same CPU. This works on my very quiet VM, but I guess it
            would not fair so well with more noise.</p>
            <p>One last note on FUSE is that there are a lot of FUSE
            message types and I would be surprised if the
            <code>xattr</code> messages are the only two of interest to
            attackers. FUSE gives user land the ability to pause inside
            a whole bunch of internal kernel operations.</p>
            <h1 id="kaslr">KASLR</h1>
            <p>Now that I could overlap <code>xattr</code> buffers with
            <code>tls_context</code> I could read the content of
            <code>tls_context</code>. A caveat is that
            <code>xattr</code> zeros its buffer, also zeroing
            <code>tls_context</code>. This corrupts
            <code>tls_context</code> because after being initialised it
            contains a couple of pointers.</p>
            <p>Luckily these pointers are not used much. Without
            touching them we are able to set the encryption keys used
            for transmitting or receiving using <code>setsockopt</code>.
            This writes lots of interesting things into
            <code>tls_context</code>.</p>
            <p>One of these things is a function pointer to
            <code>tls_sw_push_pending_record</code> (or
            <code>tls_device_push_pending_record</code> if hardware with
            TLS offload is present). We can write this pointer into the
            <code>xattr</code> buffer while blocking the FUSE request.
            When <code>setsockopt</code> returns we can then allow the
            FUSE request to continue and read back the internal kernel
            pointer.</p>
            <p>We need a function pointer because of address layout
            randomisation. If KASLR is enabled then the kernel shifts
            its address space by some amount. So that every symbol is
            offset from its default location.</p>
            <div class="float">
            <img src="kaslr.jpg"
            alt="If you can read then you can defeat KASLR" />
            <div class="figcaption">If you can read then you can defeat
            KASLR</div>
            </div>
            <p>However the symbols are all still in the same order and
            there are no random gaps introduced between them. So if we
            retrieve a pointer to a known symbol, then we can calculate
            the location of any symbol.</p>
            <h1 id="rip">RIP</h1>
            <p><em>The more sensitive reader may want to turn away now
            as things are going to get really ROPey.</em></p>
            <p>At last I could read and write to
            <code>tls_context</code>. This doesn’t instantly grant root
            privileges though. The structure doesn’t contain a pointer
            to the process credentials. Plus it’s not obvious to me how
            to write to some arbitrary memory just using
            <code>tls_context</code> without hijacking kernel control
            flow.</p>
            <p>Interestingly <code>tls_context</code> can contain an
            indirect reference to the <em>creds</em> via
            <code>ctx-&gt;priv_ctx_rx-&gt;strp.sk-&gt;sk_socket-&gt;file-&gt;f_cred</code>
            if we setup encryption for receiving data. I say interesting
            because any time we have a socket or <code>sk</code> pointer
            we know an arbitrary read would allow us to find the task
            creds.</p>
            <p>I like the idea of a <em>data only</em> attack, however I
            didn’t see a clear path to getting arbitrary R/W. I expect
            it’s totally possible, especially by adding more object
            types to the type confusion.</p>
            <p>With a data only attack I wouldn’t have to use a <em>code
            reuse attack</em> like ROP or JOP. This seemed attractive
            for a number of reasons, not least that ROP appeared
            magical, but also because the chain is specific to a
            particular kernel binary.</p>
            <p>The fact is though that <code>tls_context</code> provides
            access to at least three function pointers which I can
            overwrite and execute. So that is three different places
            where I can try to launch my attack. I gather this is really
            good.</p>
            <p>That isn’t to say it is easy though, as each location
            requires different setup to execute. We have to execute the
            function pointer without crashing the kernel before or
            afterwards.</p>
            <p>Note that, back in the bad old days, I would have just
            been able to write some <em>shell code</em> (I don’t know
            why the word shell is there, it’s just code) into
            <code>tls_context</code>, then point one of the function
            pointers at it. This would then get executed like any other
            kernel code.</p>
            <p>These days the heap memory where <code>tls_context</code>
            lives is not executable. So if we try that the CPU will
            refuse to execute it. We can set memory to be executable,
            but we need code execution to do it, so there is a chicken
            and egg issue there.</p>
            <p>Hence if we want to take control of execution we have to
            reuse existing code. The code we want though is something
            like the following.</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="co">// resets credentials to root in the root user namespace</span></span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a><span class="dt">void</span> escalate_privs<span class="op">(</span><span class="dt">void</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a>    <span class="cf">return</span> commit_kernel_creds<span class="op">(</span>prepare_kernel_creds<span class="op">(</span>NULL<span class="op">));</span></span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>If you are writing a kernel, then don’t include a handy
            function like this. Indeed Linux does not include such code.
            This does not stop us though because of the way the
            <code>jmp</code>, <code>call</code> and <code>ret</code> CPU
            instructions work (for now).</p>
            <p>The <code>call</code> instruction takes an address,
            <em>any</em> executable address, and updates
            <code>rip</code> to it after pushing the current value of
            <code>rip</code> to the stack. Meanwhile the
            <code>ret</code> instruction pops the top value off the
            stack into <code>rip</code>.</p>
            <p>Updating <code>rip</code> changes the next instruction
            the CPU will execute. Presently there is nothing to stop us
            from passing an address to <code>call</code> which is not
            the start of a function.</p>
            <p>Likewise <code>ret</code> will jump to whatever value is
            at the top of the stack. Furthermore we can change the
            location of the stack, including to heap memory that we
            control.</p>
            <p>So each function has at least one <code>ret</code>
            instruction and we can jump to the instructions just before
            each <code>ret</code> to do whatever. The useful sequence of
            one or more instructions in front of each <code>ret</code>
            are called a gadget.</p>
            <p>We can put the addresses of gadgets on the stack and the
            <code>ret</code> instructions at the end of each gadget will
            link them together. This forms a ROP chain, where ROP stands
            for Return Oriented Programming.</p>
            <div class="float">
            <img src="ROP-chain.jpg"
            alt="At first sight of ROP gadgets" />
            <div class="figcaption">At first sight of ROP gadgets</div>
            </div>
            <p>That’s not all though, because the various
            <code>jmp</code> instructions can also be used to set
            <code>rip</code>. There seem to be a variety of ways in
            which to construct a JOP gadget. In the end I just used
            ROP.</p>
            <p>Note that finding ROP and JOP gadgets in general is easy
            because of the tooling available. Constructing them into a
            chain is more challenging, although I’m sure a constraint
            solver or similar could partially automate that as well.</p>
            <p>I extracted the gadgets using the <a
            href="https://a13xp0p0v.github.io/2021/08/25/lkrg-bypass.html">method
            described here</a>. Then constructed the chain by grepping
            for suitable gadgets. Notably ChatGPT was useful for
            describing the purpose of basic x86 instructions and
            registers. It seems to have synthesized the various sources
            of x86 documentation well.</p>
            <p>Initially I was not sure I could use a ROP chain. I knew
            I had to change the stack location <code>rsp</code>, but I
            wasn’t sure it would be allowed. This wasn’t an issue
            though; apparently no distinction is drawn between stack and
            heap locations.</p>
            <p>Secondly I don’t have the address of
            <code>tls_context</code>, I have addresses to other memory
            locations within it, but not the structure itself. There is
            a pointer to a small amount of memory I control which is
            used to store some encryption gubbins, however it is too
            small for a ROP chain.</p>
            <p>I noted though that if a register contains
            <code>tls_context</code> already then I can move that value
            into <code>rsp</code> using the first gadget. The first
            items in <code>tls_context</code> aren’t used for a lot of
            operations, so I can overwrite them with return
            addresses.</p>
            <p>In the middle of <code>tls_context</code> there are some
            function pointers which we want to use to start the ROP
            chain. So we have to <em>stack pivot</em> over those. After
            that there is plenty of space to store a ROP chain.</p>
            <p>The first place I tried to take <code>rip</code> control
            turned out to be a bad spot. Firstly it wanted to use the
            <code>tls_context.sk_proto</code> pointer which had been
            wiped when allocating <code>xattr</code>.</p>
            <p>It only wanted one field in <code>sk_proto</code>, which
            turned out to be a function pointer. This wasn’t the
            function pointer I was originally targeting, but it was
            needed to reach the original, so I targeted it instead.</p>
            <p>I couldn’t fake the whole <code>sk_proto</code> object
            because it is too large to fit in the memory I control.
            However only one field is accessed at a given offset. So I
            could fake that one field and put a pointer in
            <code>ctx-&gt;sk_proto</code> which is offset by the
            distance to the field.</p>
            <p>This worked, but then I discovered that I had made a
            mistake. I thought that a pointer to
            <code>tls_context</code> would be in one of the registers at
            the call site which would start the ROP chain. It turns out
            that I was looking at a point in the execution just before
            the call site.</p>
            <p>The register containing the context was cleared just
            before the call. So now I didn’t know where the location of
            my chain was. Instead of giving up on this path immediately
            I decided to try dereferencing the <code>sk</code> pointer
            which was in a register and the <code>sk</code> structure
            also contained a pointer to the context.</p>
            <p>I would have to do this with JOP and didn’t grasp how to
            do that. I think it should be possible using a suitable JOP
            gadget that jumps to a location specified by a register I
            control. However I gave up and moved to another location to
            take RIP control.</p>
            <p>This time I tried
            <code>ctx-&gt;push_pending_record</code> which can be called
            by setting <code>ctx-&gt;pending_open_record_frags</code> to
            true and sending a <em>control message</em> with
            <code>sendmsg</code>. This isn’t the only way to call it,
            but it appears to be the easiest.</p>
            <p>I also got lucky because it has the context in the
            <code>rax</code> register and there is a ROP gadget which
            simply sets <code>rax</code> to <code>rsp</code>. Thus
            setting the stack to the context.</p>
            <p>Not only that, but I discovered that <code>cmsg</code>
            allocates a variable sized object in <code>GFP_KERNEL</code>
            with only a small bit of header data. Potentially that is a
            replacement for <code>xattr</code>! Although I don’t know
            how a <code>cmsg</code> payload could be read back to user
            space.</p>
            <p>Anyway I ignored that and managed to write a ROP chain
            that calls the necessary functions to set the creds. It’s
            only a very small amount of code, but it was not easy.</p>
            <p>The biggest issue was restoring the stack pointer to its
            original value after elevating privileges. I can’t leave the
            stack pointing to the context if I want to gracefully return
            from <code>push_pending_record</code> and back to user
            land.</p>
            <p>I couldn’t think of a way to save the <code>rsp</code>
            register, so that it could be restored later. I guess it’s
            possible with JOP, but I had already given up on that. It so
            happened though that 4 of the registers contained pointers
            to locations on the original stack.</p>
            <p>It took a while, but I found a simple sequence of gadgets
            that calculated the original <code>rsp</code> value from
            <code>rbx</code> and restored it. After that I just had to
            do something with the newly gained privileges and that was
            it, the exploit was complete.</p>
            <div class="float">
            <img src="sub-rbx-0x8f-ret.jpg"
            alt="I apologise for these pictures" />
            <div class="figcaption">I apologise for these pictures</div>
            </div>
            <p>Writing a ROP chain is rather odd compared to writing
            plain assembly. The instructions which immediately appear
            before a <code>ret</code> are constrained by the x86 calling
            convention. It’s impossible to find <code>mov</code> with
            some pairs of registers for example. Even if you look at
            gadgets with multiple instructions, the compiler just
            doesn’t produce some things.</p>
            <p>On the other hand it is surprising what it does produce.
            Although I guess <code>ROPgadget</code> looks at
            <em>misaligned instructions</em> as described in the book
            <em>Practical Binary Analysis</em>. So it’s not necessarily
            the case that the gadgets it finds are part of the
            kernel.</p>
            <h1 id="shell">Shell</h1>
            <p>When the exploit successfully returns to user land, it
            just starts a shell. When the shell closes the kernel
            crashes because the <code>tls_context</code> is still
            corrupted and closing the socket causes some zeroed fields
            to be accesses. This can be fixed of course, it just
            requires more cleanup.</p>
            <p>Because of the way I use FUSE the exploit requires
            unprivileged user namespaces to mount the FUSE FS. This is
            only because there is no <code>fusermount</code> in my image
            and my raw FUSE implementation doesn’t implement something
            needed to use <code>fusermount</code> anyway. Again this
            could be fixed, but I’d be more interested in switching away
            from FUSE.</p>
    </div>
  </content>
</entry>
<entry>
  <title>Recording your screen on (SUSE) Linux</title>
  <id>https://richiejp.com/linux-screen-record</id>
  <published>2020-07-04T17:11:18+01:00</published>
  <updated>2022-01-19T18:56:04Z</updated>
  <link rel="alternate" href="https://richiejp.com/linux-screen-record" />
  <summary>Options for recording your screen on Linux (X) and editing
the results</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>It is 2020 and this is still not as easy as it should be.
            Somehow recording my screen with myself on webcam is a pain.
            Possibly the easiest way is actually to use <a
            href="https://jitsi.org/">Jitsi</a> or similar. That is, to
            use your <em>web</em> browser; Chromium or FireFox.</p>
            <div class="message is-warning">
            <div class="message-body">
            <p>It is now 2022 and I am now using <a href="#obs">OBS
            Studio</a> with wlrobs on Sway/wlroots/Wayland. This is much
            better than the other options listed here.</p>
            </div>
            </div>
            <p>However I have this strange desire to do things locally
            without FireFox. At least for now anyway. FireFox
            <em>is</em> an interesting virtual machine and development
            environment xD.</p>
            <p>I’m using openSuSE Tumbleweed, but the package names are
            the same on most RPM based distros or similar on Debian.</p>
            <h1 id="outline">Outline</h1>
            <ul>
            <li><strong>vokoscreenNG</strong> Recording the screen and
            web cam</li>
            <li><strong>H.264</strong> encoding for video and
            <strong>Vorbis</strong> for audio</li>
            <li><strong>VLC</strong> for video playback</li>
            <li><strong>ffmpeg</strong> for clipping, cropping,
            concatenating and re-encoding the videos</li>
            </ul>
            <p>Alternatively to <strong>vokoscreenNG</strong>:</p>
            <ul>
            <li><strong>guvcview</strong> for displaying the webcam
            video (and crashing X, probably)</li>
            <li><strong>Simplescreenrecorder</strong> for recording the
            screen</li>
            </ul>
            <h1 id="setup">Setup</h1>
            <p>Codecs are the bane of Linux distributions everywhere and
            openSUSE is no exception. H.264 is by far the best that I
            have tried and ofcourse it is on the naughtly list. So the
            first thing you <em>must</em> do is enable the packman or
            VLC repository in Yast or <a
            href="https://en.opensuse.org/Additional_package_repositories">with
            Zypper</a> and force an update of any existing multimedia
            pacakges to that repository.</p>
            <p>Then we need to <a
            href="https://en.opensuse.org/SDB:Installing_codecs_from_Packman_repositories">add
            some packages</a> like:</p>
            <div class="sourceCode" id="cb1"><pre
            class="sourceCode zsh"><code class="sourceCode zsh"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="fu">sudo</span> zypper in gstreamer-plugins-bad-orig-addon gstreamer-plugins-ugly-orig-addon</span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a><span class="fu">sudo</span> zypper in simplescreenrecorder vlc guvcview ffmpeg</span></code></pre></div>
            <p>I’m actually not sure which package contains the H.264
            <em>gubbins</em> which Simplescreenrecorder uses because I
            installed so much crap in the process of discovery.</p>
            <h1 id="recording">Recording</h1>
            <p>I think <strong>vokoscreenNG</strong> (or
            <strong>Simplescreenrecorder</strong> and
            <strong>guvcview</strong>) are fairly self explanatory. Just
            start them and click around or look at the
            <code>--help</code> if you can’t stand GUIs. Just note that
            <strong>vokoscreenNG</strong>/<strong>Simplescreenrecorder</strong>
            may not default to H.264.</p>
            <p>Previously on my videos I was trying to do them in one
            shot, like a live presentation. Which is good practice, but
            time consuming (and I was experiencing some crashes), so I
            now set
            <strong>Simplescreenrecorder</strong>/<strong>vokoscreenNG</strong>
            to save the video as chunks and I clip them and stitch them
            together afterwards.</p>
            <p>For webcam I simply have <del>guvcview</del>
            <strong>vokoscreenNG</strong> display the output on the
            desktop. <strong>guvcview</strong> works OK except that the
            graphics driver occasionally <em>throws a wobbly</em> due to
            a buffer being filled or something and crashes
            <strong>X</strong>. Probably I should do something about
            that…</p>
            <p>I also tried recording with <strong>ffmpeg</strong>, but
            performance seemed worse, audio and video were out of sync
            and so on. <strong>vokoscreenNG</strong> is not without
            problems either, at least not on the <strong>i3</strong>
            desktop where it is unable to select a subsection of the
            screen for recording. So I had to crop the video
            afterwards.</p>
            <h1 id="editing">Editing</h1>
            <p>I really didn’t want to <em>lay ruin upon</em> my package
            manager by installing 5 different video editing packages to
            see what worked. So I just use a combination of
            <strong>VLC</strong> to watch the video and
            <strong>ffmpeg</strong> to concatenate the clips
            together.</p>
            <h2 id="failed-attempt">Failed attempt</h2>
            <p><em>This failed because <code>inpoint</code> and
            <code>endpoint</code> had no appreciable effect</em></p>
            <p>As befits a staple, bloated, <em>all things to all
            people</em> tool like <strong>ffmpeg</strong> there are at
            least three different ways to join some videos together. I’m
            not doing anything fancy, so I use the simplest.</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode zsh"><code class="sourceCode zsh"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="fu">tee</span> vids.txt <span class="op">&lt;&lt;EOF</span></span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a><span class="st">file vid1.mkv</span></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a><span class="st">inpoint 00:00:01.000</span></span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a><span class="st">outpoint 00:00:05.000</span></span>
<span id="cb2-5"><a href="#cb2-5" tabindex="-1"></a><span class="st">file vid2.mkv</span></span>
<span id="cb2-6"><a href="#cb2-6" tabindex="-1"></a><span class="st">file vid3.mkv</span></span>
<span id="cb2-7"><a href="#cb2-7" tabindex="-1"></a><span class="st">outpoint 00:00:30.000</span></span>
<span id="cb2-8"><a href="#cb2-8" tabindex="-1"></a><span class="op">EOF</span></span>
<span id="cb2-9"><a href="#cb2-9" tabindex="-1"></a><span class="ex">ffmpeg</span> <span class="at">-f</span> concat <span class="at">-i</span> vids <span class="at">-c</span> copy final.mkv</span></code></pre></div>
            <p><del>I can clip the videos with <code>inpoint</code> and
            <code>endpoint</code></del>. I’m not interested in making
            everything perfect, but I think it is best not to waste too
            much of the viewers time with stuff that can be easily
            cut.</p>
            <h2 id="successfull-attempt">Successfull attempt</h2>
            <p>The simplest way (IMO), didn’t work so I had to resort
            <code>-filter_complex</code>. The name is accurate; it
            appears <strong>ffmpeg</strong> implements its own stream
            processing language to connect <em>filters</em> together.
            After some time I was able to figure out how to chain the
            <code>concat</code> <em>filter</em> with
            <code>crop</code>.</p>
            <p>The below concatenates the three input video segments and
            crops the bottom 1% of the screen.</p>
            <div class="sourceCode" id="cb3"><pre
            class="sourceCode zsh"><code class="sourceCode zsh"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a><span class="ex">ffmpeg</span> <span class="at">-ss</span> 1 <span class="at">-t</span> 1:03 <span class="at">-i</span> p0.mkv <span class="dt">\</span></span>
<span id="cb3-2"><a href="#cb3-2" tabindex="-1"></a>       <span class="at">-ss</span> 2 <span class="at">-t</span> 13:30 <span class="at">-i</span> p1.mkv <span class="dt">\ </span></span>
<span id="cb3-3"><a href="#cb3-3" tabindex="-1"></a>       <span class="ex">-ss</span> 3 <span class="at">-t</span> 07:27 <span class="at">-i</span> p2.mkv <span class="dt">\</span></span>
<span id="cb3-4"><a href="#cb3-4" tabindex="-1"></a>       <span class="at">-filter_complex</span> <span class="st">&#39;concat=a=1:n=3 [v][a]; [v] crop=in_w:in_h*0.99:0:0&#39;</span> <span class="dt">\</span></span>
<span id="cb3-5"><a href="#cb3-5" tabindex="-1"></a>       <span class="at">-map</span> <span class="st">&#39;[a]&#39;</span> <span class="dt">\</span></span>
<span id="cb3-6"><a href="#cb3-6" tabindex="-1"></a>       <span class="at">-c:a</span> libvorbis <span class="at">-c:v</span> libx264 fzsync.mkv</span></code></pre></div>
            <p>Clipping was easier to figure out, for that you just need
            to specify <code>-ss</code> and <code>-t</code> on each
            input (<code>-i</code>) file.</p>
            <p>Using the filter means we must re-encode the video, which
            is slow, but had the added benefit of halving the overall
            size. So you probably shouldn’t use the <em>copy codec</em>
            with <em>concat demuxer</em> anyway as it won’t have the
            best end result.</p>
            <h1 id="obs">OBS</h1>
            <p>If you are on Tumbledweed then you will still need the
            packman repository. Also you will need to install
            <code>obs-studio-devel</code>, <code>wayland-devel</code>,
            <code>gcc</code> and <code>meson</code>.</p>
            <p>To record a Wayland session we need the <a
            href="https://hg.sr.ht/~scoopta/wlrobs">wlrobs obs
            plugin</a>. It’s relatively easy to compile and install as
            it has few dependencies. However there is no package on
            Tumbleweed as far as I know.</p>
            <p>OBS Studio is what <a
            href="https://www.ei8fdb.org/thoughts/2021/01/03/practical-advice-for-speakers-at-fosdem-2021-open-source-design-devroom/">FOSDEM
            recomends</a> speakers use to record their presentations and
            it works pretty well.</p>
    </div>
  </content>
</entry>
<entry>
  <title>Linux socket example</title>
  <id>https://richiejp.com/linux-socket-example</id>
  <published>2021-08-31T17:09:28+01:00</published>
  <updated>2022-01-25T21:13:21Z</updated>
  <link rel="alternate" href="https://richiejp.com/linux-socket-example" />
  <summary>Intro to using sockets on Linux in C</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>First the code for a small socket example.</p>
            <div class="message is-danger">
            <div class="message-body">
            <p>All networking code is unsafe, it’s just a question of
            degree. You should start by assuming any code you find here
            is very unsafe. It’s not been tested, fuzzed, reviewed,
            formally verified nor bathed in the fire of real world
            usage. If you want to see hardened network code then you
            need to look at… real code.</p>
            </div>
            </div>
            <div class="sourceCode" id="cb1"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;unistd.h&gt;</span></span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;string.h&gt;</span></span>
<span id="cb1-3"><a href="#cb1-3" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;stdio.h&gt;</span></span>
<span id="cb1-4"><a href="#cb1-4" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;sys/socket.h&gt;</span></span>
<span id="cb1-5"><a href="#cb1-5" tabindex="-1"></a></span>
<span id="cb1-6"><a href="#cb1-6" tabindex="-1"></a><span class="pp">#define MSG </span><span class="st">&quot;Hello, World!&quot;</span></span>
<span id="cb1-7"><a href="#cb1-7" tabindex="-1"></a></span>
<span id="cb1-8"><a href="#cb1-8" tabindex="-1"></a><span class="dt">int</span> main<span class="op">(</span><span class="dt">const</span> <span class="dt">int</span> argc<span class="op">,</span></span>
<span id="cb1-9"><a href="#cb1-9" tabindex="-1"></a>         <span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span><span class="dt">const</span> argv<span class="op">[])</span></span>
<span id="cb1-10"><a href="#cb1-10" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb1-11"><a href="#cb1-11" tabindex="-1"></a>    <span class="dt">char</span> read_buf<span class="op">[</span><span class="kw">sizeof</span><span class="op">(</span>MSG<span class="op">)];</span></span>
<span id="cb1-12"><a href="#cb1-12" tabindex="-1"></a>    <span class="dt">int</span> socket<span class="op">[</span><span class="dv">2</span><span class="op">];</span></span>
<span id="cb1-13"><a href="#cb1-13" tabindex="-1"></a></span>
<span id="cb1-14"><a href="#cb1-14" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">int</span> ret <span class="op">=</span></span>
<span id="cb1-15"><a href="#cb1-15" tabindex="-1"></a>        socketpair<span class="op">(</span>AF_UNIX<span class="op">,</span> SOCK_STREAM<span class="op">,</span></span>
<span id="cb1-16"><a href="#cb1-16" tabindex="-1"></a>               <span class="dv">0</span><span class="op">,</span> socket<span class="op">);</span></span>
<span id="cb1-17"><a href="#cb1-17" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>ret <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb1-18"><a href="#cb1-18" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;socketpair&quot;</span><span class="op">);</span></span>
<span id="cb1-19"><a href="#cb1-19" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb1-20"><a href="#cb1-20" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb1-21"><a href="#cb1-21" tabindex="-1"></a></span>
<span id="cb1-22"><a href="#cb1-22" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">ssize_t</span> write_len <span class="op">=</span></span>
<span id="cb1-23"><a href="#cb1-23" tabindex="-1"></a>        write<span class="op">(</span>socket<span class="op">[</span><span class="dv">0</span><span class="op">],</span> MSG<span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span>MSG<span class="op">));</span></span>
<span id="cb1-24"><a href="#cb1-24" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>write_len <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb1-25"><a href="#cb1-25" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;write&quot;</span><span class="op">);</span></span>
<span id="cb1-26"><a href="#cb1-26" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb1-27"><a href="#cb1-27" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb1-28"><a href="#cb1-28" tabindex="-1"></a>    printf<span class="op">(</span><span class="st">&quot;Wrote </span><span class="sc">%zu</span><span class="st"> of </span><span class="sc">%zu</span><span class="st"> bytes</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span></span>
<span id="cb1-29"><a href="#cb1-29" tabindex="-1"></a>           write_len<span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span>MSG<span class="op">));</span></span>
<span id="cb1-30"><a href="#cb1-30" tabindex="-1"></a></span>
<span id="cb1-31"><a href="#cb1-31" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">ssize_t</span> read_len <span class="op">=</span></span>
<span id="cb1-32"><a href="#cb1-32" tabindex="-1"></a>        read<span class="op">(</span>socket<span class="op">[</span><span class="dv">1</span><span class="op">],</span> read_buf<span class="op">,</span></span>
<span id="cb1-33"><a href="#cb1-33" tabindex="-1"></a>             <span class="kw">sizeof</span><span class="op">(</span>read_buf<span class="op">));</span></span>
<span id="cb1-34"><a href="#cb1-34" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>read_len <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb1-35"><a href="#cb1-35" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;read&quot;</span><span class="op">);</span></span>
<span id="cb1-36"><a href="#cb1-36" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb1-37"><a href="#cb1-37" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb1-38"><a href="#cb1-38" tabindex="-1"></a>    printf<span class="op">(</span><span class="st">&quot;Read </span><span class="sc">%zu</span><span class="st"> of </span><span class="sc">%zu</span><span class="st"> bytes</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span></span>
<span id="cb1-39"><a href="#cb1-39" tabindex="-1"></a>           read_len<span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span>MSG<span class="op">));</span></span>
<span id="cb1-40"><a href="#cb1-40" tabindex="-1"></a></span>
<span id="cb1-41"><a href="#cb1-41" tabindex="-1"></a>    <span class="cf">return</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb1-42"><a href="#cb1-42" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>You may copy this to a file <code>socket.c</code>,
            compile and run it as follows.</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a>$ gcc <span class="op">-</span>Wall <span class="op">-</span>pedantic sockets<span class="op">.</span>c <span class="op">-</span>o sockets</span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a>$ <span class="op">./</span>sockets</span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a>Wrote <span class="dv">14</span> of <span class="dv">14</span> bytes</span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a>Read <span class="dv">14</span> of <span class="dv">14</span> bytes</span></code></pre></div>
            <h1 id="preamble">Preamble</h1>
            <p>Now let me tell you about <em>man pages</em>. On almost
            any Linux distribution you can type the following in a
            terminal.</p>
            <div class="sourceCode" id="cb3"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a><span class="ex">$</span> man 2 socketpair</span></code></pre></div>
            <p>Or often just <code>man socketpair</code> will do.
            Exactly what happens in this case is dependant on your
            distribution. There are things, such as Emacs’s helm mode or
            <code>make -k</code>, which can search the man pages. This
            is useful if, like myself, you are irritated by searching
            the web.</p>
            <p>Most <em>system calls</em> are documented in man pages.
            They are not always accurate, complete or easy to read.
            However it is expected that Linux (and POSIX) behave the way
            the man pages describe.</p>
            <p><code>socketpair</code> is a system call. System calls
            are how the user, in user land, tells the Linux kernel, in
            kernel land, to do something. Usually kernel land is where
            the <em>network stack</em> and sockets live. In user land we
            are just given an ID number, a <em>file descriptor</em>,
            representing the socket. We never interact with the socket
            ‘object’<a href="#fn1" class="footnote-ref"
            id="fnref1"><sup>1</sup></a> directly.</p>
            <p>Usually system calls are required to issue commands to
            the kernel. These are like function calls in C except that
            they cause a <em>context switch</em>. That is, a switch
            between user land context and kernel context. Exactly what
            that entails changes with every kernel version, hardware
            architecture and configuration.</p>
            <p>You can find out more with <code>man 2 syscalls</code>.
            More importantly right now, there is a useful tool for
            tracking system calls.</p>
            <div class="sourceCode" id="cb4"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb4-1"><a href="#cb4-1" tabindex="-1"></a>$ strace <span class="op">-</span>e read<span class="op">,</span>write<span class="op">,</span>socketpair <span class="op">./</span>sockets <span class="op">&gt;/</span>dev<span class="op">/</span>null</span>
<span id="cb4-2"><a href="#cb4-2" tabindex="-1"></a>read<span class="op">(</span><span class="dv">3</span><span class="op">,</span> <span class="st">&quot;</span><span class="er">\1</span><span class="st">77ELF</span><span class="er">\2\1\1\3</span><span class="sc">\0\0\0\0\0\0\0\0</span><span class="er">\3</span><span class="sc">\0</span><span class="st">&gt;</span><span class="sc">\0</span><span class="er">\1</span><span class="sc">\0\0\0</span><span class="st">p|</span><span class="er">\2</span><span class="sc">\0\0\0\0\0</span><span class="st">&quot;</span><span class="op">...,</span> <span class="dv">832</span><span class="op">)</span> <span class="op">=</span> <span class="dv">832</span></span>
<span id="cb4-3"><a href="#cb4-3" tabindex="-1"></a>socketpair<span class="op">(</span>AF_UNIX<span class="op">,</span> SOCK_STREAM<span class="op">,</span> <span class="dv">0</span><span class="op">,</span> <span class="op">[</span><span class="dv">3</span><span class="op">,</span> <span class="dv">4</span><span class="op">])</span> <span class="op">=</span> <span class="dv">0</span></span>
<span id="cb4-4"><a href="#cb4-4" tabindex="-1"></a>write<span class="op">(</span><span class="dv">3</span><span class="op">,</span> <span class="st">&quot;Hello, World!</span><span class="sc">\0</span><span class="st">&quot;</span><span class="op">,</span> <span class="dv">14</span><span class="op">)</span>         <span class="op">=</span> <span class="dv">14</span></span>
<span id="cb4-5"><a href="#cb4-5" tabindex="-1"></a>read<span class="op">(</span><span class="dv">4</span><span class="op">,</span> <span class="st">&quot;Hello, World!</span><span class="sc">\0</span><span class="st">&quot;</span><span class="op">,</span> <span class="dv">14</span><span class="op">)</span>          <span class="op">=</span> <span class="dv">14</span></span>
<span id="cb4-6"><a href="#cb4-6" tabindex="-1"></a>write<span class="op">(</span><span class="dv">1</span><span class="op">,</span> <span class="st">&quot;Wrote 14 of 14 bytes</span><span class="sc">\n</span><span class="st">Read 14 of &quot;</span><span class="op">...,</span> <span class="dv">41</span><span class="op">)</span> <span class="op">=</span> <span class="dv">41</span></span>
<span id="cb4-7"><a href="#cb4-7" tabindex="-1"></a><span class="op">+++</span> exited with <span class="dv">0</span> <span class="op">+++</span></span></code></pre></div>
            <p>On SUSE this can be installed with
            <code>zypper in strace</code>. It is probably similar on
            other distributions.</p>
            <p>The above command prints system calls that our
            <code>sockets</code> program makes. the <code>-e</code> flag
            filters all calls except <code>read</code>,
            <code>write</code> and <code>socketpair</code>. The first
            <code>read</code> call is loading the <code>libc</code>
            library and can be ignored. Try running <code>strace</code>
            with no filter. It can be seen that the system call trace
            does not match exactly to the source code.</p>
            <p>Calls to <code>read</code> and <code>write</code> take a
            file descriptor (FD) as the first argument. This is an index
            number for a row in the FD table. Each process has its own
            FD table. This is managed by the kernel, we can’t access the
            table directly, only via system calls.</p>
            <p>Sockets are not files, the name “file descriptor”, is
            historical. Lots of things can be represented by an FD. This
            includes, but is not limited to, files and sockets. The
            above program will have a FD table similar to the below by
            the end.</p>
            <table>
            <thead>
            <tr class="header">
            <th align="left">ID</th>
            <th align="left">Description</th>
            </tr>
            </thead>
            <tbody>
            <tr class="odd">
            <td align="left">0</td>
            <td align="left">stdin</td>
            </tr>
            <tr class="even">
            <td align="left">1</td>
            <td align="left">stdout</td>
            </tr>
            <tr class="odd">
            <td align="left">2</td>
            <td align="left">stderr</td>
            </tr>
            <tr class="even">
            <td align="left">3</td>
            <td align="left">UNIX socket 0</td>
            </tr>
            <tr class="odd">
            <td align="left">4</td>
            <td align="left">UNIX socket 1</td>
            </tr>
            </tbody>
            </table>
            <p>You can inspect a program’s FDs either by looking in
            <code>/proc/</code> or using <code>lsof</code>. Programs
            like <code>netstat</code> and <code>ss</code> can display
            more socket specific information.</p>
            <h1 id="unix">UNIX</h1>
            <p>Sockets are an interface centered around the socket
            ‘object’. As sockets live in the kernel, we are just given a
            <em>file descriptor</em> as a reference to a socket. Usually
            sockets are used to send and receive data over a network.
            However in the example above we are not sending data over a
            network. Just from a process to one or more buffers in the
            kernel and back again.</p>
            <p>Usually <code>socketpair</code> is used in a program
            which forks a child process. Let’s make the example above a
            little more realistic by creating a child process.</p>
            <div class="sourceCode" id="cb5"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb5-1"><a href="#cb5-1" tabindex="-1"></a><span class="op">...</span></span>
<span id="cb5-2"><a href="#cb5-2" tabindex="-1"></a></span>
<span id="cb5-3"><a href="#cb5-3" tabindex="-1"></a><span class="dt">static</span> <span class="dt">int</span> child_proc<span class="op">(</span><span class="dt">int</span> socket<span class="op">)</span></span>
<span id="cb5-4"><a href="#cb5-4" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-5"><a href="#cb5-5" tabindex="-1"></a>    <span class="dt">char</span> read_buf<span class="op">[</span><span class="kw">sizeof</span><span class="op">(</span>MSG<span class="op">)];</span></span>
<span id="cb5-6"><a href="#cb5-6" tabindex="-1"></a></span>
<span id="cb5-7"><a href="#cb5-7" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">ssize_t</span> read_len <span class="op">=</span></span>
<span id="cb5-8"><a href="#cb5-8" tabindex="-1"></a>        read<span class="op">(</span>socket<span class="op">,</span> read_buf<span class="op">,</span></span>
<span id="cb5-9"><a href="#cb5-9" tabindex="-1"></a>             <span class="kw">sizeof</span><span class="op">(</span>read_buf<span class="op">));</span></span>
<span id="cb5-10"><a href="#cb5-10" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>read_len <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-11"><a href="#cb5-11" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;read&quot;</span><span class="op">);</span></span>
<span id="cb5-12"><a href="#cb5-12" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb5-13"><a href="#cb5-13" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb5-14"><a href="#cb5-14" tabindex="-1"></a>    printf<span class="op">(</span><span class="st">&quot;Read </span><span class="sc">%zu</span><span class="st"> of </span><span class="sc">%zu</span><span class="st"> bytes</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span></span>
<span id="cb5-15"><a href="#cb5-15" tabindex="-1"></a>           read_len<span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span>MSG<span class="op">));</span></span>
<span id="cb5-16"><a href="#cb5-16" tabindex="-1"></a></span>
<span id="cb5-17"><a href="#cb5-17" tabindex="-1"></a>    <span class="cf">return</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb5-18"><a href="#cb5-18" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb5-19"><a href="#cb5-19" tabindex="-1"></a></span>
<span id="cb5-20"><a href="#cb5-20" tabindex="-1"></a><span class="dt">int</span> main<span class="op">(</span><span class="dt">const</span> <span class="dt">int</span> argc<span class="op">,</span></span>
<span id="cb5-21"><a href="#cb5-21" tabindex="-1"></a>         <span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span><span class="dt">const</span> argv<span class="op">[])</span></span>
<span id="cb5-22"><a href="#cb5-22" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-23"><a href="#cb5-23" tabindex="-1"></a>    <span class="dt">int</span> socket<span class="op">[</span><span class="dv">2</span><span class="op">];</span></span>
<span id="cb5-24"><a href="#cb5-24" tabindex="-1"></a></span>
<span id="cb5-25"><a href="#cb5-25" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">int</span> ret <span class="op">=</span></span>
<span id="cb5-26"><a href="#cb5-26" tabindex="-1"></a>        socketpair<span class="op">(</span>AF_UNIX<span class="op">,</span> SOCK_STREAM<span class="op">,</span></span>
<span id="cb5-27"><a href="#cb5-27" tabindex="-1"></a>               <span class="dv">0</span><span class="op">,</span> socket<span class="op">);</span></span>
<span id="cb5-28"><a href="#cb5-28" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>ret <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-29"><a href="#cb5-29" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;socketpair&quot;</span><span class="op">);</span></span>
<span id="cb5-30"><a href="#cb5-30" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb5-31"><a href="#cb5-31" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb5-32"><a href="#cb5-32" tabindex="-1"></a></span>
<span id="cb5-33"><a href="#cb5-33" tabindex="-1"></a>    <span class="dt">const</span> pid_t child_pid <span class="op">=</span> fork<span class="op">();</span></span>
<span id="cb5-34"><a href="#cb5-34" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>child_pid <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-35"><a href="#cb5-35" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;fork&quot;</span><span class="op">);</span></span>
<span id="cb5-36"><a href="#cb5-36" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb5-37"><a href="#cb5-37" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb5-38"><a href="#cb5-38" tabindex="-1"></a></span>
<span id="cb5-39"><a href="#cb5-39" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>child_pid<span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-40"><a href="#cb5-40" tabindex="-1"></a>        close<span class="op">(</span>socket<span class="op">[</span><span class="dv">0</span><span class="op">]);</span></span>
<span id="cb5-41"><a href="#cb5-41" tabindex="-1"></a>        <span class="cf">return</span> child_proc<span class="op">(</span>socket<span class="op">[</span><span class="dv">1</span><span class="op">]);</span></span>
<span id="cb5-42"><a href="#cb5-42" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb5-43"><a href="#cb5-43" tabindex="-1"></a></span>
<span id="cb5-44"><a href="#cb5-44" tabindex="-1"></a>    close<span class="op">(</span>socket<span class="op">[</span><span class="dv">1</span><span class="op">]);</span></span>
<span id="cb5-45"><a href="#cb5-45" tabindex="-1"></a></span>
<span id="cb5-46"><a href="#cb5-46" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">ssize_t</span> write_len <span class="op">=</span></span>
<span id="cb5-47"><a href="#cb5-47" tabindex="-1"></a>        write<span class="op">(</span>socket<span class="op">[</span><span class="dv">0</span><span class="op">],</span> MSG<span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span>MSG<span class="op">));</span></span>
<span id="cb5-48"><a href="#cb5-48" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>write_len <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-49"><a href="#cb5-49" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;write&quot;</span><span class="op">);</span></span>
<span id="cb5-50"><a href="#cb5-50" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb5-51"><a href="#cb5-51" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb5-52"><a href="#cb5-52" tabindex="-1"></a>    printf<span class="op">(</span><span class="st">&quot;Wrote </span><span class="sc">%zu</span><span class="st"> of </span><span class="sc">%zu</span><span class="st"> bytes</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span></span>
<span id="cb5-53"><a href="#cb5-53" tabindex="-1"></a>           write_len<span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span>MSG<span class="op">));</span></span>
<span id="cb5-54"><a href="#cb5-54" tabindex="-1"></a></span>
<span id="cb5-55"><a href="#cb5-55" tabindex="-1"></a>    <span class="cf">return</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb5-56"><a href="#cb5-56" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>When forking (with <code>fork()</code>) the file
            descriptor table is copied from the parent to the child
            process. So we can use <code>socketpair</code> to create a
            pair of connected sockets. Then assign one to each process
            by closing one in the child and the other in the parent.
            Closing them avoids confusion, but it is possible to leave
            both ends open.</p>
            <p>Again we can run <code>strace</code> on this program to
            see what is happening. However an extra flag is needed
            (<code>-f</code>) to see what the child process does.</p>
            <div class="sourceCode" id="cb6"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb6-1"><a href="#cb6-1" tabindex="-1"></a>$ strace <span class="op">-</span>f <span class="op">-</span>e read<span class="op">,</span>write<span class="op">,</span>socketpair<span class="op">,</span>close<span class="op">,</span>clone <span class="op">~/</span>c<span class="op">/</span>scratch<span class="op">/</span>sockets <span class="op">&gt;</span> <span class="op">/</span>dev<span class="op">/</span>null</span>
<span id="cb6-2"><a href="#cb6-2" tabindex="-1"></a><span class="op">...</span></span>
<span id="cb6-3"><a href="#cb6-3" tabindex="-1"></a>socketpair<span class="op">(</span>AF_UNIX<span class="op">,</span> SOCK_STREAM<span class="op">,</span> <span class="dv">0</span><span class="op">,</span> <span class="op">[</span><span class="dv">3</span><span class="op">,</span> <span class="dv">4</span><span class="op">])</span> <span class="op">=</span> <span class="dv">0</span></span>
<span id="cb6-4"><a href="#cb6-4" tabindex="-1"></a>clone<span class="op">(</span>child_stack<span class="op">=</span>NULL<span class="op">,</span> flags<span class="op">=</span>CLONE_CHILD_CLEARTID<span class="op">|</span>CLONE_CHILD_SETTID<span class="op">|</span>SIGCHLD<span class="op">,</span> child_tidptr<span class="op">=</span><span class="bn">0x7fd2148ad850</span><span class="op">)</span> <span class="op">=</span> <span class="dv">12370</span></span>
<span id="cb6-5"><a href="#cb6-5" tabindex="-1"></a>strace<span class="op">:</span> Process <span class="dv">12370</span> attached</span>
<span id="cb6-6"><a href="#cb6-6" tabindex="-1"></a><span class="op">[</span>pid <span class="dv">12369</span><span class="op">]</span> close<span class="op">(</span><span class="dv">4</span><span class="op">)</span>                    <span class="op">=</span> <span class="dv">0</span></span>
<span id="cb6-7"><a href="#cb6-7" tabindex="-1"></a><span class="op">[</span>pid <span class="dv">12369</span><span class="op">]</span> write<span class="op">(</span><span class="dv">3</span><span class="op">,</span> <span class="st">&quot;Hello, World!</span><span class="sc">\0</span><span class="st">&quot;</span><span class="op">,</span> <span class="dv">14</span> <span class="op">&lt;</span>unfinished <span class="op">...&gt;</span></span>
<span id="cb6-8"><a href="#cb6-8" tabindex="-1"></a><span class="op">[</span>pid <span class="dv">12370</span><span class="op">]</span> close<span class="op">(</span><span class="dv">3</span> <span class="op">&lt;</span>unfinished <span class="op">...&gt;</span></span>
<span id="cb6-9"><a href="#cb6-9" tabindex="-1"></a><span class="op">[</span>pid <span class="dv">12369</span><span class="op">]</span> <span class="op">&lt;...</span> write resumed<span class="op">&gt;)</span>        <span class="op">=</span> <span class="dv">14</span></span>
<span id="cb6-10"><a href="#cb6-10" tabindex="-1"></a><span class="op">[</span>pid <span class="dv">12370</span><span class="op">]</span> <span class="op">&lt;...</span> close resumed<span class="op">&gt;)</span>        <span class="op">=</span> <span class="dv">0</span></span>
<span id="cb6-11"><a href="#cb6-11" tabindex="-1"></a><span class="op">[</span>pid <span class="dv">12370</span><span class="op">]</span> read<span class="op">(</span><span class="dv">4</span><span class="op">,</span> <span class="st">&quot;Hello, World!</span><span class="sc">\0</span><span class="st">&quot;</span><span class="op">,</span> <span class="dv">14</span><span class="op">)</span> <span class="op">=</span> <span class="dv">14</span></span>
<span id="cb6-12"><a href="#cb6-12" tabindex="-1"></a><span class="op">[</span>pid <span class="dv">12369</span><span class="op">]</span> write<span class="op">(</span><span class="dv">1</span><span class="op">,</span> <span class="st">&quot;Wrote 14 of 14 bytes</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span> <span class="dv">21</span><span class="op">)</span> <span class="op">=</span> <span class="dv">21</span></span>
<span id="cb6-13"><a href="#cb6-13" tabindex="-1"></a><span class="op">[</span>pid <span class="dv">12370</span><span class="op">]</span> write<span class="op">(</span><span class="dv">1</span><span class="op">,</span> <span class="st">&quot;Read 14 of 14 bytes</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span> <span class="dv">20</span><span class="op">)</span> <span class="op">=</span> <span class="dv">20</span></span>
<span id="cb6-14"><a href="#cb6-14" tabindex="-1"></a><span class="op">[</span>pid <span class="dv">12369</span><span class="op">]</span> <span class="op">+++</span> exited with <span class="dv">0</span> <span class="op">+++</span></span>
<span id="cb6-15"><a href="#cb6-15" tabindex="-1"></a><span class="op">+++</span> exited with <span class="dv">0</span> <span class="op">+++</span></span></code></pre></div>
            <p>The output of <code>strace</code> is becoming more
            confusing. Our call to <code>fork</code> actually resulted
            in a call to <code>clone</code>. Also because some system
            calls were executed in parallel they interrupt each others’
            log messages. You may wish to try playing with the
            <code>strace</code> options to see what information can be
            revealed.</p>
            <p>There are many different <em>socket families</em> which
            support various <em>types</em> of socket and
            <em>protocols</em>. Additionally there are many socket
            options. These change the operations (system calls)
            available and their behaviour. These changes are significant
            and can be surprising.</p>
            <p>Currently we are using the stream type of a UNIX socket.
            Otherwise known as a local socket, because they only allow
            communication between processes on the same machine. As
            usual there is a man page (<code>man 7 unix</code>).</p>
            <p>The way we are currently using UNIX sockets is almost
            identical to a pipe (<code>man 2 pipe</code>). Indeed to use
            a pipe all we need to do is substitute
            <code>socketpair()</code> for <code>pipe()</code> then swap
            the FD numbers. Unlike UNIX sockes a pipe is unidirectional,
            so we need to read and write to the correct FD. There are
            many other subtle differences. However we are unlikely to
            notice the difference with our simple program.</p>
            <p>As well as being bidirectional there are other things a
            UNIX stream socket can do. For one thing we can use the
            <code>send</code>, <code>recv</code>, <code>sendmsg</code>
            and <code>recvmsg</code> interfaces. Before continuing, you
            may wish to convert the program to use these yourself.</p>
            <p>Something to note is that we only send a very small
            amount of data. We also don’t interrupt our program with
            signals. So <code>read</code> and <code>write</code> are
            likely to receive or send the full amount. However, in
            general, there is no guarantee they will <code>read</code>
            or <code>write</code> the full amount. This means the above
            programs are technically incorrect.</p>
            <p>Next let’s start using sockets capable of remote
            communication.</p>
            <h1 id="udp">UDP</h1>
            <p>User Datagram Protocol allows us to send packets
            (datagrams) to an IP address. We do not need to setup a
            <em>connection</em>. We can send and receive packets
            immediately. Although usually one participant needs to
            <code>bind</code> to a known address and port. UDP will
            automatically choose a port and address, but remote peers
            won’t know what this is until we message them.</p>
            <p>Of course there are connections. However these are
            maintained by lower parts of the stack. Such as the IP, ARP
            and Ethernet layers. Our program usually doesn’t need to set
            these up. We just aim a packet at an IP address, send it and
            hope it is routed to the correct location.</p>
            <p>UDP is not reliable, it will happily let us send messages
            to a location that doesn’t exist. The below program is also
            unreliable and contains a race condition, note the
            <code>usleep</code>.</p>
            <div class="sourceCode" id="cb7"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb7-1"><a href="#cb7-1" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;errno.h&gt;</span></span>
<span id="cb7-2"><a href="#cb7-2" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;unistd.h&gt;</span></span>
<span id="cb7-3"><a href="#cb7-3" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;string.h&gt;</span></span>
<span id="cb7-4"><a href="#cb7-4" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;ctype.h&gt;</span></span>
<span id="cb7-5"><a href="#cb7-5" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;stdlib.h&gt;</span></span>
<span id="cb7-6"><a href="#cb7-6" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;stdio.h&gt;</span></span>
<span id="cb7-7"><a href="#cb7-7" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;sys/socket.h&gt;</span></span>
<span id="cb7-8"><a href="#cb7-8" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;sys/types.h&gt;</span></span>
<span id="cb7-9"><a href="#cb7-9" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;sys/wait.h&gt;</span></span>
<span id="cb7-10"><a href="#cb7-10" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;sys/uio.h&gt;</span></span>
<span id="cb7-11"><a href="#cb7-11" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;netinet/in.h&gt;</span></span>
<span id="cb7-12"><a href="#cb7-12" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;netinet/udp.h&gt;</span></span>
<span id="cb7-13"><a href="#cb7-13" tabindex="-1"></a></span>
<span id="cb7-14"><a href="#cb7-14" tabindex="-1"></a><span class="pp">#define PING </span><span class="st">&quot;PING&quot;</span></span>
<span id="cb7-15"><a href="#cb7-15" tabindex="-1"></a><span class="pp">#define PONG </span><span class="st">&quot;PONG&quot;</span></span>
<span id="cb7-16"><a href="#cb7-16" tabindex="-1"></a></span>
<span id="cb7-17"><a href="#cb7-17" tabindex="-1"></a><span class="pp">#define PONG_ADDR </span><span class="op">{</span><span class="pp">         </span><span class="op">\</span></span>
<span id="cb7-18"><a href="#cb7-18" tabindex="-1"></a><span class="pp">    </span><span class="op">.</span><span class="pp">sin_family </span><span class="op">=</span><span class="pp"> AF_INET</span><span class="op">,</span><span class="pp">      </span><span class="op">\</span></span>
<span id="cb7-19"><a href="#cb7-19" tabindex="-1"></a><span class="pp">    </span><span class="op">.</span><span class="pp">sin_port </span><span class="op">=</span><span class="pp"> htons</span><span class="op">(</span><span class="dv">21000</span><span class="op">),</span><span class="pp">   </span><span class="op">\</span></span>
<span id="cb7-20"><a href="#cb7-20" tabindex="-1"></a><span class="pp">    </span><span class="op">.</span><span class="pp">sin_addr </span><span class="op">=</span><span class="pp"> </span><span class="op">{</span><span class="pp">           </span><span class="op">\</span></span>
<span id="cb7-21"><a href="#cb7-21" tabindex="-1"></a><span class="pp">        htonl</span><span class="op">(</span><span class="pp">INADDR_LOOPBACK</span><span class="op">)</span><span class="pp">  </span><span class="op">\</span></span>
<span id="cb7-22"><a href="#cb7-22" tabindex="-1"></a><span class="pp">    </span><span class="op">}</span><span class="pp">               </span><span class="op">\</span></span>
<span id="cb7-23"><a href="#cb7-23" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb7-24"><a href="#cb7-24" tabindex="-1"></a></span>
<span id="cb7-25"><a href="#cb7-25" tabindex="-1"></a><span class="dt">static</span> <span class="dt">int</span> udp_socket<span class="op">(</span><span class="dt">void</span><span class="op">)</span></span>
<span id="cb7-26"><a href="#cb7-26" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb7-27"><a href="#cb7-27" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">int</span> sk <span class="op">=</span></span>
<span id="cb7-28"><a href="#cb7-28" tabindex="-1"></a>        socket<span class="op">(</span>AF_INET<span class="op">,</span> SOCK_DGRAM<span class="op">,</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb7-29"><a href="#cb7-29" tabindex="-1"></a></span>
<span id="cb7-30"><a href="#cb7-30" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>sk <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb7-31"><a href="#cb7-31" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;socket&quot;</span><span class="op">);</span></span>
<span id="cb7-32"><a href="#cb7-32" tabindex="-1"></a>        exit<span class="op">(</span><span class="dv">1</span><span class="op">);</span></span>
<span id="cb7-33"><a href="#cb7-33" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb7-34"><a href="#cb7-34" tabindex="-1"></a></span>
<span id="cb7-35"><a href="#cb7-35" tabindex="-1"></a>    <span class="cf">return</span> sk<span class="op">;</span></span>
<span id="cb7-36"><a href="#cb7-36" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb7-37"><a href="#cb7-37" tabindex="-1"></a></span>
<span id="cb7-38"><a href="#cb7-38" tabindex="-1"></a><span class="dt">static</span> <span class="dt">ssize_t</span> udp_recvfrom<span class="op">(</span><span class="dt">const</span> <span class="dt">int</span> sk<span class="op">,</span></span>
<span id="cb7-39"><a href="#cb7-39" tabindex="-1"></a>                <span class="dt">const</span> <span class="kw">struct</span> iovec <span class="op">*</span><span class="dt">const</span> iov<span class="op">,</span></span>
<span id="cb7-40"><a href="#cb7-40" tabindex="-1"></a>                <span class="dt">const</span> <span class="kw">struct</span> sockaddr_in <span class="op">*</span><span class="dt">const</span> addr<span class="op">)</span></span>
<span id="cb7-41"><a href="#cb7-41" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb7-42"><a href="#cb7-42" tabindex="-1"></a>    socklen_t addr_len <span class="op">=</span> <span class="kw">sizeof</span><span class="op">(*</span>addr<span class="op">);</span></span>
<span id="cb7-43"><a href="#cb7-43" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">ssize_t</span> recv_len <span class="op">=</span></span>
<span id="cb7-44"><a href="#cb7-44" tabindex="-1"></a>        recvfrom<span class="op">(</span>sk<span class="op">,</span></span>
<span id="cb7-45"><a href="#cb7-45" tabindex="-1"></a>             iov<span class="op">-&gt;</span>iov_base<span class="op">,</span></span>
<span id="cb7-46"><a href="#cb7-46" tabindex="-1"></a>             iov<span class="op">-&gt;</span>iov_len <span class="op">-</span> <span class="dv">1</span><span class="op">,</span></span>
<span id="cb7-47"><a href="#cb7-47" tabindex="-1"></a>             <span class="dv">0</span><span class="op">,</span></span>
<span id="cb7-48"><a href="#cb7-48" tabindex="-1"></a>             <span class="op">(</span><span class="kw">struct</span> sockaddr <span class="op">*)</span>addr<span class="op">,</span></span>
<span id="cb7-49"><a href="#cb7-49" tabindex="-1"></a>             addr <span class="op">?</span> <span class="op">&amp;</span>addr_len <span class="op">:</span> NULL<span class="op">);</span></span>
<span id="cb7-50"><a href="#cb7-50" tabindex="-1"></a></span>
<span id="cb7-51"><a href="#cb7-51" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>recv_len <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb7-52"><a href="#cb7-52" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;recvfrom&quot;</span><span class="op">);</span></span>
<span id="cb7-53"><a href="#cb7-53" tabindex="-1"></a>        exit<span class="op">(</span><span class="dv">1</span><span class="op">);</span></span>
<span id="cb7-54"><a href="#cb7-54" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb7-55"><a href="#cb7-55" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>addr_len <span class="op">!=</span> <span class="kw">sizeof</span><span class="op">(*</span>addr<span class="op">))</span> <span class="op">{</span></span>
<span id="cb7-56"><a href="#cb7-56" tabindex="-1"></a>        printf<span class="op">(</span><span class="st">&quot;address is not expected size</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">);</span></span>
<span id="cb7-57"><a href="#cb7-57" tabindex="-1"></a>        exit<span class="op">(</span><span class="dv">1</span><span class="op">);</span></span>
<span id="cb7-58"><a href="#cb7-58" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb7-59"><a href="#cb7-59" tabindex="-1"></a></span>
<span id="cb7-60"><a href="#cb7-60" tabindex="-1"></a>    <span class="op">((</span><span class="dt">char</span> <span class="op">*)</span>iov<span class="op">-&gt;</span>iov_base<span class="op">)[</span>recv_len<span class="op">]</span> <span class="op">=</span> <span class="ch">&#39;</span><span class="sc">\0</span><span class="ch">&#39;</span><span class="op">;</span></span>
<span id="cb7-61"><a href="#cb7-61" tabindex="-1"></a></span>
<span id="cb7-62"><a href="#cb7-62" tabindex="-1"></a>    <span class="cf">return</span> recv_len<span class="op">;</span></span>
<span id="cb7-63"><a href="#cb7-63" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb7-64"><a href="#cb7-64" tabindex="-1"></a></span>
<span id="cb7-65"><a href="#cb7-65" tabindex="-1"></a><span class="dt">static</span> <span class="dt">ssize_t</span> udp_sendto<span class="op">(</span><span class="dt">const</span> <span class="dt">int</span> sk<span class="op">,</span></span>
<span id="cb7-66"><a href="#cb7-66" tabindex="-1"></a>              <span class="dt">const</span> <span class="kw">struct</span> iovec <span class="op">*</span><span class="dt">const</span> iov<span class="op">,</span></span>
<span id="cb7-67"><a href="#cb7-67" tabindex="-1"></a>              <span class="dt">const</span> <span class="kw">struct</span> sockaddr_in <span class="op">*</span><span class="dt">const</span> addr<span class="op">)</span></span>
<span id="cb7-68"><a href="#cb7-68" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb7-69"><a href="#cb7-69" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">ssize_t</span> send_len <span class="op">=</span></span>
<span id="cb7-70"><a href="#cb7-70" tabindex="-1"></a>        sendto<span class="op">(</span>sk<span class="op">,</span></span>
<span id="cb7-71"><a href="#cb7-71" tabindex="-1"></a>               iov<span class="op">-&gt;</span>iov_base<span class="op">,</span></span>
<span id="cb7-72"><a href="#cb7-72" tabindex="-1"></a>               iov<span class="op">-&gt;</span>iov_len<span class="op">,</span></span>
<span id="cb7-73"><a href="#cb7-73" tabindex="-1"></a>               MSG_DONTROUTE<span class="op">,</span></span>
<span id="cb7-74"><a href="#cb7-74" tabindex="-1"></a>               <span class="op">(</span><span class="kw">struct</span> sockaddr <span class="op">*)</span>addr<span class="op">,</span></span>
<span id="cb7-75"><a href="#cb7-75" tabindex="-1"></a>               <span class="kw">sizeof</span><span class="op">(*</span>addr<span class="op">));</span></span>
<span id="cb7-76"><a href="#cb7-76" tabindex="-1"></a></span>
<span id="cb7-77"><a href="#cb7-77" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>send_len <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb7-78"><a href="#cb7-78" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;sendto&quot;</span><span class="op">);</span></span>
<span id="cb7-79"><a href="#cb7-79" tabindex="-1"></a>        exit<span class="op">(</span><span class="dv">1</span><span class="op">);</span></span>
<span id="cb7-80"><a href="#cb7-80" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb7-81"><a href="#cb7-81" tabindex="-1"></a></span>
<span id="cb7-82"><a href="#cb7-82" tabindex="-1"></a>    <span class="cf">return</span> send_len<span class="op">;</span></span>
<span id="cb7-83"><a href="#cb7-83" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb7-84"><a href="#cb7-84" tabindex="-1"></a></span>
<span id="cb7-85"><a href="#cb7-85" tabindex="-1"></a><span class="dt">static</span> <span class="dt">int</span> can_print<span class="op">(</span><span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span>buf<span class="op">)</span></span>
<span id="cb7-86"><a href="#cb7-86" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb7-87"><a href="#cb7-87" tabindex="-1"></a>    <span class="cf">while</span> <span class="op">(</span>isprint<span class="op">(*(</span>buf<span class="op">++)))</span></span>
<span id="cb7-88"><a href="#cb7-88" tabindex="-1"></a>        <span class="op">;</span></span>
<span id="cb7-89"><a href="#cb7-89" tabindex="-1"></a></span>
<span id="cb7-90"><a href="#cb7-90" tabindex="-1"></a>    <span class="cf">return</span> <span class="op">*</span>buf <span class="op">==</span> <span class="ch">&#39;</span><span class="sc">\0</span><span class="ch">&#39;</span><span class="op">;</span></span>
<span id="cb7-91"><a href="#cb7-91" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb7-92"><a href="#cb7-92" tabindex="-1"></a></span>
<span id="cb7-93"><a href="#cb7-93" tabindex="-1"></a><span class="dt">static</span> <span class="dt">int</span> pinger<span class="op">(</span><span class="dt">void</span><span class="op">)</span></span>
<span id="cb7-94"><a href="#cb7-94" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb7-95"><a href="#cb7-95" tabindex="-1"></a>    <span class="dt">const</span> <span class="kw">struct</span> sockaddr_in pong_addr <span class="op">=</span> PONG_ADDR<span class="op">;</span></span>
<span id="cb7-96"><a href="#cb7-96" tabindex="-1"></a>    <span class="dt">char</span> buf<span class="op">[</span>BUFSIZ<span class="op">];</span></span>
<span id="cb7-97"><a href="#cb7-97" tabindex="-1"></a>    <span class="dt">const</span> <span class="kw">struct</span> iovec recv_iov <span class="op">=</span> <span class="op">{</span></span>
<span id="cb7-98"><a href="#cb7-98" tabindex="-1"></a>        <span class="op">.</span>iov_base <span class="op">=</span> buf<span class="op">,</span></span>
<span id="cb7-99"><a href="#cb7-99" tabindex="-1"></a>        <span class="op">.</span>iov_len <span class="op">=</span> BUFSIZ</span>
<span id="cb7-100"><a href="#cb7-100" tabindex="-1"></a>    <span class="op">};</span></span>
<span id="cb7-101"><a href="#cb7-101" tabindex="-1"></a>    <span class="dt">const</span> <span class="kw">struct</span> iovec send_iov <span class="op">=</span> <span class="op">{</span></span>
<span id="cb7-102"><a href="#cb7-102" tabindex="-1"></a>        <span class="op">.</span>iov_base <span class="op">=</span> PING<span class="op">,</span></span>
<span id="cb7-103"><a href="#cb7-103" tabindex="-1"></a>        <span class="op">.</span>iov_len <span class="op">=</span> <span class="kw">sizeof</span><span class="op">(</span>PING<span class="op">)</span></span>
<span id="cb7-104"><a href="#cb7-104" tabindex="-1"></a>    <span class="op">};</span></span>
<span id="cb7-105"><a href="#cb7-105" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">int</span> sk <span class="op">=</span> udp_socket<span class="op">();</span></span>
<span id="cb7-106"><a href="#cb7-106" tabindex="-1"></a></span>
<span id="cb7-107"><a href="#cb7-107" tabindex="-1"></a>    udp_sendto<span class="op">(</span>sk<span class="op">,</span> <span class="op">&amp;</span>send_iov<span class="op">,</span> <span class="op">&amp;</span>pong_addr<span class="op">);</span></span>
<span id="cb7-108"><a href="#cb7-108" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">ssize_t</span> recv_len <span class="op">=</span></span>
<span id="cb7-109"><a href="#cb7-109" tabindex="-1"></a>        udp_recvfrom<span class="op">(</span>sk<span class="op">,</span> <span class="op">&amp;</span>recv_iov<span class="op">,</span> NULL<span class="op">);</span></span>
<span id="cb7-110"><a href="#cb7-110" tabindex="-1"></a></span>
<span id="cb7-111"><a href="#cb7-111" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>can_print<span class="op">(</span>buf<span class="op">))</span></span>
<span id="cb7-112"><a href="#cb7-112" tabindex="-1"></a>        printf<span class="op">(</span><span class="st">&quot;pinger recv: </span><span class="sc">%s\n</span><span class="st">&quot;</span><span class="op">,</span> buf<span class="op">);</span></span>
<span id="cb7-113"><a href="#cb7-113" tabindex="-1"></a>    <span class="cf">else</span></span>
<span id="cb7-114"><a href="#cb7-114" tabindex="-1"></a>        printf<span class="op">(</span><span class="st">&quot;pinger recv </span><span class="sc">%zd</span><span class="st"> bytes</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span> recv_len<span class="op">);</span></span>
<span id="cb7-115"><a href="#cb7-115" tabindex="-1"></a></span>
<span id="cb7-116"><a href="#cb7-116" tabindex="-1"></a>    <span class="cf">return</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb7-117"><a href="#cb7-117" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb7-118"><a href="#cb7-118" tabindex="-1"></a></span>
<span id="cb7-119"><a href="#cb7-119" tabindex="-1"></a><span class="dt">static</span> <span class="dt">int</span> ponger<span class="op">(</span><span class="dt">void</span><span class="op">)</span></span>
<span id="cb7-120"><a href="#cb7-120" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb7-121"><a href="#cb7-121" tabindex="-1"></a>    <span class="dt">const</span> <span class="kw">struct</span> sockaddr_in pong_addr <span class="op">=</span> PONG_ADDR<span class="op">;</span></span>
<span id="cb7-122"><a href="#cb7-122" tabindex="-1"></a>    <span class="dt">const</span> <span class="kw">struct</span> sockaddr_in ping_addr<span class="op">;</span></span>
<span id="cb7-123"><a href="#cb7-123" tabindex="-1"></a>    <span class="dt">char</span> buf<span class="op">[</span>BUFSIZ<span class="op">];</span></span>
<span id="cb7-124"><a href="#cb7-124" tabindex="-1"></a>    <span class="dt">const</span> <span class="kw">struct</span> iovec recv_iov <span class="op">=</span> <span class="op">{</span></span>
<span id="cb7-125"><a href="#cb7-125" tabindex="-1"></a>        <span class="op">.</span>iov_base <span class="op">=</span> buf<span class="op">,</span></span>
<span id="cb7-126"><a href="#cb7-126" tabindex="-1"></a>        <span class="op">.</span>iov_len <span class="op">=</span> BUFSIZ</span>
<span id="cb7-127"><a href="#cb7-127" tabindex="-1"></a>    <span class="op">};</span></span>
<span id="cb7-128"><a href="#cb7-128" tabindex="-1"></a>    <span class="dt">const</span> <span class="kw">struct</span> iovec send_iov <span class="op">=</span> <span class="op">{</span></span>
<span id="cb7-129"><a href="#cb7-129" tabindex="-1"></a>        <span class="op">.</span>iov_base <span class="op">=</span> PONG<span class="op">,</span></span>
<span id="cb7-130"><a href="#cb7-130" tabindex="-1"></a>        <span class="op">.</span>iov_len <span class="op">=</span> <span class="kw">sizeof</span><span class="op">(</span>PONG<span class="op">)</span></span>
<span id="cb7-131"><a href="#cb7-131" tabindex="-1"></a>    <span class="op">};</span></span>
<span id="cb7-132"><a href="#cb7-132" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">int</span> sk <span class="op">=</span> udp_socket<span class="op">();</span></span>
<span id="cb7-133"><a href="#cb7-133" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">int</span> ret <span class="op">=</span></span>
<span id="cb7-134"><a href="#cb7-134" tabindex="-1"></a>        bind<span class="op">(</span>sk<span class="op">,</span></span>
<span id="cb7-135"><a href="#cb7-135" tabindex="-1"></a>             <span class="op">(</span><span class="kw">struct</span> sockaddr <span class="op">*)&amp;</span>pong_addr<span class="op">,</span></span>
<span id="cb7-136"><a href="#cb7-136" tabindex="-1"></a>             <span class="kw">sizeof</span><span class="op">(</span>pong_addr<span class="op">));</span></span>
<span id="cb7-137"><a href="#cb7-137" tabindex="-1"></a></span>
<span id="cb7-138"><a href="#cb7-138" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>ret <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb7-139"><a href="#cb7-139" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;bind&quot;</span><span class="op">);</span></span>
<span id="cb7-140"><a href="#cb7-140" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb7-141"><a href="#cb7-141" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb7-142"><a href="#cb7-142" tabindex="-1"></a></span>
<span id="cb7-143"><a href="#cb7-143" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">ssize_t</span> recv_len <span class="op">=</span></span>
<span id="cb7-144"><a href="#cb7-144" tabindex="-1"></a>        udp_recvfrom<span class="op">(</span>sk<span class="op">,</span> <span class="op">&amp;</span>recv_iov<span class="op">,</span> <span class="op">&amp;</span>ping_addr<span class="op">);</span></span>
<span id="cb7-145"><a href="#cb7-145" tabindex="-1"></a></span>
<span id="cb7-146"><a href="#cb7-146" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>can_print<span class="op">(</span>buf<span class="op">))</span></span>
<span id="cb7-147"><a href="#cb7-147" tabindex="-1"></a>        printf<span class="op">(</span><span class="st">&quot;ponger recv: </span><span class="sc">%s\n</span><span class="st">&quot;</span><span class="op">,</span> buf<span class="op">);</span></span>
<span id="cb7-148"><a href="#cb7-148" tabindex="-1"></a>    <span class="cf">else</span></span>
<span id="cb7-149"><a href="#cb7-149" tabindex="-1"></a>        printf<span class="op">(</span><span class="st">&quot;ponger recv </span><span class="sc">%zd</span><span class="st"> bytes</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span> recv_len<span class="op">);</span></span>
<span id="cb7-150"><a href="#cb7-150" tabindex="-1"></a></span>
<span id="cb7-151"><a href="#cb7-151" tabindex="-1"></a>    udp_sendto<span class="op">(</span>sk<span class="op">,</span> <span class="op">&amp;</span>send_iov<span class="op">,</span> <span class="op">&amp;</span>ping_addr<span class="op">);</span></span>
<span id="cb7-152"><a href="#cb7-152" tabindex="-1"></a></span>
<span id="cb7-153"><a href="#cb7-153" tabindex="-1"></a>    <span class="cf">return</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb7-154"><a href="#cb7-154" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb7-155"><a href="#cb7-155" tabindex="-1"></a></span>
<span id="cb7-156"><a href="#cb7-156" tabindex="-1"></a><span class="dt">int</span> main<span class="op">(</span><span class="dt">const</span> <span class="dt">int</span> argc<span class="op">,</span></span>
<span id="cb7-157"><a href="#cb7-157" tabindex="-1"></a>     <span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span><span class="dt">const</span> argv<span class="op">[])</span></span>
<span id="cb7-158"><a href="#cb7-158" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb7-159"><a href="#cb7-159" tabindex="-1"></a>    <span class="dt">int</span> ret<span class="op">;</span></span>
<span id="cb7-160"><a href="#cb7-160" tabindex="-1"></a>    siginfo_t infop<span class="op">;</span></span>
<span id="cb7-161"><a href="#cb7-161" tabindex="-1"></a></span>
<span id="cb7-162"><a href="#cb7-162" tabindex="-1"></a>    <span class="dt">const</span> pid_t pinger_pid <span class="op">=</span> fork<span class="op">();</span></span>
<span id="cb7-163"><a href="#cb7-163" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>pinger_pid<span class="op">)</span></span>
<span id="cb7-164"><a href="#cb7-164" tabindex="-1"></a>        <span class="cf">return</span> pinger<span class="op">();</span></span>
<span id="cb7-165"><a href="#cb7-165" tabindex="-1"></a></span>
<span id="cb7-166"><a href="#cb7-166" tabindex="-1"></a>    <span class="dt">const</span> pid_t ponger_pid <span class="op">=</span> fork<span class="op">();</span></span>
<span id="cb7-167"><a href="#cb7-167" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>ponger_pid<span class="op">)</span></span>
<span id="cb7-168"><a href="#cb7-168" tabindex="-1"></a>        <span class="cf">return</span> ponger<span class="op">();</span></span>
<span id="cb7-169"><a href="#cb7-169" tabindex="-1"></a></span>
<span id="cb7-170"><a href="#cb7-170" tabindex="-1"></a>    <span class="cf">do</span> <span class="op">{</span></span>
<span id="cb7-171"><a href="#cb7-171" tabindex="-1"></a>        ret <span class="op">=</span> waitid<span class="op">(</span>P_ALL<span class="op">,</span> <span class="dv">0</span><span class="op">,</span> <span class="op">&amp;</span>infop<span class="op">,</span> WEXITED<span class="op">);</span></span>
<span id="cb7-172"><a href="#cb7-172" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(!</span>ret<span class="op">)</span></span>
<span id="cb7-173"><a href="#cb7-173" tabindex="-1"></a>            <span class="co">/* should read infop here */</span></span>
<span id="cb7-174"><a href="#cb7-174" tabindex="-1"></a>            <span class="cf">continue</span><span class="op">;</span></span>
<span id="cb7-175"><a href="#cb7-175" tabindex="-1"></a></span>
<span id="cb7-176"><a href="#cb7-176" tabindex="-1"></a>        <span class="cf">switch</span> <span class="op">(</span>errno<span class="op">)</span> <span class="op">{</span></span>
<span id="cb7-177"><a href="#cb7-177" tabindex="-1"></a>        <span class="cf">case</span> EINTR<span class="op">:</span></span>
<span id="cb7-178"><a href="#cb7-178" tabindex="-1"></a>            <span class="cf">continue</span><span class="op">;</span></span>
<span id="cb7-179"><a href="#cb7-179" tabindex="-1"></a>        <span class="cf">case</span> ECHILD<span class="op">:</span></span>
<span id="cb7-180"><a href="#cb7-180" tabindex="-1"></a>            <span class="cf">break</span><span class="op">;</span></span>
<span id="cb7-181"><a href="#cb7-181" tabindex="-1"></a>        <span class="cf">default</span><span class="op">:</span></span>
<span id="cb7-182"><a href="#cb7-182" tabindex="-1"></a>            perror<span class="op">(</span><span class="st">&quot;waitid&quot;</span><span class="op">);</span></span>
<span id="cb7-183"><a href="#cb7-183" tabindex="-1"></a>            <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb7-184"><a href="#cb7-184" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb7-185"><a href="#cb7-185" tabindex="-1"></a>    <span class="op">}</span> <span class="cf">while</span> <span class="op">(</span><span class="dv">0</span><span class="op">);</span></span>
<span id="cb7-186"><a href="#cb7-186" tabindex="-1"></a></span>
<span id="cb7-187"><a href="#cb7-187" tabindex="-1"></a>    <span class="cf">return</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb7-188"><a href="#cb7-188" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>This starts two processes. Ponger; which binds to
            <code>localhost:21000</code> and waits for a packet. When it
            receives a packet it prints the contents and sends “PONG”
            back. Meanwhile Pinger sends a packet to
            <code>localhost:21000</code> and waits for a response. When
            it gets a response it prints it.</p>
            <p>Pinger does not choose an address to bind to. It is
            automatically assigned a port and is bound to any local
            address. Meanwhile we bind Ponger to <code>localhost</code>
            or the address of the <em>loopback</em> device. Usually
            <code>localhost</code> (<code>127.0.0.1</code>,
            <code>::1</code>, <code>lo</code> etc.) can not receive
            messages from a remote host. So Ponger probably won’t
            receive messages from a remote device.</p>
            <p>Pinger on the other hand will receive messages from
            anywhere. So long as they are addressed to some network
            interface on the local machine (or in the process’s
            <em>network namespace</em>). And they are addressed the port
            which was automatically assigned to it. This means Pinger
            could randomly receive a packet from some remote source.
            Ponger also could receive some unexpected data from a local
            process. Possibly port 21000 is used for something else.</p>
            <p>Which brings me onto constructing an address. Let’s look
            at Ponger’s address with the macro expanded.</p>
            <div class="sourceCode" id="cb8"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb8-1"><a href="#cb8-1" tabindex="-1"></a><span class="dt">const</span> <span class="kw">struct</span> sockaddr_in pong_addr <span class="op">=</span> <span class="op">{</span></span>
<span id="cb8-2"><a href="#cb8-2" tabindex="-1"></a>    <span class="op">.</span>sin_family <span class="op">=</span> AF_INET<span class="op">,</span></span>
<span id="cb8-3"><a href="#cb8-3" tabindex="-1"></a>    <span class="op">.</span>sin_port <span class="op">=</span> htons<span class="op">(</span><span class="dv">21000</span><span class="op">),</span></span>
<span id="cb8-4"><a href="#cb8-4" tabindex="-1"></a>    <span class="op">.</span>sin_addr <span class="op">=</span> <span class="op">(</span><span class="kw">struct</span> in_addr<span class="op">){</span></span>
<span id="cb8-5"><a href="#cb8-5" tabindex="-1"></a>        htonl<span class="op">(</span>INADDR_LOOPBACK<span class="op">)</span></span>
<span id="cb8-6"><a href="#cb8-6" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb8-7"><a href="#cb8-7" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>What catches this author out time and again; is that the
            port and address are in <em>network byte order</em>. This
            happens to be <em>big endian</em>, meanwhile my computer
            uses <em>little endian</em>. So we need to swap the bytes
            around. Consider that <code>21000 = 0x5208</code>.</p>
            <table>
            <thead>
            <tr class="header">
            <th align="left"></th>
            <th align="center">Byte 0</th>
            <th align="center">Byte 1</th>
            </tr>
            </thead>
            <tbody>
            <tr class="odd">
            <td align="left">Little Endian</td>
            <td align="center"><code>0x08</code></td>
            <td align="center"><code>0x52</code></td>
            </tr>
            <tr class="even">
            <td align="left">Big Endian</td>
            <td align="center"><code>0x52</code></td>
            <td align="center"><code>0x08</code></td>
            </tr>
            </tbody>
            </table>
            <p>If byte 0 is on the left, then the <em>end</em> is
            considered to be on the left. This is, of course,
            nonsensical as this means Big Endian <em>starts</em> the
            transmission with the high (i.e. big) order byte. Perhaps it
            should be called Big Startian or HOBAZ (High Order Byte At
            Zero)?</p>
            <p>Another way to visualise it is from top to bottom.
            Address zero is at the top end and there is no bottom end;
            it goes all the way down to infinity. So the <em>end</em> is
            address zero.</p>
            <p>The littleness or bigness of the end depends on the
            <em>significance</em> of the byte. The significance is
            greater if the byte has a greater effect on the number’s
            magnitude. So the least significant byte can only add at
            most 255 (<code>0xff</code>) to a number. The next byte can
            add at most <code>255 * 256</code>
            (<code>0xff00</code>).</p>
            <p>To be clear we are discussing bytes not bits. Binary
            numbers written in Arabic numerals (that is 0 and 1) have
            the high order bit on the left. Generally programming
            languages and machine instructions follow this convention.
            What order the bits are stored or transmitted by hardware is
            irrelevant.</p>
            <p>Let’s say we shift bits left (<code>&lt;&lt;</code>) in a
            64-bit <code>int</code>. Then we expect the low order bit to
            now be zero. All other bits are expected to move one place
            to the left. Regardless of if they cross a byte boundary and
            what order the bytes are handled by the CPU. Nor do we care
            what the actual bit order is within bytes.</p>
            <p>Individual bits are not directly addressable. You need to
            use a combination of shifts and masking to get a single
            bit’s value. Which bit you consider to be index zero is
            arbitrary. It can be the low or high order bit.</p>
            <p>Now let’s look at receiving a packet.</p>
            <div class="sourceCode" id="cb9"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb9-1"><a href="#cb9-1" tabindex="-1"></a><span class="dt">static</span> <span class="dt">ssize_t</span> udp_recvfrom<span class="op">(</span><span class="dt">const</span> <span class="dt">int</span> sk<span class="op">,</span></span>
<span id="cb9-2"><a href="#cb9-2" tabindex="-1"></a>                <span class="dt">const</span> <span class="kw">struct</span> iovec <span class="op">*</span><span class="dt">const</span> iov<span class="op">,</span></span>
<span id="cb9-3"><a href="#cb9-3" tabindex="-1"></a>                <span class="dt">const</span> <span class="kw">struct</span> sockaddr_in <span class="op">*</span><span class="dt">const</span> addr<span class="op">)</span></span>
<span id="cb9-4"><a href="#cb9-4" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb9-5"><a href="#cb9-5" tabindex="-1"></a>    socklen_t addr_len <span class="op">=</span> <span class="kw">sizeof</span><span class="op">(*</span>addr<span class="op">);</span></span>
<span id="cb9-6"><a href="#cb9-6" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">ssize_t</span> recv_len <span class="op">=</span></span>
<span id="cb9-7"><a href="#cb9-7" tabindex="-1"></a>        recvfrom<span class="op">(</span>sk<span class="op">,</span></span>
<span id="cb9-8"><a href="#cb9-8" tabindex="-1"></a>             iov<span class="op">-&gt;</span>iov_base<span class="op">,</span></span>
<span id="cb9-9"><a href="#cb9-9" tabindex="-1"></a>             iov<span class="op">-&gt;</span>iov_len <span class="op">-</span> <span class="dv">1</span><span class="op">,</span></span>
<span id="cb9-10"><a href="#cb9-10" tabindex="-1"></a>             <span class="dv">0</span><span class="op">,</span></span>
<span id="cb9-11"><a href="#cb9-11" tabindex="-1"></a>             <span class="op">(</span><span class="kw">struct</span> sockaddr <span class="op">*)</span>addr<span class="op">,</span></span>
<span id="cb9-12"><a href="#cb9-12" tabindex="-1"></a>             addr <span class="op">?</span> <span class="op">&amp;</span>addr_len <span class="op">:</span> NULL<span class="op">);</span></span>
<span id="cb9-13"><a href="#cb9-13" tabindex="-1"></a></span>
<span id="cb9-14"><a href="#cb9-14" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>recv_len <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb9-15"><a href="#cb9-15" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;recvfrom&quot;</span><span class="op">);</span></span>
<span id="cb9-16"><a href="#cb9-16" tabindex="-1"></a>        exit<span class="op">(</span><span class="dv">1</span><span class="op">);</span></span>
<span id="cb9-17"><a href="#cb9-17" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb9-18"><a href="#cb9-18" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>addr_len <span class="op">!=</span> <span class="kw">sizeof</span><span class="op">(*</span>addr<span class="op">))</span> <span class="op">{</span></span>
<span id="cb9-19"><a href="#cb9-19" tabindex="-1"></a>        printf<span class="op">(</span><span class="st">&quot;address is not expected size</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">);</span></span>
<span id="cb9-20"><a href="#cb9-20" tabindex="-1"></a>        exit<span class="op">(</span><span class="dv">1</span><span class="op">);</span></span>
<span id="cb9-21"><a href="#cb9-21" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb9-22"><a href="#cb9-22" tabindex="-1"></a></span>
<span id="cb9-23"><a href="#cb9-23" tabindex="-1"></a>    <span class="op">((</span><span class="dt">char</span> <span class="op">*)</span>iov<span class="op">-&gt;</span>iov_base<span class="op">)[</span>recv_len<span class="op">]</span> <span class="op">=</span> <span class="ch">&#39;</span><span class="sc">\0</span><span class="ch">&#39;</span><span class="op">;</span></span>
<span id="cb9-24"><a href="#cb9-24" tabindex="-1"></a></span>
<span id="cb9-25"><a href="#cb9-25" tabindex="-1"></a>    <span class="cf">return</span> recv_len<span class="op">;</span></span>
<span id="cb9-26"><a href="#cb9-26" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>The <code>struct iovec</code> is used to wrap the buffer
            and length into a single argument. It’s not necessary,
            however it’s commonly used in networking.</p>
            <p>We reserve one byte of the receive buffer for null
            termination. That is, we add a sentinel value which marks
            the end of a string. Pinger and Ponger already send a null
            terminated string. However we could get some random data
            from another source. It’s also possible to receive corrupted
            data. Although UDP does have a checksum to mitigate that. It
            can happen so it will happen.</p>
            <p>When we receive a UDP packet the kernel informs us of the
            source address. This allows us to respond. The source
            address could be fraudulent. It’s only some data sent in the
            packet’s header. There is no encryption or signing in basic
            UDP. So we can’t trust anything.</p>
            <p>It’s worth noting that <code>send</code> and
            <code>recv</code> only ever accept or return one packet. The
            data in this packet can be between 0 and the
            maximum-transmission-unit in size. The buffer we use to
            receive the packet data in must be large enough to contain
            all of it. Furthermore the order the packets are sent in may
            not be the order they are received in.</p>
            <h1 id="tcp-http">TCP &amp; HTTP</h1>
            <p>This is quite unlike files or streams where we can read
            or write arbitrarily sized chunks of data. Where the data is
            usually in the order it was sent or written. If we want to
            use a stream instead then we can use TCP. The above example
            can be converted to TCP by using the <code>listen</code> and
            <code>connect</code> system calls and switching to
            <code>read</code> and <code>write</code>.</p>
            <p>TCP is connection or stream orientated, meaning we have
            to establish a connection before sending or receiving data.
            Once we have a connection then we can write bytes to a
            socket on one end and expect them to be read in same order
            at the other end. Of course things can still go wrong, but
            it is more reliable than UDP. On the other hand we can no
            longer read and write single packets. Nor can we just send a
            packet immediately.</p>
            <p>Although things like QUIC now exist, TCP is generally
            used to serve web content. Let’s make a minimal HTTP web
            server to serve <a href="/https/richiejp.com/pandoc-bulma-static-site">my
            static website</a>. Now I have to warn you that HTTP is
            hugely complicated. We can get away with ignoring most of
            that complication, but we still end up with a fair old chunk
            of code.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>You can find the <a
            href="https://gitlab.com/Palethorpe/portfolio/src/self-server.c">latest
            source here</a>. It can be built with something like
            <code>gcc -fno-omit-frame-pointer -fsanitize=address,undefined -Wall -Wextra self-serve.c -o self-serve</code>.
            Also note in the <a
            href="https://gitlab.com/Palethorpe/portfolio/-/commits/master/src/self-serve.c">Git
            history</a> that I went from more to less complicated while
            also fixing a number of bugs. In general I think it is best
            to do <a href="ways-to-help-your-project-fail">the simplest
            thing that works first</a>. This is easier said than done,
            so sometimes one has to work backwards, ruthlessly
            discarding things that don’t appear necessary.</p>
            </div>
            </div>
            <div class="sourceCode" id="cb10"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb10-1"><a href="#cb10-1" tabindex="-1"></a><span class="pp">#define </span><span class="ot">_GNU_SOURCE</span></span>
<span id="cb10-2"><a href="#cb10-2" tabindex="-1"></a></span>
<span id="cb10-3"><a href="#cb10-3" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;limits.h&gt;</span></span>
<span id="cb10-4"><a href="#cb10-4" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;errno.h&gt;</span></span>
<span id="cb10-5"><a href="#cb10-5" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;unistd.h&gt;</span></span>
<span id="cb10-6"><a href="#cb10-6" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;string.h&gt;</span></span>
<span id="cb10-7"><a href="#cb10-7" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;stdio.h&gt;</span></span>
<span id="cb10-8"><a href="#cb10-8" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;fcntl.h&gt;</span></span>
<span id="cb10-9"><a href="#cb10-9" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;signal.h&gt;</span></span>
<span id="cb10-10"><a href="#cb10-10" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;sys/stat.h&gt;</span></span>
<span id="cb10-11"><a href="#cb10-11" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;sys/socket.h&gt;</span></span>
<span id="cb10-12"><a href="#cb10-12" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;sys/sendfile.h&gt;</span></span>
<span id="cb10-13"><a href="#cb10-13" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;netinet/in.h&gt;</span></span>
<span id="cb10-14"><a href="#cb10-14" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;netinet/tcp.h&gt;</span></span>
<span id="cb10-15"><a href="#cb10-15" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;arpa/inet.h&gt;</span></span>
<span id="cb10-16"><a href="#cb10-16" tabindex="-1"></a></span>
<span id="cb10-17"><a href="#cb10-17" tabindex="-1"></a><span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span><span class="dt">const</span> http_head <span class="op">=</span></span>
<span id="cb10-18"><a href="#cb10-18" tabindex="-1"></a>    <span class="st">&quot;HTTP/1.1 200 OK</span><span class="sc">\r\n</span><span class="st">&quot;</span></span>
<span id="cb10-19"><a href="#cb10-19" tabindex="-1"></a>    <span class="st">&quot;Connection: close</span><span class="sc">\r\n</span><span class="st">&quot;</span></span>
<span id="cb10-20"><a href="#cb10-20" tabindex="-1"></a>    <span class="st">&quot;Content-Type: </span><span class="sc">%s\r\n</span><span class="st">&quot;</span></span>
<span id="cb10-21"><a href="#cb10-21" tabindex="-1"></a>    <span class="st">&quot;Content-Length: </span><span class="sc">%lu\r\n</span><span class="st">&quot;</span></span>
<span id="cb10-22"><a href="#cb10-22" tabindex="-1"></a>    <span class="st">&quot;</span><span class="sc">\r\n</span><span class="st">&quot;</span><span class="op">;</span></span>
<span id="cb10-23"><a href="#cb10-23" tabindex="-1"></a></span>
<span id="cb10-24"><a href="#cb10-24" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> serve_file<span class="op">(</span><span class="dt">const</span> <span class="dt">int</span> sk<span class="op">,</span> <span class="dt">const</span> <span class="dt">int</span> public_dir<span class="op">)</span></span>
<span id="cb10-25"><a href="#cb10-25" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb10-26"><a href="#cb10-26" tabindex="-1"></a>    <span class="dt">char</span> recv_buf<span class="op">[</span>BUFSIZ<span class="op">];</span></span>
<span id="cb10-27"><a href="#cb10-27" tabindex="-1"></a>    <span class="dt">char</span> head_buf<span class="op">[</span>BUFSIZ<span class="op">];</span></span>
<span id="cb10-28"><a href="#cb10-28" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">size_t</span> buf_len <span class="op">=</span> BUFSIZ <span class="op">-</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb10-29"><a href="#cb10-29" tabindex="-1"></a>    <span class="dt">char</span> path_buf<span class="op">[</span><span class="dv">256</span><span class="op">];</span></span>
<span id="cb10-30"><a href="#cb10-30" tabindex="-1"></a>    <span class="dt">char</span> <span class="op">*</span>file_path<span class="op">;</span></span>
<span id="cb10-31"><a href="#cb10-31" tabindex="-1"></a>    <span class="dt">ssize_t</span> recv<span class="op">,</span> sent<span class="op">;</span></span>
<span id="cb10-32"><a href="#cb10-32" tabindex="-1"></a>    <span class="dt">size_t</span> recv_total <span class="op">=</span> <span class="dv">0</span><span class="op">,</span> sent_total <span class="op">=</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb10-33"><a href="#cb10-33" tabindex="-1"></a>    <span class="dt">int</span> body_fd<span class="op">;</span></span>
<span id="cb10-34"><a href="#cb10-34" tabindex="-1"></a></span>
<span id="cb10-35"><a href="#cb10-35" tabindex="-1"></a>    <span class="cf">while</span> <span class="op">(</span>recv_total <span class="op">&lt;</span> buf_len<span class="op">)</span> <span class="op">{</span></span>
<span id="cb10-36"><a href="#cb10-36" tabindex="-1"></a>        recv <span class="op">=</span> read<span class="op">(</span>sk<span class="op">,</span></span>
<span id="cb10-37"><a href="#cb10-37" tabindex="-1"></a>                recv_buf <span class="op">+</span> recv_total<span class="op">,</span></span>
<span id="cb10-38"><a href="#cb10-38" tabindex="-1"></a>                buf_len <span class="op">-</span> recv_total<span class="op">);</span></span>
<span id="cb10-39"><a href="#cb10-39" tabindex="-1"></a></span>
<span id="cb10-40"><a href="#cb10-40" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>recv <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb10-41"><a href="#cb10-41" tabindex="-1"></a>            perror<span class="op">(</span><span class="st">&quot;[-] read&quot;</span><span class="op">);</span></span>
<span id="cb10-42"><a href="#cb10-42" tabindex="-1"></a>            <span class="cf">return</span><span class="op">;</span></span>
<span id="cb10-43"><a href="#cb10-43" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb10-44"><a href="#cb10-44" tabindex="-1"></a></span>
<span id="cb10-45"><a href="#cb10-45" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(!</span>recv<span class="op">)</span> <span class="op">{</span></span>
<span id="cb10-46"><a href="#cb10-46" tabindex="-1"></a>            dprintf<span class="op">(</span>STDERR_FILENO<span class="op">,</span></span>
<span id="cb10-47"><a href="#cb10-47" tabindex="-1"></a>                <span class="st">&quot;[-] End of data before header was received</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">);</span></span>
<span id="cb10-48"><a href="#cb10-48" tabindex="-1"></a>            <span class="cf">return</span><span class="op">;</span></span>
<span id="cb10-49"><a href="#cb10-49" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb10-50"><a href="#cb10-50" tabindex="-1"></a></span>
<span id="cb10-51"><a href="#cb10-51" tabindex="-1"></a>        recv_total <span class="op">+=</span> recv<span class="op">;</span></span>
<span id="cb10-52"><a href="#cb10-52" tabindex="-1"></a>        recv_buf<span class="op">[</span>recv_total<span class="op">]</span> <span class="op">=</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb10-53"><a href="#cb10-53" tabindex="-1"></a></span>
<span id="cb10-54"><a href="#cb10-54" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>strstr<span class="op">(</span>recv_buf<span class="op">,</span> <span class="st">&quot;</span><span class="sc">\r\n\r\n</span><span class="st">&quot;</span><span class="op">))</span></span>
<span id="cb10-55"><a href="#cb10-55" tabindex="-1"></a>            <span class="cf">goto</span> got_header<span class="op">;</span></span>
<span id="cb10-56"><a href="#cb10-56" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb10-57"><a href="#cb10-57" tabindex="-1"></a></span>
<span id="cb10-58"><a href="#cb10-58" tabindex="-1"></a>    dprintf<span class="op">(</span>STDERR_FILENO<span class="op">,</span></span>
<span id="cb10-59"><a href="#cb10-59" tabindex="-1"></a>        <span class="st">&quot;Exceeded buffer reading header</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">);</span></span>
<span id="cb10-60"><a href="#cb10-60" tabindex="-1"></a>    <span class="cf">return</span><span class="op">;</span></span>
<span id="cb10-61"><a href="#cb10-61" tabindex="-1"></a></span>
<span id="cb10-62"><a href="#cb10-62" tabindex="-1"></a>got_header<span class="op">:</span></span>
<span id="cb10-63"><a href="#cb10-63" tabindex="-1"></a>    printf<span class="op">(</span><span class="st">&quot;[*] &lt;&lt;&lt;</span><span class="sc">\n%s\n</span><span class="st">&quot;</span><span class="op">,</span> recv_buf<span class="op">);</span></span>
<span id="cb10-64"><a href="#cb10-64" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>sscanf<span class="op">(</span>recv_buf<span class="op">,</span> <span class="st">&quot;GET </span><span class="sc">%250s</span><span class="st"> HTTP/1.1&quot;</span><span class="op">,</span> path_buf<span class="op">))</span> <span class="op">{</span></span>
<span id="cb10-65"><a href="#cb10-65" tabindex="-1"></a>        dprintf<span class="op">(</span>STDERR_FILENO<span class="op">,</span></span>
<span id="cb10-66"><a href="#cb10-66" tabindex="-1"></a>            <span class="st">&quot;[-] &#39;GET &lt;file_path&gt; HTTP/1.1&#39; not matched in:</span><span class="sc">\n</span><span class="st"> </span><span class="sc">%s</span><span class="st">&quot;</span><span class="op">,</span></span>
<span id="cb10-67"><a href="#cb10-67" tabindex="-1"></a>            recv_buf<span class="op">);</span></span>
<span id="cb10-68"><a href="#cb10-68" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb10-69"><a href="#cb10-69" tabindex="-1"></a></span>
<span id="cb10-70"><a href="#cb10-70" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>strcmp<span class="op">(</span><span class="st">&quot;/&quot;</span><span class="op">,</span> path_buf<span class="op">))</span> <span class="op">{</span></span>
<span id="cb10-71"><a href="#cb10-71" tabindex="-1"></a>        strcpy<span class="op">(</span>path_buf<span class="op">,</span> <span class="st">&quot;index.html&quot;</span><span class="op">);</span></span>
<span id="cb10-72"><a href="#cb10-72" tabindex="-1"></a>        file_path <span class="op">=</span> path_buf<span class="op">;</span></span>
<span id="cb10-73"><a href="#cb10-73" tabindex="-1"></a>    <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="op">(</span>path_buf<span class="op">[</span><span class="dv">0</span><span class="op">]</span> <span class="op">==</span> <span class="ch">&#39;/&#39;</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb10-74"><a href="#cb10-74" tabindex="-1"></a>        file_path <span class="op">=</span> path_buf <span class="op">+</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb10-75"><a href="#cb10-75" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb10-76"><a href="#cb10-76" tabindex="-1"></a></span>
<span id="cb10-77"><a href="#cb10-77" tabindex="-1"></a>    printf<span class="op">(</span><span class="st">&quot;[*] Opening </span><span class="sc">%s</span><span class="st">&quot;</span><span class="op">,</span> file_path<span class="op">);</span></span>
<span id="cb10-78"><a href="#cb10-78" tabindex="-1"></a>    body_fd <span class="op">=</span> openat<span class="op">(</span>public_dir<span class="op">,</span> file_path<span class="op">,</span> O_RDONLY<span class="op">);</span></span>
<span id="cb10-79"><a href="#cb10-79" tabindex="-1"></a></span>
<span id="cb10-80"><a href="#cb10-80" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>body_fd <span class="op">&lt;</span> <span class="dv">0</span> <span class="op">&amp;&amp;</span> errno <span class="op">==</span> ENOENT<span class="op">)</span> <span class="op">{</span></span>
<span id="cb10-81"><a href="#cb10-81" tabindex="-1"></a>        strcpy<span class="op">(</span>file_path <span class="op">+</span> strlen<span class="op">(</span>file_path<span class="op">),</span> <span class="st">&quot;.html&quot;</span><span class="op">);</span></span>
<span id="cb10-82"><a href="#cb10-82" tabindex="-1"></a>        body_fd <span class="op">=</span> openat<span class="op">(</span>public_dir<span class="op">,</span> file_path<span class="op">,</span> O_RDONLY<span class="op">);</span></span>
<span id="cb10-83"><a href="#cb10-83" tabindex="-1"></a>        printf<span class="op">(</span><span class="st">&quot; failed trying with .html&quot;</span><span class="op">);</span></span>
<span id="cb10-84"><a href="#cb10-84" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb10-85"><a href="#cb10-85" tabindex="-1"></a>    printf<span class="op">(</span><span class="st">&quot;</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">);</span></span>
<span id="cb10-86"><a href="#cb10-86" tabindex="-1"></a></span>
<span id="cb10-87"><a href="#cb10-87" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>body_fd <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb10-88"><a href="#cb10-88" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;[-] openat&quot;</span><span class="op">);</span></span>
<span id="cb10-89"><a href="#cb10-89" tabindex="-1"></a>        <span class="cf">return</span><span class="op">;</span></span>
<span id="cb10-90"><a href="#cb10-90" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb10-91"><a href="#cb10-91" tabindex="-1"></a></span>
<span id="cb10-92"><a href="#cb10-92" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span>mime <span class="op">=</span> <span class="st">&quot;text/html&quot;</span><span class="op">;</span></span>
<span id="cb10-93"><a href="#cb10-93" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>strstr<span class="op">(</span>file_path<span class="op">,</span> <span class="st">&quot;.css&quot;</span><span class="op">))</span></span>
<span id="cb10-94"><a href="#cb10-94" tabindex="-1"></a>        mime <span class="op">=</span> <span class="st">&quot;text/css&quot;</span><span class="op">;</span></span>
<span id="cb10-95"><a href="#cb10-95" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>strstr<span class="op">(</span>file_path<span class="op">,</span> <span class="st">&quot;.map&quot;</span><span class="op">))</span></span>
<span id="cb10-96"><a href="#cb10-96" tabindex="-1"></a>        mime <span class="op">=</span> <span class="st">&quot;application/json&quot;</span><span class="op">;</span></span>
<span id="cb10-97"><a href="#cb10-97" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>strstr<span class="op">(</span>file_path<span class="op">,</span> <span class="st">&quot;.svg&quot;</span><span class="op">))</span></span>
<span id="cb10-98"><a href="#cb10-98" tabindex="-1"></a>        mime <span class="op">=</span> <span class="st">&quot;image/svg+xml&quot;</span><span class="op">;</span></span>
<span id="cb10-99"><a href="#cb10-99" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>strstr<span class="op">(</span>file_path<span class="op">,</span> <span class="st">&quot;.jpg&quot;</span><span class="op">))</span></span>
<span id="cb10-100"><a href="#cb10-100" tabindex="-1"></a>        mime <span class="op">=</span> <span class="st">&quot;image/jpg&quot;</span><span class="op">;</span></span>
<span id="cb10-101"><a href="#cb10-101" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>strstr<span class="op">(</span>file_path<span class="op">,</span> <span class="st">&quot;.png&quot;</span><span class="op">))</span></span>
<span id="cb10-102"><a href="#cb10-102" tabindex="-1"></a>        mime <span class="op">=</span> <span class="st">&quot;image/png&quot;</span><span class="op">;</span></span>
<span id="cb10-103"><a href="#cb10-103" tabindex="-1"></a></span>
<span id="cb10-104"><a href="#cb10-104" tabindex="-1"></a>    <span class="kw">struct</span> stat body_stat<span class="op">;</span></span>
<span id="cb10-105"><a href="#cb10-105" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>fstat<span class="op">(</span>body_fd<span class="op">,</span> <span class="op">&amp;</span>body_stat<span class="op">))</span> <span class="op">{</span></span>
<span id="cb10-106"><a href="#cb10-106" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;[-] fstat&quot;</span><span class="op">);</span></span>
<span id="cb10-107"><a href="#cb10-107" tabindex="-1"></a>        <span class="cf">goto</span> close_body<span class="op">;</span></span>
<span id="cb10-108"><a href="#cb10-108" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb10-109"><a href="#cb10-109" tabindex="-1"></a>    sprintf<span class="op">(</span>head_buf<span class="op">,</span> http_head<span class="op">,</span> mime<span class="op">,</span> body_stat<span class="op">.</span>st_size<span class="op">);</span></span>
<span id="cb10-110"><a href="#cb10-110" tabindex="-1"></a>    printf<span class="op">(</span><span class="st">&quot;[*] &gt;&gt;&gt;</span><span class="sc">\n%s</span><span class="st">&quot;</span><span class="op">,</span> head_buf<span class="op">);</span></span>
<span id="cb10-111"><a href="#cb10-111" tabindex="-1"></a></span>
<span id="cb10-112"><a href="#cb10-112" tabindex="-1"></a>    <span class="cf">while</span> <span class="op">(</span>sent_total <span class="op">&lt;</span> strlen<span class="op">(</span>http_head<span class="op">))</span> <span class="op">{</span></span>
<span id="cb10-113"><a href="#cb10-113" tabindex="-1"></a>        sent <span class="op">=</span> write<span class="op">(</span>sk<span class="op">,</span> head_buf <span class="op">+</span> sent_total<span class="op">,</span> strlen<span class="op">(</span>head_buf<span class="op">));</span></span>
<span id="cb10-114"><a href="#cb10-114" tabindex="-1"></a></span>
<span id="cb10-115"><a href="#cb10-115" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>sent <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb10-116"><a href="#cb10-116" tabindex="-1"></a>            perror<span class="op">(</span><span class="st">&quot;[-] write&quot;</span><span class="op">);</span></span>
<span id="cb10-117"><a href="#cb10-117" tabindex="-1"></a>            <span class="cf">goto</span> close_body<span class="op">;</span></span>
<span id="cb10-118"><a href="#cb10-118" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb10-119"><a href="#cb10-119" tabindex="-1"></a></span>
<span id="cb10-120"><a href="#cb10-120" tabindex="-1"></a>        sent_total <span class="op">+=</span> sent<span class="op">;</span></span>
<span id="cb10-121"><a href="#cb10-121" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb10-122"><a href="#cb10-122" tabindex="-1"></a></span>
<span id="cb10-123"><a href="#cb10-123" tabindex="-1"></a>    <span class="cf">do</span> <span class="op">{</span></span>
<span id="cb10-124"><a href="#cb10-124" tabindex="-1"></a>        sent <span class="op">=</span> sendfile<span class="op">(</span>sk<span class="op">,</span> body_fd<span class="op">,</span> NULL<span class="op">,</span> body_stat<span class="op">.</span>st_size<span class="op">);</span></span>
<span id="cb10-125"><a href="#cb10-125" tabindex="-1"></a></span>
<span id="cb10-126"><a href="#cb10-126" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>sent <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb10-127"><a href="#cb10-127" tabindex="-1"></a>            perror<span class="op">(</span><span class="st">&quot;[-] sendfile&quot;</span><span class="op">);</span></span>
<span id="cb10-128"><a href="#cb10-128" tabindex="-1"></a>            <span class="cf">goto</span> close_body<span class="op">;</span></span>
<span id="cb10-129"><a href="#cb10-129" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb10-130"><a href="#cb10-130" tabindex="-1"></a></span>
<span id="cb10-131"><a href="#cb10-131" tabindex="-1"></a>        sent_total <span class="op">+=</span> sent<span class="op">;</span></span>
<span id="cb10-132"><a href="#cb10-132" tabindex="-1"></a>    <span class="op">}</span> <span class="cf">while</span> <span class="op">(</span>sent <span class="op">&gt;</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb10-133"><a href="#cb10-133" tabindex="-1"></a></span>
<span id="cb10-134"><a href="#cb10-134" tabindex="-1"></a>close_body<span class="op">:</span></span>
<span id="cb10-135"><a href="#cb10-135" tabindex="-1"></a>    close<span class="op">(</span>body_fd<span class="op">);</span></span>
<span id="cb10-136"><a href="#cb10-136" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb10-137"><a href="#cb10-137" tabindex="-1"></a></span>
<span id="cb10-138"><a href="#cb10-138" tabindex="-1"></a><span class="dt">int</span> main<span class="op">(</span><span class="dt">const</span> <span class="dt">int</span> argc<span class="op">,</span> <span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span><span class="dt">const</span> argv<span class="op">[])</span></span>
<span id="cb10-139"><a href="#cb10-139" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb10-140"><a href="#cb10-140" tabindex="-1"></a>    <span class="dt">const</span> pid_t orig_parent <span class="op">=</span> getppid<span class="op">();</span></span>
<span id="cb10-141"><a href="#cb10-141" tabindex="-1"></a>    <span class="dt">const</span> <span class="kw">struct</span> sockaddr_in self_addr <span class="op">=</span> <span class="op">{</span></span>
<span id="cb10-142"><a href="#cb10-142" tabindex="-1"></a>        <span class="op">.</span>sin_family <span class="op">=</span> AF_INET<span class="op">,</span></span>
<span id="cb10-143"><a href="#cb10-143" tabindex="-1"></a>        <span class="op">.</span>sin_port <span class="op">=</span> htons<span class="op">(</span><span class="dv">9000</span><span class="op">),</span></span>
<span id="cb10-144"><a href="#cb10-144" tabindex="-1"></a>        <span class="op">.</span>sin_addr <span class="op">=</span> <span class="op">{</span></span>
<span id="cb10-145"><a href="#cb10-145" tabindex="-1"></a>            htonl<span class="op">(</span>INADDR_LOOPBACK<span class="op">)</span></span>
<span id="cb10-146"><a href="#cb10-146" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb10-147"><a href="#cb10-147" tabindex="-1"></a>    <span class="op">};</span></span>
<span id="cb10-148"><a href="#cb10-148" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">int</span> listen_sk <span class="op">=</span> socket<span class="op">(</span>AF_INET<span class="op">,</span> SOCK_STREAM<span class="op">,</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb10-149"><a href="#cb10-149" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">int</span> public_dir <span class="op">=</span> open<span class="op">(</span>argv<span class="op">[</span><span class="dv">1</span><span class="op">],</span> O_PATH<span class="op">);</span></span>
<span id="cb10-150"><a href="#cb10-150" tabindex="-1"></a>    <span class="kw">struct</span> sockaddr client_addr<span class="op">;</span></span>
<span id="cb10-151"><a href="#cb10-151" tabindex="-1"></a>    socklen_t addr_len<span class="op">;</span></span>
<span id="cb10-152"><a href="#cb10-152" tabindex="-1"></a></span>
<span id="cb10-153"><a href="#cb10-153" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>argc <span class="op">&lt;</span> <span class="dv">2</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb10-154"><a href="#cb10-154" tabindex="-1"></a>        dprintf<span class="op">(</span>STDERR_FILENO<span class="op">,</span></span>
<span id="cb10-155"><a href="#cb10-155" tabindex="-1"></a>            <span class="st">&quot;usage: </span><span class="sc">%s</span><span class="st"> &lt;dir to serve files from&gt;</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span></span>
<span id="cb10-156"><a href="#cb10-156" tabindex="-1"></a>            argv<span class="op">[</span><span class="dv">0</span><span class="op">]);</span></span>
<span id="cb10-157"><a href="#cb10-157" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb10-158"><a href="#cb10-158" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb10-159"><a href="#cb10-159" tabindex="-1"></a></span>
<span id="cb10-160"><a href="#cb10-160" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>bind<span class="op">(</span>listen_sk<span class="op">,</span> <span class="op">(</span><span class="kw">struct</span> sockaddr <span class="op">*)&amp;</span>self_addr<span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span>self_addr<span class="op">)))</span> <span class="op">{</span></span>
<span id="cb10-161"><a href="#cb10-161" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;bind&quot;</span><span class="op">);</span></span>
<span id="cb10-162"><a href="#cb10-162" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb10-163"><a href="#cb10-163" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb10-164"><a href="#cb10-164" tabindex="-1"></a></span>
<span id="cb10-165"><a href="#cb10-165" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>listen<span class="op">(</span>listen_sk<span class="op">,</span> <span class="dv">8</span><span class="op">))</span> <span class="op">{</span></span>
<span id="cb10-166"><a href="#cb10-166" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;listen&quot;</span><span class="op">);</span></span>
<span id="cb10-167"><a href="#cb10-167" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb10-168"><a href="#cb10-168" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb10-169"><a href="#cb10-169" tabindex="-1"></a></span>
<span id="cb10-170"><a href="#cb10-170" tabindex="-1"></a>    printf<span class="op">(</span><span class="st">&quot;[+] Listening; press Ctrl-C to exit...</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">);</span></span>
<span id="cb10-171"><a href="#cb10-171" tabindex="-1"></a></span>
<span id="cb10-172"><a href="#cb10-172" tabindex="-1"></a>    <span class="cf">while</span> <span class="op">(</span>orig_parent <span class="op">==</span> getppid<span class="op">())</span> <span class="op">{</span></span>
<span id="cb10-173"><a href="#cb10-173" tabindex="-1"></a>        <span class="dt">const</span> <span class="dt">int</span> sk <span class="op">=</span> accept<span class="op">(</span>listen_sk<span class="op">,</span> <span class="op">&amp;</span>client_addr<span class="op">,</span> <span class="op">&amp;</span>addr_len<span class="op">);</span></span>
<span id="cb10-174"><a href="#cb10-174" tabindex="-1"></a></span>
<span id="cb10-175"><a href="#cb10-175" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>sk <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb10-176"><a href="#cb10-176" tabindex="-1"></a>            perror<span class="op">(</span><span class="st">&quot;[-] accept&quot;</span><span class="op">);</span></span>
<span id="cb10-177"><a href="#cb10-177" tabindex="-1"></a>            <span class="cf">break</span><span class="op">;</span></span>
<span id="cb10-178"><a href="#cb10-178" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb10-179"><a href="#cb10-179" tabindex="-1"></a></span>
<span id="cb10-180"><a href="#cb10-180" tabindex="-1"></a>        printf<span class="op">(</span><span class="st">&quot;[+] Accepted Connection</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">);</span></span>
<span id="cb10-181"><a href="#cb10-181" tabindex="-1"></a></span>
<span id="cb10-182"><a href="#cb10-182" tabindex="-1"></a>        serve_file<span class="op">(</span>sk<span class="op">,</span> public_dir<span class="op">);</span></span>
<span id="cb10-183"><a href="#cb10-183" tabindex="-1"></a>        close<span class="op">(</span>sk<span class="op">);</span></span>
<span id="cb10-184"><a href="#cb10-184" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb10-185"><a href="#cb10-185" tabindex="-1"></a></span>
<span id="cb10-186"><a href="#cb10-186" tabindex="-1"></a>    <span class="cf">return</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb10-187"><a href="#cb10-187" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>I tested this on Firefox and Chromium. Niether seemed too
            concerned that most of the things they asked for were
            ignored. They didn’t cope very well without the
            content-length header though and Chromium also needs the
            MIME type to be spelled out for it.</p>
            <p>All of the HTTP complication is in
            <code>serve_file</code>. So if we look in <code>main</code>,
            this shows what is involved in accepting an incoming TCP
            connection. The client side is simpler, you just need to
            call <code>connect</code>.</p>
            <p>Inside <code>serve_file</code> we first load the whole
            HTTP header into a buffer. We do this by looking for the
            first instance of a newline followed by a newline
            (<code>\r\n\r\n</code>). HTTP doesn’t appear to set any
            limit on the size of a header. It also has a dreadful
            feature which allows “comments” to be put in some header
            fields which are delimited by <code>(</code> and
            <code>)</code>. These can contain <code>\r\n\r\n</code>. It
            doesn’t matter to us though because we ignore most of the
            header and are not trying to be standards compliant.</p>
            <p>The browser would prefere it if we kept the connection
            open between requests, but it’s easier for us just to close
            it. However it should be noted that opening and closing TCP
            connections is expensive. It seems that Firefox even
            preemptively opens a connection when you move your mouse
            towards a link.</p>
            <p>Anyway, once we have some complete data then we scan the
            first line of it to get the URI path. We only accept paths
            up to 250 characters long which leaves another 5 characters
            for “.html” to be added, plus <code>\0</code>, the null
            character.</p>
            <p>Unfortunately the C libraries string functions are prone
            to dangerous errors. It’s easy to overwrite the null
            terminating character <code>\0</code> or to forget it
            requires extra space in buffers. Also you need to pay
            attention to whether functions like <code>strlen</code>
            count <code>\0</code>. Then there are the attempted fixes
            for these functions, like <code>strncpy</code>, which make
            matters worse by potentially leaving strings
            unterminated.</p>
            <p>C itself does not help because by default there is no
            bounds checking. Although thorough testing with the address
            sanitizer enabled can help with that.</p>
            <p>Eventually we open the file requested. Which, as the file
            path is not validated, could include any file on your
            system. We use <code>openat</code> which takes, as the first
            argument, a file descriptor for a path to a directory. Not
            the directory itself, just the path to that directory. The
            second argument is the file path relative to the directory
            described by the FD. This avoids having to construct the
            full file path with <code>sprintf</code> or similar.</p>
            <p>We then <code>stat</code> the file to get its size for
            the content-length header. The header is formatted and sent
            before writing the file content to the socket with
            <code>sendfile</code>.</p>
            <p>The <code>sendfile</code> system call shown here is
            unique to Linux. Although FreeBSD has a similar one as no
            doubt other kernels do. It avoids having to read the file
            into a buffer before writing it back to the socket. The
            reason for this function’s existence is probably
            performance. However it also happens to make things simpler,
            hence why it’s used here.</p>
            <p>Once we are finished sending the file, the FD and socket
            are closed. Then we wait for the next connection.</p>
            <div class="footnotes footnotes-end-of-document">
            <hr />
            <ol>
            <li id="fn1"><p>I’m using the term object in a loosely
            defined way. There are a number of C structs and associated
            data used to represent a socket in the kernel. Exactly what
            is encapsulated in the socket object and what is external to
            it is unclear<a href="#fnref1"
            class="footnote-back">↩︎</a></p></li>
            </ol>
            </div>
    </div>
  </content>
</entry>
<entry>
  <title>LocalAGI: Create and run AI agents locally without writing
code</title>
  <id>https://richiejp.com/localagi-announcement-local-agents</id>
  <published>2025-05-02T09:04:13+01:00</published>
  <updated>2025-05-02T09:04:13+01:00</updated>
  <link rel="alternate" href="https://richiejp.com/localagi-announcement-local-agents" />
  <summary>Announcing an Open Source project which makes creating local
AI agents, assistants and chat bots easy</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p align="center">
            <img src="/https/richiejp.com/localagi-logo-2.png" alt="LocalAGI Logo" width="220"/>
            </p>
            <p><a href="https://github.com/mudler/LocalAGI">LocalAGI
            allows you to create “AI agents”</a> that run completely on
            your hardware and can access your systems to perform tasks.
            It’s Open Source, allowing you to self-host, modify and
            audit it. You don’t need an internet connection to run it
            and none of your data leaves your hardware unless you want
            it to.</p>
            <p>It comes with a web UI that allows you to configure and
            chat to AI agents. You can also configure connectors, for
            instance Slack. Then you can interact with agents on Slack
            in a similar way to a regular user. If you don’t trust Slack
            you can use a locally hosted IRC server or if there is
            another platform you prefer to communicate with, just let us
            know.</p>
            <iframe width="560" height="315" src="https://www.youtube.com/embed/HtVwIxW3ePg?si=ke1vq1F_oAKeuy0c" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen>
            </iframe>
            <p>There are built-in actions for things like search,
            posting to social media and creating GitHub issues. So, for
            example, you can ask an agent on Slack to summarize
            discussions and create GitHub issues from them.</p>
            <p>To give a more specific case; we had a long discussion on
            Slack about loop detection in agent planning. We called on
            an agent to create a GitHub issue from this discussion. The
            result is a nicely formatted issue with acceptance
            criteria.</p>
            <p>The agent can also provide background information in the
            ticket by doing an internet search and break it down into
            sub-tasks. The result isn’t always what we’d want, but there
            is a kernel of something very useful here on which to
            build.</p>
            <p>In order to create the issue we didn’t need to manually
            browse to GitHub or integrate Slack with GitHub. It’s
            handled by the agent and the workflow can be transported to
            other chat and issue tracking software, let’s say Matrix and
            GitLab. Once we have added the relevant “connectors”, for
            you it is just a case of entering the connection
            details.</p>
            <iframe width="560" height="315" src="https://www.youtube.com/embed/v82rswGJt_M?si=SeP8TMEN4X-oLHZE" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen>
            </iframe>
            <p>The ability to search the internet is just one of the
            kinds of research we could do. Another is that the agent
            automatically checks for similar issues using semantic
            search and then links to them. It could also search the
            code, it could ping people involved in previous similar
            discussions and so on. We don’t quite have all of the pieces
            working in harmony yet, but the foundations are set.</p>
            <p>As long as an agent has the tools and connector
            configurations necessary, it can include these in a
            workflow. The agents know what tools they have available to
            them and can create a plan to use the tools together to
            achieve a goal.</p>
            <p>You can also provide agents with instructions depending
            on the context (dynamic prompts) so that when a particular
            thing happens it has a playbook to follow. So in the case
            someone asks it to create an issue on Slack, it will be
            given instructions for what to do when it is asked on slack
            to create an issue. There is also a knowledge base which
            agents can search and retrieve instructions from.</p>
            <p>Essentially you can create standard operating procedures
            for your AI. Using natural language in a similar fashion to
            how you would with human agents. Creating an issue from a
            discussion saves a bit of time, but performing a wide set of
            validation on an issue can save a lot of time. Also while
            LLMs sometimes do the wrong thing, they don’t get bored of
            following procedures, so they can be given the least
            interesting work to do without fear of upsetting them.</p>
            <iframe width="560" height="315" src="https://www.youtube.com/embed/d_we-AYksSw?si=czKzXTpbv76Aw7Ia" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen>
            </iframe>
            <p>It’s multi-modal so you can give pictures to it which it
            will look at and describe or can use as a reference to
            generate new images. We use this to automatically create
            avatars for agents, but potentially there are more serious
            uses. For instance let’s a say a customer e-mails a picture
            of a faulty product, the agent can use image recognition to
            describe the state of the product, whether or not it
            contains the serial number and suggest a response.</p>
            <p>You can create groups of agents with different personas
            and capabilities. For example you could have one agent which
            is configured to be good at reasoning and planning. Then
            another which is configured with the model and persona to
            create posts for X and another specifically for critiquing
            the posts.</p>
            <p>The planning and reasoning agent can create a high level
            plan for converting some promotional material into a series
            of posts. It can then call on the creative agent a number of
            times to create the posts and the critique agent to check
            each result.</p>
            <p>This allows you to organise and think about your agents
            in a natural way. Essentially like a team of humans, each
            with different capabilities, personalities and
            authorisations. So if you want to write some code, let’s say
            a custom action module for LocalAGI that let’s agents get
            the weather forecast from the UK Met Office, then you ask
            the agent which is configured to write code for LocalAGI
            custom actions.</p>
            <p>You don’t have to worry about your coding agent randomly
            making a post to LinkedIn if it’s not been configured with
            that capability. Meanwhile the agent which does have that
            ability could be setup as a gatekeeper, so that it is
            instructed not to produce content itself, but instead to
            only handle requests to post content. Before posting it
            reviews each request against some criteria and may reject
            the post. Local First</p>
            <p>All of this can be done locally using hardware that at
            the very least small to medium businesses can afford. In
            fact I run it on a $300 GPU from Intel which is good enough
            for experimentation and I suspect that with tuning and
            refinement would be good for a number of production use
            cases.</p>
            <p>This means you won’t have to hand over all of your data
            to the major AI service providers, nor will all of your
            systems stop when the internet goes down.</p>
            <p>However you can still use external LLM providers and it
            is possible to mix local and remote providers. So that you
            can have one agent using a remote provider and others using
            local ones.</p>
            <p>For the adventurous LocalAGI is ready today, but if you
            are not quite ready to dive in now please still head over to
            <a href="https://github.com/mudler/LocalAGI">GitHub and star
            it</a>. Follow Ettore Di Giacinto and myself for development
            updates. In between the time I had started writing this
            article and releasing it, Ettore had implemented a browser
            operator and a coding agent so things are moving fast.</p>
    </div>
  </content>
</entry>
<entry>
  <title>Getting NodeJS 18.x to run on the nanos unikernel</title>
  <id>https://richiejp.com/nanos-clone3-brk-and-nodejs</id>
  <published>2022-06-28T17:47:52+01:00</published>
  <updated>2022-06-28T17:47:52+01:00</updated>
  <link rel="alternate" href="https://richiejp.com/nanos-clone3-brk-and-nodejs" />
  <summary>Implementing clone3, debugging brk and weird Node
behavior</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>I told myself, “Don’t do anything fancy, no new tech,
            just write the booking app how most web developers write
            apps”. A few months later I am implementing the
            <code>clone3</code> systemcall in the nanos unikernel so
            that it can run programs linked with new versions of
            glibc.</p>
            <p>Why do I need to run my nodejs app on a unikernel you
            ask? Because it’s more efficient and very simple… at least
            it would be simple if <a
            href="ways-to-help-your-project-fail">I took my own
            advice</a> and didn’t need cutting edge versions of
            everything. Also if I strictly adhered to my own advice,
            then nanos is also too new.</p>
            <p>Nanos is pretty amazing though, it implements enough
            Linux system calls that it can run hefty programs like Node
            which were compiled for Linux.</p>
            <p>You run a command and it puts a program in an image along
            with any libraries its linked to. Then the image can be run
            in the cloud or QEMU. It’s a lot like creating a container
            from scratch except that, in production, you don’t need to
            boot Linux and have an init program start the container.
            This cuts out a huge number of layers.</p>
            <p>If you are not sure what I mean, then look at how
            containers are ran on <a
            href="https://fly.io/docs/reference/architecture/">Fly.io</a>.
            They run one container per VM. They are not running a full
            Linux distro in the VM, it’s just the kernel and an init
            process which spawns the container image.</p>
            <p>In nanos there is no separate init process, it’s not
            needed or possible. It runs your app executable directly.
            There is no creating new PID or user namespaces or even just
            processes. It doesn’t even have processes or users (plural).
            It has one process and one (fake) user.</p>
            <p>This is a major limitation, but it also makes everything
            faster and simpler. To the extent that it can beat Linux in
            benchmarks and nanos has not been optimised at all by Linux
            standards. Nanos also retains the kernel-user-land barrier
            which means it is sacrificing some performance in comparison
            to other unikernels. So it’s not just running everything in
            kernel-land (which would save on context switches).</p>
            <p>It does have threads however, so multi-threaded
            applications can run. With the exception of thread local
            storage, thread’s share the same virtual address space and
            underlying memory. They also share the same file descriptors
            and other resources. Nanos doesn’t need to implement copy on
            write and such to efficiently support threads.</p>
            <p>On the down side my app uses Redis as the database. Redis
            forks a new process to write to storage in the background.
            It’s possible to run Redis on nanos, but it won’t persist to
            disk (or at least it won’t rewrite the RDB/AOF, something
            like that). The solution is to rewrite Redis to use threads,
            however let’s just ignore that for now.</p>
            <p>Something to note here is that I’m using nodejs because
            of SvelteKit. It’s also, strictly speaking, too new. However
            it’s really good, much better than React. I won’t get into
            that here, it deserves its own article. The thing to note
            though is if you throw out nodejs and use Go, Rust, <a
            href="zig-vs-c-mini-http-server">Zig</a>, Pony (maybe) etc.
            Then you are not likely going to hit the issues I did with
            node.</p>
            <h1 id="running-node">Running node</h1>
            <p>How do you make and run an nanos-node image? Well, we can
            make a simple web server with the following js:</p>
            <div class="sourceCode" id="cb1"><pre
            class="sourceCode js"><code class="sourceCode javascript"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="co">// hi.js</span></span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a><span class="kw">var</span> http <span class="op">=</span> <span class="pp">require</span>(<span class="st">&#39;http&#39;</span>)<span class="op">;</span></span>
<span id="cb1-3"><a href="#cb1-3" tabindex="-1"></a>http<span class="op">.</span><span class="fu">createServer</span>(<span class="kw">function</span> (req<span class="op">,</span> res) {</span>
<span id="cb1-4"><a href="#cb1-4" tabindex="-1"></a>    res<span class="op">.</span><span class="fu">writeHead</span>(<span class="dv">200</span><span class="op">,</span> {<span class="st">&#39;Content-Type&#39;</span><span class="op">:</span> <span class="st">&#39;text/plain&#39;</span>})<span class="op">;</span></span>
<span id="cb1-5"><a href="#cb1-5" tabindex="-1"></a>    res<span class="op">.</span><span class="fu">end</span>(<span class="st">&#39;Hello World</span><span class="sc">\n</span><span class="st">&#39;</span>)<span class="op">;</span></span>
<span id="cb1-6"><a href="#cb1-6" tabindex="-1"></a>})<span class="op">.</span><span class="fu">listen</span>(<span class="dv">8083</span><span class="op">,</span> <span class="st">&quot;0.0.0.0&quot;</span>)<span class="op">;</span></span>
<span id="cb1-7"><a href="#cb1-7" tabindex="-1"></a><span class="bu">console</span><span class="op">.</span><span class="fu">log</span>(<span class="st">&#39;Server running at http://127.0.0.1:8083/&#39;</span>)<span class="op">;</span></span></code></pre></div>
            <p>And a manifest for the image:</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode js"><code class="sourceCode javascript"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a>{</span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a>    <span class="st">&quot;Args&quot;</span><span class="op">:</span> [<span class="st">&quot;hi.js&quot;</span>]<span class="op">,</span></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a>    <span class="st">&quot;Files&quot;</span><span class="op">:</span> [<span class="st">&quot;hi.js&quot;</span>]</span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a>}</span></code></pre></div>
            <p>Then run it locally with <code>ops</code> in QEMU
            with:</p>
            <div class="sourceCode" id="cb3"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a><span class="ex">$</span> ops run /usr/bin/node <span class="at">-a</span> hi.js <span class="at">-c</span> config.json</span>
<span id="cb3-2"><a href="#cb3-2" tabindex="-1"></a><span class="ex">...</span></span></code></pre></div>
            <p>The <code>ops</code> command is made specifically for
            building and deploying nanos images. This works somewhat
            like creating a container from scratch and copying in node,
            node’s libs and <code>hi.js</code>. Except that it creates a
            raw bootable VM image.</p>
            <p>The above works fine when used with versions of node
            compiled for older distros which use older kernels. However
            there is a problem for people living on the cutting
            edge…</p>
            <h1 id="implementing-clone3">Implementing clone3</h1>
            <p>What happens if we run node 17/18 from OpenSUSE
            Tumbleweed or Nix?</p>
            <p>…</p>
            <p>Uh, well, it works just fine now! Perhaps I was
            hallucinating or someone got fed up of containers randomly
            breaking because <code>clone3</code> is disallowed by
            seccomp or similar.</p>
            <p>Let’s just pretend that it fails and talk about my
            <code>clone3</code> implementation. Running <code>ops</code>
            with <code>--trace</code> shows that node dies when it tries
            to use syscall 435. We can find out what syscall that is by
            looking in
            <code>$linux_tree/arch/x86/entry/syscalls/syscall_64.tbl</code>.</p>
            <p>It’s <code>clone3</code> of course which I happened to
            previously write a test for in the Linux Test Project. It’s
            a nicer interface than <code>clone</code>, especially as the
            arguments don’t change position on different platforms. It’s
            more extensible allowing new processes to be cloned directly
            into new namespaces and CGroups which I have had <a
            href="cgroup-compat-layer">lot’s of fun with</a>.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>By the way, <code>clone</code> is used for spawning new
            processes (usually done with <code>fork</code> which is
            implemented in terms of <code>clone</code> these days). Or
            for spawning new threads, which is usually done with the
            POSIX pthreads library.</p>
            </div>
            </div>
            <p>Luckily nanos doesn’t even have processes never mind
            CGroups. So <code>clone3</code> really just allows the stack
            size to be specified. Otherwise there’s no difference with
            it and <code>clone</code>.</p>
            <p>The original nanos <code>clone</code> is implemented in
            <code>$nanos_tree/src/unix/thread.c</code>. It looks like
            the following:</p>
            <div class="sourceCode" id="cb4"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb4-1"><a href="#cb4-1" tabindex="-1"></a><span class="pp">#if defined(__x86_64__)</span></span>
<span id="cb4-2"><a href="#cb4-2" tabindex="-1"></a>sysreturn clone<span class="op">(</span><span class="dt">unsigned</span> <span class="dt">long</span> flags<span class="op">,</span> <span class="dt">void</span> <span class="op">*</span>child_stack<span class="op">,</span> <span class="dt">int</span> <span class="op">*</span>ptid<span class="op">,</span> <span class="dt">int</span> <span class="op">*</span>ctid<span class="op">,</span> <span class="dt">unsigned</span> <span class="dt">long</span> newtls<span class="op">)</span></span>
<span id="cb4-3"><a href="#cb4-3" tabindex="-1"></a><span class="pp">#elif defined(__aarch64__) || defined(__riscv)</span></span>
<span id="cb4-4"><a href="#cb4-4" tabindex="-1"></a>sysreturn clone<span class="op">(</span><span class="dt">unsigned</span> <span class="dt">long</span> flags<span class="op">,</span> <span class="dt">void</span> <span class="op">*</span>child_stack<span class="op">,</span> <span class="dt">int</span> <span class="op">*</span>ptid<span class="op">,</span> <span class="dt">unsigned</span> <span class="dt">long</span> newtls<span class="op">,</span> <span class="dt">int</span> <span class="op">*</span>ctid<span class="op">)</span></span>
<span id="cb4-5"><a href="#cb4-5" tabindex="-1"></a><span class="pp">#endif</span></span>
<span id="cb4-6"><a href="#cb4-6" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb4-7"><a href="#cb4-7" tabindex="-1"></a>    thread_log<span class="op">(</span>current<span class="op">,</span> <span class="st">&quot;clone: flags </span><span class="sc">%lx</span><span class="st">, child_stack </span><span class="sc">%p</span><span class="st">, ptid </span><span class="sc">%p</span><span class="st">, ctid </span><span class="sc">%p</span><span class="st">, newtls </span><span class="sc">%lx</span><span class="st">&quot;</span><span class="op">,</span></span>
<span id="cb4-8"><a href="#cb4-8" tabindex="-1"></a>        flags<span class="op">,</span> child_stack<span class="op">,</span> ptid<span class="op">,</span> ctid<span class="op">,</span> newtls<span class="op">);</span></span>
<span id="cb4-9"><a href="#cb4-9" tabindex="-1"></a></span>
<span id="cb4-10"><a href="#cb4-10" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!(</span>flags <span class="op">&amp;</span> CLONE_THREAD<span class="op">))</span> <span class="op">{</span></span>
<span id="cb4-11"><a href="#cb4-11" tabindex="-1"></a>        thread_log<span class="op">(</span>current<span class="op">,</span> <span class="st">&quot;attempted to create new process, aborting.&quot;</span><span class="op">);</span></span>
<span id="cb4-12"><a href="#cb4-12" tabindex="-1"></a>        <span class="cf">return</span> set_syscall_error<span class="op">(</span>current<span class="op">,</span> ENOSYS<span class="op">);</span></span>
<span id="cb4-13"><a href="#cb4-13" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb4-14"><a href="#cb4-14" tabindex="-1"></a></span>
<span id="cb4-15"><a href="#cb4-15" tabindex="-1"></a>    <span class="co">/* no stack size given, just validate the top word */</span></span>
<span id="cb4-16"><a href="#cb4-16" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>validate_user_memory<span class="op">(</span>child_stack<span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span>u64<span class="op">),</span> <span class="kw">true</span><span class="op">))</span></span>
<span id="cb4-17"><a href="#cb4-17" tabindex="-1"></a>        <span class="cf">return</span> set_syscall_error<span class="op">(</span>current<span class="op">,</span> EFAULT<span class="op">);</span></span>
<span id="cb4-18"><a href="#cb4-18" tabindex="-1"></a></span>
<span id="cb4-19"><a href="#cb4-19" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(((</span>flags <span class="op">&amp;</span> CLONE_PARENT_SETTID<span class="op">)</span> <span class="op">&amp;&amp;</span></span>
<span id="cb4-20"><a href="#cb4-20" tabindex="-1"></a>         <span class="op">!</span>validate_user_memory<span class="op">(</span>ptid<span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span><span class="dt">int</span><span class="op">),</span> <span class="kw">true</span><span class="op">))</span> <span class="op">||</span></span>
<span id="cb4-21"><a href="#cb4-21" tabindex="-1"></a>        <span class="op">((</span>flags <span class="op">&amp;</span> CLONE_CHILD_CLEARTID<span class="op">)</span> <span class="op">&amp;&amp;</span></span>
<span id="cb4-22"><a href="#cb4-22" tabindex="-1"></a>         <span class="op">!</span>validate_user_memory<span class="op">(</span>ctid<span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span><span class="dt">int</span><span class="op">),</span> <span class="kw">true</span><span class="op">)))</span></span>
<span id="cb4-23"><a href="#cb4-23" tabindex="-1"></a>        <span class="cf">return</span> set_syscall_error<span class="op">(</span>current<span class="op">,</span> EFAULT<span class="op">);</span></span>
<span id="cb4-24"><a href="#cb4-24" tabindex="-1"></a></span>
<span id="cb4-25"><a href="#cb4-25" tabindex="-1"></a>    thread t <span class="op">=</span> create_thread<span class="op">(</span>current<span class="op">-&gt;</span>p<span class="op">,</span> INVALID_PHYSICAL<span class="op">);</span></span>
<span id="cb4-26"><a href="#cb4-26" tabindex="-1"></a>    context_frame f <span class="op">=</span> thread_frame<span class="op">(</span>t<span class="op">);</span></span>
<span id="cb4-27"><a href="#cb4-27" tabindex="-1"></a>    <span class="co">/* clone frame processor state */</span></span>
<span id="cb4-28"><a href="#cb4-28" tabindex="-1"></a>    clone_frame_pstate<span class="op">(</span>f<span class="op">,</span> thread_frame<span class="op">(</span>current<span class="op">));</span></span>
<span id="cb4-29"><a href="#cb4-29" tabindex="-1"></a>    thread_clone_sigmask<span class="op">(</span>t<span class="op">,</span> current<span class="op">);</span></span>
<span id="cb4-30"><a href="#cb4-30" tabindex="-1"></a></span>
<span id="cb4-31"><a href="#cb4-31" tabindex="-1"></a>    <span class="co">/* clone behaves like fork at the syscall level, returning 0 to the child */</span></span>
<span id="cb4-32"><a href="#cb4-32" tabindex="-1"></a>    set_syscall_return<span class="op">(</span>t<span class="op">,</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb4-33"><a href="#cb4-33" tabindex="-1"></a>    f<span class="op">[</span>SYSCALL_FRAME_SP<span class="op">]</span> <span class="op">=</span> u64_from_pointer<span class="op">(</span>child_stack<span class="op">);</span></span>
<span id="cb4-34"><a href="#cb4-34" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>flags <span class="op">&amp;</span> CLONE_SETTLS<span class="op">)</span></span>
<span id="cb4-35"><a href="#cb4-35" tabindex="-1"></a>        set_tls<span class="op">(</span>f<span class="op">,</span> newtls<span class="op">);</span></span>
<span id="cb4-36"><a href="#cb4-36" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>flags <span class="op">&amp;</span> CLONE_PARENT_SETTID<span class="op">)</span></span>
<span id="cb4-37"><a href="#cb4-37" tabindex="-1"></a>        <span class="op">*</span>ptid <span class="op">=</span> t<span class="op">-&gt;</span>tid<span class="op">;</span></span>
<span id="cb4-38"><a href="#cb4-38" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>flags <span class="op">&amp;</span> CLONE_CHILD_CLEARTID<span class="op">)</span></span>
<span id="cb4-39"><a href="#cb4-39" tabindex="-1"></a>        t<span class="op">-&gt;</span>clear_tid <span class="op">=</span> ctid<span class="op">;</span></span>
<span id="cb4-40"><a href="#cb4-40" tabindex="-1"></a>    t<span class="op">-&gt;</span>blocked_on <span class="op">=</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb4-41"><a href="#cb4-41" tabindex="-1"></a>    t<span class="op">-&gt;</span>syscall <span class="op">=</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb4-42"><a href="#cb4-42" tabindex="-1"></a>    f<span class="op">[</span>FRAME_FULL<span class="op">]</span> <span class="op">=</span> <span class="kw">true</span><span class="op">;</span></span>
<span id="cb4-43"><a href="#cb4-43" tabindex="-1"></a>    thread_reserve<span class="op">(</span>t<span class="op">);</span></span>
<span id="cb4-44"><a href="#cb4-44" tabindex="-1"></a>    schedule_thread<span class="op">(</span>t<span class="op">);</span></span>
<span id="cb4-45"><a href="#cb4-45" tabindex="-1"></a>    <span class="cf">return</span> t<span class="op">-&gt;</span>tid<span class="op">;</span></span>
<span id="cb4-46"><a href="#cb4-46" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>Linux declares syscalls with a system of macros that
            actually wrap the function definition. It seems that in
            nanos we just write a normal function (named accordingly),
            add the <code>SYS_*</code> define to a header file and
            register it like so.</p>
            <div class="sourceCode" id="cb5"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb5-1"><a href="#cb5-1" tabindex="-1"></a><span class="dt">void</span> register_thread_syscalls<span class="op">(</span><span class="kw">struct</span> syscall <span class="op">*</span>map<span class="op">)</span></span>
<span id="cb5-2"><a href="#cb5-2" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-3"><a href="#cb5-3" tabindex="-1"></a>    register_syscall<span class="op">(</span>map<span class="op">,</span> futex<span class="op">,</span> futex<span class="op">,</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb5-4"><a href="#cb5-4" tabindex="-1"></a>    register_syscall<span class="op">(</span>map<span class="op">,</span> set_robust_list<span class="op">,</span> set_robust_list<span class="op">,</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb5-5"><a href="#cb5-5" tabindex="-1"></a>    register_syscall<span class="op">(</span>map<span class="op">,</span> get_robust_list<span class="op">,</span> get_robust_list<span class="op">,</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb5-6"><a href="#cb5-6" tabindex="-1"></a>    register_syscall<span class="op">(</span>map<span class="op">,</span> clone<span class="op">,</span> clone<span class="op">,</span> SYSCALL_F_SET_PROC<span class="op">);</span></span>
<span id="cb5-7"><a href="#cb5-7" tabindex="-1"></a><span class="pp">#ifdef __x86_64__</span></span>
<span id="cb5-8"><a href="#cb5-8" tabindex="-1"></a>    register_syscall<span class="op">(</span>map<span class="op">,</span> arch_prctl<span class="op">,</span> arch_prctl<span class="op">,</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb5-9"><a href="#cb5-9" tabindex="-1"></a><span class="pp">#endif</span></span>
<span id="cb5-10"><a href="#cb5-10" tabindex="-1"></a>    register_syscall<span class="op">(</span>map<span class="op">,</span> set_tid_address<span class="op">,</span> set_tid_address<span class="op">,</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb5-11"><a href="#cb5-11" tabindex="-1"></a>    register_syscall<span class="op">(</span>map<span class="op">,</span> gettid<span class="op">,</span> gettid<span class="op">,</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb5-12"><a href="#cb5-12" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>The new <code>clone3</code> syscall takes a single struct
            and its size as the only two arguments.</p>
            <div class="sourceCode" id="cb6"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb6-1"><a href="#cb6-1" tabindex="-1"></a><span class="kw">struct</span> clone_args <span class="op">{</span></span>
<span id="cb6-2"><a href="#cb6-2" tabindex="-1"></a>     u64 flags<span class="op">;</span></span>
<span id="cb6-3"><a href="#cb6-3" tabindex="-1"></a>     u64 pidfd<span class="op">;</span></span>
<span id="cb6-4"><a href="#cb6-4" tabindex="-1"></a>     u64 child_tid<span class="op">;</span></span>
<span id="cb6-5"><a href="#cb6-5" tabindex="-1"></a>     u64 parent_tid<span class="op">;</span></span>
<span id="cb6-6"><a href="#cb6-6" tabindex="-1"></a>     u64 exit_signal<span class="op">;</span></span>
<span id="cb6-7"><a href="#cb6-7" tabindex="-1"></a>     u64 stack<span class="op">;</span></span>
<span id="cb6-8"><a href="#cb6-8" tabindex="-1"></a>     u64 stack_size<span class="op">;</span></span>
<span id="cb6-9"><a href="#cb6-9" tabindex="-1"></a>     u64 tls<span class="op">;</span></span>
<span id="cb6-10"><a href="#cb6-10" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb6-11"><a href="#cb6-11" tabindex="-1"></a></span>
<span id="cb6-12"><a href="#cb6-12" tabindex="-1"></a>sysreturn clone3<span class="op">(</span><span class="kw">struct</span> clone_args <span class="op">*</span>args<span class="op">,</span> bytes size<span class="op">)</span></span>
<span id="cb6-13"><a href="#cb6-13" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb6-14"><a href="#cb6-14" tabindex="-1"></a><span class="op">...</span></span>
<span id="cb6-15"><a href="#cb6-15" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>Originally I just copy and pasted the <code>clone</code>
            syscall and modified it to take this struct. I was asked to
            deduplicate the code, so I copied what Linux does and create
            an internal clone which takes a cut down version of
            <code>clone_args</code>. Then used this to implement both
            syscalls.</p>
            <div class="sourceCode" id="cb7"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb7-1"><a href="#cb7-1" tabindex="-1"></a><span class="kw">struct</span> clone_args_internal <span class="op">{</span></span>
<span id="cb7-2"><a href="#cb7-2" tabindex="-1"></a>     u64 flags<span class="op">;</span></span>
<span id="cb7-3"><a href="#cb7-3" tabindex="-1"></a>     <span class="dt">int</span> <span class="op">*</span>child_tid<span class="op">;</span></span>
<span id="cb7-4"><a href="#cb7-4" tabindex="-1"></a>     <span class="dt">int</span> <span class="op">*</span>parent_tid<span class="op">;</span></span>
<span id="cb7-5"><a href="#cb7-5" tabindex="-1"></a>     <span class="dt">void</span> <span class="op">*</span>stack<span class="op">;</span></span>
<span id="cb7-6"><a href="#cb7-6" tabindex="-1"></a>     bytes stack_size<span class="op">;</span></span>
<span id="cb7-7"><a href="#cb7-7" tabindex="-1"></a>     u64 tls<span class="op">;</span></span>
<span id="cb7-8"><a href="#cb7-8" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb7-9"><a href="#cb7-9" tabindex="-1"></a></span>
<span id="cb7-10"><a href="#cb7-10" tabindex="-1"></a>sysreturn clone_internal<span class="op">(</span><span class="kw">struct</span> clone_args_internal <span class="op">*</span>args<span class="op">,</span> bytes size<span class="op">)</span></span>
<span id="cb7-11"><a href="#cb7-11" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb7-12"><a href="#cb7-12" tabindex="-1"></a>     u64 flags <span class="op">=</span> args<span class="op">-&gt;</span>flags<span class="op">;</span></span>
<span id="cb7-13"><a href="#cb7-13" tabindex="-1"></a></span>
<span id="cb7-14"><a href="#cb7-14" tabindex="-1"></a>     <span class="cf">if</span> <span class="op">(!</span>args<span class="op">-&gt;</span>stack_size<span class="op">)</span></span>
<span id="cb7-15"><a href="#cb7-15" tabindex="-1"></a>          <span class="cf">return</span> set_syscall_error<span class="op">(</span>current<span class="op">,</span> EINVAL<span class="op">);</span></span>
<span id="cb7-16"><a href="#cb7-16" tabindex="-1"></a></span>
<span id="cb7-17"><a href="#cb7-17" tabindex="-1"></a>     <span class="cf">if</span> <span class="op">(!(</span>flags <span class="op">&amp;</span> CLONE_THREAD<span class="op">))</span> <span class="op">{</span></span>
<span id="cb7-18"><a href="#cb7-18" tabindex="-1"></a>          thread_log<span class="op">(</span>current<span class="op">,</span> <span class="st">&quot;attempted to create new process, aborting.&quot;</span><span class="op">);</span></span>
<span id="cb7-19"><a href="#cb7-19" tabindex="-1"></a>          <span class="cf">return</span> set_syscall_error<span class="op">(</span>current<span class="op">,</span> ENOSYS<span class="op">);</span></span>
<span id="cb7-20"><a href="#cb7-20" tabindex="-1"></a>     <span class="op">}</span></span>
<span id="cb7-21"><a href="#cb7-21" tabindex="-1"></a></span>
<span id="cb7-22"><a href="#cb7-22" tabindex="-1"></a>     <span class="cf">if</span> <span class="op">(!</span>validate_user_memory<span class="op">(</span>args<span class="op">-&gt;</span>stack<span class="op">,</span> args<span class="op">-&gt;</span>stack_size<span class="op">,</span> <span class="kw">true</span><span class="op">))</span></span>
<span id="cb7-23"><a href="#cb7-23" tabindex="-1"></a>          <span class="cf">return</span> set_syscall_error<span class="op">(</span>current<span class="op">,</span> EFAULT<span class="op">);</span></span>
<span id="cb7-24"><a href="#cb7-24" tabindex="-1"></a></span>
<span id="cb7-25"><a href="#cb7-25" tabindex="-1"></a>     <span class="cf">if</span> <span class="op">(((</span>flags <span class="op">&amp;</span> CLONE_PARENT_SETTID<span class="op">)</span> <span class="op">&amp;&amp;</span></span>
<span id="cb7-26"><a href="#cb7-26" tabindex="-1"></a>          <span class="op">!</span>validate_user_memory<span class="op">(</span>args<span class="op">-&gt;</span>parent_tid<span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span>u64<span class="op">),</span> <span class="kw">true</span><span class="op">))</span> <span class="op">||</span></span>
<span id="cb7-27"><a href="#cb7-27" tabindex="-1"></a>         <span class="op">((</span>flags <span class="op">&amp;</span> CLONE_CHILD_CLEARTID<span class="op">)</span> <span class="op">&amp;&amp;</span></span>
<span id="cb7-28"><a href="#cb7-28" tabindex="-1"></a>          <span class="op">!</span>validate_user_memory<span class="op">(</span>args<span class="op">-&gt;</span>child_tid<span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span>u64<span class="op">),</span> <span class="kw">true</span><span class="op">)))</span></span>
<span id="cb7-29"><a href="#cb7-29" tabindex="-1"></a>          <span class="cf">return</span> set_syscall_error<span class="op">(</span>current<span class="op">,</span> EFAULT<span class="op">);</span></span>
<span id="cb7-30"><a href="#cb7-30" tabindex="-1"></a></span>
<span id="cb7-31"><a href="#cb7-31" tabindex="-1"></a>     thread t <span class="op">=</span> create_thread<span class="op">(</span>current<span class="op">-&gt;</span>p<span class="op">,</span> INVALID_PHYSICAL<span class="op">);</span></span>
<span id="cb7-32"><a href="#cb7-32" tabindex="-1"></a>     context_frame f <span class="op">=</span> thread_frame<span class="op">(</span>t<span class="op">);</span></span>
<span id="cb7-33"><a href="#cb7-33" tabindex="-1"></a></span>
<span id="cb7-34"><a href="#cb7-34" tabindex="-1"></a>     clone_frame_pstate<span class="op">(</span>f<span class="op">,</span> thread_frame<span class="op">(</span>current<span class="op">));</span></span>
<span id="cb7-35"><a href="#cb7-35" tabindex="-1"></a>     thread_clone_sigmask<span class="op">(</span>t<span class="op">,</span> current<span class="op">);</span></span>
<span id="cb7-36"><a href="#cb7-36" tabindex="-1"></a></span>
<span id="cb7-37"><a href="#cb7-37" tabindex="-1"></a>     set_syscall_return<span class="op">(</span>t<span class="op">,</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb7-38"><a href="#cb7-38" tabindex="-1"></a>     f<span class="op">[</span>SYSCALL_FRAME_SP<span class="op">]</span> <span class="op">=</span> <span class="op">(</span>u64<span class="op">)</span>args<span class="op">-&gt;</span>stack<span class="op">;</span></span>
<span id="cb7-39"><a href="#cb7-39" tabindex="-1"></a>     <span class="cf">if</span> <span class="op">(</span>flags <span class="op">&amp;</span> CLONE_SETTLS<span class="op">)</span></span>
<span id="cb7-40"><a href="#cb7-40" tabindex="-1"></a>      set_tls<span class="op">(</span>f<span class="op">,</span> args<span class="op">-&gt;</span>tls<span class="op">);</span></span>
<span id="cb7-41"><a href="#cb7-41" tabindex="-1"></a>     <span class="cf">if</span> <span class="op">(</span>flags <span class="op">&amp;</span> CLONE_PARENT_SETTID<span class="op">)</span></span>
<span id="cb7-42"><a href="#cb7-42" tabindex="-1"></a>      <span class="op">*(</span>args<span class="op">-&gt;</span>parent_tid<span class="op">)</span> <span class="op">=</span> t<span class="op">-&gt;</span>tid<span class="op">;</span></span>
<span id="cb7-43"><a href="#cb7-43" tabindex="-1"></a>     <span class="cf">if</span> <span class="op">(</span>flags <span class="op">&amp;</span> CLONE_CHILD_SETTID<span class="op">)</span></span>
<span id="cb7-44"><a href="#cb7-44" tabindex="-1"></a>      <span class="op">*(</span>args<span class="op">-&gt;</span>child_tid<span class="op">)</span> <span class="op">=</span> t<span class="op">-&gt;</span>tid<span class="op">;</span></span>
<span id="cb7-45"><a href="#cb7-45" tabindex="-1"></a>     <span class="cf">if</span> <span class="op">(</span>flags <span class="op">&amp;</span> CLONE_CHILD_CLEARTID<span class="op">)</span></span>
<span id="cb7-46"><a href="#cb7-46" tabindex="-1"></a>      t<span class="op">-&gt;</span>clear_tid <span class="op">=</span> args<span class="op">-&gt;</span>child_tid<span class="op">;</span></span>
<span id="cb7-47"><a href="#cb7-47" tabindex="-1"></a>     t<span class="op">-&gt;</span>blocked_on <span class="op">=</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb7-48"><a href="#cb7-48" tabindex="-1"></a>     t<span class="op">-&gt;</span>syscall <span class="op">=</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb7-49"><a href="#cb7-49" tabindex="-1"></a>     f<span class="op">[</span>FRAME_FULL<span class="op">]</span> <span class="op">=</span> <span class="kw">true</span><span class="op">;</span></span>
<span id="cb7-50"><a href="#cb7-50" tabindex="-1"></a>     thread_reserve<span class="op">(</span>t<span class="op">);</span></span>
<span id="cb7-51"><a href="#cb7-51" tabindex="-1"></a>     schedule_thread<span class="op">(</span>t<span class="op">);</span></span>
<span id="cb7-52"><a href="#cb7-52" tabindex="-1"></a>     <span class="cf">return</span> t<span class="op">-&gt;</span>tid<span class="op">;</span></span>
<span id="cb7-53"><a href="#cb7-53" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb7-54"><a href="#cb7-54" tabindex="-1"></a></span>
<span id="cb7-55"><a href="#cb7-55" tabindex="-1"></a><span class="pp">#if defined(__x86_64__)</span></span>
<span id="cb7-56"><a href="#cb7-56" tabindex="-1"></a>sysreturn clone<span class="op">(</span><span class="dt">unsigned</span> <span class="dt">long</span> flags<span class="op">,</span> <span class="dt">void</span> <span class="op">*</span>child_stack<span class="op">,</span> <span class="dt">int</span> <span class="op">*</span>ptid<span class="op">,</span> <span class="dt">int</span> <span class="op">*</span>ctid<span class="op">,</span> <span class="dt">unsigned</span> <span class="dt">long</span> newtls<span class="op">)</span></span>
<span id="cb7-57"><a href="#cb7-57" tabindex="-1"></a><span class="pp">#elif defined(__aarch64__) || defined(__riscv)</span></span>
<span id="cb7-58"><a href="#cb7-58" tabindex="-1"></a>sysreturn clone<span class="op">(</span><span class="dt">unsigned</span> <span class="dt">long</span> flags<span class="op">,</span> <span class="dt">void</span> <span class="op">*</span>child_stack<span class="op">,</span> <span class="dt">int</span> <span class="op">*</span>ptid<span class="op">,</span> <span class="dt">unsigned</span> <span class="dt">long</span> newtls<span class="op">,</span> <span class="dt">int</span> <span class="op">*</span>ctid<span class="op">)</span></span>
<span id="cb7-59"><a href="#cb7-59" tabindex="-1"></a><span class="pp">#endif</span></span>
<span id="cb7-60"><a href="#cb7-60" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb7-61"><a href="#cb7-61" tabindex="-1"></a>    thread_log<span class="op">(</span>current<span class="op">,</span> <span class="st">&quot;clone: flags </span><span class="sc">%lx</span><span class="st">, child_stack </span><span class="sc">%p</span><span class="st">, ptid </span><span class="sc">%p</span><span class="st">, ctid </span><span class="sc">%p</span><span class="st">, newtls </span><span class="sc">%lx</span><span class="st">&quot;</span><span class="op">,</span></span>
<span id="cb7-62"><a href="#cb7-62" tabindex="-1"></a>        flags<span class="op">,</span> child_stack<span class="op">,</span> ptid<span class="op">,</span> ctid<span class="op">,</span> newtls<span class="op">);</span></span>
<span id="cb7-63"><a href="#cb7-63" tabindex="-1"></a></span>
<span id="cb7-64"><a href="#cb7-64" tabindex="-1"></a>    <span class="kw">struct</span> clone_args_internal args <span class="op">=</span> <span class="op">{</span></span>
<span id="cb7-65"><a href="#cb7-65" tabindex="-1"></a>         <span class="op">.</span>flags <span class="op">=</span> flags<span class="op">,</span></span>
<span id="cb7-66"><a href="#cb7-66" tabindex="-1"></a>         <span class="op">.</span>child_tid <span class="op">=</span> ctid<span class="op">,</span></span>
<span id="cb7-67"><a href="#cb7-67" tabindex="-1"></a>         <span class="op">.</span>parent_tid <span class="op">=</span> ptid<span class="op">,</span></span>
<span id="cb7-68"><a href="#cb7-68" tabindex="-1"></a>         <span class="co">/* no stack size given, just validate the top word */</span></span>
<span id="cb7-69"><a href="#cb7-69" tabindex="-1"></a>         <span class="op">.</span>stack <span class="op">=</span> child_stack<span class="op">,</span></span>
<span id="cb7-70"><a href="#cb7-70" tabindex="-1"></a>         <span class="op">.</span>stack_size <span class="op">=</span> <span class="kw">sizeof</span><span class="op">(</span>u64<span class="op">),</span></span>
<span id="cb7-71"><a href="#cb7-71" tabindex="-1"></a>         <span class="op">.</span>tls <span class="op">=</span> newtls<span class="op">,</span></span>
<span id="cb7-72"><a href="#cb7-72" tabindex="-1"></a>    <span class="op">};</span></span>
<span id="cb7-73"><a href="#cb7-73" tabindex="-1"></a></span>
<span id="cb7-74"><a href="#cb7-74" tabindex="-1"></a>    <span class="cf">return</span> clone_internal<span class="op">(&amp;</span>args<span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span>args<span class="op">));</span></span>
<span id="cb7-75"><a href="#cb7-75" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb7-76"><a href="#cb7-76" tabindex="-1"></a></span>
<span id="cb7-77"><a href="#cb7-77" tabindex="-1"></a><span class="kw">struct</span> clone_args <span class="op">{</span></span>
<span id="cb7-78"><a href="#cb7-78" tabindex="-1"></a>     u64 flags<span class="op">;</span></span>
<span id="cb7-79"><a href="#cb7-79" tabindex="-1"></a>     u64 pidfd<span class="op">;</span></span>
<span id="cb7-80"><a href="#cb7-80" tabindex="-1"></a>     u64 child_tid<span class="op">;</span></span>
<span id="cb7-81"><a href="#cb7-81" tabindex="-1"></a>     u64 parent_tid<span class="op">;</span></span>
<span id="cb7-82"><a href="#cb7-82" tabindex="-1"></a>     u64 exit_signal<span class="op">;</span></span>
<span id="cb7-83"><a href="#cb7-83" tabindex="-1"></a>     u64 stack<span class="op">;</span></span>
<span id="cb7-84"><a href="#cb7-84" tabindex="-1"></a>     u64 stack_size<span class="op">;</span></span>
<span id="cb7-85"><a href="#cb7-85" tabindex="-1"></a>     u64 tls<span class="op">;</span></span>
<span id="cb7-86"><a href="#cb7-86" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb7-87"><a href="#cb7-87" tabindex="-1"></a></span>
<span id="cb7-88"><a href="#cb7-88" tabindex="-1"></a>sysreturn clone3<span class="op">(</span><span class="kw">struct</span> clone_args <span class="op">*</span>args<span class="op">,</span> bytes size<span class="op">)</span></span>
<span id="cb7-89"><a href="#cb7-89" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb7-90"><a href="#cb7-90" tabindex="-1"></a>     thread_log<span class="op">(</span>current<span class="op">,</span></span>
<span id="cb7-91"><a href="#cb7-91" tabindex="-1"></a>         <span class="st">&quot;clone3: args_size: </span><span class="sc">%ld</span><span class="st">, pidfd: </span><span class="sc">%p</span><span class="st">, child_tid: </span><span class="sc">%p</span><span class="st">, parent_tid: </span><span class="sc">%p</span><span class="st">, exit_signal: </span><span class="sc">%ld</span><span class="st">, stack: </span><span class="sc">%p</span><span class="st">, stack_size: 0x</span><span class="sc">%lx</span><span class="st">, tls: </span><span class="sc">%p</span><span class="st">&quot;</span><span class="op">,</span></span>
<span id="cb7-92"><a href="#cb7-92" tabindex="-1"></a>         size<span class="op">,</span> args<span class="op">-&gt;</span>pidfd<span class="op">,</span> args<span class="op">-&gt;</span>child_tid<span class="op">,</span> args<span class="op">-&gt;</span>parent_tid<span class="op">,</span> args<span class="op">-&gt;</span>exit_signal<span class="op">,</span></span>
<span id="cb7-93"><a href="#cb7-93" tabindex="-1"></a>         args<span class="op">-&gt;</span>stack<span class="op">,</span> args<span class="op">-&gt;</span>stack_size<span class="op">,</span> args<span class="op">-&gt;</span>tls<span class="op">);</span></span>
<span id="cb7-94"><a href="#cb7-94" tabindex="-1"></a></span>
<span id="cb7-95"><a href="#cb7-95" tabindex="-1"></a>     <span class="cf">if</span> <span class="op">(</span>size <span class="op">&lt;</span> <span class="kw">sizeof</span><span class="op">(*</span>args<span class="op">))</span></span>
<span id="cb7-96"><a href="#cb7-96" tabindex="-1"></a>          <span class="cf">return</span> set_syscall_error<span class="op">(</span>current<span class="op">,</span> EINVAL<span class="op">);</span></span>
<span id="cb7-97"><a href="#cb7-97" tabindex="-1"></a></span>
<span id="cb7-98"><a href="#cb7-98" tabindex="-1"></a>     <span class="kw">struct</span> clone_args_internal argsi <span class="op">=</span> <span class="op">{</span></span>
<span id="cb7-99"><a href="#cb7-99" tabindex="-1"></a>          <span class="op">.</span>flags <span class="op">=</span> args<span class="op">-&gt;</span>flags<span class="op">,</span></span>
<span id="cb7-100"><a href="#cb7-100" tabindex="-1"></a>          <span class="op">.</span>child_tid <span class="op">=</span> <span class="op">(</span><span class="dt">int</span> <span class="op">*)</span>args<span class="op">-&gt;</span>child_tid<span class="op">,</span></span>
<span id="cb7-101"><a href="#cb7-101" tabindex="-1"></a>          <span class="op">.</span>parent_tid <span class="op">=</span> <span class="op">(</span><span class="dt">int</span> <span class="op">*)</span>args<span class="op">-&gt;</span>parent_tid<span class="op">,</span></span>
<span id="cb7-102"><a href="#cb7-102" tabindex="-1"></a>          <span class="op">.</span>stack <span class="op">=</span> <span class="op">((</span><span class="dt">char</span> <span class="op">*)</span>args<span class="op">-&gt;</span>stack<span class="op">)</span> <span class="op">+</span> args<span class="op">-&gt;</span>stack_size<span class="op">,</span></span>
<span id="cb7-103"><a href="#cb7-103" tabindex="-1"></a>          <span class="op">.</span>stack_size <span class="op">=</span> args<span class="op">-&gt;</span>stack_size<span class="op">,</span></span>
<span id="cb7-104"><a href="#cb7-104" tabindex="-1"></a>          <span class="op">.</span>tls <span class="op">=</span> args<span class="op">-&gt;</span>tls</span>
<span id="cb7-105"><a href="#cb7-105" tabindex="-1"></a>     <span class="op">};</span></span>
<span id="cb7-106"><a href="#cb7-106" tabindex="-1"></a></span>
<span id="cb7-107"><a href="#cb7-107" tabindex="-1"></a>     <span class="cf">return</span> clone_internal<span class="op">(&amp;</span>argsi<span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span>argsi<span class="op">));</span></span>
<span id="cb7-108"><a href="#cb7-108" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>Note that <code>clone3</code> accepts a pointer to the
            bottom of the stack while <code>clone</code> takes a pointer
            to the top. Stacks grow down (from high to low) so before
            knowing the stack size, a pointer to the end of the memory
            range needed to be provided. Now that there is a stack size,
            a pointer to the start of the range can be provided. This
            means the result of <code>mmap</code> can be passed
            directly.</p>
            <p>Now did I get caught out by this? You bet I did. It’s
            also the major difference between <code>clone</code> and
            <code>clone3</code> from the nanos perspective.</p>
            <p>You can see how pull request is going/gone <a
            href="https://github.com/nanovms/nanos/pull/1750">here</a>.</p>
            <h1 id="running-node-18">Running node 18</h1>
            <p>The above gets node 17 running. However that is not new
            and unstable enough for me. Node 18 is released so I want to
            be running that. However there is a problem and this time I
            can reproduce it.</p>
            <div class="sourceCode" id="cb8"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb8-1"><a href="#cb8-1" tabindex="-1"></a><span class="ex">$</span> ops run /nix/store/yg2w28z1ph0h4a2ydkgbyfz9rl5gd9yh-nodejs-18.2.0/bin/node <span class="at">-a</span> hi.js <span class="at">-c</span> config.json <span class="at">-f</span></span>
<span id="cb8-2"><a href="#cb8-2" tabindex="-1"></a><span class="ex">booting</span> /home/rich/.ops/images/node.img ...</span>
<span id="cb8-3"><a href="#cb8-3" tabindex="-1"></a><span class="ex">en1:</span> assigned 10.0.2.15</span>
<span id="cb8-4"><a href="#cb8-4" tabindex="-1"></a></span>
<span id="cb8-5"><a href="#cb8-5" tabindex="-1"></a><span class="ex">frame</span> trace:</span>
<span id="cb8-6"><a href="#cb8-6" tabindex="-1"></a><span class="ex">ffffc0000706ff40:</span>   ffffffff800a1929    <span class="er">(</span><span class="ex">adjust_process_heap</span> + 0000000000000049/0000000000000064<span class="kw">)</span></span>
<span id="cb8-7"><a href="#cb8-7" tabindex="-1"></a><span class="ex">ffffc0000706ff60:</span>   ffffffff800ba3b6    <span class="er">(</span><span class="ex">brk</span> + 0000000000000156/00000000000001fc<span class="kw">)</span></span>
<span id="cb8-8"><a href="#cb8-8" tabindex="-1"></a><span class="ex">ffffc0000706ffb0:</span>   ffffffff800c8d8d    <span class="er">(</span><span class="ex">syscall_handler</span> + 00000000000002ed/00000000000005e4<span class="kw">)</span></span>
<span id="cb8-9"><a href="#cb8-9" tabindex="-1"></a><span class="ex">ffffc0000706fff0:</span>   0000000000001000</span>
<span id="cb8-10"><a href="#cb8-10" tabindex="-1"></a><span class="ex">assertion</span> rbtree_remove_by_key<span class="er">(</span><span class="ex">t,</span> n<span class="kw">)</span> <span class="ex">failed</span> at /rich/kernel/nanos/src/runtime/rbtree.h:28  in rbtree_remove_node<span class="er">(</span><span class="kw">);</span> <span class="ex">halt</span></span></code></pre></div>
            <p>Oh no, an assertion failure during an operation on a
            red-black tree. The syscall which triggers this is
            <code>brk</code>. This is used to move the end of the heap.
            Unless I am mistaken <code>brk</code> is used for smaller
            memory allocations by the likes of <code>malloc</code>.
            Bigger and more complicated ones are done with
            <code>mmap</code>.</p>
            <p>To find out more about what’s going on we can add nanos
            trace output (with <code>ops run ... --trace</code>).
            Although it turned out <code>brk</code> didn’t output much
            trace information.</p>
            <p>So I incrementally <a
            href="https://github.com/richiejp/nanos/commit/a75a400c848320e9715b7c147336aac0542462f6">added
            trace messages</a> to <code>brk</code> until I was able to
            pinpoint where things were going wrong. By the end
            <code>brk</code> looked like this.</p>
            <div class="sourceCode" id="cb9"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb9-1"><a href="#cb9-1" tabindex="-1"></a><span class="dt">static</span> sysreturn brk<span class="op">(</span><span class="dt">void</span> <span class="op">*</span>addr<span class="op">)</span></span>
<span id="cb9-2"><a href="#cb9-2" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb9-3"><a href="#cb9-3" tabindex="-1"></a>    process p <span class="op">=</span> current<span class="op">-&gt;</span>p<span class="op">;</span></span>
<span id="cb9-4"><a href="#cb9-4" tabindex="-1"></a>    process_lock<span class="op">(</span>p<span class="op">);</span></span>
<span id="cb9-5"><a href="#cb9-5" tabindex="-1"></a></span>
<span id="cb9-6"><a href="#cb9-6" tabindex="-1"></a>    thread_log<span class="op">(</span>current<span class="op">,</span> <span class="st">&quot;brk: p-&gt;brk: </span><span class="sc">%p</span><span class="st">, addr: </span><span class="sc">%p</span><span class="st">&quot;</span><span class="op">,</span> p<span class="op">-&gt;</span>brk<span class="op">,</span> addr<span class="op">);</span></span>
<span id="cb9-7"><a href="#cb9-7" tabindex="-1"></a></span>
<span id="cb9-8"><a href="#cb9-8" tabindex="-1"></a>    <span class="co">/* on failure, return the current break */</span></span>
<span id="cb9-9"><a href="#cb9-9" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>addr <span class="op">||</span> p<span class="op">-&gt;</span>brk <span class="op">==</span> addr<span class="op">)</span></span>
<span id="cb9-10"><a href="#cb9-10" tabindex="-1"></a>        <span class="cf">goto</span> out<span class="op">;</span></span>
<span id="cb9-11"><a href="#cb9-11" tabindex="-1"></a></span>
<span id="cb9-12"><a href="#cb9-12" tabindex="-1"></a>    u64 old_end <span class="op">=</span> pad<span class="op">(</span>u64_from_pointer<span class="op">(</span>p<span class="op">-&gt;</span>brk<span class="op">),</span> PAGESIZE<span class="op">);</span></span>
<span id="cb9-13"><a href="#cb9-13" tabindex="-1"></a>    u64 new_end <span class="op">=</span> pad<span class="op">(</span>u64_from_pointer<span class="op">(</span>addr<span class="op">),</span> PAGESIZE<span class="op">);</span></span>
<span id="cb9-14"><a href="#cb9-14" tabindex="-1"></a></span>
<span id="cb9-15"><a href="#cb9-15" tabindex="-1"></a>    thread_log<span class="op">(</span>current<span class="op">,</span> <span class="st">&quot;brk: old_end: </span><span class="sc">%lx</span><span class="st">, new_end: </span><span class="sc">%lx</span><span class="st">&quot;</span><span class="op">,</span> old_end<span class="op">,</span> new_end<span class="op">);</span></span>
<span id="cb9-16"><a href="#cb9-16" tabindex="-1"></a></span>
<span id="cb9-17"><a href="#cb9-17" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>old_end <span class="op">&gt;</span> new_end<span class="op">)</span> <span class="op">{</span></span>
<span id="cb9-18"><a href="#cb9-18" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>u64_from_pointer<span class="op">(</span>addr<span class="op">)</span> <span class="op">&lt;</span> p<span class="op">-&gt;</span>heap_base <span class="op">||</span></span>
<span id="cb9-19"><a href="#cb9-19" tabindex="-1"></a>            <span class="op">!</span>adjust_process_heap<span class="op">(</span>p<span class="op">,</span> irange<span class="op">(</span>p<span class="op">-&gt;</span>heap_base<span class="op">,</span> new_end<span class="op">)))</span></span>
<span id="cb9-20"><a href="#cb9-20" tabindex="-1"></a>            <span class="cf">goto</span> out<span class="op">;</span></span>
<span id="cb9-21"><a href="#cb9-21" tabindex="-1"></a>        write_barrier<span class="op">();</span></span>
<span id="cb9-22"><a href="#cb9-22" tabindex="-1"></a>        unmap_and_free_phys<span class="op">(</span>new_end<span class="op">,</span> old_end <span class="op">-</span> new_end<span class="op">);</span></span>
<span id="cb9-23"><a href="#cb9-23" tabindex="-1"></a>    <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="op">(</span>new_end <span class="op">&gt;</span> old_end<span class="op">)</span> <span class="op">{</span></span>
<span id="cb9-24"><a href="#cb9-24" tabindex="-1"></a>        u64 alloc <span class="op">=</span> new_end <span class="op">-</span> old_end<span class="op">;</span></span>
<span id="cb9-25"><a href="#cb9-25" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(!</span>validate_user_memory<span class="op">(</span>pointer_from_u64<span class="op">(</span>old_end<span class="op">),</span> alloc<span class="op">,</span> <span class="kw">true</span><span class="op">)</span> <span class="op">||</span></span>
<span id="cb9-26"><a href="#cb9-26" tabindex="-1"></a>            <span class="op">!</span>adjust_process_heap<span class="op">(</span>p<span class="op">,</span> irange<span class="op">(</span>p<span class="op">-&gt;</span>heap_base<span class="op">,</span> new_end<span class="op">)))</span> <span class="op">{</span></span>
<span id="cb9-27"><a href="#cb9-27" tabindex="-1"></a>        thread_log<span class="op">(</span>current<span class="op">,</span> <span class="st">&quot;brk: failed&quot;</span><span class="op">);</span></span>
<span id="cb9-28"><a href="#cb9-28" tabindex="-1"></a>        <span class="cf">goto</span> out<span class="op">;</span></span>
<span id="cb9-29"><a href="#cb9-29" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb9-30"><a href="#cb9-30" tabindex="-1"></a>        pageflags flags <span class="op">=</span> pageflags_writable<span class="op">(</span>pageflags_noexec<span class="op">(</span>pageflags_user<span class="op">(</span>pageflags_memory<span class="op">())));</span></span>
<span id="cb9-31"><a href="#cb9-31" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>new_zeroed_pages<span class="op">(</span>old_end<span class="op">,</span> alloc<span class="op">,</span> flags<span class="op">,</span> <span class="dv">0</span><span class="op">)</span> <span class="op">==</span> INVALID_PHYSICAL<span class="op">)</span> <span class="op">{</span></span>
<span id="cb9-32"><a href="#cb9-32" tabindex="-1"></a>            adjust_process_heap<span class="op">(</span>p<span class="op">,</span> irange<span class="op">(</span>p<span class="op">-&gt;</span>heap_base<span class="op">,</span> old_end<span class="op">));</span></span>
<span id="cb9-33"><a href="#cb9-33" tabindex="-1"></a>            <span class="cf">goto</span> out<span class="op">;</span></span>
<span id="cb9-34"><a href="#cb9-34" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb9-35"><a href="#cb9-35" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb9-36"><a href="#cb9-36" tabindex="-1"></a>    p<span class="op">-&gt;</span>brk <span class="op">=</span> addr<span class="op">;</span></span>
<span id="cb9-37"><a href="#cb9-37" tabindex="-1"></a>  out<span class="op">:</span></span>
<span id="cb9-38"><a href="#cb9-38" tabindex="-1"></a>    addr <span class="op">=</span> p<span class="op">-&gt;</span>brk<span class="op">;</span></span>
<span id="cb9-39"><a href="#cb9-39" tabindex="-1"></a>    process_unlock<span class="op">(</span>p<span class="op">);</span></span>
<span id="cb9-40"><a href="#cb9-40" tabindex="-1"></a></span>
<span id="cb9-41"><a href="#cb9-41" tabindex="-1"></a>    thread_log<span class="op">(</span>current<span class="op">,</span> <span class="st">&quot;brk: ret addr: </span><span class="sc">%p</span><span class="st">&quot;</span><span class="op">,</span> addr<span class="op">);</span></span>
<span id="cb9-42"><a href="#cb9-42" tabindex="-1"></a>    <span class="cf">return</span> sysreturn_from_pointer<span class="op">(</span>addr<span class="op">);</span></span>
<span id="cb9-43"><a href="#cb9-43" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>The <code>thread_log</code> calls add trace information.
            Of particular interest is the one which prints “brk:
            failed”.</p>
            <p>The trace output looked like this.</p>
            <div class="sourceCode" id="cb10"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb10-1"><a href="#cb10-1" tabindex="-1"></a><span class="ex">$</span> ops run /nix/store/yg2w28z1ph0h4a2ydkgbyfz9rl5gd9yh-nodejs-18.2.0/bin/node <span class="at">-a</span> hi.js <span class="at">-c</span> config.json <span class="at">-f</span> <span class="at">--trace</span> <span class="kw">|</span> <span class="ex">rg</span> <span class="at">-A</span> 10 brk</span>
<span id="cb10-2"><a href="#cb10-2" tabindex="-1"></a></span>
<span id="cb10-3"><a href="#cb10-3" tabindex="-1"></a><span class="ex">...</span> many lines redacted ...</span>
<span id="cb10-4"><a href="#cb10-4" tabindex="-1"></a></span>
<span id="cb10-5"><a href="#cb10-5" tabindex="-1"></a>    <span class="ex">2</span> brk</span>
<span id="cb10-6"><a href="#cb10-6" tabindex="-1"></a>    <span class="ex">2</span> brk: p-<span class="op">&gt;</span>brk: 0x0000000002a6f000, addr: 0x0000000002a91000</span>
<span id="cb10-7"><a href="#cb10-7" tabindex="-1"></a>    <span class="ex">2</span> brk: old_end: 2a6f000, new_end: 2a91000</span>
<span id="cb10-8"><a href="#cb10-8" tabindex="-1"></a>    <span class="ex">2</span> brk: failed</span>
<span id="cb10-9"><a href="#cb10-9" tabindex="-1"></a>    <span class="ex">2</span> brk: ret addr: 0x0000000002a6f000</span>
<span id="cb10-10"><a href="#cb10-10" tabindex="-1"></a>    <span class="ex">2</span> direct return: 44494848, rsp 0xffe38e98</span>
<span id="cb10-11"><a href="#cb10-11" tabindex="-1"></a>    <span class="ex">2</span> run thread, cpu 0, frame 0xffffc00001807000, pc 0x100121d37, sp 0xffe38e98, rv 0x2a6f000</span>
<span id="cb10-12"><a href="#cb10-12" tabindex="-1"></a>    <span class="ex">2</span> mmap</span>
<span id="cb10-13"><a href="#cb10-13" tabindex="-1"></a>    <span class="ex">2</span> mmap: addr 0x0000000000000000, length 0x100000, prot 0x3, flags 0x22, fd <span class="at">-1,</span> offset 0x0</span>
<span id="cb10-14"><a href="#cb10-14" tabindex="-1"></a>    <span class="ex">2</span>    returning 0x1007a3000</span>
<span id="cb10-15"><a href="#cb10-15" tabindex="-1"></a>    <span class="ex">2</span> direct return: 4302974976, rsp 0xffe38e98</span>
<span id="cb10-16"><a href="#cb10-16" tabindex="-1"></a>    <span class="ex">2</span> run thread, cpu 0, frame 0xffffc00001807000, pc 0x1001254b3, sp 0xffe38e98, rv 0x1007a3000</span>
<span id="cb10-17"><a href="#cb10-17" tabindex="-1"></a>    <span class="ex">2</span> page fault, vaddr 0x1007a3008, vmap 0xffffc0000040a500, ctx 0xffffc00001807000, type 3, pc 0x1000b769e</span>
<span id="cb10-18"><a href="#cb10-18" tabindex="-1"></a>    <span class="ex">2</span> page fault, vaddr 0x1007ab018, vmap 0xffffc0000040a500, ctx 0xffffc00001807000, type 3, pc 0x1000b7229</span>
<span id="cb10-19"><a href="#cb10-19" tabindex="-1"></a>    <span class="ex">2</span> page fault, vaddr 0x1007a7010, vmap 0xffffc0000040a500, ctx 0xffffc00001807000, type 3, pc 0xd54c32</span>
<span id="cb10-20"><a href="#cb10-20" tabindex="-1"></a><span class="ex">--</span></span>
<span id="cb10-21"><a href="#cb10-21" tabindex="-1"></a>    <span class="ex">2</span> brk</span>
<span id="cb10-22"><a href="#cb10-22" tabindex="-1"></a>    <span class="ex">2</span> brk: p-<span class="op">&gt;</span>brk: 0x0000000002a6f000, addr: 0x0000000002afe000</span>
<span id="cb10-23"><a href="#cb10-23" tabindex="-1"></a>    <span class="ex">2</span> brk: old_end: 2a6f000, new_end: 2afe000</span>
<span id="cb10-24"><a href="#cb10-24" tabindex="-1"></a></span>
<span id="cb10-25"><a href="#cb10-25" tabindex="-1"></a><span class="ex">frame</span> trace:</span>
<span id="cb10-26"><a href="#cb10-26" tabindex="-1"></a><span class="ex">ffffc0000706ff40:</span>   ffffffff800a1ce9    <span class="er">(</span><span class="ex">adjust_process_heap</span> + 0000000000000049/0000000000000064<span class="kw">)</span></span>
<span id="cb10-27"><a href="#cb10-27" tabindex="-1"></a><span class="ex">ffffc0000706ff60:</span>   ffffffff800bb107    <span class="er">(</span><span class="ex">brk</span> + 00000000000004d7/0000000000000548<span class="kw">)</span></span>
<span id="cb10-28"><a href="#cb10-28" tabindex="-1"></a><span class="ex">ffffc0000706ffb0:</span>   ffffffff800c959d    <span class="er">(</span><span class="ex">syscall_handler</span> + 00000000000002ed/00000000000005e4<span class="kw">)</span></span>
<span id="cb10-29"><a href="#cb10-29" tabindex="-1"></a><span class="ex">ffffc0000706fff0:</span>   0000000000001000</span>
<span id="cb10-30"><a href="#cb10-30" tabindex="-1"></a><span class="ex">assertion</span> rbtree_remove_by_key<span class="er">(</span><span class="ex">t,</span> n<span class="kw">)</span> <span class="ex">failed</span> at /home/rich/kernel/nanos/src/runtime/rbtree.h:28  in rbtree_remove_node<span class="er">(</span><span class="kw">);</span> <span class="ex">halt</span></span></code></pre></div>
            <p>The second til last <code>brk</code> fails, then the last
            one triggers the assertion. My suspicions fell on
            <code>adjust_process_heap(p, irange(p-&gt;heap_base, new_end))</code>
            before even adding the “brk: failed” message.</p>
            <p>To see why let’s look at the implementation.</p>
            <div class="sourceCode" id="cb11"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb11-1"><a href="#cb11-1" tabindex="-1"></a>boolean adjust_process_heap<span class="op">(</span>process p<span class="op">,</span> range new<span class="op">)</span></span>
<span id="cb11-2"><a href="#cb11-2" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb11-3"><a href="#cb11-3" tabindex="-1"></a>    vmap_lock<span class="op">(</span>p<span class="op">);</span></span>
<span id="cb11-4"><a href="#cb11-4" tabindex="-1"></a>    boolean inserted <span class="op">=</span> rangemap_reinsert<span class="op">(</span>p<span class="op">-&gt;</span>vmaps<span class="op">,</span> <span class="op">&amp;</span>p<span class="op">-&gt;</span>heap_map<span class="op">-&gt;</span>node<span class="op">,</span> new<span class="op">);</span></span>
<span id="cb11-5"><a href="#cb11-5" tabindex="-1"></a>    vmap_unlock<span class="op">(</span>p<span class="op">);</span></span>
<span id="cb11-6"><a href="#cb11-6" tabindex="-1"></a>    <span class="cf">return</span> inserted<span class="op">;</span></span>
<span id="cb11-7"><a href="#cb11-7" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>OK, OK, we really want to look at
            <code>rangemap_reinsert</code>.</p>
            <div class="sourceCode" id="cb12"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb12-1"><a href="#cb12-1" tabindex="-1"></a>boolean rangemap_reinsert<span class="op">(</span>rangemap rm<span class="op">,</span> rmnode n<span class="op">,</span> range k<span class="op">)</span></span>
<span id="cb12-2"><a href="#cb12-2" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb12-3"><a href="#cb12-3" tabindex="-1"></a>    rangemap_remove_node<span class="op">(</span>rm<span class="op">,</span> n<span class="op">);</span></span>
<span id="cb12-4"><a href="#cb12-4" tabindex="-1"></a>    n<span class="op">-&gt;</span>r <span class="op">=</span> k<span class="op">;</span></span>
<span id="cb12-5"><a href="#cb12-5" tabindex="-1"></a>    <span class="cf">return</span> rangemap_insert<span class="op">(</span>rm<span class="op">,</span> n<span class="op">);</span></span>
<span id="cb12-6"><a href="#cb12-6" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>If <code>rangemap_insert</code> can fail after
            <code>rangemap_remove_node</code> succeeded then we can
            remove the node representing the heap memory from the
            rangemap using <code>brk</code>.</p>
            <p>In fact <code>rangemap_remove_node</code> can’t fail
            without triggering an assertion. That’s what causes the
            assertion failure above. So let’s look at
            <code>rangemap_insert</code>.</p>
            <div class="sourceCode" id="cb13"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb13-1"><a href="#cb13-1" tabindex="-1"></a>boolean rangemap_insert<span class="op">(</span>rangemap rm<span class="op">,</span> rmnode n<span class="op">)</span></span>
<span id="cb13-2"><a href="#cb13-2" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb13-3"><a href="#cb13-3" tabindex="-1"></a>    init_rbnode<span class="op">(&amp;</span>n<span class="op">-&gt;</span>n<span class="op">);</span></span>
<span id="cb13-4"><a href="#cb13-4" tabindex="-1"></a>    rangemap_foreach_of_range<span class="op">(</span>rm<span class="op">,</span> curr<span class="op">,</span> n<span class="op">)</span> <span class="op">{</span></span>
<span id="cb13-5"><a href="#cb13-5" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>curr<span class="op">-&gt;</span>r<span class="op">.</span>start <span class="op">&gt;=</span> n<span class="op">-&gt;</span>r<span class="op">.</span>end<span class="op">)</span></span>
<span id="cb13-6"><a href="#cb13-6" tabindex="-1"></a>            <span class="cf">break</span><span class="op">;</span></span>
<span id="cb13-7"><a href="#cb13-7" tabindex="-1"></a>        range i <span class="op">=</span> range_intersection<span class="op">(</span>curr<span class="op">-&gt;</span>r<span class="op">,</span> n<span class="op">-&gt;</span>r<span class="op">);</span></span>
<span id="cb13-8"><a href="#cb13-8" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>range_span<span class="op">(</span>i<span class="op">))</span> <span class="op">{</span></span>
<span id="cb13-9"><a href="#cb13-9" tabindex="-1"></a>            msg_warn<span class="op">(</span><span class="st">&quot;attempt to insert </span><span class="sc">%p</span><span class="st"> (%R) but overlap with </span><span class="sc">%p</span><span class="st"> (%R)</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span></span>
<span id="cb13-10"><a href="#cb13-10" tabindex="-1"></a>                     n<span class="op">,</span> n<span class="op">-&gt;</span>r<span class="op">,</span> curr<span class="op">,</span> curr<span class="op">-&gt;</span>r<span class="op">);</span></span>
<span id="cb13-11"><a href="#cb13-11" tabindex="-1"></a>            <span class="cf">return</span> <span class="kw">false</span><span class="op">;</span></span>
<span id="cb13-12"><a href="#cb13-12" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb13-13"><a href="#cb13-13" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb13-14"><a href="#cb13-14" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>rbtree_insert_node<span class="op">(&amp;</span>rm<span class="op">-&gt;</span>t<span class="op">,</span> <span class="op">&amp;</span>n<span class="op">-&gt;</span>n<span class="op">))</span> <span class="op">{</span></span>
<span id="cb13-15"><a href="#cb13-15" tabindex="-1"></a>        halt<span class="op">(</span><span class="st">&quot;scan found no intersection but rb insert failed, node </span><span class="sc">%p</span><span class="st"> (%R)</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span></span>
<span id="cb13-16"><a href="#cb13-16" tabindex="-1"></a>             n<span class="op">,</span> n<span class="op">-&gt;</span>r<span class="op">);</span></span>
<span id="cb13-17"><a href="#cb13-17" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb13-18"><a href="#cb13-18" tabindex="-1"></a>    <span class="cf">return</span> <span class="kw">true</span><span class="op">;</span></span>
<span id="cb13-19"><a href="#cb13-19" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>It appears that insertion can fail if the new range
            intersects an existing one. When this happens it should
            print a warning message that we don’t see. The reason for
            that though is because <code>msg_warn</code> needs to be
            enabled at compile time.</p>
            <p>Doing that confirms my suspicions. The following line is
            added to the log.</p>
            <pre><code>rangemap_insert warning: attempt to insert 0xffffc00000405f00 ([0x29b8000 0x2a91000)) but overlap with 0xffffc0000040a400 ([0x2a6f000 0xaa6f000))</code></pre>
            <p>So what range is it overlapping? Grepping the start of
            the range reveals it.</p>
            <pre><code>...
    2 mmap
    2 mmap: addr 0x0000000002a6f000, length 0x8000000, prot 0x0, flags 0x4022, fd -1, offset 0x0
    2    returning 0x2a6f000
...</code></pre>
            <p>This means that node <em>deliberately</em> maps this
            address. It does this on Linux too which we an see more
            clearly with <code>strace</code>.</p>
            <div class="sourceCode" id="cb16"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb16-1"><a href="#cb16-1" tabindex="-1"></a><span class="ex">$</span> strace <span class="at">-e</span> brk,mmap /nix/store/dv8rq1kl181whp5r1f30j0ar4i11axqw-nodejs-18.4.0/bin/node</span>
<span id="cb16-2"><a href="#cb16-2" tabindex="-1"></a><span class="ex">...</span></span>
<span id="cb16-3"><a href="#cb16-3" tabindex="-1"></a><span class="ex">brk</span><span class="er">(</span><span class="ex">0x2e8e000</span><span class="kw">)</span>                          <span class="ex">=</span> 0x2e8e000</span>
<span id="cb16-4"><a href="#cb16-4" tabindex="-1"></a><span class="ex">mmap</span><span class="er">(</span><span class="ex">0x2e8e000,</span> 134217728, PROT_NONE, MAP_PRIVATE<span class="kw">|</span><span class="ex">MAP_ANONYMOUS</span><span class="kw">|</span><span class="ex">MAP_NORESERVE,</span> <span class="at">-1,</span> 0<span class="kw">)</span> <span class="ex">=</span> 0x2e8e000</span>
<span id="cb16-5"><a href="#cb16-5" tabindex="-1"></a><span class="ex">...</span></span></code></pre></div>
            <p>The addresses are different, but this is the offending
            <code>mmap</code>. What I find odd is that it’s deliberately
            mapping the end of the heap. I haven’t investigated this
            further. From the nanos point of view, it simply needs to
            deal with it.</p>
            <p>To this end I introduced the <a
            href="https://github.com/nanovms/nanos/pull/1753">following
            change</a> to <code>rangemap_reinsert</code>.</p>
            <div class="sourceCode" id="cb17"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb17-1"><a href="#cb17-1" tabindex="-1"></a>boolean rangemap_reinsert<span class="op">(</span>rangemap rm<span class="op">,</span> rmnode n<span class="op">,</span> range k<span class="op">)</span></span>
<span id="cb17-2"><a href="#cb17-2" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb17-3"><a href="#cb17-3" tabindex="-1"></a>    range old <span class="op">=</span> n<span class="op">-&gt;</span>r<span class="op">;</span></span>
<span id="cb17-4"><a href="#cb17-4" tabindex="-1"></a>    rangemap_remove_node<span class="op">(</span>rm<span class="op">,</span> n<span class="op">);</span></span>
<span id="cb17-5"><a href="#cb17-5" tabindex="-1"></a>    n<span class="op">-&gt;</span>r <span class="op">=</span> k<span class="op">;</span></span>
<span id="cb17-6"><a href="#cb17-6" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>rangemap_insert<span class="op">(</span>rm<span class="op">,</span> n<span class="op">))</span> <span class="op">{</span></span>
<span id="cb17-7"><a href="#cb17-7" tabindex="-1"></a>        n<span class="op">-&gt;</span>r <span class="op">=</span> old<span class="op">;</span></span>
<span id="cb17-8"><a href="#cb17-8" tabindex="-1"></a>        assert<span class="op">(</span>rangemap_insert<span class="op">(</span>rm<span class="op">,</span> n<span class="op">));</span></span>
<span id="cb17-9"><a href="#cb17-9" tabindex="-1"></a>        <span class="cf">return</span> <span class="kw">false</span><span class="op">;</span></span>
<span id="cb17-10"><a href="#cb17-10" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb17-11"><a href="#cb17-11" tabindex="-1"></a>    <span class="cf">return</span> <span class="kw">true</span><span class="op">;</span></span>
<span id="cb17-12"><a href="#cb17-12" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>So now, when the insert fails, it tries to put the
            rangemap back into the state it found it in. This way we
            won’t unmap the heap when <code>brk</code> fails. The result
            is that node 18 can now run.</p>
            <h1 id="conclusion">Conclusion</h1>
            <p>Can I host my app on nanos yet? Not quite, there is the
            issue of Redis’s call(s) to <code>fork</code> and also the
            fact I haven’t finished my app. Hosting providers like Fly
            abstract away many of the issues with containers, although
            you still pay for the CPU and memory their kernel and init
            system uses. You also start getting into some vendor lockin,
            but none of this is a concern when you have zero users.</p>
            <p>On the other hand, if you have thousands or millions of
            users then nanos has huge potential. Although this article
            is about doing kernel development, I wouldn’t expect you to
            need to do that if you are just deploying a Go or Rust
            microservice or stick to the nanosvms supported
            packages.</p>
            <p>If you want to rewrite your application <em>as a</em>
            unikernel then you are likely to fall off the deep end at
            some point. The fact that nanos keeps the userland barrier
            and copies the Linux ABI is pretty important.</p>
            <p>Anyway back to writing Typescript, HTML and CSS.</p>
    </div>
  </content>
</entry>
<entry>
  <title>Richie’s Techbits newsletter: Issue 11</title>
  <id>https://richiejp.com/news-letter-issue-11</id>
  <published>2025-04-10T09:10:33+01:00</published>
  <updated>2025-04-10T09:10:33+01:00</updated>
  <link rel="alternate" href="https://richiejp.com/news-letter-issue-11" />
  <summary>10M Tokens Deep: Llama 4 boosts working memory, but not close
to Total Recall</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>Llama 4’s 10 million token context window is not enough
            to eliminate RAG. Not even close. However the increased
            context could drastically simplify agent workflow
            development.</p>
            <p>Byte Pair Encoding is used to produce tokens and
            typically each token represents one or more letters. Common
            words may be represented by a single token, but longer
            sequences are unusual.</p>
            <p>If we assume an average of 4 letters per token and each
            letter takes one byte, then 10M tokens is 40MB of text data.
            We could assume an average of 2 or 8 letters, it doesn’t
            make a lot of difference; we’re still just talking
            megabytes.</p>
            <p>Many companies have terabytes of text data. Not to
            mention audiovisual data. Even if you have a trillion tokens
            then your whole context could be filled by your knowledge
            base leaving none for output.</p>
            <p>So let’s say that for long term memory this is not close
            to enough, but what about the opposite end? Let’s call it
            working memory.</p>
            <p>When an agent takes instructions from the user, creates a
            plan, calls tools and reads the results, this all takes up
            context space.</p>
            <p>If the agent runs out of context then it can forget what
            steps of the plan it has executed and what were the
            results.</p>
            <p>For example I have seen an agent get stuck in a loop
            searching the web, because while it had space reserved for
            my original prompt in its context, the results of the search
            were not entirely present. So it kept deciding to do the
            search.</p>
            <p>To mitigate this we must use tricks to compress previous
            steps and remove irrelevant details from the context. In the
            case above one thing we could do is summarize the search
            results or add them to a vector store.</p>
            <p>The bigger the context the less we have to do this. Our
            working memory is handled by the LLM and we just need to
            manage longer term memory.</p>
            <p>This is nothing new for software, basically the more RAM
            you have the simpler it is to write software. You don’t have
            to decide which data should be in RAM and which on some
            longer term storage. It’s just all in RAM.</p>
            <p>As to whether Llama 4 has an impact on agent development,
            I don’t know. While it may have a large context, it has been
            reported that the model’s responses are poor quality. These
            reports could be based on incorrectly deployed models or
            other confusions, so as usual it’s good to wait for the dust
            to settle before writing the model off.</p>
            <p>Unlike RAM, recall accuracy, speed and overall quality
            may degrade as the context is filled. There is research
            (https://x.com/rajhans_samdani/status/1899969384191582218)
            to show that this is the case and it also fits with my
            anecdotal experience.</p>
            <p>Still, perhaps the architecture that allows for such a
            large context can be replicated without negative effects on
            quality. In fact even if the quality is poor once you have a
            few million tokens in context, this could be preferable to a
            hard limit.</p>
    </div>
  </content>
</entry>
<entry>
  <title>Richie’s Techbits newsletter: Issue 12</title>
  <id>https://richiejp.com/news-letter-issue-12</id>
  <published>2025-04-10T12:17:52+01:00</published>
  <updated>2025-04-10T12:17:52+01:00</updated>
  <link rel="alternate" href="https://richiejp.com/news-letter-issue-12" />
  <summary>Enabling Intel GPUs with GGML’s SYCL support in LocalAI, how
hard can it be?</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>LocalAI has at least two backends that support Intel’s
            SYCL. One is Llama.cpp which is used to run LLMs locally and
            another is stable-diffusion.cpp which can generate
            images.</p>
            <p>I own an Intel Arc A770 16GB card because there aren’t
            many GPUs you can buy with 16GB of RAM for $300. To my
            knowledge this is an unnecessary amount of VRAM for computer
            games, but it’s just enough to run smaller LLMs and other
            types of gen AI.</p>
            <p>To be clear, this amount of VRAM allows you to run a 12b
            LLM with 8bit quantization. You won’t be able to plug this
            into a system that expects Claude or ChatGPT, but it is
            usable with software that takes the limits of a model this
            size into account.</p>
            <p>However there is a bigger problem with this card, namely
            that it doesn’t support CUDA, NVIDIA’s very special toolkit
            for creating GPU accelerated applications that are not
            computer games.</p>
            <p>Instead it supports Intel SYCL, OpenVINO and Vulkan. I
            don’t know where to begin with OpenVINO; LocalAI supports it
            as a seperate backend, it is fast when it works, but models
            have to be converted to its format and I could only ever get
            integrated GPUs to work with it. The A770 caused a bug which
            I reported, but I’m not sure what came of it.</p>
            <p>Meanwhile Llama.cpp supports both SYCL and Vulkan, the
            latter is what computer games use and has good support on
            Linux. It has computation shaders
            (https://richiejp.com/1d-reversible-automata-webgpu) which
            can be used to perform the inference, but it’s rather low
            level and requires a lot of optimization work. As a result
            it is presently not that fast.</p>
            <p>Intel OneAPI SYCL is similar to CUDA in that you write
            the computations in a C/C++ like language. It does a whole
            bunch to make this easier for you in comparison to using
            shader language. Including that it has optimized library
            functions and a special LLVM MLIR based C/C++ optimizing
            compiler. The result being that the performance is decent. I
            could be missing some details, but this is my general
            impression.</p>
            <p>So this has been implemented in Llama.cpp and all LocalAI
            has to do is turn it on and use it. What could go wrong?</p>
            <p>Compiling and linking fucking C/C++, that’s what.</p>
            <p>To keep pace with the current version of Llama.cpp
            LocalAI compiles it from source. LocalAI has multiple
            backends to run various types of AI model and each backend
            communicates over gRPC because they are written in different
            languages. The Llama.cpp is written in C++ and uses
            Llama.cpp as a library.</p>
            <p>In order to compile Llama.cpp with SYCL support it is not
            only required that Intel’s OneAPI libraries are present, but
            even that the whole thing is compiled with a Clang based
            Intel compiler called icp/icpx. What’s more you must pass
            the flag ‘-fsycl’ otherwise linking will fail with cryptic
            error messages about missing symbols.</p>
            <p>And no, asking AI does not help at this point in time,
            it’s the type of issue that requires a lot of context and
            investigation. The framework needed to get an agent to
            perform this investigation is not in place yet. I’m sure
            we’ll get there, but you’ll be happy to know that solving
            shitty issues like this is still in the human domain at this
            time.</p>
            <p>Anyway Llama.cpp is compiled with CMake along with the
            GGML library that is at its core. The Intel OneAPI supports
            CMake and GGML includes it this way. The LocalAI backend for
            Llama.cpp also uses CMake, so it can include Llama.cpp and
            enable SYCL using CMake variables.</p>
            <p>Well actually that is not enough, we also need to call a
            script from OneAPI that sets some variables and set the
            CMake variables that control what compiler is being
            used.</p>
            <p>This doesn’t all need to be figured out for LocalAI,
            Llama.cpp has a Dockerfile for Intel SYCL where it compiles
            its CLI in CI. We should just be able to adapt this
            Dockerfile to LocalAI’s build system and it should all work,
            right? Wrong, no, it did not work.</p>
            <p>The reason it did not work is because ‘-fsycl’ was
            missing from some invocation of the icp/icpx compiler. It
            shouldn’t be missing though, it is set by Llama.cpp’s CMake
            file and it’s why the Llama.cpp Dockerfile works. At least
            that is what I thought.</p>
            <p>However Builker pointed out here
            (https://github.com/mudler/LocalAI/issues/4905#issuecomment-2774074934)
            Ollama encountered a similar issue and added the ‘-fsycl’
            flag to correct it. So I tried it and it worked. Why? I’m
            not sure exactly, perhaps the final executable doesn’t get
            any of the flags that Llama.cpp configures. I haven’t looked
            because prying apart the build system to find out takes
            time.</p>
            <p>Then there is the stable-diffusion.cpp backend which has
            bigger problems because it is partially written in Go and
            uses CGO to interoperate with C++. Once again I encountered
            cryptic linker errors about missing symbols, but I had
            implemented all the fixes for Llama.cpp and also applied
            updates to stable-diffusion.cpp itself.</p>
            <p>Then I realised from trawling through the 1000’s of lines
            of logs that the errors were actually coming from CGO and it
            wasn’t even using Intel’s special compiler. So of course it
            wouldn’t work. This made me feel rather silly, but then
            again, the Go code just calls the stable-diffusion.cpp
            library and doesn’t do anything with SYCL.</p>
            <p>So why should it need to use the special compiler or link
            to some low level intrinsic? This should all be abstracted
            away in the library, but no. Linking is left until as late
            as possible so that when building the final executable I
            need to provide the dependencies of my dependencies.</p>
            <p>Perhaps it’s possible to force linking at an earlier
            stage, perhaps the library could be compiled into a dynamic
            library that the CGO program links to a runtime or the Go
            code could be rewritten in C++. However instead I opted to
            configure CGO to use the right compiler and libraries.</p>
            <p>I’m not sure if CMake works well with Go, so I ended up
            configuring the necessary libraries and flags with a
            combination of pkg-config and Intel OneAPIs online linker
            tool. This appears to work, but we have to hope that the
            necessary flags don’t change on a regular basis.
            (https://github.com/mudler/LocalAI/pull/5144)</p>
            <p>So what is the moral of this story? Simply saying CMake,
            C/C++, Makefiles, etc. are bad is tempting, but… actually no
            I’m just going to say it, they are fucking awful. There is a
            lot of wisdom and knowledge locked away in them, there is a
            reason they survived, but there is soo much dead weight
            being carried with them.</p>
            <p>The fact I can write complete rubbish in a Makefile and
            it won’t throw a syntax error until it passes the text to a
            shell or how in C you can call functions that don’t exist.
            These would be fine if they were the exception rather than
            the norm, but it’s not, it’s the default to be extremely
            relaxed and let stuff happen that will waste time.</p>
            <p>That said I don’t blame people for picking tried and
            tested tools, but there are alternatives like Nix, Zig, Go
            and, if you really must go to the opposite extreme,
            Rust.</p>
    </div>
  </content>
</entry>
<entry>
  <title>Richie’s Techbits newsletter: Issue 12</title>
  <id>https://richiejp.com/news-letter-issue-13</id>
  <published>2025-05-02T09:04:13+01:00</published>
  <updated>2025-05-02T09:04:13+01:00</updated>
  <link rel="alternate" href="https://richiejp.com/news-letter-issue-13" />
  <summary>Enabling Intel GPUs with GGML’s SYCL support in LocalAI, how
hard can it be?</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>LocalAI has at least two backends that support Intel’s
            SYCL. One is Llama.cpp which is used to run LLMs locally and
            another is stable-diffusion.cpp which can generate
            images.</p>
            <p>I own an Intel Arc A770 16GB card because there aren’t
            many GPUs you can buy with 16GB of RAM for $300. To my
            knowledge this is an unnecessary amount of VRAM for computer
            games, but it’s just enough to run smaller LLMs and other
            types of gen AI.</p>
            <p>To be clear, this amount of VRAM allows you to run a 12b
            LLM with 8bit quantization. You won’t be able to plug this
            into a system that expects Claude or ChatGPT, but it is
            usable with software that takes the limits of a model this
            size into account.</p>
            <p>However there is a bigger problem with this card, namely
            that it doesn’t support CUDA, NVIDIA’s very special toolkit
            for creating GPU accelerated applications that are not
            computer games.</p>
            <p>Instead it supports Intel SYCL, OpenVINO and Vulkan. I
            don’t know where to begin with OpenVINO; LocalAI supports it
            as a seperate backend, it is fast when it works, but models
            have to be converted to its format and I could only ever get
            integrated GPUs to work with it. The A770 caused a bug which
            I reported, but I’m not sure what came of it.</p>
            <p>Meanwhile Llama.cpp supports both SYCL and Vulkan, the
            latter is what computer games use and has good support on
            Linux. It has computation shaders
            (https://richiejp.com/1d-reversible-automata-webgpu) which
            can be used to perform the inference, but it’s rather low
            level and requires a lot of optimization work. As a result
            it is presently not that fast.</p>
            <p>Intel OneAPI SYCL is similar to CUDA in that you write
            the computations in a C/C++ like language. It does a whole
            bunch to make this easier for you in comparison to using
            shader language. Including that it has optimized library
            functions and a special LLVM MLIR based C/C++ optimizing
            compiler. The result being that the performance is decent. I
            could be missing some details, but this is my general
            impression.</p>
            <p>So this has been implemented in Llama.cpp and all LocalAI
            has to do is turn it on and use it. What could go wrong?</p>
            <p>Compiling and linking fucking C/C++, that’s what.</p>
            <p>To keep pace with the current version of Llama.cpp
            LocalAI compiles it from source. LocalAI has multiple
            backends to run various types of AI model and each backend
            communicates over gRPC because they are written in different
            languages. The Llama.cpp is written in C++ and uses
            Llama.cpp as a library.</p>
            <p>In order to compile Llama.cpp with SYCL support it is not
            only required that Intel’s OneAPI libraries are present, but
            even that the whole thing is compiled with a Clang based
            Intel compiler called icp/icpx. What’s more you must pass
            the flag ‘-fsycl’ otherwise linking will fail with cryptic
            error messages about missing symbols.</p>
            <p>And no, asking AI does not help at this point in time,
            it’s the type of issue that requires a lot of context and
            investigation. The framework needed to get an agent to
            perform this investigation is not in place yet. I’m sure
            we’ll get there, but you’ll be happy to know that solving
            shitty issues like this is still in the human domain at this
            time.</p>
            <p>Anyway Llama.cpp is compiled with CMake along with the
            GGML library that is at its core. The Intel OneAPI supports
            CMake and GGML includes it this way. The LocalAI backend for
            Llama.cpp also uses CMake, so it can include Llama.cpp and
            enable SYCL using CMake variables.</p>
            <p>Well actually that is not enough, we also need to call a
            script from OneAPI that sets some variables and set the
            CMake variables that control what compiler is being
            used.</p>
            <p>This doesn’t all need to be figured out for LocalAI,
            Llama.cpp has a Dockerfile for Intel SYCL where it compiles
            its CLI in CI. We should just be able to adapt this
            Dockerfile to LocalAI’s build system and it should all work,
            right? Wrong, no, it did not work.</p>
            <p>The reason it did not work is because ‘-fsycl’ was
            missing from some invocation of the icp/icpx compiler. It
            shouldn’t be missing though, it is set by Llama.cpp’s CMake
            file and it’s why the Llama.cpp Dockerfile works. At least
            that is what I thought.</p>
            <p>However Builker pointed out here
            (https://github.com/mudler/LocalAI/issues/4905#issuecomment-2774074934)
            Ollama encountered a similar issue and added the ‘-fsycl’
            flag to correct it. So I tried it and it worked. Why? I’m
            not sure exactly, perhaps the final executable doesn’t get
            any of the flags that Llama.cpp configures. I haven’t looked
            because prying apart the build system to find out takes
            time.</p>
            <p>Then there is the stable-diffusion.cpp backend which has
            bigger problems because it is partially written in Go and
            uses CGO to interoperate with C++. Once again I encountered
            cryptic linker errors about missing symbols, but I had
            implemented all the fixes for Llama.cpp and also applied
            updates to stable-diffusion.cpp itself.</p>
            <p>Then I realised from trawling through the 1000’s of lines
            of logs that the errors were actually coming from CGO and it
            wasn’t even using Intel’s special compiler. So of course it
            wouldn’t work. This made me feel rather silly, but then
            again, the Go code just calls the stable-diffusion.cpp
            library and doesn’t do anything with SYCL.</p>
            <p>So why should it need to use the special compiler or link
            to some low level intrinsic? This should all be abstracted
            away in the library, but no. Linking is left until as late
            as possible so that when building the final executable I
            need to provide the dependencies of my dependencies.</p>
            <p>Perhaps it’s possible to force linking at an earlier
            stage, perhaps the library could be compiled into a dynamic
            library that the CGO program links to a runtime or the Go
            code could be rewritten in C++. However instead I opted to
            configure CGO to use the right compiler and libraries.</p>
            <p>I’m not sure if CMake works well with Go, so I ended up
            configuring the necessary libraries and flags with a
            combination of pkg-config and Intel OneAPIs online linker
            tool. This appears to work, but we have to hope that the
            necessary flags don’t change on a regular basis.
            (https://github.com/mudler/LocalAI/pull/5144)</p>
            <p>So what is the moral of this story? Simply saying CMake,
            C/C++, Makefiles, etc. are bad is tempting, but… actually no
            I’m just going to say it, they are fucking awful. There is a
            lot of wisdom and knowledge locked away in them, there is a
            reason they survived, but there is soo much dead weight
            being carried with them.</p>
            <p>The fact I can write complete rubbish in a Makefile and
            it won’t throw a syntax error until it passes the text to a
            shell or how in C you can call functions that don’t exist.
            These would be fine if they were the exception rather than
            the norm, but it’s not, it’s the default to be extremely
            relaxed and let stuff happen that will waste time.</p>
            <p>That said I don’t blame people for picking tried and
            tested tools, but there are alternatives like Nix, Zig, Go
            and, if you really must go to the opposite extreme,
            Rust.</p>
    </div>
  </content>
</entry>
<entry>
  <title>Richie’s Techbits newsletter: Issue 1: Deepseek, Apps Script
and more</title>
  <id>https://richiejp.com/news-letter-issue-1</id>
  <published>2025-02-21T14:16:03Z</published>
  <updated>2025-02-21T14:16:03Z</updated>
  <link rel="alternate" href="https://richiejp.com/news-letter-issue-1" />
  <summary>I didn’t like DeepSeek, I liked Apps Script even though its
rubbish, tools, planes, finance and security</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>I’m starting a weekly news letter to note down and share
            interesting things I have found in technology and business.
            Comments and corrections welcome.</p>
            <h1 id="whats-in-this-issue">What’s in this issue</h1>
            <ul>
            <li>I tried DeepSeek and didn’t like it</li>
            <li>Google Apps script is wierd and clunky but I like
            it</li>
            <li>Tools: ngrok, ryelang, scrap/scratch, Bun, ZML,
            Cilium</li>
            <li>Other: Boom Supersonic, DeepSeek impact on financial
            markets, Zseano’s bug bounty methodology</li>
            </ul>
            <h1 id="richies-opinion">Richie’s opinion</h1>
            <h2 id="i-tried-deepseek-and-didnt-like-it">I tried DeepSeek
            and didn’t like it</h2>
            <p>Well at least I thought I did…</p>
            <p>As soon as the DeepSeek R1 drama found its way into my X
            feed I wanted to try it on Groq as an alternative to Llamma
            3.x. Groq produces tokens very quickly and cheaply and the
            idea of having a model as good as Claude on there is very
            attractive one.</p>
            <p>Unfortunately they don’t have the full DeepSeek R1 on
            there, the full model is a staggering 671B parameters which
            even if each parameter is only 16-bit that is over a
            terabyte. It’s also a mixture of experts architecture (MoE).
            Possibly Groq’s specialised hardware has difficulty with
            both.</p>
            <p>What Groq did put up though was a Llama 3.3 70b variant
            which I found to be worse than the original Llama. It used
            more tokens because it does some initial “thinking out
            loud”, then hallucinated more and produced more
            verbiage.</p>
            <p>Admittedly I tested it on Google Apps script which
            appears to be quite difficult for all of LLMs, including
            Google’s. Claude 3.5 did quite well, but oddly hallucinated
            more than Llama 3.3 when I started to copy and paste
            Google’s documentation into the prompt. More on Google Apps
            script later.</p>
            <p>I haven’t actually been able to try the full R1 model,
            DeepSeek won’t allow me to sign up.</p>
            <p>I suspect that once developers creating LLM “wrapper”
            products get more chance to evaluate DeepSeek, there will be
            some backlash as people find it doesn’t perform well for
            their niche. The time it takes to have an impact could be
            much longer than some have suggested.</p>
            <h2 id="google-apps-script-is-a-hotbed-of-activity">Google
            Apps Script is a hotbed of activity</h2>
            <p>So from the hottest thing in tech to one of the most
            boring. Second only to VBA in terms of eye watering cringe.
            However I quite like it, I shouldn’t do, I’m the type of
            software developer who is drawn towards new tech and OSS
            like a moth to a flame, but I still like it.</p>
            <p>I’m using it to create a Google Workspace’s plugin for my
            <a href="https://dobu.uk">availability calendar web</a> app.
            Google have essentially forced me into this if I want to get
            on the Workspaces add-on directory.</p>
            <p>There is nothing particularly good about Apps Script,
            it’s just JavaScript that runs in a sandbox on Google’s
            infra and can automate or extend most of Google’s products.
            It has its own weird little IDE which I don’t use, instead
            opting to write TypeScript and upload the results using
            Clasp. The documentation seems oddly difficult to navigate
            and grok. Finally the scripts take a long time to execute
            even if they are not fetching any data.</p>
            <p>It all feels a little broken down and neglected, like no
            one at Google really cares that much about it. Certainly
            like no one cares too much about add-on developers having a
            smooth experience in the cloud console.</p>
            <p>Despite all that it provides a pretty effective “duct
            tape”, that brings all of the Google services together.
            What’s more it is clearly very well used by many businesses
            who rely heavily on Google’s office suite for their
            operations. There is a hose pipe of requests coming into
            Upwork every day for Apps Script automations.</p>
            <h1 id="tools">tools</h1>
            <ul>
            <li>https://ngrok.com/: Host web apps or APIs locally even
            if you don’t have a static IP</li>
            <li>https://ryelang.org/: Caught my eye on HN because it has
            a library for spread sheets</li>
            <li>scrap and scratch
            https://bsky.app/profile/raysan5.bsky.social/post/3lgo7bpgskk2n
            <ul>
            <li>scratch https://scratch.mit.edu/about</li>
            </ul></li>
            <li>https://bun.sh/blog/bun-v1.2 released: It’s like nodejs,
            but faster and has more built in. I find it interesting that
            a whole web app could be built just using Bun’s builtin
            features</li>
            <li>https://zml.ai: Run AI models on any hardware (except my
            Intel Arc of course)</li>
            <li>https://github.com/cilium/ebpf: Write Go that runs
            inside the Linux kernel</li>
            </ul>
            <h1 id="other">other</h1>
            <ul>
            <li><p>Soon it may once again be possible for rich people to
            do a round trip from New York to London in one day. Opening
            up the exciting possibility of being able to go shopping in
            London and be back for dinner in New York. There are more
            serious uses for fast transport as well which I’ll to the
            reader to figure out
            https://x.com/boomaero/status/1884320653840392587?t=2X-b_6CRZnn8Gq8os2Ojbg&amp;s=19</p></li>
            <li><p>“High Beta Vs. Low Volatility Large Caps: Largest
            Divergence Since GFC”
            https://x.com/priceactionlab/status/1884174855869972689?t=E9Hr114ENn_Ij7Z0B5bLmw&amp;s=19
            DeepSeek’s release had some interesting consequences for the
            financial markets and should serve as a warning.</p></li>
            <li><p>https://www.bugbountyhunter.com/methodology/zseanos-methodology.pdf
            introduction to how to go about finding security
            vulnerabilities in web applications and getting paid for
            it.</p></li>
            </ul>
    </div>
  </content>
</entry>
<entry>
  <title>Richie’s Techbits newsletter: Issue 2: Erasure codes to
Entropy, Apple SLAP and FLOP, ZOHO and n8n</title>
  <id>https://richiejp.com/news-letter-issue-2</id>
  <published>2025-02-21T14:16:03Z</published>
  <updated>2025-02-21T14:16:03Z</updated>
  <link rel="alternate" href="https://richiejp.com/news-letter-issue-2" />
  <summary>This week I stumbled into entropy which always leaves me
uncertain.</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>This week I stumbled into entropy which always leaves me
            uncertain.</p>
            <h1 id="in-this-weeks-issue">In this weeks issue</h1>
            <ul>
            <li>Siamese codes, Huffman codes and an introduction to
            information theory from NASA</li>
            <li>SLAP and FLOP; Apple’s turn to suffer speculation
            attacks</li>
            <li>ZOHO CRM has it’s own language</li>
            <li>Tools: n8n</li>
            </ul>
            <h1 id="entropy">Entropy</h1>
            <p>While trying to understand a library for <a
            href="https://github.com/catid/siamese">streaming forward
            erasure correction codes</a> I was once again reminded how
            close probability theory lurks in the background of computer
            engineering.</p>
            <p>I have a collection of computer science text books and I
            was thinking to myself “which one of those did I see error
            correction codes in?”. It turned out that none of them had a
            satisifactory explanation of these, however I do have a
            large book on probability theory that goes into a lot of
            detail.</p>
            <p>It has the catchy title “Probability, Random Variables
            and Stochastic Processes” and was written by Athanasios
            Papoulis and S. Unnikrishna Pillai. It contains a section on
            information theory which in my niave estimation boils down
            to calculating how many bits of <em>information</em> a
            particular combination of <em>data</em> bits represents.</p>
            <p>If we receive a byte of data (<code>8 = 2^3</code> bits)
            and each data bit has an equally likely probability of being
            1 or 0 then we have 8 bits of information. Put another way
            if we have 256 symbols (<code>2^8</code>) and each symbol
            has an equal likelyhood of occuring then we have
            <code>log2(256) = 8</code> bits of information.</p>
            <p>However let’s say that the byte of information we receive
            is almost always equal to the ASCII code for either
            <code>{</code> (<code>01111011</code>) or <code>[</code>
            (<code>01011011</code>) as is the case when transmitting the
            first byte of a JSON object or array. I say “almost always”
            to allow for corruption. In this case, although we get 8
            bits of data, we are not receiving 8 bits of information.
            It’s more like 3 bits and actually if I calculate the “self
            entropy” with a 2% chance of corruption then it comes out as
            approximately <code>1.28</code> bits of information.</p>
            <p>Ironically we get a higher information value the higher
            the chance of corruption. With completely random noise
            having the highest value. Although I should note that if the
            corruption had a strong pattern to it then it’s information
            or entropy value could be lower. You can also choose to
            merge all symbols except <code>{</code> or <code>[</code>
            into a single “corrupted” symbol in which case the maximum
            entropy will occur when there is a <code>1/3</code> chance
            of corruption and will decrease thereafter.</p>
            <p>If this makes no sense to you then I came across <a
            href="https://ntrs.nasa.gov/api/citations/19970009858/downloads/19970009858.pdf">this
            gem written by Jon C. Freeman and published by NASA</a>.
            Unlike the text book I mentioned, it is written with
            engineers in mind and it shows. <a
            href="https://www.linkedin.com/feed/update/urn:li:activity:7292384863393849344/">Huffman
            codes</a>, errasure codes and so on all follow from
            information theory which in turn is an offshoot of
            probability theory. At least that is one way of viewing
            it.</p>
            <h1 id="slap-and-flop">SLAP and FLOP</h1>
            <p>I missed this somehow, but Apple got a turn at having to
            deal with <a
            href="https://bsky.app/profile/lukaszolejnik.bsky.social/post/3lgurdyknu72o">hardware
            vulnerabilities in their CPUs</a>. As usual it involves
            speculation of one form or another which CPUs do for
            performance.</p>
            <p>Apparently data needs to be in the same address space for
            these techniques to work (don’t quote me on that), but one
            expects a real Safari exploit to require more than one
            bug.</p>
            <p>Again probability theory crops up here because
            speculation attacks usually rely on timing certain events
            and finding a statistically significant timing difference
            between when speculation has happened or not.</p>
            <h1 id="zoho-crm-has-its-scripting-language">ZOHO CRM has
            its scripting language</h1>
            <p>On my hunt to integrate my calendar app with more
            services (I think Google’s verification team have put me in
            the slow lane) I discovered ZOHO Deluge.</p>
            <p>ZOHO is a vast CRM and office suite plus a wild range of
            other stuff including a serverless platform and nocode
            tools. For some reason they have a programming language that
            appears to be not based on anything I particularly
            recognise.</p>
            <p>I couldn’t see a way to shoe horn my app into their app
            store, so decided to go try it on with Webflow instead which
            I’ll leave for another time. It’s worth noting though that
            ZOHO is big enough to have its own ecosystem and
            specialists. Not close to the same level as Google
            Workspaces, but I could see significant activity on
            Upwork.</p>
            <h1 id="tools">Tools</h1>
            <p>Somehow ended with a very short list this week.</p>
            <ul>
            <li>https://github.com/n8n-io/n8n - An <sub>Open</sub> Fair
            source Zapier? Definitely one for the AI agent party.</li>
            </ul>
    </div>
  </content>
</entry>
<entry>
  <title>Richie’s Techbits newsletter: Issue 3: Spying with eBPF, WASM
shouldn’t exist, Go fast with Unikernels and tools</title>
  <id>https://richiejp.com/news-letter-issue-3</id>
  <published>2025-02-21T14:16:03Z</published>
  <updated>2025-02-21T14:16:03Z</updated>
  <link rel="alternate" href="https://richiejp.com/news-letter-issue-3" />
  <summary>Spying on packets and processes with eBPF, Why WASM shouldn’t
exist, Can your app run in a Unikernel?</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <h1 id="in-this-weeks-issue">In this weeks issue</h1>
            <ul>
            <li>Spying on packets and processes with eBPF</li>
            <li>Why WASM shouldn’t exist</li>
            <li>Can your app run in a Unikernel?</li>
            <li>Tools: Computer engineering for Babies, AstroVim and
            Gitu</li>
            </ul>
            <h1 id="using-ebpf-to-spy-on-linux-kernel-internals">Using
            eBPF to spy on Linux kernel internals</h1>
            <p>Before I get into the tech details here are some practial
            things you can do with eBPF</p>
            <ul>
            <li>Crush DDOS attacks by dropping the packets so early the
            CPU barely notices</li>
            <li>Test applications by hooking into any systemcall and
            changing the return value to an error code</li>
            <li>Hook into any kernel function, read and parse the
            functions arguments, allowing you to observe exactly what is
            happening in a production kernel at almost any point</li>
            <li>Control the exact arguments and circumstances a program
            is allowed to make a syscall allowing you to implement rules
            like “only call this once at startup”</li>
            </ul>
            <p>The eBPF subsystem for Linux allows you to safely (well
            almost) insert code into the Linux kernel and userspace. The
            code can be attached at various points, some stable and
            others not so much. There are different types of eBPF
            program with some allowing actions to be taken via helper
            functions or return codes, while others can read arbitrary
            kernel memory. The programs have access to maps which allow
            them to communicate with user space programs or each
            other.</p>
            <p>This essentially provides a safe runtime extension
            mechanism for the Linux kernel. I’m a bit dubious about
            whether it is safe to load some random eBPF code, but the
            eBPF verifier makes it safer than loading a kernel module.
            The verifier itself has been the subject of numerous
            exploits and no doubt the JIT compiler to. When I worked on
            creating reproducers for eBPF bugs I got the impression that
            the subsystem is complex enough that securing it against
            malicious eBPF may be a Sisphyean task beyond the resources
            available for it.</p>
            <p>Of course focusing on bugs gives one a slightly warped
            view of things. The verifier does an excellent job of making
            code safe, it can be a pain to work with, but it’s hard to
            accidentally write eBPF that breaks your system. Especially
            if you stick to the eBPF programs that can run in
            unprivileged mode. Typically I would think it unwise to
            allow completely untrusted users to load any kind of eBPF
            program, but as a way of stopping accidents it is a
            brilliant tool.</p>
            <p>A strong use for eBPF, and perhaps even more common that
            packet filtering, is observability. There are many tools
            available that use eBPF to measure performance, bandwidth
            and various other metrics. This week I spent some time
            investigating how to track which process is responsible for
            sending a packet.</p>
            <p>Although there are many tools that claim to do this, it
            is somewhat surprising that there isn’t a truly convenient
            hook point in the kernel where a packet’s headers can be
            read while the sending process’s PID is known. I’ll expand
            on this in a seperate article, but here are some things to
            look out for when writing eBPF, in particular programs
            dealing with packets:</p>
            <ul>
            <li>Is the process returned by
            <code>bpf_get_current_pid_tgid</code> the one which sent the
            packet or was it interrupted by the network interface?</li>
            <li>Can the function you are hooking with a kprobe or
            Fentry/Fexit tracepoint be inlined by the compiler at some
            call sites?</li>
            <li>Is the skbuff data you are accessing locked by the
            calling thread or can it be overwritten while you are
            reading it?</li>
            <li>Does the skbuff for a receiving packet have a socket
            assigned to it yet?</li>
            <li>Does tc or XDP work on the interface types
            (e.g. Wireguard) you are interested in?</li>
            </ul>
            <p>libbpf in combination with BTF has this great feature
            called CO-RE which allows one to partially define kernel
            structs at compile time, then relocates the program when it
            is loaded into the kernel. This means kprobes can be used on
            different kernels. However I’ve seen a number of eBPF tools
            that try to hook kernel functions where some of the call
            sites are probably missing on different kernels. There’s no
            error when this happens because some call sites still exist,
            so telemetry can be silently discarded. Some eBPF program
            types have stable interfaces, but the more advanced programs
            I have seen usually resort to kprobes where prolems like
            this abound.</p>
            <p>The stable interface for eBPF is constantly expanding
            however, so my prediction is that in the future these tools
            will evolve with the kernel to become rock solid. I even
            think it could make the Rust for Linux project slightly
            redundant if device drivers can be written in eBPF.</p>
            <h2 id="relevant-links">Relevant links</h2>
            <ul>
            <li>Path of a packet:
            https://x.com/alexjplaskett/status/1887924295265051133,
            https://www.net.in.tum.de/fileadmin/TUM/NET/NET-2024-04-1/NET-2024-04-1_16.pdf</li>
            <li>https://ebpf.io/applications/</li>
            <li>https://github.com/bpftrace/bpftrace &lt;- Probably the
            best tool to start learning eBPF with</li>
            </ul>
            <h1 id="wasm-shouldnt-exist">WASM shouldn’t exist</h1>
            <p>I remember reading the original paper on BPF and why it
            was important that BPF was register based. They thought
            about the actual hardware BPF would need to run on. BPF was
            then baptised in fire in FreeBSD and then the Linux kernel.
            It wasn’t forced on Linux, it didn’t have a standards body
            behind it, there was just FreeBSD to vouch for it. Later
            eBPF was introduced which added JIT, more instructions and a
            bunch of other stuff you can see in the linked article. It
            has evolved over time based on feedback and contains a lot
            of organic solutions for integrating byte code into a
            broader system.</p>
            <p>The fact that it is register based and has about the
            number of registers that real CPUs have, means it can be
            easily JIT translated into the host CPU’s native
            instructions. This may not be the absolute best thing for
            the performance of hot loops on any particular CPU, but it
            makes the JIT translation very fast and the performance is
            close enough to if the compiler was optimizing for a
            particular CPU.</p>
            <p>Meanwhile WASM is stack based and it is expected that the
            WASM compiler will optimize this for real, register based,
            CPUs at load time. It’s not clear to me what the advantage
            of this is over JavaScript with some extension’s for things
            like SIMD, native types and manual memory management. They
            both require compiling and actually JavaScript has the
            advantage of being able to access all of the web API’s
            without awkward bindings.</p>
            <p>In my opinion they should have adapted eBPF to the needs
            of the browser or just stuck with asm.js. Having said that,
            in absolute terms, WASM is very good and most of it’s
            problem will be arbitraged with shims and libraries. So
            practically speaking WASM may be a good choice when choosing
            a byte code to support.</p>
            <h2 id="relevant-link">Relevant link</h2>
            <ul>
            <li>https://www.tcpdump.org/papers/bpf-usenix93.pdf</li>
            <li>http://troubles.md/wasm-is-not-a-stack-machine/</li>
            </ul>
            <h1 id="you-should-convert-your-app-to-use-a-unikernel">You
            should convert your app to use a Unikernel</h1>
            <p>In fact you should hire me to convert your app to run
            on/in a Unikernel. What you’ll get</p>
            <ul>
            <li>The cold boot time will be negligible, meaning you pay
            less when it’s not serving requests</li>
            <li>It’ll be much faster when it is running, meaning you pay
            less when it’s serving requests</li>
            <li>The VM image size can be very small, meaning it can be
            quickly transfered even on a bad connection and can be
            deployed to edge locations with expensive storage</li>
            <li>Much reduced attack surface, meaning a much reduced need
            for security updates, hardening and risk of attack</li>
            </ul>
            <p>Although I have to point out that cutting down the Linux
            kernel and running your app as init (I did this) will get a
            lot of these benefits while retaining Linux’s hardening and
            many other features. Fly do something similar by using a
            very slim init and running apps in a single container on a
            lightweight container runtime. So in this case an attacker
            would usually first need to get root in their VM, then
            escape the VM. In a Unikernel an attacker is either “root”
            or very close as soon as they get code execution.</p>
            <p>Regardless of how much you cut out of Linux though, it’s
            never going to beat a unikernel in terms of overall
            performance. There may be edge cases where Linux’s memory
            mangement has been far better optimized, but most of the
            time it will just be doing a whole lot of unecessary
            work.</p>
            <p>Traditionally Unikernels have been difficult to write for
            because they don’t have POSIX compatible system calls or
            indeed any system calls. They have an API particular to
            them, like embedded kernels or really just like libraries
            you use on bare-metal. Some time ago though I came across
            Nanos and more recently Unikraft, both of these are Linux
            compatible to the extent that many popular applications will
            run on them unmodified.</p>
            <p>Nanos even retains the kernel-user-space barrier, meaning
            it has system calls and the kernel has some memory
            protection from your app. With Unikraft I’m not so sure, but
            of course there is a performance cost with having real
            system calls so it is a trade-off.</p>
            <p>I haven’t used Unikraft, but I did convert a NodeJS app
            to work on Nanos and here are some issues you may face</p>
            <ul>
            <li>Missing system calls or system call arguments: I
            implemented clone3 for Nanos because I found my NodeJS or
            binary was using it by default</li>
            <li>The app attempts to start a new process: Nanos at least
            does not and will not support multiple processes, so you
            have to convert you app to threads</li>
            <li>The tooling is different, you may be best off switching
            hosts or dumping Kubernetes</li>
            </ul>
            <p>On the last point Unikraft claims some support for Docker
            and Kubernetes and possibly Nanos has moved on since I used
            it. However a VM is fundamentally different from a container
            and if your current infra is based on containers from top to
            bottom then there is going to be friction. Personally though
            I think it would be a net win to get rid of Kubernetes
            :-).</p>
            <p>Writting your app to run on top of Linux with a minimal
            userland is also a valid way of doing things if you want to
            keep features like eBPF. Also if you want to deploy to a
            bare metal system which needs the Linux drivers. There are
            of course embedded “distributions” like Yocto and Buildroot,
            which can produce a stripped down userland.</p>
            <h2 id="relevant-links-1">Relevant Links</h2>
            <ul>
            <li>https://richiejp.com/nanos-clone3-brk-and-nodejs</li>
            <li>https://nanos.org/</li>
            <li>https://github.com/unikraft/unikraft</li>
            <li>https://github.com/richiejp/m &lt;– The Linux kernel
            with just Zig on top</li>
            </ul>
            <h1 id="tools">Tools</h1>
            <ul>
            <li>https://computerengineeringforbabies.com/: I bought both
            books from Kickstarter, I’m finding the material in the
            second book challenging, but I fully recommend</li>
            <li>https://astronvim.com/: I realised recently that
            LunarVim was no longer being maintained so I have switched
            to Astro. I don’t want to configure Neovim</li>
            <li>https://terminaltrove.com/gitu/: I really miss Magit
            from Emacs, this is a promising replacement</li>
            </ul>
    </div>
  </content>
</entry>
<entry>
  <title>Richie’s Techbits newsletter: Issue 4: What language does farm
equipment speak? Should have used Pocketbase and the 3 types of
OSS</title>
  <id>https://richiejp.com/news-letter-issue-4</id>
  <published>2025-02-21T14:16:03Z</published>
  <updated>2025-02-21T14:16:03Z</updated>
  <link rel="alternate" href="https://richiejp.com/news-letter-issue-4" />
  <summary>Internet of Tractors: What language does farm equipment
speak? How not to write web app: Should have used Pocketbase, Avoid
upset by understanding the 3 types of OSS</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <ul>
            <li>Internet of Tractors: What language does farm equipment
            speak?</li>
            <li>How not to write web app: Should have used
            Pocketbase</li>
            <li>Avoid upset by understanding the 3 types of OSS</li>
            </ul>
            <h1 id="how-does-farm-equipment-communicate">How does farm
            equipment communicate?</h1>
            <p>Would it surprise you to learn that agricultural
            equipment and, from what I can gather, most manufacturing
            equipment, uses a protocol that has remained largely
            unchanged from 1980s?</p>
            <p>Depending on your background it may surprise you that
            farm equipment contains networked computer equipment at all.
            However it does and I’m not talking about weather stations
            that transmit readings over LoRaWAN or things that can be
            secreted on top of existing infrastructure.</p>
            <p>I’m talking about heavy duty industrial equipment such as
            wood chip boilers and heat distribution systems for green
            houses amongst a host of other things. These devices often
            have ethernet connections and sometimes VNC for remote
            access to the display panel. VNC is not the best protocol
            for collecting telemetry and coordinating the operation of
            multiple devices however. For this these machines provide
            Modbus over TCP or RS242/RS485 (i.e. a serial cable).</p>
            <p>My experience is with a single farm, but practically all
            of the equipment supports Modbus with one or two exceptions.
            This is exciting because it means there is a way to create a
            centralised system that collects telemetry data and tune
            equipment based on feedback. The problem is that Modbus is a
            very basic protocol that just allows setting and reading
            <em>registers</em>, what the data in each register means is
            machine dependent and either requires a <em>register
            map</em> or reverse engineering to understand.</p>
            <p>A second issue is finding a way to plug in to Modbus when
            it is already connected to the display panel or a PCL via
            serial. This may require physically interfering with the
            machine to install a Modbus to ethernet gateway or indeed a
            Modbus to MQTT gateway or something else with a modicum of
            security. On the plus side there are a lot of modules which
            can be bought off the shelf to facilitate this.</p>
            <p>Now for those of you who are used to working with a
            Kubernetes cluster with full observability tooling or work
            in data engineering. Imagine going from having a bunch of
            disparate systems that require a human to run between them
            to communicate, to a centralised observability and data
            analysis platform with all the tools developed for the cloud
            and big data?</p>
            <p>At this point I’m not talking about doing anything really
            fancy, just having a Grafana or Metabase dashboard with
            everything that happened that day to all of the devices and
            sensors would be a huge improvement. In high end motor sport
            it’s required to have telemetry from every part of the
            vehicle just to be competitive and I don’t believe having
            fast and detailed telemetry would be any less valuable in
            agriculture.</p>
            <p>While all of the machines are digitised and even
            networked, they are still essentially islands and if we
            extrapolate this to many other industries it’s easy to see
            the potential. There are many networked machines out there
            which only require minor hardware updates and some software
            to be entered into an integrated whole. The major challenge
            is that every machine is different, Modbus isn’t self
            describing and you can be sure most of the equipment
            manufacturers don’t publish the docs in a convenient
            location.</p>
            <p>As Karl Marx observed, technology or human progress in
            general, advances unequally. In some areas we are racing
            ahead with LLMs, that are going to replace you next year by
            the way, and in others we are trying to get a machine to
            report sensor readings using technologies that have barely
            changed since the 1970s.</p>
            <h2 id="relevant-links">Relevant Links</h2>
            <ul>
            <li>https://www.youtube.com/watch?v=cpVYEEsYEq8,
            https://www.youtube.com/watch?v=RhSTszNc91A - Modbus
            protocol and how it is used</li>
            <li>https://thepihut.com/products/rs485-to-wifi-ethernet-module-modbus-mqtt-gateway
            - Serial to ethernet/WiFi gateway module that can be rail
            mounted</li>
            </ul>
            <h1 id="should-have-used-pocketbase">Should have used
            Pocketbase</h1>
            <p>One of my regrets when creating dobu.uk was that I didn’t
            use something like Pocketbase. Instead I recreated a whole
            bunch of plumbing needed by any web app. What you get with
            Pocketbase:</p>
            <ul>
            <li>Auth and user management, meaning you don’t have to
            implement passwords or OAuth2 endpoints</li>
            <li>Collections/Records API that abstracts away SQLite,
            meaning you don’t have to write SQL (unless you want to) nor
            deal with an ORM</li>
            <li>Optional automatic database migrations, meaning you
            don’t have to write migration scripts or schedule them</li>
            <li>Inbuilt admin and DB UI, meaning you can create your
            schema, query and update data from in a GUI without reaching
            for any other tools<br />
            </li>
            <li>Scale to 1000’s of concurrent connections on a low cost
            VM</li>
            <li>Use it as a HTTP API server, as a Go web framework or
            both, meaning the API can be extended with Go</li>
            <li>Completely self contained and easy to host</li>
            </ul>
            <p>There is more stuff like Cron jobs, Backups and so on. On
            the downside it is pre-1.0 and they specifically warn of
            breaking changes. Breaking changes are always a risk in web
            development, but if they actually warn you of it then I tend
            to err on the side of caution. At least now I err on the
            side of caution after being burned by SvelteKit.</p>
            <p>I assume that Pocketbase is a play on the names of
            Firebase and Supabase. All of these abstract away a lot of
            the plumbing associated with making even the most trivial of
            SaaS. Unless an app is completely stateless then people who
            are used to, ahem, writing zero-dependency-low-level code
            where 1ms is a long time can quickly get into a mess when
            they see the disgusting mess that JavaScript libraries
            represent and thus decide to do everything themselves.</p>
            <p>The reality is one may be able to produce something
            better in one particular area. Let’s say an area of very
            great importance to one’s particular application. However
            everything else should be accepted as it is because there
            just isn’t enough time to sort through all of the details
            and craft a solution specific to your needs. This is a very
            domain dependent statement and that was the issue for me,
            because when crafting some fundamental low level code it can
            absolutely make sense to pay attention to every detail.</p>
            <p>Pocketbase makes no attempt to scale horizontally and
            this is another important point. I wanted to have my apps
            data replicated around the world so that it could be
            accessed with low latency in any location. I got hooked on
            the idea of doing this at the database level and using
            eventual consistency for most operations. I managed to do it
            using KeyDB, but its only benefit was an educational
            one.</p>
            <p>I have to admit that basic caching could have solved this
            problem in my particular app, even just a CDN like KeyCDN
            could do most of it I reckon. If the VM running Pocketbase
            goes down this could be an issue, but if most functions of
            the website keep running from the cache, it’s not a huge
            one. Finally, if you really need to scale horizontally it
            can be done at the application level.</p>
            <p>Finally Pocketbase is a True Open Source project rather
            than a sales funnel for something else. This got me thinking
            about Open Source again and the different types of Open
            Source which I’ll explain below.</p>
            <h2 id="relevant-links-1">Relevant links</h2>
            <ul>
            <li>https://pocketbase.io/</li>
            <li>https://docs.keydb.dev/</li>
            <li>https://supabase.com/</li>
            <li>https://firebase.google.com/</li>
            </ul>
            <h1 id="the-three-types-of-open-source">The three types of
            Open Source</h1>
            <p>I woke up in the middle of the night, heart pounding and
            sweating and thought “There are three types of Open Source”.
            I think I had eaten too much chocolate before going to bed
            and was having an insulin crash, but obviously this must
            have been percolating somewhere in the back of my mind.</p>
            <ol style="list-style-type: decimal">
            <li>The Sales Funnel</li>
            <li>Art</li>
            <li>The Collaboration</li>
            </ol>
            <p>The Sales Funnel is an Open Source project where the
            motivation to maintain and develop it comes from the
            marketing value of being the developer of the Open Source
            software. For the obvious reason that Open Source can not be
            sold, there is no way to directly make money from it,
            however one way you can extract value from it is by
            integrating it into your sales funnel.</p>
            <p>If the Art project encompasses a wide range of
            motivations, including those that defy analysis. Perhaps the
            author is motivated by status and recognition, or the pure
            joy of creation or learning. Maybe they want to make
            something that spreads and see what happens, just to make
            their mark on reality.</p>
            <p>Finally there is The Collaboration, where a group of
            individuals or companies get together to develop and
            maintain an open source project. The motivation is primarily
            the economic value of running the code and the cost savings
            of developing it together instead of individually or by
            paying a third party.</p>
            <p>So the three types are separated by how the authors
            capture value from the project or intend to capture value.
            Of course a single project may have elements of all 3 types.
            Pocketbase is more or less written by one person and it
            looks a lot like an Art project with elements of a
            Collaboration. The evidence of it being a Sales Funnel is
            minor.</p>
            <p>If this news letter were open source, it would primarily
            be a Sales Funnel. I do enjoy writing and spreading ideas
            purely for its own sake, but I have limited time to spend on
            hobbies. Committing myself to writing once a week on
            LinkedIn requires a financial justification.</p>
            <p>The most outrage occurs when a project giving the
            appearance of being a Collaboration or an Art piece suddenly
            flips to a non-open source license. Usually though this
            happens with the projects that are clearly a Sales Funnel
            for a SaaS product. There is also upset when a project which
            nominally looks like Art or a Collaboration is revealed as
            being a Sales Funnel. Such is the case when the core
            developers get hired or acquired and the new owner puts
            their stamp on the project.</p>
            <p>As a user of Open source or a contributor you have an
            interest in knowing what really motivates a project’s
            development. In absolute terms I have no problem with any of
            the three types. However I think it is useful to identify
            the type of project one is dealing with and whether that is
            suitable for your situation. Also whether the project is
            likely to remain in a particular category or if their is a
            mismatch between contributor motivations and the outward
            appearance of the project.</p>
            <p>Clearly most projects don’t neatly fit into one of the
            thread categories. I’ve seen strong evidence of all three in
            the Linux kernel for example. It’s clearly a Sales Funnel
            for the Linux Foundation and companies like SUSE. Meanwhile
            Meta and Netflix collaborate on it because it costs them
            less than developing their own individual kernels. Finally
            there are people who contribute just for the sake of taking
            part.</p>
            <p>In terms of attribution I think some of these ideas came
            from Ron Evans (https://www.linkedin.com/in/deadprogram/)
            speaking on a podcast about TinyGo although I’m not saying
            he would agree with any of this, the rest is from Austrian
            economics and my imagination.</p>
    </div>
  </content>
</entry>
<entry>
  <title>Richie’s Techbits newsletter: Issue 5</title>
  <id>https://richiejp.com/news-letter-issue-5</id>
  <published>2025-02-28T11:24:17Z</published>
  <updated>2025-02-28T11:24:17Z</updated>
  <link rel="alternate" href="https://richiejp.com/news-letter-issue-5" />
  <summary>Configuration control with NixOS and Kairos. Tools: blxrep,
udpspeeder and more</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>In this weeks issue</p>
            <ul>
            <li>Configuration control with NixOS and Kairos</li>
            <li>Tools: blxrep, udpspeeder, Node-RED and below</li>
            <li>Random: LTE/5G vulnerabilities and Mr. Beast</li>
            </ul>
            <h1
            id="configuration-control-nixos-and-kairos">Configuration
            control, NixOS and Kairos</h1>
            <p>A client asked me about configuration control and brought
            up Ansible, Puppet and so on. I’m not sure if configuration
            control is the right term, but given the context my client
            was asking in, I immediately thought of NixOS and Kairos.
            Although these two are very different there are some
            commonalities:</p>
            <ul>
            <li>Unified declarative configuration, meaning you can
            configure your whole OS using the same syntax (although you
            may have to add support for some settings)</li>
            <li>Immutable, meaning you don’t get configuration drift at
            runtime (with caveats)</li>
            <li>Atomic updates and rollback, meaning if an update goes
            wrong the system will be fine and you can select previous
            generations at boot</li>
            </ul>
            <p>There is quite a large group of software that comes into
            the configuration management and infrastructure as code
            (IaC) categories that will control the state of a Linux
            installation. The idea being that you can automate
            installing and updating one or more Linux devices. There are
            a number of systems which can install and configure Ubuntu,
            these act as an external controller and usually use an
            agent, such as <code>sshd</code>, to control the box once it
            is up and running. During installation something like
            cloud-init can be used to get the initial configuration.</p>
            <p>This isn’t exactly how NixOS and Kairos work. NixOS is in
            fact a stand alone Linux distribution which is built upon
            the Nix language and package manager. Kairos is a
            meta-distribution which transforms other distributions into
            immutable systems with generations and rollback. They both
            allow creating customized ISO and VM images.</p>
            <p>Kairos uses container technologies to take a distribution
            like Ubuntu and turn it into an immutable OS. The resulting
            image still uses the Ubuntu kernel and to my knowledge there
            is no container runtime. However it uses file system
            overlays at runtime to allow rollback. At build time it uses
            Dockerfiles and so it’s essentially using Dockerfiles to
            package a bare metal OS. The overall system configuration is
            done using a cloud-init file in a similar fashion to how
            NixOS uses Nix to configure everything (see below). The
            cloud-init syntax can be extended using bundles, which
            themselves are created with Dockerfiles and are kind of like
            meta-packages.</p>
            <p>Nix is actually a functional programming language that is
            intended for reproducibly building and packaging software.
            You can install the Nix package manager on Linux, Mac and
            maybe Windows at some point. It’s in fact the <a
            href="https://repology.org/repositories/graphs">largest and
            most up to date OSS repository in the world</a> by a very
            long way. Only AUR comes close and that is actually still
            less fresh than a NixOS stable release from over 3 years
            ago. The Google and Apple app stores are larger, but Nix is
            catching up.</p>
            <p>NixOS is a Linux distribution built with Nix. Almost
            everything on NixOS can be configured from a single Nix
            file. The base Nix system is partially immutable and
            supports rolling back to previous generations. I started
            using NixOS because I was fed up of reinstalling and
            configuring machines from scratch as well as maintaining
            lots of different config files. I also loved the Nix package
            manager, especially because I could just type
            <code>nix run nixpkgs#some-program</code> or
            <code>nix shell nixpkgs#some-program</code> and try out some
            program without installing it or using a container.</p>
            <p>NixOS also has the steepest learning curve I’ve ever
            encountered in a Linux distribution. You aren’t just
            learning an unusual distribution, you are also learning an
            unusual programming language and its libraries. A lot of
            software does stupid stuff like download binaries that link
            to a specific libc in a specific location. This does not
            work with Nix unless you implement a workaround and this is
            usually not that easy. The fact then that NixOS has more
            than just a few users is a testament to how incredibly
            powerful it is. There has to be a strong incentive to use
            this software because it is bloody difficult to get into
            it.</p>
            <p>I vaguely remember a post from someone saying “I finally
            wrote my first Nix flake/package and it only took me two
            years!”. Obviously the story is quite different with Kairos
            which still requires some mental adjustments, but will be
            more familiar to a lot of system admins who have already
            been exposed to Kubernetes, cloud-init and whatever distro
            they decided to wrap.</p>
            <p>It could actually make sense to wrap NixOS with Kairos
            because the level of immutability is different between them,
            but there are a lot of details to sort through before
            deciding something like that. Indeed there is a lot of depth
            to this topic.</p>
            <h2 id="relevant-links">Relevant links</h2>
            <ul>
            <li>https://nixos.org/</li>
            <li>https://kairos.io/</li>
            </ul>
            <h1 id="tools">Tools</h1>
            <ul>
            <li><a
            href="https://www.linkedin.com/feed/update/urn:li:activity:7290666326879096832?updateEntityUrn=urn%3Ali%3Afs_updateV2%3A%28urn%3Ali%3Aactivity%3A7290666326879096832%2CFEED_DETAIL%2CEMPTY%2CDEFAULT%2Cfalse%29">blxrep</a>
            - disk replication tool that uses eBPF tracepoints</li>
            <li><a
            href="https://github.com/NixOS/nixpkgs/pull/385260">udpspeeder</a>
            - Wraps UDP packets (e.g. VPN traffic) providing forward
            error correction codes making any UDP traffic more
            reliable</li>
            <li><a href="https://nodered.org/">Node-RED</a> - Low code
            event-driven programming. This is great for writing programs
            that you can then hand off to non-coders, especially
            electrical and mechanical engineers who despise anything
            text based. It has a module for Modbus which I covered in my
            previous newsletter.</li>
            <li><a
            href="https://www.linkedin.com/feed/update/urn:li:activity:7300274962085347329?updateEntityUrn=urn%3Ali%3Afs_updateV2%3A%28urn%3Ali%3Aactivity%3A7300274962085347329%2CFEED_DETAIL%2CEMPTY%2CDEFAULT%2Cfalse%29">below</a>
            - Like top, but for CGroups</li>
            </ul>
            <h1 id="random">Random</h1>
            <ul>
            <li><a
            href="https://x.com/alexjplaskett/status/1892464401988440574">119
            vulnerabilities in LTE/5G core infrastructure</a></li>
            <li><a
            href="https://drive.google.com/file/d/1YaG9xpu-WQKBPUi8yQ4HaDYQLUSa7Y3J/view?usp=sharing">HOW
            TO SUCCEED IN MRBEAST PRODUCTION</a></li>
            </ul>
    </div>
  </content>
</entry>
<entry>
  <title>Richie’s Techbits newsletter: Issue 6</title>
  <id>https://richiejp.com/news-letter-issue-6</id>
  <published>2025-03-07T10:14:35Z</published>
  <updated>2025-03-20T14:27:01Z</updated>
  <link rel="alternate" href="https://richiejp.com/news-letter-issue-6" />
  <summary>Neovim Avante, Bolt.new and Cursor: coding tools that are
going to replace me next year</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>In this weeks issue: Avante is a great tool for bringing
            AI to Neovim. Also a comparison with Cursor and Bolt.new</p>
            <h1 id="coding-tools-for-replacing-me">Coding tools for
            replacing me</h1>
            <p>The first thing that will be replaced are the current
            generation of AI tools; last in first out. However after
            that there are the professions that are the product of
            technology. Not least of all software developer, a practice
            that is well documented and can be carried out entirely
            using a keyboard.</p>
            <p>The below tools are more developer aids than total
            replacements, but they can already replace developers for
            certain tasks. Agents like Devin are supposed to replace a
            developer on your team, but the difference is really just
            the UI and I think at this stage Devin has it wrong. You can
            give your developers Cursor and let one of them go.</p>
            <p>Alternatively you can let one developer go, but replace
            them with a specialist, thus expanding the breadth of your
            team’s expertise. The economic effects of AI on developers
            are far from certain, but it’s clear that we must adapt.</p>
            <h2 id="cursor-and-avante">Cursor and Avante</h2>
            <p>Cursor is a visual studio fork that enables chat bots to
            perform various actions on your code base and appears to be
            the leader in this space. Avante is the same thing, but it
            is an Open Source plugin for Neovim. What you get with
            these:</p>
            <ul>
            <li>Ask LLMs within your editor to modify or add code
            files</li>
            <li>Choose which additions or modifications suggested by the
            LLM you want to keep</li>
            <li>LLMs can use tools such as code and file search to
            answer questions</li>
            <li>LLMs can generate one-off scripts to run on your code
            base e.g. <code>go mod init</code></li>
            <li>Autocomplete code as you type with an LLM</li>
            <li>Choose specific regions of code and ask the LLM to
            describe or rewrite them</li>
            </ul>
            <p>I put off trying Cursor because I use Neovim and Emacs. I
            do a lot of development in a terminal over SSH and I hate
            ever having to reach for the mouse. I tried Cursor after
            using Avante just to check if it is significantly different
            and my conclusion is that I still hate Visual Studio.
            Meanwhile I was blown away by Avante.</p>
            <p>I was not expecting it to work well without significant
            pain. However the first thing I tried was (using Claude 3.7)
            to add an IRC connector to an existing project. I asked it
            to base the IRC connector on the existing Slack connector
            which had a reasonable chunk of functionality. This required
            creating a new file and modifying several others. It
            completed the task with only a few minor bug fixes from
            myself.</p>
            <p>The basic structure of the code was good and it saved me
            a lot of laborious work creating modules and wiring up
            interfaces. Claude did get very confused while trying to
            generate a configuration file. I had to fully figure that
            out on my own. So it did hit a clear limit, but it made
            significant progress beforehand.</p>
            <p>Next I tried using it to create a Go based web app from
            scratch using HTMX and TailwindCSS. It managed to create the
            initial app without errors, I then iterated on the site’s
            style and layout. This all worked wonderfully except for
            some highly verbose Tailwind code and a few instances where
            Avante got stuck.</p>
            <p>The question is can I refuse to use them and still be
            competitive? Or will they actually make developers capable
            of debugging and writing novel code more in demand?</p>
            <h2 id="bolt.new">Bolt.new</h2>
            <p>Bolt hosts your app in addition to writing code, similar
            to Replit which also has an AI. It feels more like a
            developer replacement than an aide. The scope is much
            smaller and the tool is less complicated. It just allows
            creating web apps and doesn’t appear to have all the code
            editing features of Cursor.</p>
            <p>Tools like this, or ones that are even more focused, are
            the ones which can defer hiring a developer. Because if the
            tool can produce limited, but flawless apps, then a
            non-developer can use it for prototyping, internal tools or
            an MVP. Which is exactly how I use image generation and LLMs
            to replace other professionals.</p>
            <p>However I used Bolt to create an Astro JS app and hit a
            bug which I couldn’t figure out if it was in the code or
            Bolt. The Bolt editor is crappy compared to Neovim or Visual
            Studio, so for me this is just not worth the effort in
            figuring out what went wrong. With a tool like this, the
            second I have to start debugging using my own brain then I
            have to spend significant time understanding the application
            code.</p>
            <p>Still though this is the kind of thing I would point less
            technical people to. Although I feel like Bolt doesn’t go
            far enough to capture a non-technical audience. It falls
            into a middle ground between AI assisted Nocode/Lowcode and
            AI assisted coding.</p>
    </div>
  </content>
</entry>
<entry>
  <title>Richie’s Techbits newsletter: Issue 7</title>
  <id>https://richiejp.com/news-letter-issue-7</id>
  <published>2025-03-13T16:14:55Z</published>
  <updated>2025-03-13T16:14:55Z</updated>
  <link rel="alternate" href="https://richiejp.com/news-letter-issue-7" />
  <summary>Tracking bandwidth usage per process with eBPF and
CGroups</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>Finding which process are sending receiving data with a
            given host appears to be a common need and here are some of
            the reasons I have seen for it:</p>
            <ul>
            <li>Preventing data usage charges on cellular networks by
            finding which application makes requests.</li>
            <li>Detecting malicious applications by tracking whether the
            data usage of applications aligns with their stated
            purpose.</li>
            <li>Debugging applications or the Linux kernel itself, by
            measuring the actual data usage against the expected
            amount.</li>
            </ul>
            <p>So, for example, you may see on your firewall that a
            particular host downloads a lot of data from Hugging Face.
            It’s normal for it to download some data from Hugging Face,
            but the current level is saturating your bandwidth.</p>
            <p>You could setup a HTTP proxy to cache the downloads, but
            it’s still natural to want to find which application is
            using the bandwidth and see if it can be avoided
            altogether.</p>
            <p>By default the kernel has no user friendly interface to
            present this information, the closest is a trace point which
            shows how much data is sent or received on a per process
            basis. If you want to know where the data is coming from or
            going to then this needs to be correlated with other
            sources.</p>
            <p>The kernel can be extended with eBPF byte code, this can
            be used to hook into various points within the kernel and
            provide the necessary functionality. That is you can
            register a small eBPF program with a particular hook point,
            so that when the hook point is hit, the eBPF program runs.
            There are multiple types of eBPF program with different
            restrictions and capabilities.</p>
            <p>The eBPF based utility <a
            href="https://github.com/dkorunic/pktstat-bpf">pktstat-bpf</a>
            tracks the bandwidth usage of source-destination IP address
            pairs using a number of techniques. It can also track the
            process when using kprobes or the newly added (by me)
            CGroups hook points.</p>
            <p>So far so good, but now let’s get further into the
            technical details, which in the interest of keeping the
            article short I won’t try to expand on all the jargon.</p>
            <p>As I have mentioned previously in my newsletter there
            don’t appear to be any convenient hook points to measure
            both the bandwidth and record the responsible process at the
            same time.</p>
            <p>That is, there don’t appear to be any stable hook points
            that have access to the packet data and only execute in the
            context of the sending or receiving process. To overcome
            this pktstat-bpf’s author, Dinko, used a kprobe based
            approach that attaches probes to several internal kernel
            functions.</p>
            <p>This provides the necessary information, however the
            internal functions that kprobes hook can change or disappear
            between kernel version. In fact just recompiling the kernel
            with different compiler settings can cause some function
            call sites to be inlined and a kprobe to disappear. Kprobes
            are also more challenging for security and can’t ever be
            attached by a non-root user. Due to this I decided to
            investigate some stable hook points, the most fruitful of
            which so far have been the CGroup hook points.</p>
            <h2 id="taking-advantage-cgroups">Taking advantage
            CGroups</h2>
            <p>CGroups can be used to organise processes into a
            hierarchy of groups and place various controls on them.
            Including memory limits, CPU limits, I/O limits and a whole
            host of other things. A big advantage of CGroups is that you
            can prevent one type of process from using up all of your
            system’s resources.</p>
            <p>It’s also possible to attach eBPF programs to CGroups
            which hook into a number of different actions.</p>
            <p>In particular there are CGroup hook points for sending
            and receiving packets and for creating sockets.
            Unfortunately the hook points for sending and receiving
            packets are not inside process context. However the hook
            point for socket creation is always executed inside the
            relevant proc’s task. So what we can do is track which
            process creates a socket, then when we see a packet being
            sent or received on a particular socket, we can look up
            which process created that socket.</p>
            <p>This has the disadvantage that sockets which were created
            before we started pktstat-bpf won’t have any process
            associated with them. Also there is some cost associated
            with tracking which processes created a socket. However if
            we start pktstat-bpf before the workload then we’ll get the
            full information. The only other issue is that a process may
            create a socket then transfer it to a different process.
            Commonly this happens when a process forks and its children
            inherit its sockets. However processes can also send sockets
            (actually file descriptors) using UNIX control messages.</p>
            <p>On the plus side the CGroup hook points are stable and
            provide almost the same level of information as the kprobes.
            Also you can restrict monitoring to just one group of
            processes which can avoid spending resources on monitoring
            irrelevant traffic. On the other hand if you want to track
            all processes then it’s simply a case of attaching to the
            root CGroup.</p>
            <p>There are more CGroup hook points that my be useful for
            getting the process as well as other types of hook point
            that I haven’t investigated. In conclusion there are a
            number of options for tracking process bandwidth usage in
            eBPF and pktstat-bpf is a good tool try this out with.</p>
            <p>For more on eBPF see my <a
            href="https://www.linkedin.com/pulse/week-7-spying-ebpf-wasm-shouldnt-exist-go-fast-tools-palethorpe-mwkae/?trackingId=e9EWTz%2F3QTmD2spwwfeuMw%3D%3D">previous
            article</a>.</p>
    </div>
  </content>
</entry>
<entry>
  <title>Richie’s Techbits newsletter: Issue 8</title>
  <id>https://richiejp.com/news-letter-issue-8</id>
  <published>2025-03-20T15:06:24Z</published>
  <updated>2025-03-20T15:11:40Z</updated>
  <link rel="alternate" href="https://richiejp.com/news-letter-issue-8" />
  <summary>Vibe coding an OS kernel into existence</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>In this week’s issue I vibe coded a “bare metal” program
            that runs inside a Firecracker micro VM on x86_64 and
            accepts commands through the serial terminal.</p>
            <p>I’ve been trying out Windsurf, which is a competitor to
            Cursor. I found that it’s rather good, so I thought to
            myself, what’s the most absurd thing I can try to code just
            using AI?</p>
            <p>In other words, what can I try vibe coding that wouldn’t
            appear to lend itself to vibe coding? Vibe coding being
            where you just ask the AI to write the code based on your
            gut and don’t engage your brain.</p>
            <p>Operating system kernels came to mind first of all. So I
            decided to try starting a kernel that runs inside a
            Firecracker microVM. The end result works quite nicely and
            it took very little time, but there are caveats.</p>
            <p>I asked Claude from Anthropic via Windsurf to write me a
            bare metal program that can be booted inside a micro VM.</p>
            <p>On the first attempt, it decided to use Rust. This did
            not go well, and I’m not sure what is to blame the most.</p>
            <p>Windsurf first tried to create a new Rust crate using
            Cargo. When it discovered that wasn’t installed, it tried to
            install it using apt-get. That didn’t work because I’m using
            NixOS.</p>
            <p>In the process, it saw that I am using Nix because of the
            Nix paths in error output. It didn’t think to try using Nix
            though to install the packages or to use a Nix flake.</p>
            <p>So I helped it out to get Rust installed. It then
            generated the Rust code after checking the Rust version, but
            the code would not compile.</p>
            <p>It then got into a loop trying to compile the code and
            doing various different things like using rustup to install
            the toolchain, switching between nightly and stable, trying
            different compiler flags, modifying the code, etc.</p>
            <p>I’m not that familiar with rust so it wasn’t clear to me
            if it was making progress or just creating new problems and
            getting a different error message as a result.</p>
            <p>I asked it to start again and pick a different language.
            This time it picked C, which has the advantage of being a
            language that is very well established, doesn’t change much,
            and is used extensively in operating systems.</p>
            <p>This is good for large language models because there’s
            high-quality training data available for C. It also chose
            NASM assembly for the entry code and created a linker script
            for LD.</p>
            <p>It wrote some c code to output ASCII to the serial
            console and read an echo input from the serial console. It
            also made some scripts and config to run firecracker.</p>
            <p>Apart from problems with nix again, which I fixed because
            I think it’s unfair to expect AI to deal with Nix, it all
            worked.</p>
            <p>However, the program polled the serial console in a tight
            loop which used 100% of one CPU core. I decided to try
            asking it to fix this so that it didn’t use 100% CPU.</p>
            <p>Initially it completely failed to do this; it first broke
            the program by halting the CPU indefinitely. Then it tried
            various ways of using the x86 pause instruction, skirting
            around the real solution which is to use interrupts.</p>
            <p>Eventually I asked it why don’t you just use interrupts?
            This prompted it that it should set up an interrupt table
            and use interrupts. Initially there was a bug but it fixed
            it and it worked.</p>
            <p>Perhaps if I were more clueless about kernel development
            this could have been a sticking point. Then again, while the
            LLM did not attempt to implement interrupt handlers, it did
            mention their absence and someone could simply question that
            without knowing what interrupt handlers are.</p>
            <p>After this I asked it to add support for an exit command.
            It actually added a number of commands including help.
            Initially there were a couple of visible bugs but after
            providing descriptions of the bugs it solved the issues and
            got the code working.</p>
            <p>It would be quite easy to start picking the program
            apart. However what’s incredible is that it worked at all.
            Also it’s worth thinking about the trajectory and velocity
            of this technology. Machine learning models are improving
            quickly and at the same time the tooling around these
            probabilistic models is improving too.</p>
    </div>
  </content>
</entry>
<entry>
  <title>Richie’s Techbits newsletter: Issue 9</title>
  <id>https://richiejp.com/news-letter-issue-9</id>
  <published>2025-03-30T19:57:11+01:00</published>
  <updated>2025-03-30T19:57:11+01:00</updated>
  <link rel="alternate" href="https://richiejp.com/news-letter-issue-9" />
  <summary>Chat coding feels awful, but it can’t be ignored</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>Following on from last week’s letter, where I ‘vibe
            coded’ a bare metal program in C, this week I have ‘chat
            coded’ a moderately complicated front end from raw HTML/JS
            into React and started a shell in Go which is intended to
            have the usability of Fish shell while maintaining POSIX
            compatibility.</p>
            <p>I’m not the world’s best frontend engineer so I managed
            to do in a few hours what would have previously taken me
            months. Although there are hidden costs which I’ll get to in
            a moment. The shell experiment did not go so well.</p>
            <p>I wouldn’t describe this as vibe coding, which I hit hard
            limits with. Instead, I’m going to call this ‘chat coding’
            because I’m thinking much harder about what is being
            written, rather than just going with the vibes. The term
            chat coding has been floating around for a while, but since
            Andrej Kaparthy coined (or popularized) vibe coding it has
            taken on a new distinction.</p>
            <p>With pure vibe coding I found the AI consistently makes
            poor strategic decisions. Leading to huge files that are
            highly duplicated and lots of functions with pointless code
            in. Eventually the AI can’t progress except by creating
            completely new code and hooking that into what already
            exists. The resulting code is about as intelligible as the
            product of a genetic algorithm.</p>
            <p>In some cases the code reaches such a level of mess that
            the code agent goes wild and just produces yet more
            nonsense. This is typical of what happens to LLMs when the
            input is too far outside of their training data. Which is
            amusing when it’s the LLM which produced what is now being
            input.</p>
            <p>I think there are (expensive) ways that coding agents
            could mitigate this however we’re not there yet. So in the
            meantime the user has to provide controls to prevent this
            from happening.</p>
            <p>At this stage that means looking at and understanding the
            code to some extent. Then providing the AI with constant
            feedback. Also taking time to think about or chat with the
            AI regarding the overall structure of the program.</p>
            <p>Generally speaking, coding agents will make the laziest,
            most direct changes to achieve whatever you’ve given them.
            It’s easy to see why; when they deviate from this, becoming
            more proactive, it’s often disastrous.</p>
            <p>Constantly bullying and directing the AI to get the
            result you want, is not nearly as fun as vibe coding.
            Neither is it more satisfying than writing code by hand and
            having complete control.</p>
            <p>It’s an unhappy middle ground, but for tasks lacking in
            novelty and not requiring strict security guarantees, it is
            so much faster. The major reasons for which I think are
            quite mundane.</p>
            <p>I feel like for a lot of programming tasks the limiting
            factors are typing and reading speed.</p>
            <p>That is, how fast you can read the documentation and
            existing code to absorb the exact details of how it was
            constructed. Followed by how fast you can type in the code
            to use whatever combination of features you need.</p>
            <p>For an experienced software developer, doing routine
            development, not much deep thought is required. It only
            takes a second to form an abstract idea of how the code
            should work. Most of the time is taken hammering out the
            routine details which follow standard patterns.</p>
            <p>Any gap between thinking of what you want to write and
            writing it, is an opportunity to get distracted. In my case
            it is an opportunity to get bored and start complicating
            things to make it more interesting. So even if the
            percentage of time spent typing is small it’s still an
            important factor.</p>
            <p>The amount of time spent typing can be reduced by asking
            an LLM to write the code you want, because even though
            you’re still typing (unless you’re using speech-to-text),
            the key combinations are easier and some of the details can
            be inferred by the LLM.</p>
            <p>If you’re able to find the right balance, where you’re
            not asking the LLM to exceed its capabilities, then you can
            offload a lot of reading, writing, and editing at a
            reasonable cost.</p>
            <p>However for some tasks I think it’s unlikely you’ll be
            able to find that balance. For the shell program for
            instance, while I did achieve some nice results quickly, the
            LLM overstepped and tried to do stuff that wasn’t gonna
            work. It then started piling hacks on top of hacks to try
            and fix it. In fact it quite often left a TODO style comment
            saying this is tricky, so I’ll just do this hack for
            now.</p>
            <p>It seemed almost every detail of the program confused it,
            but in particular when I started trying to use the AST to
            highlight the syntax. Although to be fair it’s not just the
            LLM that was getting confused.</p>
            <p>While I wouldn’t rule out using chat coding for elements
            of this shell Project, clearly some parts need to be written
            manually with special care. Trying to get an LLM to write
            them is partially a battle with the LLM and an expensive
            battle at that.</p>
            <p>I have to say though, that I’m getting more and more
            excited by the thought of not having to write code by hand.
            To be clear, I enjoy being a competent Vim and Emacs user; I
            deliberately trained myself to touch type using the standard
            10 finger method. I won’t be happy to never use these skills
            again, but I will be happy to have the option of not using
            them.</p>
            <p>I like the idea of being able to walk around while
            writing code or doing research. Keyboards aren’t exactly a
            natural phenomena after all and needing to sit at one
            constrains my work environment.</p>
            <p>In the past I’ve not used speech to text tools much
            because in the past they were usually rubbish. These days
            though they’re getting very good, which is in no small part
            due to LLMs and the underlying advances in machine
            learning.</p>
            <p>Editing some text using speech commands is quite
            difficult if you have to specify the exact changes required.
            However if you can casually describe what is needed and
            there is a layer which translates this into the exact edits
            required, then it becomes a lot easier.</p>
            <p>So today chat coding has a number of down sides that
            annoy me a lot. However the upsides make it difficult to
            ignore and in the future I see a number of possible
            evolutionary paths that I am positively excited about.
            Including that it becomes a superior alternative to the
            keyboard and mouse. If this could be the case for software
            development it would have deep implications for how everyone
            interfaces with computers.</p>
    </div>
  </content>
</entry>
<entry>
  <title>Creating a static site with Pandoc and Bulma</title>
  <id>https://richiejp.com/pandoc-bulma-static-site</id>
  <published>2020-06-29T21:37:07+01:00</published>
  <updated>2023-07-23T16:10:39+01:00</updated>
  <link rel="alternate" href="https://richiejp.com/pandoc-bulma-static-site" />
  <summary>Duck tape edition of static web site generation</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <h1 id="intro">Intro</h1>
            <p>The other day I got this <em>incredible</em> urge to
            resurrect my website and create a portfolio<a href="#fn1"
            class="footnote-ref" id="fnref1"><sup>1</sup></a>. It is
            something I have been meaning to do for a <em>long
            time</em>, but have been putting off due to the pain of
            dealing with web technology.</p>
            <p>The web is awash with unnecessary complexity and bloat. I
            don’t even have the patience for learning static site
            generators<a href="#fn2" class="footnote-ref"
            id="fnref2"><sup>2</sup></a>, I don’t need anything dynamic
            and I don’t want to maintain it. I’m almost happy to write
            raw HTML, but despite claims to the contrary the structure
            of HTML is tightly coupled with the rendering of it (unless
            you reprocess it). Plus, <em>even as an intermediate
            language without all the extra elements needed for styling,
            it is still ugly and verbose</em>. It is nicer to use
            <strong>Markdown</strong> or similar.</p>
            <p>To be clear, I don’t want to spend ages learning about a
            new tool when I can increase my knowledge of ones I already
            use that have more utility in other domains. Don’t get me
            wrong, sometimes (OK, historically, <em>most of the
            time</em>)<a href="#fn3" class="footnote-ref"
            id="fnref3"><sup>3</sup></a> I’m happy to rewrite everything
            from scratch in a language no one has heard of using
            technologies that are barely invented, but this is not one
            of those occasions.</p>
            <p>It so happens I have already been using <a
            href="https://pandoc.org">Pandoc</a>, which can generate
            HTML amongst other formats. Using <strong>Pandoc</strong> I
            can convert from a single source format to HTML and
            Latex/PDF. I don’t like the default HTML/CSS which
            <strong>Pandoc</strong> produces by <em>default</em>, but I
            also have some basic familiarity with a CSS framework called
            <a href="https://bulma.io">Bulma</a> which I can use without
            any CSS knowledge.</p>
            <p><strong>GitLab CI</strong> makes it easy to host a static
            site. So I decided to <em>duct tape</em> together
            <strong>Pandoc</strong> and <strong>Bulma</strong> with GNU
            Make to create a static site generator. I’m not saying
            anything about how good or bad these are in relation to
            similar stuff, but my patience was low with this one, so I
            went with what worked quickly in the past.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>I have been tweaking my website since writing this. Look
            at my GitLab to see the <a
            href="https://gitlab.com/Palethorpe/portfolio">latest
            source</a>.</p>
            <p>In fact it now uses TailwindCSS which requires some CSS
            knowledge. However it tended to be the case with Bulma that
            something wouldn’t quite work right. Then I’d have to start
            tweaking the CSS which usually took a long time.</p>
            </div>
            </div>
            <h1 id="pandoc-bulma">Pandoc &amp; Bulma</h1>
            <p>For now I am writing the page source in Markdown,
            <strong>Pandoc</strong> can turn this into fairly generic
            HTML. Either as a standalone page or a fragment. I decided
            to the use the standalone variant, which uses a template to
            generate the document header and footer.</p>
            <p>So when executing pandoc it looks something like
            this:</p>
            <div class="sourceCode" id="cb1"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="ex">pandoc</span> <span class="at">--standalone</span> <span class="at">--template</span><span class="op">=</span>src/std.tmpl <span class="at">--css</span><span class="op">=</span>bulma.css <span class="dt">\</span></span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a>       src/pandoc-bulma-static-site.md</span></code></pre></div>
            <p>By default the HTML document doesn’t have much style to
            speak of except for the code highlights. So I can freely add
            <strong>Bulma</strong> which doesn’t conflict much with the
            code highlighting. <strong>Bulma</strong> does require
            particular classes on <em>some</em> of the HTML, so this
            must be added in the template.</p>
            <p>Luckily <strong>Bulma</strong> has a class simply called
            <code>content</code> which nicely handles the HTML
            <strong>Pandoc</strong> produces (that is not part of the
            template). Note that <strong>Pandoc</strong> allows one to
            customize the formatter to change all output using Lua
            script, but so far I haven’t needed it (thankfully).</p>
            <p>Let’s go through the template (at the time of
            writing).</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode html"><code class="sourceCode html"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="dt">&lt;!DOCTYPE</span> html<span class="dt">&gt;</span></span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a><span class="dt">&lt;</span><span class="kw">html</span><span class="ot"> xmlns</span><span class="op">=</span><span class="st">&quot;http://www.w3.org/1999/xhtml&quot;</span><span class="ot"> lang</span><span class="op">=</span><span class="st">&quot;$lang$&quot;</span><span class="ot"> xml:lang</span><span class="op">=</span><span class="st">&quot;$lang$&quot;</span><span class="er">$</span><span class="ot">if(dir)</span><span class="er">$</span><span class="ot"> dir</span><span class="op">=</span><span class="st">&quot;$dir$&quot;</span><span class="er">$</span><span class="ot">endif</span><span class="er">$</span><span class="dt">&gt;</span></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a><span class="dt">&lt;</span><span class="kw">head</span><span class="dt">&gt;</span></span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">meta</span><span class="ot"> charset</span><span class="op">=</span><span class="st">&quot;utf-8&quot;</span><span class="ot"> </span><span class="dt">/&gt;</span></span>
<span id="cb2-5"><a href="#cb2-5" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">meta</span><span class="ot"> name</span><span class="op">=</span><span class="st">&quot;generator&quot;</span><span class="ot"> content</span><span class="op">=</span><span class="st">&quot;pandoc&quot;</span><span class="ot"> </span><span class="dt">/&gt;</span></span>
<span id="cb2-6"><a href="#cb2-6" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">meta</span><span class="ot"> name</span><span class="op">=</span><span class="st">&quot;viewport&quot;</span><span class="ot"> content</span><span class="op">=</span><span class="st">&quot;width=device-width, initial-scale=1.0, user-scalable=yes&quot;</span><span class="ot"> </span><span class="dt">/&gt;</span></span>
<span id="cb2-7"><a href="#cb2-7" tabindex="-1"></a>$for(author-meta)$</span>
<span id="cb2-8"><a href="#cb2-8" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">meta</span><span class="ot"> name</span><span class="op">=</span><span class="st">&quot;author&quot;</span><span class="ot"> content</span><span class="op">=</span><span class="st">&quot;$author-meta$&quot;</span><span class="ot"> </span><span class="dt">/&gt;</span></span>
<span id="cb2-9"><a href="#cb2-9" tabindex="-1"></a>$endfor$</span>
<span id="cb2-10"><a href="#cb2-10" tabindex="-1"></a>$if(date-meta)$</span>
<span id="cb2-11"><a href="#cb2-11" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">meta</span><span class="ot"> name</span><span class="op">=</span><span class="st">&quot;dcterms.date&quot;</span><span class="ot"> content</span><span class="op">=</span><span class="st">&quot;$date-meta$&quot;</span><span class="ot"> </span><span class="dt">/&gt;</span></span>
<span id="cb2-12"><a href="#cb2-12" tabindex="-1"></a>$endif$</span>
<span id="cb2-13"><a href="#cb2-13" tabindex="-1"></a>$if(keywords)$</span>
<span id="cb2-14"><a href="#cb2-14" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">meta</span><span class="ot"> name</span><span class="op">=</span><span class="st">&quot;keywords&quot;</span><span class="ot"> content</span><span class="op">=</span><span class="st">&quot;$for(keywords)$$keywords$$sep$, $endfor$&quot;</span><span class="ot"> </span><span class="dt">/&gt;</span></span>
<span id="cb2-15"><a href="#cb2-15" tabindex="-1"></a>$endif$</span>
<span id="cb2-16"><a href="#cb2-16" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">title</span><span class="dt">&gt;</span>Richie&#39;s $pagetitle$<span class="dt">&lt;/</span><span class="kw">title</span><span class="dt">&gt;</span></span>
<span id="cb2-17"><a href="#cb2-17" tabindex="-1"></a>$for(css)$</span>
<span id="cb2-18"><a href="#cb2-18" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">link</span><span class="ot"> rel</span><span class="op">=</span><span class="st">&quot;stylesheet&quot;</span><span class="ot"> href</span><span class="op">=</span><span class="st">&quot;$css$&quot;</span><span class="ot"> </span><span class="dt">/&gt;</span></span>
<span id="cb2-19"><a href="#cb2-19" tabindex="-1"></a>$endfor$</span>
<span id="cb2-20"><a href="#cb2-20" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">style</span><span class="dt">&gt;</span></span>
<span id="cb2-21"><a href="#cb2-21" tabindex="-1"></a>    $styles<span class="fu">.html</span><span class="in">()</span>$</span>
<span id="cb2-22"><a href="#cb2-22" tabindex="-1"></a>  <span class="dt">&lt;/</span><span class="kw">style</span><span class="dt">&gt;</span></span>
<span id="cb2-23"><a href="#cb2-23" tabindex="-1"></a>$if(math)$</span>
<span id="cb2-24"><a href="#cb2-24" tabindex="-1"></a>  $math$</span>
<span id="cb2-25"><a href="#cb2-25" tabindex="-1"></a>$endif$</span>
<span id="cb2-26"><a href="#cb2-26" tabindex="-1"></a>  <span class="co">&lt;!--[if lt IE 9]&gt;</span></span>
<span id="cb2-27"><a href="#cb2-27" tabindex="-1"></a><span class="co">    &lt;script src=&quot;//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js&quot;&gt;&lt;/script&gt;</span></span>
<span id="cb2-28"><a href="#cb2-28" tabindex="-1"></a><span class="co">  &lt;![endif]--&gt;</span></span>
<span id="cb2-29"><a href="#cb2-29" tabindex="-1"></a>$for(header-includes)$</span>
<span id="cb2-30"><a href="#cb2-30" tabindex="-1"></a>  $header-includes$</span>
<span id="cb2-31"><a href="#cb2-31" tabindex="-1"></a>$endfor$</span>
<span id="cb2-32"><a href="#cb2-32" tabindex="-1"></a><span class="dt">&lt;/</span><span class="kw">head</span><span class="dt">&gt;</span></span></code></pre></div>
            <p><strong>Pandoc</strong> templates have support for
            various control structures (branches and loops) and
            inserting variables. For example</p>
            <div class="sourceCode" id="cb3"><pre
            class="sourceCode html"><code class="sourceCode html"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a>$for(css)$</span>
<span id="cb3-2"><a href="#cb3-2" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">link</span><span class="ot"> rel</span><span class="op">=</span><span class="st">&quot;stylesheet&quot;</span><span class="ot"> href</span><span class="op">=</span><span class="st">&quot;$css$&quot;</span><span class="ot"> </span><span class="dt">/&gt;</span></span>
<span id="cb3-3"><a href="#cb3-3" tabindex="-1"></a>$endfor$</span></code></pre></div>
            <p>Which is where the <strong>Bulma</strong> style sheet
            will be linked to. I have removed some bits from the default
            template, but otherwise this is just what
            <strong>Pandoc</strong> uses by default.</p>
            <div class="sourceCode" id="cb4"><pre
            class="sourceCode html"><code class="sourceCode html"><span id="cb4-1"><a href="#cb4-1" tabindex="-1"></a><span class="dt">&lt;</span><span class="kw">body</span><span class="dt">&gt;</span></span>
<span id="cb4-2"><a href="#cb4-2" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">section</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;hero is-small is-warning is-bold&quot;</span><span class="dt">&gt;</span></span>
<span id="cb4-3"><a href="#cb4-3" tabindex="-1"></a>    $for(include-before)$</span>
<span id="cb4-4"><a href="#cb4-4" tabindex="-1"></a>    $include-before$</span>
<span id="cb4-5"><a href="#cb4-5" tabindex="-1"></a>    $endfor$</span>
<span id="cb4-6"><a href="#cb4-6" tabindex="-1"></a>    $if(title)$</span>
<span id="cb4-7"><a href="#cb4-7" tabindex="-1"></a>    <span class="dt">&lt;</span><span class="kw">div</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;hero-head&quot;</span><span class="dt">&gt;</span></span>
<span id="cb4-8"><a href="#cb4-8" tabindex="-1"></a>      <span class="dt">&lt;</span><span class="kw">nav</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;navbar is-pulled-right&quot;</span><span class="dt">&gt;</span></span>
<span id="cb4-9"><a href="#cb4-9" tabindex="-1"></a>    <span class="dt">&lt;</span><span class="kw">div</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;container&quot;</span><span class="dt">&gt;</span></span>
<span id="cb4-10"><a href="#cb4-10" tabindex="-1"></a>          <span class="dt">&lt;</span><span class="kw">div</span><span class="ot"> id</span><span class="op">=</span><span class="st">&quot;navbarMenuHeroA&quot;</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;navbar-menu&quot;</span><span class="dt">&gt;</span></span>
<span id="cb4-11"><a href="#cb4-11" tabindex="-1"></a>            <span class="dt">&lt;</span><span class="kw">div</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;navbar-end&quot;</span><span class="dt">&gt;</span></span>
<span id="cb4-12"><a href="#cb4-12" tabindex="-1"></a>              <span class="dt">&lt;</span><span class="kw">a</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;navbar-item&quot;</span><span class="ot"> href</span><span class="op">=</span><span class="st">&quot;/&quot;</span><span class="dt">&gt;</span></span>
<span id="cb4-13"><a href="#cb4-13" tabindex="-1"></a>        /index</span>
<span id="cb4-14"><a href="#cb4-14" tabindex="-1"></a>              <span class="dt">&lt;/</span><span class="kw">a</span><span class="dt">&gt;</span></span>
<span id="cb4-15"><a href="#cb4-15" tabindex="-1"></a>          <span class="dt">&lt;</span><span class="kw">a</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;navbar-item&quot;</span></span>
<span id="cb4-16"><a href="#cb4-16" tabindex="-1"></a><span class="ot">         href</span><span class="op">=</span><span class="st">&quot;https://gitlab.com/Palethorpe&quot;</span><span class="dt">&gt;</span></span>
<span id="cb4-17"><a href="#cb4-17" tabindex="-1"></a>                  <span class="dt">&lt;</span><span class="kw">img</span><span class="ot"> src</span><span class="op">=</span><span class="st">&quot;gitlab_logo.svg&quot;</span><span class="dt">&gt;&lt;/</span><span class="kw">img</span><span class="dt">&gt;</span></span>
<span id="cb4-18"><a href="#cb4-18" tabindex="-1"></a>          <span class="dt">&lt;/</span><span class="kw">a</span><span class="dt">&gt;</span></span>
<span id="cb4-19"><a href="#cb4-19" tabindex="-1"></a>          <span class="dt">&lt;</span><span class="kw">a</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;navbar-item&quot;</span><span class="dt">&gt;</span></span>
<span id="cb4-20"><a href="#cb4-20" tabindex="-1"></a>          <span class="dt">&lt;/</span><span class="kw">a</span><span class="dt">&gt;</span></span>
<span id="cb4-21"><a href="#cb4-21" tabindex="-1"></a>            <span class="dt">&lt;/</span><span class="kw">div</span><span class="dt">&gt;</span></span>
<span id="cb4-22"><a href="#cb4-22" tabindex="-1"></a>          <span class="dt">&lt;/</span><span class="kw">div</span><span class="dt">&gt;</span></span>
<span id="cb4-23"><a href="#cb4-23" tabindex="-1"></a>    <span class="dt">&lt;/</span><span class="kw">div</span><span class="dt">&gt;</span></span>
<span id="cb4-24"><a href="#cb4-24" tabindex="-1"></a>      <span class="dt">&lt;/</span><span class="kw">nav</span><span class="dt">&gt;</span></span>
<span id="cb4-25"><a href="#cb4-25" tabindex="-1"></a>    <span class="dt">&lt;/</span><span class="kw">div</span><span class="dt">&gt;</span></span>
<span id="cb4-26"><a href="#cb4-26" tabindex="-1"></a>    <span class="dt">&lt;</span><span class="kw">div</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;hero-body&quot;</span><span class="dt">&gt;</span></span>
<span id="cb4-27"><a href="#cb4-27" tabindex="-1"></a>      <span class="dt">&lt;</span><span class="kw">header</span><span class="ot"> id</span><span class="op">=</span><span class="st">&quot;title-block-header&quot;</span><span class="dt">&gt;</span></span>
<span id="cb4-28"><a href="#cb4-28" tabindex="-1"></a>    <span class="dt">&lt;</span><span class="kw">h1</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;title&quot;</span><span class="dt">&gt;</span>Richie&#39;s<span class="dt">&lt;/</span><span class="kw">h1</span><span class="dt">&gt;</span></span>
<span id="cb4-29"><a href="#cb4-29" tabindex="-1"></a>    <span class="dt">&lt;</span><span class="kw">h2</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;subtitle&quot;</span><span class="dt">&gt;</span>$title$<span class="dt">&lt;/</span><span class="kw">h2</span><span class="dt">&gt;</span></span>
<span id="cb4-30"><a href="#cb4-30" tabindex="-1"></a>      <span class="dt">&lt;/</span><span class="kw">header</span><span class="dt">&gt;</span></span>
<span id="cb4-31"><a href="#cb4-31" tabindex="-1"></a>    <span class="dt">&lt;/</span><span class="kw">div</span><span class="dt">&gt;</span></span>
<span id="cb4-32"><a href="#cb4-32" tabindex="-1"></a>    $endif$</span>
<span id="cb4-33"><a href="#cb4-33" tabindex="-1"></a>    <span class="dt">&lt;</span><span class="kw">div</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;hero-foot&quot;</span><span class="dt">&gt;</span></span>
<span id="cb4-34"><a href="#cb4-34" tabindex="-1"></a>      <span class="dt">&lt;</span><span class="kw">nav</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;tabs is-boxed is-pulled-right&quot;</span><span class="dt">&gt;</span></span>
<span id="cb4-35"><a href="#cb4-35" tabindex="-1"></a>    $if(toc)$</span>
<span id="cb4-36"><a href="#cb4-36" tabindex="-1"></a>    <span class="dt">&lt;</span><span class="kw">div</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;container&quot;</span><span class="dt">&gt;</span></span>
<span id="cb4-37"><a href="#cb4-37" tabindex="-1"></a>      $table-of-contents$</span>
<span id="cb4-38"><a href="#cb4-38" tabindex="-1"></a>    <span class="dt">&lt;/</span><span class="kw">div</span><span class="dt">&gt;</span></span>
<span id="cb4-39"><a href="#cb4-39" tabindex="-1"></a>    $endif$</span>
<span id="cb4-40"><a href="#cb4-40" tabindex="-1"></a>      <span class="dt">&lt;/</span><span class="kw">nav</span><span class="dt">&gt;</span></span>
<span id="cb4-41"><a href="#cb4-41" tabindex="-1"></a>    <span class="dt">&lt;/</span><span class="kw">div</span><span class="dt">&gt;</span></span>
<span id="cb4-42"><a href="#cb4-42" tabindex="-1"></a>  <span class="dt">&lt;/</span><span class="kw">section</span><span class="dt">&gt;</span></span></code></pre></div>
            <p>Next up is the ‘hero’ banner, this was more or less
            copied from <a
            href="https://bulma.io/documentation/layout/hero/">Bulma’s
            documentation</a>, I just added some modifiers
            (e.g. <code>is-warning</code>, <code>is-pulled-right</code>)
            and removed a few bits to customise it. I think it looks
            great!</p>
            <p><strong>Pandoc</strong> can generate a table of contents
            (<code>--toc</code>), which I have hacked into the
            <code>hero-foot</code>. Luckily the HTML is rendered OK when
            <code>--toc-depth 1</code>.</p>
            <div class="sourceCode" id="cb5"><pre
            class="sourceCode html"><code class="sourceCode html"><span id="cb5-1"><a href="#cb5-1" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">section</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;section&quot;</span><span class="dt">&gt;</span></span>
<span id="cb5-2"><a href="#cb5-2" tabindex="-1"></a>    <span class="dt">&lt;</span><span class="kw">div</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;content&quot;</span><span class="dt">&gt;</span></span>
<span id="cb5-3"><a href="#cb5-3" tabindex="-1"></a>$body$</span>
<span id="cb5-4"><a href="#cb5-4" tabindex="-1"></a>    <span class="dt">&lt;/</span><span class="kw">div</span><span class="dt">&gt;</span></span>
<span id="cb5-5"><a href="#cb5-5" tabindex="-1"></a>  <span class="dt">&lt;/</span><span class="kw">section</span><span class="dt">&gt;</span></span></code></pre></div>
            <p>This is the important part; the <code>$body</code> is
            wedged into a content <code>div</code>.
            <strong>Pandoc</strong> mostly outputs HTML content like the
            following.</p>
            <div class="sourceCode" id="cb6"><pre
            class="sourceCode html"><code class="sourceCode html"><span id="cb6-1"><a href="#cb6-1" tabindex="-1"></a><span class="dt">&lt;</span><span class="kw">h1</span><span class="ot"> id</span><span class="op">=</span><span class="st">&quot;intro&quot;</span><span class="dt">&gt;</span>Intro<span class="dt">&lt;/</span><span class="kw">h1</span><span class="dt">&gt;</span></span>
<span id="cb6-2"><a href="#cb6-2" tabindex="-1"></a><span class="dt">&lt;</span><span class="kw">p</span><span class="dt">&gt;</span>The other day I got this <span class="dt">&lt;</span><span class="kw">em</span><span class="dt">&gt;</span>incredible<span class="dt">&lt;/</span><span class="kw">em</span><span class="dt">&gt;</span> urge to resurrect my website and create a portfolio. It is something I have been meaning to do for a <span class="dt">&lt;</span><span class="kw">em</span><span class="dt">&gt;</span>long time<span class="dt">&lt;/</span><span class="kw">em</span><span class="dt">&gt;</span>, but have been putting off due to the pain of dealing with web technology.<span class="dt">&lt;/</span><span class="kw">p</span><span class="dt">&gt;</span></span>
<span id="cb6-3"><a href="#cb6-3" tabindex="-1"></a><span class="dt">&lt;</span><span class="kw">p</span><span class="dt">&gt;&lt;</span><span class="kw">strong</span><span class="dt">&gt;</span>The web is awash with unnecessary complexity and bloat<span class="dt">&lt;/</span><span class="kw">strong</span><span class="dt">&gt;</span>. I don’t even have the patience for learning static site generators, I don’t need anything dynamic and I don’t want to maintain it.</span></code></pre></div>
            <p>I guess these are HTML ‘content’ elements which
            <strong>Bulma</strong> styles sensibly when they are in an
            HTML element with the content class.</p>
            <div class="sourceCode" id="cb7"><pre
            class="sourceCode html"><code class="sourceCode html"><span id="cb7-1"><a href="#cb7-1" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">footer</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;footer&quot;</span><span class="dt">&gt;</span></span>
<span id="cb7-2"><a href="#cb7-2" tabindex="-1"></a>    <span class="dt">&lt;</span><span class="kw">div</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;content&quot;</span><span class="dt">&gt;</span></span>
<span id="cb7-3"><a href="#cb7-3" tabindex="-1"></a>      <span class="dt">&lt;</span><span class="kw">p</span><span class="dt">&gt;&lt;</span><span class="kw">strong</span><span class="dt">&gt;</span>Richard Palethorpe<span class="dt">&lt;/</span><span class="kw">strong</span><span class="dt">&gt;&lt;/</span><span class="kw">p</span><span class="dt">&gt;</span></span>
<span id="cb7-4"><a href="#cb7-4" tabindex="-1"></a>      <span class="dt">&lt;</span><span class="kw">p</span><span class="dt">&gt;</span>richiejp@f-m.fm,</span>
<span id="cb7-5"><a href="#cb7-5" tabindex="-1"></a>    <span class="dt">&lt;</span><span class="kw">a</span><span class="ot"> href</span><span class="op">=</span><span class="st">&quot;https://twitter.com/jichiep&quot;</span><span class="dt">&gt;</span>@jichiep<span class="dt">&lt;/</span><span class="kw">a</span><span class="dt">&gt;</span></span>
<span id="cb7-6"><a href="#cb7-6" tabindex="-1"></a>      <span class="dt">&lt;/</span><span class="kw">p</span><span class="dt">&gt;</span></span>
<span id="cb7-7"><a href="#cb7-7" tabindex="-1"></a>    <span class="dt">&lt;/</span><span class="kw">div</span><span class="dt">&gt;</span></span>
<span id="cb7-8"><a href="#cb7-8" tabindex="-1"></a>  <span class="dt">&lt;/</span><span class="kw">footer</span><span class="dt">&gt;</span></span>
<span id="cb7-9"><a href="#cb7-9" tabindex="-1"></a><span class="dt">&lt;/</span><span class="kw">body</span><span class="dt">&gt;</span></span>
<span id="cb7-10"><a href="#cb7-10" tabindex="-1"></a><span class="dt">&lt;/</span><span class="kw">html</span><span class="dt">&gt;</span></span></code></pre></div>
            <p>And that is the footer…</p>
            <h1 id="making-the-pages">Making the Pages</h1>
            <p>OK, brace yourself, I use GNU make to build the site and
            it is not pretty.</p>
            <div class="sourceCode" id="cb8"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb8-1"><a href="#cb8-1" tabindex="-1"></a><span class="ex">CSS</span> <span class="pp">?</span>= css</span>
<span id="cb8-2"><a href="#cb8-2" tabindex="-1"></a></span>
<span id="cb8-3"><a href="#cb8-3" tabindex="-1"></a><span class="ex">inputs</span> = <span class="va">$(</span><span class="ex">wildcard</span> src/<span class="pp">*</span>.md<span class="va">)</span></span>
<span id="cb8-4"><a href="#cb8-4" tabindex="-1"></a><span class="ex">pages</span> = <span class="va">$(</span><span class="ex">subst</span> src,public,<span class="va">$(</span><span class="ex">inputs:.md=.html</span><span class="va">))</span></span>
<span id="cb8-5"><a href="#cb8-5" tabindex="-1"></a><span class="ex">svgs</span> = <span class="va">$(</span><span class="ex">subst</span> src,public,<span class="va">$(</span><span class="ex">wildcard</span> src/<span class="pp">*</span>.svg<span class="va">))</span></span>
<span id="cb8-6"><a href="#cb8-6" tabindex="-1"></a><span class="ex">pngs</span> = <span class="va">$(</span><span class="ex">subst</span> src,public,<span class="va">$(</span><span class="ex">wildcard</span> src/<span class="pp">*</span>.png<span class="va">))</span></span>
<span id="cb8-7"><a href="#cb8-7" tabindex="-1"></a><span class="ex">imgs</span> = <span class="va">$(</span><span class="ex">svgs</span><span class="va">)</span> <span class="va">$(</span><span class="ex">pngs</span><span class="va">)</span></span>
<span id="cb8-8"><a href="#cb8-8" tabindex="-1"></a></span>
<span id="cb8-9"><a href="#cb8-9" tabindex="-1"></a><span class="ex">all:</span> <span class="va">$(</span><span class="ex">pages</span><span class="va">)</span> <span class="va">$(</span><span class="ex">imgs</span><span class="va">)</span></span>
<span id="cb8-10"><a href="#cb8-10" tabindex="-1"></a></span>
<span id="cb8-11"><a href="#cb8-11" tabindex="-1"></a><span class="va">$(</span><span class="ex">svgs</span><span class="va">)</span><span class="ex">:</span> public/%.svg: src/%.svg</span>
<span id="cb8-12"><a href="#cb8-12" tabindex="-1"></a>    <span class="fu">cp</span> $<span class="op">&lt;</span> <span class="va">$@</span></span>
<span id="cb8-13"><a href="#cb8-13" tabindex="-1"></a><span class="va">$(</span><span class="ex">pngs</span><span class="va">)</span><span class="ex">:</span> public/%.png: src/%.png</span>
<span id="cb8-14"><a href="#cb8-14" tabindex="-1"></a>    <span class="fu">cp</span> $<span class="op">&lt;</span> <span class="va">$@</span></span>
<span id="cb8-15"><a href="#cb8-15" tabindex="-1"></a><span class="va">$(</span><span class="ex">pages</span><span class="va">)</span><span class="ex">:</span> src/std.tmpl</span>
<span id="cb8-16"><a href="#cb8-16" tabindex="-1"></a><span class="va">$(</span><span class="ex">pages</span><span class="va">)</span><span class="ex">:</span> public/%.html: src/%.md</span>
<span id="cb8-17"><a href="#cb8-17" tabindex="-1"></a>    <span class="ex">pandoc</span> <span class="at">-s</span> <span class="at">--css</span><span class="op">=</span><span class="va">$(</span><span class="ex">CSS</span><span class="va">)</span> <span class="at">--template</span><span class="op">=</span>src/std.tmpl <span class="at">--toc</span> <span class="at">--toc-depth</span> 1 $<span class="op">&lt;</span> <span class="op">&gt;</span> <span class="va">$@</span></span>
<span id="cb8-18"><a href="#cb8-18" tabindex="-1"></a></span>
<span id="cb8-19"><a href="#cb8-19" tabindex="-1"></a><span class="ex">public/css:</span></span>
<span id="cb8-20"><a href="#cb8-20" tabindex="-1"></a>    <span class="fu">cp</span> res/bulma-0.9.0/css/bulma.css public/css</span>
<span id="cb8-21"><a href="#cb8-21" tabindex="-1"></a></span>
<span id="cb8-22"><a href="#cb8-22" tabindex="-1"></a><span class="ex">clean:</span> <span class="va">$(</span><span class="ex">pages</span><span class="va">)</span></span>
<span id="cb8-23"><a href="#cb8-23" tabindex="-1"></a>    <span class="fu">rm</span> <span class="va">$(</span><span class="ex">pages</span><span class="va">)</span></span></code></pre></div>
            <p>This will only rebuild files when they change, everything
            will be rebuilt if the template changes. If you drop an
            <code>.md</code> file in <code>src/</code> it will be
            automatically built. Make has the advantages:</p>
            <ol style="list-style-type: decimal">
            <li>It is available everywhere</li>
            <li>It never changes</li>
            <li>I have a vague understanding of how it works</li>
            </ol>
            <p>I won’t pretend I’m sure this how the Makefile should be
            written, in fact I’m not sure anyone <em>really</em> knows,
            but it works. Finally we want to run this on Gitlab CI.</p>
            <div class="sourceCode" id="cb9"><pre
            class="sourceCode yml"><code class="sourceCode yaml"><span id="cb9-1"><a href="#cb9-1" tabindex="-1"></a><span class="fu">image</span><span class="kw">:</span><span class="at"> pandoc/core:latest</span></span>
<span id="cb9-2"><a href="#cb9-2" tabindex="-1"></a></span>
<span id="cb9-3"><a href="#cb9-3" tabindex="-1"></a><span class="fu">pages</span><span class="kw">:</span></span>
<span id="cb9-4"><a href="#cb9-4" tabindex="-1"></a><span class="at">  </span><span class="fu">stage</span><span class="kw">:</span><span class="at"> deploy</span></span>
<span id="cb9-5"><a href="#cb9-5" tabindex="-1"></a><span class="at">  </span><span class="fu">script</span><span class="kw">:</span></span>
<span id="cb9-6"><a href="#cb9-6" tabindex="-1"></a><span class="at">    </span><span class="kw">-</span><span class="at"> apk add make</span></span>
<span id="cb9-7"><a href="#cb9-7" tabindex="-1"></a><span class="at">    </span><span class="kw">-</span><span class="at"> mkdir public</span></span>
<span id="cb9-8"><a href="#cb9-8" tabindex="-1"></a><span class="at">    </span><span class="kw">-</span><span class="at"> export CSS=https://cdn.jsdelivr.net/npm/bulma@0.9.0/css/bulma.min.css</span></span>
<span id="cb9-9"><a href="#cb9-9" tabindex="-1"></a><span class="at">    </span><span class="kw">-</span><span class="at"> make</span></span>
<span id="cb9-10"><a href="#cb9-10" tabindex="-1"></a><span class="at">  </span><span class="fu">artifacts</span><span class="kw">:</span></span>
<span id="cb9-11"><a href="#cb9-11" tabindex="-1"></a><span class="at">    </span><span class="fu">paths</span><span class="kw">:</span></span>
<span id="cb9-12"><a href="#cb9-12" tabindex="-1"></a><span class="at">      </span><span class="kw">-</span><span class="at"> public</span></span>
<span id="cb9-13"><a href="#cb9-13" tabindex="-1"></a><span class="at">  </span><span class="fu">only</span><span class="kw">:</span></span>
<span id="cb9-14"><a href="#cb9-14" tabindex="-1"></a><span class="at">    </span><span class="kw">-</span><span class="at"> master</span></span></code></pre></div>
            <p>This is the <code>.gitlab-ci.yml</code> file which
            more-or-less just calls Make after installing it. The
            <strong>Pandoc</strong> image is specified so we don’t need
            to install that, I <em>could</em> create my own Docker image
            based on <strong>Pandoc</strong>’s image, but with Make too.
            I could do that…</p>
            <h1 id="fancy-links">Fancy links</h1>
            <p>You know how links to most websites display as a fancy
            box or picture on Twitter? Well those are “Twitter Cards”
            and require Twitter Card Tags. Or Open Graph Protocol tags
            invented by Facebook. I assume the latter works on more
            websites. I added both.</p>
            <p>For this we need to start using YAML metadata in our
            markdown.</p>
            <div class="sourceCode" id="cb10"><pre
            class="sourceCode yaml"><code class="sourceCode yaml"><span id="cb10-1"><a href="#cb10-1" tabindex="-1"></a><span class="pp">---</span></span>
<span id="cb10-2"><a href="#cb10-2" tabindex="-1"></a><span class="fu">title</span><span class="kw">:</span><span class="at"> Creating a static site with Pandoc and Bulma</span></span>
<span id="cb10-3"><a href="#cb10-3" tabindex="-1"></a><span class="fu">description</span><span class="kw">:</span><span class="at"> Duck tape edition of static web site generation</span></span>
<span id="cb10-4"><a href="#cb10-4" tabindex="-1"></a><span class="pp">---</span></span></code></pre></div>
            <p>This goes at the top of each markdown file. It replaces
            the <code>%</code> line(s) which only allow title, date and
            author meta data. To use YAML we need to add
            <code>--from=markdown+yaml_metadata_block</code> to the
            Pandoc arguments.</p>
            <p>The head element of the HTML template has the following
            added to it.</p>
            <div class="sourceCode" id="cb11"><pre
            class="sourceCode html"><code class="sourceCode html"><span id="cb11-1"><a href="#cb11-1" tabindex="-1"></a><span class="dt">&lt;</span><span class="kw">head</span><span class="ot"> prefix</span><span class="op">=</span><span class="st">&quot;og: https://ogp.me/ns#&quot;</span><span class="dt">&gt;</span></span>
<span id="cb11-2"><a href="#cb11-2" tabindex="-1"></a>  ...</span>
<span id="cb11-3"><a href="#cb11-3" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">meta</span><span class="ot"> property</span><span class="op">=</span><span class="st">&quot;twitter:card&quot;</span><span class="ot"> content</span><span class="op">=</span><span class="st">&quot;summary&quot;</span><span class="ot"> </span><span class="dt">/&gt;</span></span>
<span id="cb11-4"><a href="#cb11-4" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">meta</span><span class="ot"> property</span><span class="op">=</span><span class="st">&quot;twitter:site&quot;</span><span class="ot"> content</span><span class="op">=</span><span class="st">&quot;@jichiep&quot;</span><span class="ot"> </span><span class="dt">/&gt;</span></span>
<span id="cb11-5"><a href="#cb11-5" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">meta</span><span class="ot"> property</span><span class="op">=</span><span class="st">&quot;twitter:creator&quot;</span><span class="ot"> content</span><span class="op">=</span><span class="st">&quot;@jichiep&quot;</span><span class="ot"> </span><span class="dt">/&gt;</span></span>
<span id="cb11-6"><a href="#cb11-6" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">meta</span><span class="ot"> property</span><span class="op">=</span><span class="st">&quot;og:type&quot;</span><span class="ot"> contents</span><span class="op">=</span><span class="st">&quot;website&quot;</span><span class="ot"> </span><span class="dt">/&gt;</span></span>
<span id="cb11-7"><a href="#cb11-7" tabindex="-1"></a>$if(title)$</span>
<span id="cb11-8"><a href="#cb11-8" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">meta</span><span class="ot"> property</span><span class="op">=</span><span class="st">&quot;og:title&quot;</span><span class="ot"> content</span><span class="op">=</span><span class="st">&quot;$title$&quot;</span><span class="ot"> </span><span class="dt">/&gt;</span></span>
<span id="cb11-9"><a href="#cb11-9" tabindex="-1"></a>$endif$</span>
<span id="cb11-10"><a href="#cb11-10" tabindex="-1"></a>$if(description)$</span>
<span id="cb11-11"><a href="#cb11-11" tabindex="-1"></a>  <span class="dt">&lt;</span><span class="kw">meta</span><span class="ot"> property</span><span class="op">=</span><span class="st">&quot;og:description&quot;</span><span class="ot"> content</span><span class="op">=</span><span class="st">&quot;$description$&quot;</span><span class="ot"> </span><span class="dt">/&gt;</span></span>
<span id="cb11-12"><a href="#cb11-12" tabindex="-1"></a>$endif$</span>
<span id="cb11-13"><a href="#cb11-13" tabindex="-1"></a>  ...</span></code></pre></div>
            <p>There is more stuff you can add of course. Including an
            image. Images are always better than just text.</p>
            <h1 id="dev-server">Dev server</h1>
            <p>Originally I was using
            <code>python3.9 -m http.server 8000</code> from the public
            folder to serve my website locally. That’s not fun though is
            it? Instead I have now written a <a
            href="/https/richiejp.com/linux-socket-example#tcp-http">minimal HTTP static
            file server</a>.</p>
            <p>If you have GCC installed then you can build and run this
            with</p>
            <div class="sourceCode" id="cb12"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb12-1"><a href="#cb12-1" tabindex="-1"></a><span class="ex">$</span> mkdir build</span>
<span id="cb12-2"><a href="#cb12-2" tabindex="-1"></a><span class="ex">$</span> make build/self-serve</span>
<span id="cb12-3"><a href="#cb12-3" tabindex="-1"></a><span class="ex">$</span> build/self-serve public</span></code></pre></div>
            <p>Then point your browser to
            <code>localhost:9000</code>.</p>
            <h1 id="todo">TODO</h1>
            <ul>
            <li><p><strong>Pandoc</strong> is fast (enough), so I
            <em>could</em> write an inotify script to monitor the
            directory for changes and rebuild/redisplay automatically on
            save.</p></li>
            <li><p>It might be best to generate parts of the <a
            href="/">Index</a>. This could be done with the
            <strong>Pandoc</strong> JSON filter, that is, outputting the
            AST, injecting some elements into it (most likely with
            <strong>shell</strong> and <strong>jq</strong>) and passing
            it back to <strong>Pandoc</strong>.</p></li>
            <li><p>I’m not entirely sure the header is fully correct on
            mobile. It seems like the wrong parts disappear. The TOC is
            a hack and it shows.</p></li>
            </ul>
            <div class="footnotes footnotes-end-of-document">
            <hr />
            <ol>
            <li id="fn1"><p>Because I realised I was learning the tools
            on how to do it anyway<a href="#fnref1"
            class="footnote-back">↩︎</a></p></li>
            <li id="fn2"><p>I have since learned Next.js and SvelteKit.
            There is a plugin to write Markdown in Svelte files, so it
            will beat the pants off Pandoc in a straight up fight.
            However you have to be brave enough to run ‘npm’<a
            href="#fnref2" class="footnote-back">↩︎</a></p></li>
            <li id="fn3"><p>I’m not so popular in some circles for that
            :-)<a href="#fnref3" class="footnote-back">↩︎</a></p></li>
            </ol>
            </div>
    </div>
  </content>
</entry>
<entry>
  <title>100 pull-ups and 100 dips per day challenge</title>
  <id>https://richiejp.com/pull-ups-and-dips-challenge</id>
  <published>2020-07-23T10:36:53+01:00</published>
  <updated>2023-05-07T13:52:23+01:00</updated>
  <link rel="alternate" href="https://richiejp.com/pull-ups-and-dips-challenge" />
  <summary>Attempting to get to 100 reps per day</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p><em>If you got here from my index page, then please don’t
            be alarmed, but this is not an article about software
            development.</em></p>
            <div class="float">
            <img src="onedipman.jpg" alt="One hundred dips man" />
            <div class="figcaption">One hundred dips man</div>
            </div>
            <p>Guru Anaerobic, who’s book, <em>Gang Fit</em>, I have
            been reading. Said that he did 6x10 pull-ups, 4x10 of
            something else (curls?) and 10x10 dips per day, every day,
            for a month. This resulted in a serious improvement to
            strength and muscle mass.</p>
            <div class="message is-warning">
            <div class="message-body">
            <p>Update!</p>
            <p>For reasons I don’t understand this article gets some
            traffic. So I have to say that the above person, along with
            similar individuals, have a track record of encouraging
            people until serious injury.</p>
            <p>This is just an anecdotal report of something I tried. I
            was in reasonable shape when I started it.</p>
            </div>
            </div>
            <p>From everything I have read and experienced over the last
            2 years, this kind of thing shouldn’t work. If nothing else,
            it should be suboptimal, resulting in over-training and
            injury. So I decided to try it.</p>
            <p>My version of the challenge is to do 100 pull ups and 100
            dips per day for a month (or until my wife gives birth). The
            only problem is that I can’t do that many dips (unless I
            spread it over a few hours), never mind pull ups. So I did
            as many full pull ups and dips as I could (in sets of 10
            reps ideally) until failure. Then moved to a secondary
            exercise.</p>
            <p>For dips, which I can do more of, I switch out of diamond
            press ups, then when I can’t do anymore of those, to regular
            press ups, then when that fails I put my knees down, until I
            am basically just flopping around on the floor like a fish
            out of water.</p>
            <p>For pull ups I started switching out to various types of
            curl, but I hate curls, so eventually replaced these with
            bent over rows which I find more satisfying.</p>
            <h1 id="log">Log</h1>
            <p>What follows is a very rough log of the exercise. I
            didn’t record my rest periods or when I only did 2 sets of 5
            reps instead of 1 set of 10 reps for pull ups. Also I did
            all of the pull ups on my door frame on some days and some
            on the monkey bars or football goal others. It is much
            easier on the monkey frame, which doesn’t a climber style
            finger grip to stop it from digging into my fingers.</p>
            <p>So this comes down quite a lot to perception, but the
            numbers maybe help a bit. Below are approximately the number
            of sets I did of 10 reps.</p>
            <table>
            <thead>
            <tr class="header">
            <th>Day</th>
            <th align="left">Pull-ups</th>
            <th align="left">Rows</th>
            <th align="left">Dips</th>
            <th>Press-ups</th>
            <th>Notes</th>
            </tr>
            </thead>
            <tbody>
            <tr class="odd">
            <td>1</td>
            <td align="left">3</td>
            <td align="left">6</td>
            <td align="left">5</td>
            <td>3</td>
            <td>Didn’t actually do rows, just curls</td>
            </tr>
            <tr class="even">
            <td>2</td>
            <td align="left">2</td>
            <td align="left">6</td>
            <td align="left">5</td>
            <td>5</td>
            <td>Experimenting with secondary exercises, pulled back
            muscle</td>
            </tr>
            <tr class="odd">
            <td>3</td>
            <td align="left">3</td>
            <td align="left">7</td>
            <td align="left">5</td>
            <td>5</td>
            <td>Also did some hill sprints, used park frame, slept 12
            hours that night</td>
            </tr>
            <tr class="even">
            <td>4</td>
            <td align="left">1</td>
            <td align="left">9</td>
            <td align="left">8</td>
            <td>2</td>
            <td>Sudden improvement is dips (also did them with less rest
            between sets!), switched fully to rows, bicep hurts</td>
            </tr>
            <tr class="odd">
            <td>5</td>
            <td align="left">4</td>
            <td align="left">6</td>
            <td align="left">7</td>
            <td>3</td>
            <td>Fatigued, less dips, but cleaner and less rest between
            sets, also did a few sprints, visible increase in muscle
            mass</td>
            </tr>
            <tr class="even">
            <td>6</td>
            <td align="left">7</td>
            <td align="left">3</td>
            <td align="left">10</td>
            <td>0</td>
            <td>Maybe more rest between sets</td>
            </tr>
            <tr class="odd">
            <td>7</td>
            <td align="left">4</td>
            <td align="left">6</td>
            <td align="left">10</td>
            <td>0</td>
            <td>Forearms hurting in a novel way. Wife complains that I
            have ketone breath despite the fact I have been lax with
            intermittent fasting and carb intake.</td>
            </tr>
            <tr class="even">
            <td>8</td>
            <td align="left">6</td>
            <td align="left">4</td>
            <td align="left">10</td>
            <td>0</td>
            <td>Used park frame for pull-ups and did some sprints too.
            Did 2x20 dip sets and 6x10. Still had to do some of the
            pull-ups as 2 sets of 5, but feel like there is some
            improvement.</td>
            </tr>
            <tr class="odd">
            <td>9</td>
            <td align="left">5</td>
            <td align="left">5</td>
            <td align="left">10</td>
            <td>0</td>
            <td>Increased row weight a bit, helps get the rows to a more
            similar difficulty to the pull-ups. Feel far less like I am
            over-training.</td>
            </tr>
            <tr class="even">
            <td>10</td>
            <td align="left">6</td>
            <td align="left">4</td>
            <td align="left">10</td>
            <td>0</td>
            <td>Not the cleanest sets, but less rest time.</td>
            </tr>
            <tr class="odd">
            <td>11</td>
            <td align="left">5</td>
            <td align="left">5</td>
            <td align="left">10</td>
            <td>0</td>
            <td>I feel like my pull-ups are increasing in range and the
            reps are better quality.</td>
            </tr>
            <tr class="even">
            <td>12</td>
            <td align="left">5</td>
            <td align="left">5</td>
            <td align="left">10</td>
            <td>0</td>
            <td>Also did sprints, quite tired.</td>
            </tr>
            <tr class="odd">
            <td>13</td>
            <td align="left">7</td>
            <td align="left">3</td>
            <td align="left">10</td>
            <td>0</td>
            <td></td>
            </tr>
            <tr class="even">
            <td>14</td>
            <td align="left">5</td>
            <td align="left">5</td>
            <td align="left">10</td>
            <td>0</td>
            <td></td>
            </tr>
            <tr class="odd">
            <td>15</td>
            <td align="left">3</td>
            <td align="left">7</td>
            <td align="left">10</td>
            <td>0</td>
            <td>Also did sprints and jogging. It is hard to tell if I am
            going backwards or forwards with pull-ups.</td>
            </tr>
            <tr class="even">
            <td>16</td>
            <td align="left">2</td>
            <td align="left">8</td>
            <td align="left">10</td>
            <td>0</td>
            <td>Hung over, bit of a shambles</td>
            </tr>
            <tr class="odd">
            <td>17</td>
            <td align="left">5</td>
            <td align="left">5</td>
            <td align="left">10</td>
            <td>0</td>
            <td>Sprints again and some extras. Had some pretensions
            about doing a 48 hour fast, but the thought of going to bed
            without eating was too much this time.</td>
            </tr>
            <tr class="even">
            <td>18</td>
            <td align="left">1</td>
            <td align="left">9</td>
            <td align="left">10</td>
            <td>0</td>
            <td>Woke up early so went for a run as well. Starting to
            feel fatigue again. Need to back off a bit, wife is near her
            due date. Did another 20 hour fast.</td>
            </tr>
            <tr class="odd">
            <td>19</td>
            <td align="left">1</td>
            <td align="left">9</td>
            <td align="left">10</td>
            <td>0</td>
            <td>Ate lunch today and still backing off pull-ups, want to
            try and recover a bit.</td>
            </tr>
            <tr class="even">
            <td>20</td>
            <td align="left">2</td>
            <td align="left">8</td>
            <td align="left">10</td>
            <td>0</td>
            <td>Woke up too early and couldn’t go back to sleep, feel
            pretty knackered. 20 is a nice round number so I think I’m
            done!</td>
            </tr>
            </tbody>
            </table>
            <h1 id="injury-fatigue-and-performance">Injury, fatigue and
            performance</h1>
            <p>Pull-ups are always a high risk move for me. Usually when
            I am trying to eek out one last rep is when something goes
            in my back and this happened on day two. However I carried
            on, it hurts when I sneeze, but somehow not so much during
            actually exercise. Conspicuous by its absence is the tendon
            strain I used to get in my arms when rock climbing. I
            suppose that plain two-handed pull-ups are putting less
            strain on my joints compared to many climbing moves where
            you often have most of your weight on one outstretched
            arm.</p>
            <p>Fatigue is also present, but disappears during actual
            exercise. The effects of <em>overtraining</em> are clearly
            present, but qualitatively my performance improved in a
            noticeable way by day 4. At least for dips, pull-ups were
            harder to judge. Around day 8-9 I noticed that the feeling
            of fatigue was significantly reduced along with aches and
            pains.</p>
            <p>Towards the end I was in a bit of a state some days. This
            could have been because some sessions were just harder due
            to extra bits I threw in. However it also seemed to be
            building up. I could have continued, but it seemed like the
            wrong time to be pushing myself physically.</p>
            <h1 id="theorizing">“Theorizing”</h1>
            <p>Thinking about how our ancestors might have lived, the
            fact this works makes sense to me. When times were very good
            or very bad, they may have had to hunt or fight every day
            for a prolonged period. There is a fair amount of material
            suggesting just 1-2 hours of exercise a week, with plenty of
            rest in between sessions, is optimal. However I think in the
            wild it is likely you would get days or weeks of intense
            activity followed by a period of rest.</p>
            <p>When large animals were passing by, eating grass and
            farting, hunters would have to track them for hours at a
            moderate pace, followed by an intense burst of sprinting and
            spear throwing. Then they would have to haul anything they
            couldn’t eat on the spot back to camp, home or at least away
            from any large cats or dogs. If it was too heavy to move,
            then maybe they would have to fight off anything which also
            wanted to eat it.</p>
            <p>Of course this is just one possible scenario of many. It
            is just a thought exercise. I think it is reasonable to
            think about such things because it gives you some grounding
            outside of circular health and fitness metrics. The optimal
            diet and exercise for health and fitness depends on your
            environment and your environment provides your diet and
            exercise.</p>
            <p><em>Assuming</em> that some combination of habitats from
            the bulk of our evolution are optimal (excluding the
            occasional extreme stresses which result in death) is
            reasonable to avoid epistemological issues. If you accept
            that we <em>evolved</em> in a given set of environments,
            then we probably perform best in a permutation of those. It
            is not guaranteed by evolution and our evolutionary
            environment is non-extant. However it provides a counter to
            arbitrarily chosen metrics defining health and fitness.</p>
    </div>
  </content>
</entry>
<entry>
  <title>libactors; Actor model and message passing in C with Userland
RCU</title>
  <id>https://richiejp.com/rcu-actors</id>
  <published>2020-09-19T15:54:13+01:00</published>
  <updated>2022-01-20T14:21:02Z</updated>
  <link rel="alternate" href="https://richiejp.com/rcu-actors" />
  <summary>Using librcu (read-copy-update, lock-free) to create an actor
library in C</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>I recently came across liburcu, which is a userland
            implementation of the Linux kernel’s RCU (read-copy-update)
            synchronisation mechanism and a bunch of useful concurrent
            data structures. This struck me as something I could quickly
            use to build <a
            href="https://gitlab.com/Palethorpe/libactors">yet-another
            actor model library</a> and use in a new <a
            href="https://gitlab.com/Palethorpe/ltp-executor">concurrent
            Linux kernel test executor</a>. Indeed I think that for a C
            project I was able to do quite a lot very quickly. Below are
            some reflections on this.</p>
            <h1 id="read-copy-update">Read-Copy-Update</h1>
            <p>Linux’s RCU strikes me as a typical example of something
            which at first seems magical and difficult to use, but
            actually is simpler than many alternatives.</p>
            <p>What initially confused me about RCU and its API usage
            (and maybe still does confuse me) is that RCU enabled data
            structures often only require a <em>read lock</em> to
            perform an update (that is, a write). For example, a hash
            map may only require a read lock to add an entry. However
            this is because the hash map data structure is completely
            <em>lockless</em>, meaning that it can be mutated
            concurrently without requiring a lock to be taken<a
            href="#fn1" class="footnote-ref"
            id="fnref1"><sup>1</sup></a>, it is infact the value of the
            entry which is protected by the RCU read lock<a href="#fn2"
            class="footnote-ref" id="fnref2"><sup>2</sup></a>.</p>
            <p>Of course that doesn’t have to be the case and a data
            structure can use a lock and RCU to give updaters exclusive
            access or some other combination thereof. More to the point
            though, the end result of a lockless data structure plus RCU
            combo is often simpler than the equivalent with mutex’s, or
            whatever, because taking an RCU read lock or performing a
            synchronisation are just three calls with no function
            parameters.</p>
            <p>Take a look at the following pseudo C which deletes a
            value from a concurrently accessed hash map. The details are
            not too important, just look for the 3 functions with RCU in
            the name.</p>
            <div class="sourceCode" id="cb1"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="dt">static</span> <span class="kw">struct</span> cds_lfht <span class="op">*</span>book<span class="op">;</span></span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a></span>
<span id="cb1-3"><a href="#cb1-3" tabindex="-1"></a><span class="op">...</span></span>
<span id="cb1-4"><a href="#cb1-4" tabindex="-1"></a></span>
<span id="cb1-5"><a href="#cb1-5" tabindex="-1"></a>addr_t addr <span class="op">=</span> <span class="co">/* some address */</span></span>
<span id="cb1-6"><a href="#cb1-6" tabindex="-1"></a><span class="kw">struct</span> addr_entry <span class="op">*</span>entry<span class="op">;</span></span>
<span id="cb1-7"><a href="#cb1-7" tabindex="-1"></a><span class="kw">struct</span> cds_lfht_iter iter<span class="op">;</span></span>
<span id="cb1-8"><a href="#cb1-8" tabindex="-1"></a><span class="kw">struct</span> cds_lfht_node <span class="op">*</span>node<span class="op">;</span></span>
<span id="cb1-9"><a href="#cb1-9" tabindex="-1"></a><span class="dt">int</span> ret<span class="op">;</span></span>
<span id="cb1-10"><a href="#cb1-10" tabindex="-1"></a></span>
<span id="cb1-11"><a href="#cb1-11" tabindex="-1"></a><span class="co">/* Start a read critical section */</span></span>
<span id="cb1-12"><a href="#cb1-12" tabindex="-1"></a>rcu_read_lock<span class="op">();</span></span>
<span id="cb1-13"><a href="#cb1-13" tabindex="-1"></a></span>
<span id="cb1-14"><a href="#cb1-14" tabindex="-1"></a><span class="co">/* Lookup the addr in a hash map &#39;book&#39; (we assume it still exists) */</span></span>
<span id="cb1-15"><a href="#cb1-15" tabindex="-1"></a>cds_lfht_lookup<span class="op">(</span>book<span class="op">,</span> addr<span class="op">,</span> addr_entry_match<span class="op">,</span> <span class="op">&amp;</span>addr<span class="op">,</span> <span class="op">&amp;</span>iter<span class="op">);</span></span>
<span id="cb1-16"><a href="#cb1-16" tabindex="-1"></a>node <span class="op">=</span> cds_lfht_iter_get_node<span class="op">(&amp;</span>iter<span class="op">);</span></span>
<span id="cb1-17"><a href="#cb1-17" tabindex="-1"></a>entry <span class="op">=</span> caa_container_of<span class="op">(</span>node<span class="op">,</span> <span class="kw">struct</span> addr_entry<span class="op">,</span> node<span class="op">);</span></span>
<span id="cb1-18"><a href="#cb1-18" tabindex="-1"></a></span>
<span id="cb1-19"><a href="#cb1-19" tabindex="-1"></a><span class="co">/* Remove the entry from the hash map, but don&#39;t delete the value yet */</span></span>
<span id="cb1-20"><a href="#cb1-20" tabindex="-1"></a>ret <span class="op">=</span> cds_lfht_del<span class="op">(</span>book<span class="op">,</span> <span class="op">&amp;</span>entry<span class="op">-&gt;</span>node<span class="op">);</span></span>
<span id="cb1-21"><a href="#cb1-21" tabindex="-1"></a></span>
<span id="cb1-22"><a href="#cb1-22" tabindex="-1"></a>rcu_read_unlock<span class="op">();</span></span>
<span id="cb1-23"><a href="#cb1-23" tabindex="-1"></a></span>
<span id="cb1-24"><a href="#cb1-24" tabindex="-1"></a><span class="co">/* Delete the value after all existing readers have finished their read critical section */</span></span>
<span id="cb1-25"><a href="#cb1-25" tabindex="-1"></a><span class="cf">if</span> <span class="op">(!</span>ret<span class="op">)</span> <span class="op">{</span></span>
<span id="cb1-26"><a href="#cb1-26" tabindex="-1"></a>    synchronize_rcu<span class="op">();</span></span>
<span id="cb1-27"><a href="#cb1-27" tabindex="-1"></a>    free<span class="op">(</span>entry<span class="op">);</span></span>
<span id="cb1-28"><a href="#cb1-28" tabindex="-1"></a><span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="op">(</span>ret <span class="op">==</span> <span class="op">-</span>ENOENT<span class="op">)</span> <span class="op">{</span></span>
<span id="cb1-29"><a href="#cb1-29" tabindex="-1"></a>    <span class="co">// someone else removed it...</span></span>
<span id="cb1-30"><a href="#cb1-30" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>The entry is removed from the hash map between
            <code>rcu_read_lock()</code> and
            <code>rcu_read_unlock()</code>, then later deleted after
            <code>synchronize_rcu()</code>. No need to acquire a
            particular mutex or anything like that, at least not in this
            case. There is more to the RCU API, but I didn’t use any of
            it in this case.</p>
            <p>So I like RCU; not least because of its API and also it
            has some fancy performance characteristics. Once you get the
            interaction between read sections and synchronisation, I
            actually think it is preferable to some other
            synchronisation mechanisms which are better understood in
            general.</p>
            <p>Speaking of getting it, RCU consists of two things; read
            sections and synchronisation points. When we enter a
            synchronisation point, we wait until all <em>current</em>
            read sections exit. However we do not wait for any
            <em>new</em> read sections which are entered while
            synchronising. Furthermore synchronisation points do not
            block read sections.</p>
            <p>A big impediment to understanding RCU, is seeing how
            waiting only for current read sections is ever sufficient to
            ensure exclusive access to some data. Hopefully looking at
            the above code we can see why it is in this particular
            case.</p>
            <p>First we enter a read section where we fetch and remove a
            node from the <em>lock free</em> hash map. Any read sections
            which started before or during ours may still have a copy of
            the entry pointer, however any which begin after we call the
            delete function are guaranteed not to obtain a copy of the
            entry.</p>
            <p>By the time we call <code>synchronize_rcu()</code> the
            only read sections which may have access to the entry are
            ones which began before we finished our read section. We
            will wait for these to finish, after which we are safe to
            free the entry.</p>
            <p>Hopefully in this particular case you can see why it is
            only necessary to wait for existing ‘readers’. In general it
            is sufficient because one first modifies a pointer
            atomically, removing access to some memory, then waits for
            all readers that <em>may</em> have access to that memory to
            finish. New readers won’t have access after the atomic
            update, so there is no need to block them.</p>
            <p>Another strange thing about RCU is that
            <code>synchronize_rcu()</code> waits for all read sections
            on the system, including ones which have no relevance to the
            current operation. Not only is this fine performance wise,
            but it is actually very efficient at least when implemented
            in kernel space. The user space implementations are maybe
            not as efficient, but I assume they are still very good.
            From a user point of view, this nice because we don’t need
            to worry about which lock to take, however it is also
            confusing.</p>
            <h1 id="userland-rcu">Userland RCU</h1>
            <p>While I do like RCU and that is why I was looking at
            liburcu, it is not really why I am using it. It’s more
            because it provides some nice concurrent data structures and
            some other libs which are missing from the C standard
            library.</p>
            <p>Most languages have a hash map, vector and similar in
            their base library. This is not always a good thing, but in
            C it is completely missing and so every project has to go
            looking for these things.</p>
            <p>liburcu provides a lot of standard data structures in one
            place along with much other boiler plate. I’m not completely
            sure the APIs it provides are the best, perhaps they are a
            bit intrusive, but it is difficult to make things much
            better in C.</p>
            <h1 id="actors">Actors</h1>
            <p>Message passing and the Actor model makes for a nice way
            of doing concurrency and data processing. It also easily
            leads to nondeterministic chaos as the state of different
            Actors and message orders interact, but this is maybe
            something which can be mitigated.</p>
            <p>I think most people are introduced to the Actor model
            informally through Erlang or some Actor’s library like Akka,
            but there is also a rigorous definition put forward at the
            dawn of modern computing. The formalisms in <a
            href="https://www.amazon.de/dp/026251141X/ref=sr_1_2?keywords=actor+model&amp;qid=1571479272&amp;sr=8-2">Actors:
            A Model of Concurrent Computation in Distributed Systems</a>
            are interesting not least for how much they differ from the
            practical adoption of <em>Actors</em>.</p>
            <p>I previously created an <a
            href="https://gitlab.com/Palethorpe/Actors.jl">Actors
            library in Julia</a> and now <a
            href="https://gitlab.com/Palethorpe/libactors">libactors</a>.
            These are both quite far from the formal definition of the
            Actor model, especially <code>libactors</code>. I will set
            out below roughly what I mean by an actor.</p>
            <p>An Actor is an object (<code>struct</code>) with:</p>
            <ul>
            <li>An address</li>
            <li>A message box</li>
            <li>Some message handlers</li>
            <li>Some other user defined state or data</li>
            </ul>
            <p>Actors can send messages to each other, by sending a
            message to an address. They can’t access each other
            directly. This is analogous to a collection of networked
            computers; they can only communicate with message passing,
            they can not access each other’s memory directly<a
            href="#fn3" class="footnote-ref"
            id="fnref3"><sup>3</sup></a>.</p>
            <p>An Actor is able to determine if an address exists and
            send a message to it. It is able to check its own message
            box for new arrivals and update its own state/memory.
            Finally it can also start new Actors.</p>
            <p>This is also somewhat analogous to Operating System
            processes, communicating with pipes or sockets, or micro
            services. Indeed an Actor library may abstract away where an
            Actor is running and what transport is used so that each
            Actor runs in its own thread, process or even computer, but
            the interface remains the same.</p>
            <p>This makes a nice fractal where concurrency at the
            smallest scale, say POSIX threads, looks similar to a large
            scale cluster of disparate machines, thanks to the Actor
            abstraction. Having said that, <code>libactors</code> would
            have to do a deep copy of message data in order to provide
            such an abstraction, which it does not, but it’s
            possible.</p>
            <p>Also see <a
            href="https://tutorial.ponylang.io/types/actors.html">Pony
            Lang’s</a> idea of an Actor and my description for <a
            href="https://palethorpe.gitlab.io/Actors.jl/#The-Actor-Model-1">Actors.jl</a>.</p>
            <h1 id="libactors">libactors</h1>
            <p>Not to be confused with <a
            href="https://github.com/airplug/libactor">libactor</a>; I
            have created <a
            href="https://gitlab.com/Palethorpe/libactors">libactors</a>
            and used it in the experimental <a
            href="https://gitlab.com/Palethorpe/ltp-executor">LTP
            Executor</a><a href="#fn4" class="footnote-ref"
            id="fnref4"><sup>4</sup></a>. I had surprisingly few
            problems with doing this. Not least thanks to Clang’s
            address sanitizer, compiling with the strictest set of
            options I can find<a href="#fn5" class="footnote-ref"
            id="fnref5"><sup>5</sup></a> and much use of function
            attributes.</p>
            <p>So far the library is fairly simple and doesn’t contain
            anything magical. An Actor definition looks something like
            the following contrived nonsense.</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="kw">enum</span> static_addrs <span class="op">{</span></span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a>    ADDR_FOO <span class="op">=</span> <span class="dv">1</span><span class="op">,</span></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a>    ADDR_BAR <span class="op">=</span> <span class="dv">2</span><span class="op">,</span></span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb2-5"><a href="#cb2-5" tabindex="-1"></a></span>
<span id="cb2-6"><a href="#cb2-6" tabindex="-1"></a><span class="kw">enum</span> msg_types <span class="op">{</span></span>
<span id="cb2-7"><a href="#cb2-7" tabindex="-1"></a>    MSG_PING<span class="op">,</span></span>
<span id="cb2-8"><a href="#cb2-8" tabindex="-1"></a>    MSG_PONG<span class="op">,</span></span>
<span id="cb2-9"><a href="#cb2-9" tabindex="-1"></a>    MSG_DATA</span>
<span id="cb2-10"><a href="#cb2-10" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb2-11"><a href="#cb2-11" tabindex="-1"></a></span>
<span id="cb2-12"><a href="#cb2-12" tabindex="-1"></a><span class="co">// Actor state definition</span></span>
<span id="cb2-13"><a href="#cb2-13" tabindex="-1"></a><span class="kw">struct</span> foo <span class="op">{</span></span>
<span id="cb2-14"><a href="#cb2-14" tabindex="-1"></a>    <span class="dt">unsigned</span> <span class="dt">int</span> state<span class="op">;</span></span>
<span id="cb2-15"><a href="#cb2-15" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb2-16"><a href="#cb2-16" tabindex="-1"></a></span>
<span id="cb2-17"><a href="#cb2-17" tabindex="-1"></a><span class="co">// Message body definition</span></span>
<span id="cb2-18"><a href="#cb2-18" tabindex="-1"></a><span class="kw">struct</span> bin <span class="op">{</span></span>
<span id="cb2-19"><a href="#cb2-19" tabindex="-1"></a>    <span class="dt">int</span> length<span class="op">;</span></span>
<span id="cb2-20"><a href="#cb2-20" tabindex="-1"></a>    <span class="dt">char</span> bytes<span class="op">[];</span></span>
<span id="cb2-21"><a href="#cb2-21" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb2-22"><a href="#cb2-22" tabindex="-1"></a></span>
<span id="cb2-23"><a href="#cb2-23" tabindex="-1"></a><span class="co">// Foo&#39;s message handler</span></span>
<span id="cb2-24"><a href="#cb2-24" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> foo_hear<span class="op">(</span><span class="kw">struct</span> actor <span class="op">*</span>self<span class="op">,</span> <span class="kw">struct</span> msg <span class="op">*</span>msg<span class="op">)</span></span>
<span id="cb2-25"><a href="#cb2-25" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb2-26"><a href="#cb2-26" tabindex="-1"></a>    <span class="co">// Get our private, user defined, actor state</span></span>
<span id="cb2-27"><a href="#cb2-27" tabindex="-1"></a>    <span class="kw">struct</span> foo <span class="op">*</span>my <span class="op">=</span> self<span class="op">-&gt;</span>priv<span class="op">;</span></span>
<span id="cb2-28"><a href="#cb2-28" tabindex="-1"></a>    </span>
<span id="cb2-29"><a href="#cb2-29" tabindex="-1"></a>    <span class="co">// Dispatch on the message type</span></span>
<span id="cb2-30"><a href="#cb2-30" tabindex="-1"></a>    <span class="cf">switch</span><span class="op">(</span>msg<span class="op">-&gt;</span>type<span class="op">)</span> <span class="op">{</span></span>
<span id="cb2-31"><a href="#cb2-31" tabindex="-1"></a>    <span class="cf">case</span> MSG_PING<span class="op">:</span></span>
<span id="cb2-32"><a href="#cb2-32" tabindex="-1"></a>        <span class="co">// Just change the message type and pass (say) the same struct back</span></span>
<span id="cb2-33"><a href="#cb2-33" tabindex="-1"></a>        msg<span class="op">-&gt;</span>type <span class="op">=</span> MSG_PONG<span class="op">;</span></span>
<span id="cb2-34"><a href="#cb2-34" tabindex="-1"></a>        actor_say<span class="op">(</span>self<span class="op">,</span> msg<span class="op">-&gt;</span>from<span class="op">,</span> msg<span class="op">);</span></span>
<span id="cb2-35"><a href="#cb2-35" tabindex="-1"></a>    <span class="cf">case</span> MSG_PONG<span class="op">:</span></span>
<span id="cb2-36"><a href="#cb2-36" tabindex="-1"></a>        free<span class="op">(</span>msg<span class="op">);</span></span>
<span id="cb2-37"><a href="#cb2-37" tabindex="-1"></a>        <span class="co">// So they are awake, send them some junk</span></span>
<span id="cb2-38"><a href="#cb2-38" tabindex="-1"></a>        msg <span class="op">=</span> msg_alloc_extra<span class="op">(</span><span class="kw">sizeof</span><span class="op">(</span><span class="kw">struct</span> bin<span class="op">)</span> <span class="op">+</span> <span class="dv">1024</span><span class="op">);</span></span>
<span id="cb2-39"><a href="#cb2-39" tabindex="-1"></a>        msg<span class="op">-&gt;</span>type <span class="op">=</span> MSG_DATA<span class="op">;</span></span>
<span id="cb2-40"><a href="#cb2-40" tabindex="-1"></a>        actor_say<span class="op">(</span>self<span class="op">,</span> ADDR_BAR<span class="op">,</span> msg<span class="op">);</span></span>
<span id="cb2-41"><a href="#cb2-41" tabindex="-1"></a>    <span class="cf">default</span><span class="op">:</span></span>
<span id="cb2-42"><a href="#cb2-42" tabindex="-1"></a>        <span class="co">// Blow up or something</span></span>
<span id="cb2-43"><a href="#cb2-43" tabindex="-1"></a>        <span class="op">...</span></span>
<span id="cb2-44"><a href="#cb2-44" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb2-45"><a href="#cb2-45" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb2-46"><a href="#cb2-46" tabindex="-1"></a></span>
<span id="cb2-47"><a href="#cb2-47" tabindex="-1"></a><span class="op">...</span></span>
<span id="cb2-48"><a href="#cb2-48" tabindex="-1"></a></span>
<span id="cb2-49"><a href="#cb2-49" tabindex="-1"></a><span class="dt">void</span> main<span class="op">(</span><span class="dt">void</span><span class="op">)</span></span>
<span id="cb2-50"><a href="#cb2-50" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb2-51"><a href="#cb2-51" tabindex="-1"></a>    <span class="kw">struct</span> actor <span class="op">*</span>foo<span class="op">;</span></span>
<span id="cb2-52"><a href="#cb2-52" tabindex="-1"></a>    <span class="op">...</span></span>
<span id="cb2-53"><a href="#cb2-53" tabindex="-1"></a>    </span>
<span id="cb2-54"><a href="#cb2-54" tabindex="-1"></a>    actors_init<span class="op">();</span></span>
<span id="cb2-55"><a href="#cb2-55" tabindex="-1"></a>    </span>
<span id="cb2-56"><a href="#cb2-56" tabindex="-1"></a>    foo <span class="op">=</span> actor_alloc_extra<span class="op">(</span><span class="kw">sizeof</span><span class="op">(</span><span class="kw">struct</span> foo<span class="op">));</span></span>
<span id="cb2-57"><a href="#cb2-57" tabindex="-1"></a>    foo<span class="op">-&gt;</span>addr <span class="op">=</span> ADDR_FOO<span class="op">;</span></span>
<span id="cb2-58"><a href="#cb2-58" tabindex="-1"></a>    foo<span class="op">-&gt;</span>hear <span class="op">=</span> foo_hear<span class="op">;</span></span>
<span id="cb2-59"><a href="#cb2-59" tabindex="-1"></a>    </span>
<span id="cb2-60"><a href="#cb2-60" tabindex="-1"></a>    <span class="co">// Create other actors</span></span>
<span id="cb2-61"><a href="#cb2-61" tabindex="-1"></a>    <span class="op">...</span></span>
<span id="cb2-62"><a href="#cb2-62" tabindex="-1"></a>    </span>
<span id="cb2-63"><a href="#cb2-63" tabindex="-1"></a>    <span class="co">// Wait for all the actors to exit</span></span>
<span id="cb2-64"><a href="#cb2-64" tabindex="-1"></a>    actors_wait<span class="op">();</span></span>
<span id="cb2-65"><a href="#cb2-65" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>By default when an Actor <em>hears</em> a message it
            executes the function pointed to by
            <code>struct actor::hear</code>. To send a message<a
            href="#fn6" class="footnote-ref"
            id="fnref6"><sup>6</sup></a> one calls
            <code>void actor_say(struct actor *self, addr_t to, struct msg *msg)</code>.
            Actor’s are usually only accessed through one or more user
            defined addresses, not memory pointers and you are strongly
            discouraged from directly accessing another Actor’s
            memory.</p>
            <p>Unfortunately it is quite easy to accidentally access
            another Actor’s memory by passing a message containing a
            pointer to it. This is something which Rust’s borrow checker
            can prevent and Pony appears to have an <a
            href="https://tutorial.ponylang.io/reference-capabilities.html">even
            richer mechanism</a> for dealing with shared
            references/pointers.</p>
            <p>However not all is lost in C; for one we can instrument
            runtime code with the Clang address or thread sanitizers.
            These are slow (well, still faster than most languages) and
            don’t catch everything, but are easy to use. Then there is
            the possibility of adding annotations to variables and
            functions which can be enforced by a <a
            href="/https/richiejp.com/custom-c-static-analysis-tools#sparse">static
            analyzer like sparse</a>. In fact Clang and GCC will check
            some annotations as well. I haven’t tried it, but possibly
            this could be used to check that pointers passed with
            messages are accessed in a sensible way. The Linux kernel
            uses this to put some constraints on pointers used in RCU
            data structures.</p>
            <p>By default, each Actor has only a single message handler
            at any one time, we simply just branch on the msg type to
            determine what to do when receiving a message<a href="#fn7"
            class="footnote-ref" id="fnref7"><sup>7</sup></a>. I think
            this works out fairly well for basic usage as there is no
            magic involved and doesn’t look too ugly. Also C compilers
            are pretty clever so there is no need to worry about the
            performance of large switch statements.</p>
            <p>Alternatively the user can provide a listen callback
            instead which requires some more boilerplate, but gives them
            the freedom to read some other data source at the same time
            or perform some setup when the actor starts.</p>
            <div class="sourceCode" id="cb3"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> tester_listen<span class="op">(</span><span class="kw">struct</span> actor <span class="op">*</span>self<span class="op">)</span></span>
<span id="cb3-2"><a href="#cb3-2" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb3-3"><a href="#cb3-3" tabindex="-1"></a>    <span class="kw">struct</span> msg <span class="op">*</span>msg <span class="op">=</span> msg_alloc<span class="op">();</span></span>
<span id="cb3-4"><a href="#cb3-4" tabindex="-1"></a>    <span class="kw">struct</span> tester <span class="op">*</span>my <span class="op">=</span> malloc<span class="op">(</span><span class="kw">sizeof</span><span class="op">(</span><span class="kw">struct</span> tester<span class="op">));</span></span>
<span id="cb3-5"><a href="#cb3-5" tabindex="-1"></a></span>
<span id="cb3-6"><a href="#cb3-6" tabindex="-1"></a>    assert_perror<span class="op">(</span>errno<span class="op">);</span></span>
<span id="cb3-7"><a href="#cb3-7" tabindex="-1"></a>    assert<span class="op">(</span>my<span class="op">);</span></span>
<span id="cb3-8"><a href="#cb3-8" tabindex="-1"></a>    memset<span class="op">(</span>my<span class="op">,</span> <span class="dv">0</span><span class="op">,</span> <span class="kw">sizeof</span><span class="op">(*</span>my<span class="op">));</span></span>
<span id="cb3-9"><a href="#cb3-9" tabindex="-1"></a>    self<span class="op">-&gt;</span>priv <span class="op">=</span> my<span class="op">;</span></span>
<span id="cb3-10"><a href="#cb3-10" tabindex="-1"></a></span>
<span id="cb3-11"><a href="#cb3-11" tabindex="-1"></a>    <span class="co">// Inform another actor we have been allocated/started</span></span>
<span id="cb3-12"><a href="#cb3-12" tabindex="-1"></a>    msg<span class="op">-&gt;</span>type <span class="op">=</span> MSG_ALLC<span class="op">;</span></span>
<span id="cb3-13"><a href="#cb3-13" tabindex="-1"></a>    actor_say<span class="op">(</span>self<span class="op">,</span> ADDR_WRITER<span class="op">,</span> msg<span class="op">);</span></span>
<span id="cb3-14"><a href="#cb3-14" tabindex="-1"></a></span>
<span id="cb3-15"><a href="#cb3-15" tabindex="-1"></a>    <span class="co">// The message loop</span></span>
<span id="cb3-16"><a href="#cb3-16" tabindex="-1"></a>    <span class="cf">for</span> <span class="op">(;;)</span> <span class="op">{</span></span>
<span id="cb3-17"><a href="#cb3-17" tabindex="-1"></a>        msg <span class="op">=</span> actor_inbox_pop<span class="op">(</span>self<span class="op">);</span></span>
<span id="cb3-18"><a href="#cb3-18" tabindex="-1"></a></span>
<span id="cb3-19"><a href="#cb3-19" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>msg<span class="op">)</span></span>
<span id="cb3-20"><a href="#cb3-20" tabindex="-1"></a>            self<span class="op">-&gt;</span>hear<span class="op">(</span>self<span class="op">,</span> msg<span class="op">);</span> <span class="co">// If we have an actor message then handle it</span></span>
<span id="cb3-21"><a href="#cb3-21" tabindex="-1"></a>        <span class="cf">else</span> <span class="cf">if</span> <span class="op">(!(</span>my<span class="op">-&gt;</span>child <span class="op">||</span> my<span class="op">-&gt;</span>cout <span class="op">||</span> my<span class="op">-&gt;</span>eout<span class="op">))</span></span>
<span id="cb3-22"><a href="#cb3-22" tabindex="-1"></a>            actor_wait<span class="op">(</span>self<span class="op">,</span> NULL<span class="op">);</span> <span class="co">// Sleep-wait if there is nothing to do</span></span>
<span id="cb3-23"><a href="#cb3-23" tabindex="-1"></a></span>
<span id="cb3-24"><a href="#cb3-24" tabindex="-1"></a>        <span class="co">// Check if the child process has completed</span></span>
<span id="cb3-25"><a href="#cb3-25" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>my<span class="op">-&gt;</span>child<span class="op">)</span></span>
<span id="cb3-26"><a href="#cb3-26" tabindex="-1"></a>            tester_check_child<span class="op">(</span>self<span class="op">);</span></span>
<span id="cb3-27"><a href="#cb3-27" tabindex="-1"></a></span>
<span id="cb3-28"><a href="#cb3-28" tabindex="-1"></a>        <span class="co">// Check the childs standard output which we log</span></span>
<span id="cb3-29"><a href="#cb3-29" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>my<span class="op">-&gt;</span>cout <span class="op">||</span> my<span class="op">-&gt;</span>eout<span class="op">)</span></span>
<span id="cb3-30"><a href="#cb3-30" tabindex="-1"></a>            tester_check_output<span class="op">(</span>self<span class="op">);</span></span>
<span id="cb3-31"><a href="#cb3-31" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb3-32"><a href="#cb3-32" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>Above is an actual snippet from the LTP Executor, it
            shows the listen function for a <code>tester</code> actor.
            This actor starts a child process which is usually an LTP
            test. It needs to listen for new actor messages at the same
            time as reading the child’s output and checking its exit
            status.</p>
            <p>This is a common problem with Actor’s, where you need to
            both wait for new actor messages while also waiting for some
            other kind of I/O to appear. At the same time you can’t
            merely spin-wait because the CPU won’t be happy. In this
            case, when there is an active child, we block using
            <code>poll</code> with a short timeout, which is acceptable
            because we don’t need to respond quickly to Actor messages.
            When no child is active, we can use <code>actor_wait</code>
            which uses fancy Linux <code>futex</code>’s to wait
            efficiently.</p>
            <p>In general this might not be suitable, but on the other
            hand it would be possible to interrupt <code>poll</code>
            with a signal if we really need the actor to check its
            messages.</p>
            <p>Note that we can also call the default message loop from
            listen and just use it to perform some actor setup or
            something like the following.</p>
            <div class="sourceCode" id="cb4"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb4-1"><a href="#cb4-1" tabindex="-1"></a>__attribute__<span class="op">((</span>pure<span class="op">))</span></span>
<span id="cb4-2"><a href="#cb4-2" tabindex="-1"></a><span class="dt">static</span> <span class="dt">int</span> only_pong<span class="op">(</span><span class="dt">const</span> <span class="kw">struct</span> actor <span class="op">*</span>self __attribute__<span class="op">((</span>unused<span class="op">)),</span></span>
<span id="cb4-3"><a href="#cb4-3" tabindex="-1"></a>             <span class="dt">const</span> <span class="kw">struct</span> msg <span class="op">*</span>msg<span class="op">)</span></span>
<span id="cb4-4"><a href="#cb4-4" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb4-5"><a href="#cb4-5" tabindex="-1"></a>    <span class="cf">return</span> msg<span class="op">-&gt;</span>type <span class="op">==</span> MSG_PONG<span class="op">;</span></span>
<span id="cb4-6"><a href="#cb4-6" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb4-7"><a href="#cb4-7" tabindex="-1"></a></span>
<span id="cb4-8"><a href="#cb4-8" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> writer_listen<span class="op">(</span><span class="kw">struct</span> actor <span class="op">*</span>self<span class="op">)</span></span>
<span id="cb4-9"><a href="#cb4-9" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb4-10"><a href="#cb4-10" tabindex="-1"></a>    <span class="kw">struct</span> msg <span class="op">*</span>msg<span class="op">;</span></span>
<span id="cb4-11"><a href="#cb4-11" tabindex="-1"></a></span>
<span id="cb4-12"><a href="#cb4-12" tabindex="-1"></a>    <span class="co">// Only pop messages for which only_pong returns true</span></span>
<span id="cb4-13"><a href="#cb4-13" tabindex="-1"></a>    <span class="co">// Other messages are put into a buffer for later</span></span>
<span id="cb4-14"><a href="#cb4-14" tabindex="-1"></a>    actor_inbox_filter<span class="op">(</span>self<span class="op">,</span> only_pong<span class="op">);</span></span>
<span id="cb4-15"><a href="#cb4-15" tabindex="-1"></a></span>
<span id="cb4-16"><a href="#cb4-16" tabindex="-1"></a>    <span class="co">// Send ping until we get a pong</span></span>
<span id="cb4-17"><a href="#cb4-17" tabindex="-1"></a>    <span class="cf">while</span> <span class="op">((</span>msg <span class="op">=</span> actor_inbox_pop<span class="op">(</span>self<span class="op">))</span> <span class="op">==</span> NULL<span class="op">)</span> <span class="op">{</span></span>
<span id="cb4-18"><a href="#cb4-18" tabindex="-1"></a>        dprintf<span class="op">(</span>STDOUT_FILENO<span class="op">,</span> <span class="st">&quot;PING</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">);</span></span>
<span id="cb4-19"><a href="#cb4-19" tabindex="-1"></a>        usleep<span class="op">(</span><span class="dv">1000000</span><span class="op">);</span></span>
<span id="cb4-20"><a href="#cb4-20" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb4-21"><a href="#cb4-21" tabindex="-1"></a>    assert<span class="op">(</span>msg<span class="op">-&gt;</span>type <span class="op">==</span> MSG_PONG<span class="op">);</span></span>
<span id="cb4-22"><a href="#cb4-22" tabindex="-1"></a>    free<span class="op">(</span>msg<span class="op">);</span></span>
<span id="cb4-23"><a href="#cb4-23" tabindex="-1"></a></span>
<span id="cb4-24"><a href="#cb4-24" tabindex="-1"></a>    <span class="co">// Release any filtered messages</span></span>
<span id="cb4-25"><a href="#cb4-25" tabindex="-1"></a>    actor_inbox_filter<span class="op">(</span>self<span class="op">,</span> NULL<span class="op">);</span></span>
<span id="cb4-26"><a href="#cb4-26" tabindex="-1"></a></span>
<span id="cb4-27"><a href="#cb4-27" tabindex="-1"></a>    <span class="co">// Enter the default message loop</span></span>
<span id="cb4-28"><a href="#cb4-28" tabindex="-1"></a>    actor_hear_loop<span class="op">(</span>self<span class="op">);</span></span>
<span id="cb4-29"><a href="#cb4-29" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>The <em>writer</em> actor is a gateway to some other,
            probably remote, process. It is not known what transport is
            being used and whether it is reliable, we just write to
            <code>stdout</code>. Therefor before sending any important
            messages we initially send ping and wait for a pong. Other
            actors may send messages to the <em>writer</em> and they
            will be queued while waiting for communication to be
            established.</p>
            <p>Presently each Actor gets its own thread and starting a
            new actor starts a new thread. This is perhaps not ideal if
            you wish to create many more actors than CPUs or create many
            short lived actors. Also it would perhaps be better to use
            processes rather than threads at the cost of being forced to
            copy some message data. Using processes would provide much
            better isolation.</p>
            <p>I suspect that having many threads is OK on modern
            kernels and it offloads a lot of work to the kernel. However
            creating threads is relatively slow, so it would at least
            make sense to reuse threads (or actors) in the case where
            there are a lot of short lived actors.</p>
            <p>So far I have only shown actors being created in
            <code>main</code> <em>outside</em> the actor system with
            static addresses. However actors may arbitrarily create
            other actors.</p>
            <div class="sourceCode" id="cb5"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb5-1"><a href="#cb5-1" tabindex="-1"></a><span class="co">// This is called by a &#39;reader&#39; actor in response to an &#39;ALLC&#39; (allocate) </span></span>
<span id="cb5-2"><a href="#cb5-2" tabindex="-1"></a><span class="co">// message from a remote process</span></span>
<span id="cb5-3"><a href="#cb5-3" tabindex="-1"></a><span class="dt">void</span> tester_start<span class="op">(</span><span class="kw">struct</span> actor <span class="op">*</span>self<span class="op">,</span> addr_t id<span class="op">)</span></span>
<span id="cb5-4"><a href="#cb5-4" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-5"><a href="#cb5-5" tabindex="-1"></a>    <span class="kw">struct</span> msg <span class="op">*</span>msg<span class="op">;</span></span>
<span id="cb5-6"><a href="#cb5-6" tabindex="-1"></a>    <span class="kw">struct</span> actor <span class="op">*</span>tester<span class="op">;</span></span>
<span id="cb5-7"><a href="#cb5-7" tabindex="-1"></a></span>
<span id="cb5-8"><a href="#cb5-8" tabindex="-1"></a>    <span class="co">// Possibly the remote side is asking us to do something we already did</span></span>
<span id="cb5-9"><a href="#cb5-9" tabindex="-1"></a>    <span class="co">// in that case we just inform the existing actor</span></span>
<span id="cb5-10"><a href="#cb5-10" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>actor_exists<span class="op">(</span>id<span class="op">))</span> <span class="op">{</span></span>
<span id="cb5-11"><a href="#cb5-11" tabindex="-1"></a>        msg <span class="op">=</span> msg_alloc<span class="op">();</span></span>
<span id="cb5-12"><a href="#cb5-12" tabindex="-1"></a>        msg<span class="op">-&gt;</span>type <span class="op">=</span> MSG_ALLC<span class="op">;</span></span>
<span id="cb5-13"><a href="#cb5-13" tabindex="-1"></a>        actor_say<span class="op">(</span>self<span class="op">,</span> id<span class="op">,</span> msg<span class="op">);</span></span>
<span id="cb5-14"><a href="#cb5-14" tabindex="-1"></a>        <span class="cf">return</span><span class="op">;</span></span>
<span id="cb5-15"><a href="#cb5-15" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb5-16"><a href="#cb5-16" tabindex="-1"></a></span>
<span id="cb5-17"><a href="#cb5-17" tabindex="-1"></a>    <span class="co">// Create (allocate) a new tester actor</span></span>
<span id="cb5-18"><a href="#cb5-18" tabindex="-1"></a>    tester <span class="op">=</span> actor_alloc<span class="op">();</span></span>
<span id="cb5-19"><a href="#cb5-19" tabindex="-1"></a>    tester<span class="op">-&gt;</span>addr <span class="op">=</span> id<span class="op">;</span></span>
<span id="cb5-20"><a href="#cb5-20" tabindex="-1"></a>    tester<span class="op">-&gt;</span>listen <span class="op">=</span> tester_listen<span class="op">;</span></span>
<span id="cb5-21"><a href="#cb5-21" tabindex="-1"></a>    tester<span class="op">-&gt;</span>hear <span class="op">=</span> tester_hear<span class="op">;</span></span>
<span id="cb5-22"><a href="#cb5-22" tabindex="-1"></a></span>
<span id="cb5-23"><a href="#cb5-23" tabindex="-1"></a>    actor_start<span class="op">(</span>tester<span class="op">);</span></span>
<span id="cb5-24"><a href="#cb5-24" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>So the above functions checks whether an actor already
            exists with the address provided and if it does not then it
            starts a new one.</p>
            <h1 id="looking-forwards">Looking forwards</h1>
            <p>As I mentioned when describing actors, a big problem is
            the chaos which results from unstructured message passing.
            When something goes wrong in an actor based system, or any
            concurrent system for that matter, it can be very difficult
            to know what sequence of events lead up to the error.</p>
            <p>Stack traces from an error often end with
            <code>msg_box_pop</code> before revealing any important
            information. What is really needed is the causal sequence of
            messages and actor states, not just a log of the messages
            received by an actor, but the chain of messages and actor
            states.</p>
            <p>Currently there isn’t even a message log. This could be
            implemented by copying some message data into a buffer for
            each actor. Its difficult to log the content of the message,
            in a generic way, because this is just a pointer to some
            arbitrary user defined data, but at least the message type
            and from address can be logged.</p>
            <p>Actor state is also some pointer to arbitrary data, so
            providing a generic mechanism for logging this may be
            difficult. In some cases however the actor state may be
            entirely or partially represented by an integer (or bit
            field) and this could certainly be logged very cheaply.</p>
            <p>Initially liburcu was created for <a
            href="https://lttng.org/">LTTng</a> which is a tracing
            framework. So possibly libactors could define some trace
            points on sending and receiving messages. The user can then
            provide more tracepoints if they wish.</p>
            <p>Finally, this is being used in the experimental LTP test
            executor. If the executor is to be included in the LTP, then
            this should be bundled with it. Generally we avoid including
            new dependencies like the plague unless it is something
            which is included with practically every Linux
            distribution’s base packages for the last 10+ years. This
            may not be the case with liburcu, so we will probably need
            some other, more compact, implementation of a concurrent
            message queue and address table (hash map). This will be
            somewhat ironic given the original motivator for trying this
            was liburcu.</p>
            <div class="footnotes footnotes-end-of-document">
            <hr />
            <ol>
            <li id="fn1"><p>Lockless data structures usually rely on the
            optimistic updating of some atomic variable(s) which will be
            retried on failure, which arguably is equivalent to a spin
            lock, but it looks different<a href="#fnref1"
            class="footnote-back">↩︎</a></p></li>
            <li id="fn2"><p>Actually the RCU API is also used to
            dereference the data structure’s pointers and maybe more,
            but you don’t need to worry about this as you will simply be
            told “call this in a read lock”<a href="#fnref2"
            class="footnote-back">↩︎</a></p></li>
            <li id="fn3"><p>RDMA still uses message passing, it just
            cuts out some parts of the stack. Ultimately any long
            distance communication will require encapsulation and
            serialisation into something which looks like a
            <em>message</em><a href="#fnref3"
            class="footnote-back">↩︎</a></p></li>
            <li id="fn4"><p>which itself may not be a good idea, but
            that is another matter<a href="#fnref4"
            class="footnote-back">↩︎</a></p></li>
            <li id="fn5"><p>-O1 -Wall -Wextra -pedantic -Werror -g
            -fno-omit-frame-pointer -fsanitize=address<a href="#fnref5"
            class="footnote-back">↩︎</a></p></li>
            <li id="fn6"><p>unless you are operating outside the actor
            system and have a pointer to the actor struct then you can
            use <code>msg_box_push</code><a href="#fnref6"
            class="footnote-back">↩︎</a></p></li>
            <li id="fn7"><p>Actually I also use the from address in one
            instance and obviously the body of the message in many
            cases<a href="#fnref7" class="footnote-back">↩︎</a></p></li>
            </ol>
            </div>
    </div>
  </content>
</entry>
<entry>
  <title>Reproducers make the best (Linux kernel) tests</title>
  <id>https://richiejp.com/reproducers-are-best</id>
  <published>2023-01-24T17:29:25Z</published>
  <updated>2023-01-24T17:29:25Z</updated>
  <link rel="alternate" href="https://richiejp.com/reproducers-are-best" />
  <summary>Why tests that try to reproduce existing bugs are most
profitable</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p><em>Originally posted on <a
            href="https://medium.com/@richiejp/reproducers-make-the-best-linux-kernel-tests-1054a3ac4f4f">Medium</a></em></p>
            <div id="reproducers-make-the-best-linux-kernel-tests"
            class="container" style="max-width: 60ch">
            <h1>Reproducers make the best (Linux Kernel) tests</h1>
            <p>After several years of writing Linux kernel tests. My
            conclusion is that reproducers of existing bugs, find the
            most new bugs.</p>
            <p>Possibly fuzzers find more bugs than anything else. This
            does not contradict my hypotheses because fuzzers generate
            reproducers. At least Syzkaller can generate a C based
            reproducer or replay its activity some other way.</p>
            <p>With some difficulty the C based reproducers can be run
            independently. In a number of cases we have manually
            converted them to Linux Test Project (LTP) test cases. These
            are much better behaved. The LTP is relatively easy to
            install and run.</p>
            <p>Of course fuzzers are not the only source of reproducers.
            Even in this day and age of super fancy artificial
            intelligence. Many reproducers are created entirely by hand,
            even as the result of manual code review. Although static
            analyses also plays a big role.</p>
            <p>Usually reproducers are created to prove a hypotheses.
            That a bug is real and can be practically triggered. That it
            exists on a code path which may be executed in a context
            that matters. All bugs should be fixed, just perhaps not
            with the same priority.</p>
            <p>After attempting to solve the bug, the reproducer can be
            used to verify the fix. Crucially for those maintaining
            their own kernel branches. A reproducer helps to validate
            backports of fixes or the decision a backport is not
            required.</p>
            <p>They may also prevent the exact same bug from being
            reintroduced. Regressions of this type do happen. Especially
            during rewrites and refactoring of complicated code.</p>
            <p>This appears to be relatively well understood. What may
            be surprising to the reader is my following assertion.
            Reproducers of a specific bug, tailored precisely to trigger
            that bug, often find the most new bugs.</p>
            <p>More so if the reproducer can be trivially generalised.
            For example, say a reproducer triggers a bug via a TCP
            socket. If the test can be slightly modified to work with
            UDP sockets or indeed any protocol. Then we say it can be
            trivially generalised.</p>
            <p>To be clear generalisation is not required. In comparison
            to other hand written tests, reproducers find the most new
            bugs. Including bugs that are unrelated to the original.</p>
            <p>What is required is complication or complexity. The
            software under test and the reproducer need to be
            non-trivial. If they are both trivial, then there is not
            much room for the unknown.</p>
            <p>It has often been observed that bugs cluster. That if you
            find one bug, the probability of finding another in the
            “adjacent area”, increases. This is true for both insects
            and software defects. This is usually true regardless of how
            one measures the proximity of bugs.</p>
            <p>There are many convincing explanations for why bugs
            cluster. Which are easy to get sucked into. It pays to keep
            an open mind as to what the causes might be. In fact it
            often pays to temporarily suspend any thoughts about
            causation.</p>
            <p>If we think about the underlying causes of a bug, then we
            may be tempted to skip reproducing it reliably. Because we
            may think up a way of preventing that type of bug. Likewise
            if we think we know why bugs cluster. We may decide to skip
            bug reproduction. Preferring instead to spend time on
            process changes, instrumentation, adjacent code review,
            exploratory testing or static analyses.</p>
            <p>Our thoughts about causation may even be correct. However
            we may miss a far more profitable line of inquiry. Sometimes
            a bug is found which exists in largely bug free code. The
            reason it was not found previously is because an unusual
            context is required to exercise it.</p>
            <p>The code required to create the context in question may
            be very buggy. Simply because it is neglected, changes very
            often or any reason we may think of. Creating a reproducer
            therefore increases coverage in a very profitable
            direction.</p>
            <p>This is merely an example to aid understanding. The
            general principle is that in complicated software there are
            always unknowns. A clear, convincing and correct root cause
            analyses may lead to a lower payoff. When compared to
            assuming ignorance and testing via an exact reproduction of
            the bug.</p>
            <p>Root cause analyses and speculating on how to prevent
            similar bugs is essential. The point is that the process of
            creating a reproducer, including making it reliable, reveals
            new information. Running it reveals yet more
            information.</p>
            <p>The full value of a reproducer is exposed over time and
            “space”. It must be ran across many configurations and
            versions. Put another way, it needs multiple iterations of
            the test matrix.</p>
            <p>This means making it reliable and portable. Which is
            where things usually go wrong. The extra effort required is
            not invested. Understandably because the extra effort is
            often great and the payoff unclear.</p>
            <p>An immense amount of time has gone into making the Linux
            Test Project “portable” across multiple kernel versions and
            user lands. Many of the tests would be suitable for BSD and
            other operating systems as well. Some work has been done on
            that, but alas not enough.</p>
            <p>It is one thing to create a reproducer that works on one
            machine, with one kernel version, some of the time. Entirely
            another to make it work on any suitable configuration in a
            timely and reliable manner.</p>
            <p>This is especially true when a bug is the result of a
            data race. Or if a bug requires a particular outcome to one
            or more data races. Also known as race conditions. For the
            LTP we have put huge effort into reproducing these
            reliably.</p>
            <p>Regardless of race involvement. Reproducer code is often
            fragile, relying on particulars of a given configuration. At
            least the initial reproducer code is fragile. As produced by
            a fuzzer or human being.</p>
            <p>To make it compile and run across multiple kernel
            versions, architectures and compilers. Requires tweaks and
            vendoring in odd bits of libraries. Ensuring reproduction of
            the bug may require some level of generalisation.</p>
            <p>For example if the reproducer relies on particular
            internal memory offsets which may change between
            configurations. These may be practically unknown, possibly
            requiring some complicated probing to figure out exactly. So
            we may need to try a range of read or write lengths at
            runtime.</p>
            <p>I suspect tests which are incapable of reproducing the
            original bug. If that bug were reintroduced to a newer
            kernel. Are still more reliable at finding new bugs than
            purely speculative tests. Hence my assertion generalisation
            is not required for the finding of new bugs.</p>
            <p>However, for those who are testing backports, there is an
            obvious need to generalise in this case.</p>
            <p>This all comes from my anecdotal experience with the
            Linux kernel and low level user space libraries. With other
            software it can be simpler. The kernel suffers from
            combinatorial explosions on all fronts. Including over time
            as the pace of development is so fast. This makes full
            coverage of the whole test matrix impossible.</p>
            <p>Therefore we are poking around in the dark to some
            extent. We need to use probabilistic methods to identify the
            most urgent areas for testing. The observation that bugs
            cluster, is then very important.</p>
            <p>The following aphorism is another way to look at it: To
            avoid writing pointless tests. Write reproducers and then
            generalise them.</p>
            <p>Ironically this suggests to follow an “evidence based”
            approach. Yet I have provided no evidence. I encourage
            developers to mention specific test names in fix commits.
            Even better to submit patches adding commit or CVE tags to
            LTP tests. With data such as this we can perhaps validate my
            claims.</p>
            <p>On a project less nebulous than the Linux kernel, it is
            easier to test the hypothesis. Simply tag test failures
            which fail for legitimate reasons. Then collect and
            aggregate the tags. You will see which tests have
            historically been the most effective.</p>
            <p>Of course this is not a scientific experiment. The
            results are open to interpretation and may not hold any more
            value than gut feeling. Nonetheless the data would be
            interesting.</p>
            <p>Finally, consider the following; the thing under test is
            testing its testers.</p>
            </div>
    </div>
  </content>
</entry>
<entry>
  <title>Responsive IFrame</title>
  <id>https://richiejp.com/responsive-iframe</id>
  <published>2022-11-09T12:53:44Z</published>
  <updated>2022-11-09T12:53:44Z</updated>
  <link rel="alternate" href="https://richiejp.com/responsive-iframe" />
  <summary>Using an IFrame and CSS to embed a responsive calendar widget
in Wordpress</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <h1 id="intro">Intro</h1>
            <p>This isn’t about embedding a YouTube video which seems to
            be the majority of articles on this subject. I have made a
            <a href="svelte-booking-app-one">availability calendar web
            app</a> which adapts to the screen size.</p>
            <p>My first user has a Wordpress site and wants to embed it
            on a page. The options for doing this are:</p>
            <ol style="list-style-type: decimal">
            <li>Reimplement the availability calendar frontend in
            Wordpress.</li>
            <li>Inject the HTML and JS without an IFrame.</li>
            <li>Use an IFrame</li>
            </ol>
            <p>Option one is not a bad idea in general, but I don’t like
            PHP or React. I also don’t want complicated interactions
            between Open and closed source.</p>
            <p>Option two is horrendus. Mashing JS and CSS together from
            online sources results in non-reproducible behaviour. I also
            have a relatively strict content security policy on <a
            href="https://dobu.uk">DoBu.uk</a> and hooks to report
            errors.</p>
            <p>Finally there is IFrame. The main advantage of which is
            near total isolation. Which is also why it causes
            trouble.</p>
            <h1 id="fixed-size">Fixed size</h1>
            <p>If your IFrame content is a fixed size, then you don’t
            have an issue. Just set the width and height to the content
            size.</p>
            <p>Below is the availability calendar iframe forcibly set to
            the width of an iPhone X/XS (according to Firefox). The
            height is the minimum required to display the calendar
            without scrolling.</p>
            <iframe title="Availability and booking calendar" width="375px" height="600px" src="https://dobu.uk/availability/richiejp?no_header&amp;no_footer&amp;min_scroll=640&amp;new_tab_enquire">
            </iframe>
            <p>I’ll quote the HTML to save you opening dev tools.</p>
            <div class="sourceCode" id="cb1"><pre
            class="sourceCode html"><code class="sourceCode html"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="dt">&lt;</span><span class="kw">iframe</span></span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a><span class="ot">    title</span><span class="op">=</span><span class="st">&quot;Availability and booking calendar&quot;</span></span>
<span id="cb1-3"><a href="#cb1-3" tabindex="-1"></a><span class="ot">    width</span><span class="op">=</span><span class="st">&quot;375px&quot;</span></span>
<span id="cb1-4"><a href="#cb1-4" tabindex="-1"></a><span class="ot">    height</span><span class="op">=</span><span class="st">&quot;600px&quot;</span></span>
<span id="cb1-5"><a href="#cb1-5" tabindex="-1"></a><span class="ot">    src</span><span class="op">=</span><span class="st">&quot;https://dobu.uk/availability/richiejp?no_header</span><span class="er">&amp;</span><span class="st">no_footer</span><span class="er">&amp;</span><span class="st">min_scroll=640</span><span class="er">&amp;</span><span class="st">new_tab_enquire&quot;</span></span>
<span id="cb1-6"><a href="#cb1-6" tabindex="-1"></a><span class="dt">&gt;</span></span>
<span id="cb1-7"><a href="#cb1-7" tabindex="-1"></a><span class="dt">&lt;/</span><span class="kw">iframe</span><span class="dt">&gt;</span></span></code></pre></div>
            <p>The only styling applied to this is by Bulma’s
            “minireset” which removes the default border.</p>
            <p>Usually at this size DoBu.uk would still allow scrolling
            (see <a
            href="https://dobu.uk/availability/richiejp">dobu.uk/availability</a>
            on a phone). However with the <code>min_scroll=640</code>
            query param, the interface switches to a fixed height view
            when the width goes below 640px.</p>
            <p>This prevents scrolling within the IFrame when the IFrame
            is taking up most of the screen width. My present opinion is
            that scrolling within an IFrame is OK if there is space
            around it. Otherwise it messes up the UI.</p>
            <p>Note that, in general setting the <code>width</code> and
            <code>height</code> attributes to an absolute value is a bad
            idea. These should be set to <code>100%</code> and the
            dimensions controller by the enclosing element with CSS.
            Even if it is a fixed size because it is consistent with
            most other elements.</p>
            <h1 id="resizing">Resizing</h1>
            <p>My IFrame content has mobile and desktop versions
            (i.e. it’s responsive). What would be worse is if it’s
            content can change size and you want the IFrame to take the
            size of its content.</p>
            <p>We have two basic cases</p>
            <ol style="list-style-type: decimal">
            <li>Resize the IFrame to fit the screen size or
            orientation.</li>
            <li>Resize the IFrame to fit its content.</li>
            </ol>
            <p>So far I have avoided needing to do the later. Due to
            security, finding the content size of an IFrame is a pain.
            It requires setting up a communication channel
            (e.g. <code>window.postMessage</code> if available).</p>
            <p>The former can be done entirely with HTML and CSS. Below
            is a reactive IFrame.</p>
            <div class="dobuuk-availability">
            <iframe title="Availability and booking calendar" width="100%" height="100%" src="https://dobu.uk/availability/richiejp?no_header&amp;no_footer&amp;min_scroll=640&amp;new_tab_enquire">
            </iframe>
            </div>
            <p>The HTML is still very simple</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode html"><code class="sourceCode html"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="dt">&lt;</span><span class="kw">div</span><span class="ot"> class</span><span class="op">=</span><span class="st">&quot;dobuuk-availability&quot;</span><span class="dt">&gt;</span></span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a><span class="dt">&lt;</span><span class="kw">iframe</span></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a><span class="ot">    title</span><span class="op">=</span><span class="st">&quot;Availability and booking calendar&quot;</span></span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a><span class="ot">    width</span><span class="op">=</span><span class="st">&quot;100%&quot;</span></span>
<span id="cb2-5"><a href="#cb2-5" tabindex="-1"></a><span class="ot">    height</span><span class="op">=</span><span class="st">&quot;100%&quot;</span></span>
<span id="cb2-6"><a href="#cb2-6" tabindex="-1"></a><span class="ot">    src</span><span class="op">=</span><span class="st">&quot;https://dobu.uk/availability/richiejp?no_header</span><span class="er">&amp;</span><span class="st">no_footer</span><span class="er">&amp;</span><span class="st">min_scroll=640</span><span class="er">&amp;</span><span class="st">new_tab_enquire&quot;</span></span>
<span id="cb2-7"><a href="#cb2-7" tabindex="-1"></a><span class="dt">&gt;</span></span>
<span id="cb2-8"><a href="#cb2-8" tabindex="-1"></a><span class="dt">&lt;/</span><span class="kw">iframe</span><span class="dt">&gt;</span></span>
<span id="cb2-9"><a href="#cb2-9" tabindex="-1"></a><span class="dt">&lt;/</span><span class="kw">div</span><span class="dt">&gt;</span></span></code></pre></div>
            <p>Now the CSS excluding properties set by minireset.</p>
            <div class="sourceCode" id="cb3"><pre
            class="sourceCode css"><code class="sourceCode css"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a><span class="fu">.dobuuk-availability</span> {</span>
<span id="cb3-2"><a href="#cb3-2" tabindex="-1"></a>    <span class="kw">height</span><span class="ch">:</span> <span class="dv">600</span><span class="dt">px</span><span class="op">;</span></span>
<span id="cb3-3"><a href="#cb3-3" tabindex="-1"></a>    <span class="kw">width</span><span class="ch">:</span> <span class="dv">100</span><span class="dt">%</span><span class="op">;</span></span>
<span id="cb3-4"><a href="#cb3-4" tabindex="-1"></a>}</span>
<span id="cb3-5"><a href="#cb3-5" tabindex="-1"></a></span>
<span id="cb3-6"><a href="#cb3-6" tabindex="-1"></a><span class="im">@media</span> <span class="fu">(</span><span class="kw">min-width</span><span class="ch">:</span> <span class="dv">768</span><span class="dt">px</span><span class="fu">)</span> <span class="kw">and</span> <span class="fu">(</span><span class="kw">min-height</span><span class="ch">:</span> <span class="dv">1080</span><span class="dt">px</span><span class="fu">)</span> {</span>
<span id="cb3-7"><a href="#cb3-7" tabindex="-1"></a>    <span class="fu">.dobuuk-availability</span> {</span>
<span id="cb3-8"><a href="#cb3-8" tabindex="-1"></a>        <span class="kw">height</span><span class="ch">:</span> <span class="dv">70</span><span class="dt">vh</span><span class="op">;</span></span>
<span id="cb3-9"><a href="#cb3-9" tabindex="-1"></a>    }</span>
<span id="cb3-10"><a href="#cb3-10" tabindex="-1"></a>}</span>
<span id="cb3-11"><a href="#cb3-11" tabindex="-1"></a></span>
<span id="cb3-12"><a href="#cb3-12" tabindex="-1"></a><span class="fu">.dobuuk-availability</span> iframe {</span>
<span id="cb3-13"><a href="#cb3-13" tabindex="-1"></a>    <span class="kw">border</span><span class="ch">:</span> <span class="dv">1</span><span class="dt">px</span> <span class="dv">solid</span> <span class="cn">#e2e8f0</span><span class="op">;</span></span>
<span id="cb3-14"><a href="#cb3-14" tabindex="-1"></a>}</span></code></pre></div>
            <p>On small screens the height is set to 600px. On larger
            ones we set it to 70% of the viewport height.</p>
            <p>The media queries provide a heuristic for when there is
            enough space around the IFrame for inner scrolling. This
            requires tweaking, but the basics are there.</p>
            <p>Possibly there are other tricks we could apply as well.
            For instance snapping to the IFrame when it gains focus.</p>
            <p>However I think it is best to keep as much complication
            as possible on the side of the thing being embedded. This
            makes it easy for someone to embed DoBu.uk. In particular if
            they have a strict CSP or other restrictions on JS.</p>
    </div>
  </content>
</entry>
<entry>
  <title>On creating a booking app with SvelteKit</title>
  <id>https://richiejp.com/svelte-booking-app-one</id>
  <published>2022-08-03T13:26:25+01:00</published>
  <updated>2023-04-08T11:04:44+01:00</updated>
  <link rel="alternate" href="https://richiejp.com/svelte-booking-app-one" />
  <summary>A systems programmers thoughts on indiehacking and creating a
web app with SvelteKit and Redis</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <h1 id="intro">Intro</h1>
            <p>For the most part over the last 5 or 6 years I have been
            employed to work on the Linux kernel. Finding, reproducing
            and fixing bugs. Also on test automation and things I am
            unmotivated to write about. To get a feel for the kind of
            stuff I usually work on, here are some articles I have
            written related to my job:</p>
            <ul>
            <li><a href="custom-c-static-analysis-tools">A review of
            tools for rolling your own C static analysis</a></li>
            <li><a href="a-rare-data-race">Fuzzy Sync: Winning a rare
            data race</a></li>
            <li><a href="cgroup-compat-layer">Supporting both Linux
            CGroup APIs</a></li>
            <li><a href="https://youtu.be/Z6Bhhjpj1w4">How to write eBPF
            byte code by hand</a></li>
            <li><a href="zig-cross-compile-ltp-ltx-linux">Minimal Linux
            VM cross compiled with Clang and Zig</a></li>
            </ul>
            <p>I also wrote a <a
            href="https://fosdem.org/2020/schedule/event/testing_openqa_jdp/">data
            analysis framework in Julia</a> which failed horribly. The
            major issue being no one I wanted to use it, would use it. I
            made it do the job I needed it to do. However the engineers
            I wanted to use and expand it wouldn’t do so.</p>
            <p>Other people were interested in what it could do if I (or
            someone else) made it do it. Indeed it turned out that
            Palantir had made a very successful business out of a
            similar concept.</p>
            <p>This helped in my conclusions of <a
            href="ways-to-help-your-project-fail">what not to do when
            creating a project</a>. The semi-serious linked article can
            be summarised as follows; you can encourage failure by doing
            the following things:</p>
            <ul>
            <li>Neomania: Combine multiple new things at once</li>
            <li>Antisocial: Avoid talking to people</li>
            <li>Quixotic: Future-proof, abstract and generalise</li>
            </ul>
            <p>As you shall see I am not terribly good at following my
            own advice. However I perhaps have managed to avoid the
            worst excesses.</p>
            <h2 id="update">UPDATE</h2>
            <p>Since writing this the app is now live.</p>
            <iframe title="Availability and booking calendar" width="375px" height="700px" src="https://dobu.uk/availability/richiejp?no_header&amp;no_footer">
            </iframe>
            <p>At the time of writing I’m putting the app into
            maintenance mode while I find other ways to make my life
            more difficult. I have written more about it on <a
            href="https://www.indiehackers.com/product/dobu-uk">IndieHackers</a>.</p>
            <h1 id="indie-hacking">Indie Hacking</h1>
            <p>I want to create my own products and be exposed to the
            risk of running my own business. Meanwhile I don’t want to
            completely run out of Beer tokens and be forced to drink
            lighter fluid like Withnail.</p>
            <p>It would be more respectable for me to use my family as
            the reason why I can’t just quit my job and go full
            entrepreneur. However that wouldn’t explain why I didn’t do
            it before I had a family. So really it’s just left to your
            imagination (or you can buy my book when I finish and
            publish it).</p>
            <p>Anyway my employer were good enough to allow me to go
            part time and work on my own product or freelance. My backup
            plan was to quit and freelance initially. This is almost
            what happened, but handing my notice in seemed to push
            things along.</p>
            <p>Initially it looked like they had a serious problem with
            me going part-time, but this wasn’t the case at all. Someone
            just needed to do the work to change my contract and IP
            agreement. Don’t get me wrong, nobody liked me going
            part-time, but it was deemed better than me disappearing
            altogether.</p>
            <p>Being on 50% salary and splitting my week in half is less
            than ideal. I find it far better to work on things serially
            in solid blocks of at least a week.</p>
            <p>Meanwhile the salary will stop us from going bankrupt,
            however it’s “not enough”. I could elaborate further, but I
            think it suffices to say that sooner or later we will be
            uncomfortable.</p>
            <p>Nevertheless this is good enough; I should be able to do
            something with it. A lot of ink has been spilled on whether
            people should try a side project or go full time
            immediately. Obviously it depends on your circumstances,
            including what opportunities are in front of you.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>Update!</p>
            <p>Splitting your time 50/50 is fine. It’s important to move
            fast, but there is a hard ceiling to the work hours in a
            week anyway. There are far more important things than
            maximising focus time.</p>
            <p>To be clear a few hour chunks of focus time is essential
            to get anything done. After that though, what one chooses to
            do is far more important.</p>
            </div>
            </div>
            <p>The major problem was (and perhaps still is) finding a
            problem which I can identify a sensible solution for.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>Update!</p>
            <p>Or sales, but if you have a great product then it only <a
            href="https://twitter.com/nntaleb/status/1575437452617072640">needs
            to be put in front of people</a>. Still, I would seek to get
            buy-in from more than one person before writing a line of
            code for a future product.</p>
            </div>
            </div>
            <p>I explored a few opportunities, some of which are on
            hold, but settled on a booking app in collaboration with my
            sister, a professional musician. This is more than feasible
            to complete as a self funded part-time project.</p>
            <p>It’s incredible to think that in 2022 this isn’t a solved
            problem. However she wasn’t completely happy with any of the
            available solutions.</p>
            <p>She has an existing system comprised of disperate
            components which allows her to automate some aspects of her
            business. The idea is to create a streamlined and tightly
            integrated app from the system.</p>
            <p>Firstly so that it can be resold to similar businesses or
            individules. Secondly to allow some types of automation
            which requires complicated logic. Thirdly to make the
            booking process highly polished.</p>
            <p>To be clear my sister is more technical than many other
            people with similar businesses. Not to mention the amount of
            time she has spent over the years on her website and admin
            process. Other people are not as likely to cobble together
            their own automation out of various nocode tools.</p>
            <p>Because my sister’s system is quite complicated, not to
            mention the things she hasn’t been able to implement herself
            yet. I asked her to provide an idea for a minimum viable
            product (MVP). Which turned out to be an availability
            calendar and simple enquiry form.</p>
            <p>These things exist already of course, they just don’t fit
            the usecase. They don’t look right and are usually too
            complicated and expensive. Meanwhile it’s clear there are
            performers out there who could make use of such a product if
            it were simple to use.</p>
            <p>Presently they will be taking enquiries for days where
            they are already booked or unavailable for some other
            reason. My sister also pointed out that her availability is
            a marketing tool. Her unavailable days are evidence she is
            in demand.</p>
            <p>Whether or not people are willing to part with money for
            these reasons is another matter. The proliferation of
            similar software would suggest that people are willing to
            pay for this. Perhaps this won’t include the people in this
            tiny niche, we shall see.</p>
            <p>I decided to make this a subscription based web
            application for the obvious reasons. My target audience
            aren’t going to want ot install the software themselves on a
            web server or meddle with the source code. Furthermore this
            doesn’t preclude me from making it “shared” or open source
            at a later date.</p>
            <p>You may also wonder if working with family is wise. I
            think that depends on your family and what it is you are
            doing. For the time being my sister is under no obligation
            to use or promote the app and I’m not under any obligation
            to provide it. At least beyond the basic principal that I
            said I would try, so I will at least take it to the point
            that it’s clear it won’t work.</p>
            <p>If it does work to any extent, then things get more
            complicated. We still need to decide what amount of profit
            the app needs to make for each of us to keep us motivated.
            The amount of motivation for each of us depends on the
            division of responsibilities and intellectual property being
            exposed.</p>
            <p>Leaving this kind of thing to be sorted out later
            requires a lot of trust. We somehow need to negotiate this
            without making Christmas dinner very awkward. The risk of a
            serious fallout is very low in my opinion, but it wouldn’t
            be good if it did happen.</p>
            <h1 id="techfoolery">Techfoolery</h1>
            <p>For someone who has hitherto avoided front-end
            development like the plague; this may not be considered the
            wisest choice. Indeed keeping myself focused on features
            visible to the end user has been a nightmare.</p>
            <p>At one point I began writing my own database so that I
            could run the app on <a
            href="nanos-clone3-brk-and-nodejs">Nanos unikernel VMs</a>
            and avoid making NodeJS (de)serialise RESP (Redis’s
            communication format). Without writing native modules,
            NodeJS/V8 is pretty bad at reading any format other than
            JSON because <code>JSON.parse</code> is implemented in
            highly optimised native code by the V8 team.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>Update: I’m probably wrong about the performance. It
            seems V8 can deliver near native performance. The problem is
            the JavaScript ecosystem.</p>
            </div>
            </div>
            <p>So really it makes sense to have a database that just
            sends and receives NodeJS optimised JSON over a web socket
            (or whatever). At least that is what I think. I haven’t
            implemented it because it is not even remotely necessary for
            my booking app (although I think it would be similar to
            Firestore, see below).</p>
            <div class="message is-info">
            <div class="message-body">
            <p>Update: Or MongoDB and indeed PouchDB discussed
            below.</p>
            </div>
            </div>
            <p>On the plus side, I see a non-stop list of ways in which
            web app development can be improved. Forcing myself to write
            a web app has exposed me to the real challenges of modern
            web app development.</p>
            <p>There is clearly a lot that can be done at every stage in
            the stack. There are no shortage of ideas, the issue is
            finding one that I can monetise in a reasonable time
            frame.</p>
            <p>I have previously done web dev in a bunch of frameworks.
            None of which I was terribly familiar with, so I went
            looking for what start ups and indie hackers use today. I
            guessed this is the the group I want to copy because people
            who make the wrong choices are filtered from the group.</p>
            <p>It appeared that the most popular seems to be Next.js on
            Vercel with Firestore as the DB. Next up SvelteKit also on
            Vercel and PostgreSQL on Supabase or whatever.</p>
            <p>I decided that Svelte(Kit) was too new and tried Next.js
            first. However I hated it, or at least I hated React. It’s a
            bloated verbose awful thing with docs floating around for
            legacy API features. It seemed like a lot of work to do
            something very simple in React.</p>
            <p>So I tried SvelteKit also on Vercel. Initially with Bulma
            CSS which I have used before, but got annoyed with it and
            switched to TailwindCSS which seems to be more popular. I’m
            still using SvelteKit and TailwindCSS which says a lot about
            those two things.</p>
            <p>Svelte creates really svelte client side JS and the
            Svelte syntax is also nice and concise. This contrasts
            heavily with other client JS libraries which transfer a lot
            of JS and get into all kinds of complications to make pages
            load quickly.</p>
            <p>The ‘Kit’ part of SvelteKit is not quite as brilliant.
            It’s not V1 yet which is my own fault for using it before
            V1. However I wonder if even when it is V1 I will still have
            a mild itch to replace it.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>Update!</p>
            <p>I got “hosed”, by SvelteKit and ended up replacing the
            Kit part. They totally change routing and I never figured
            out how to get error reporting how I wanted it.</p>
            <p>Meanwhile the Svelte part has been steady as a rock. I
            replaced Vite with an ESBuild script, wrote my own client
            side router and used Polkadot on the server.</p>
            </div>
            </div>
            <p>Everything was going pretty well with Vercel until I
            started looking into databases. In fact it is better to say
            everything was going pretty well until I looked into
            databases. This threw all my hosting plans into
            disarray.</p>
            <p>It’s worth mentioning that at this point. Despite some
            chopping and changing with front end frameworks. I had
            quickly created a mock UI that my sister was happy with.
            Which is quite similar to <a
            href="https://www.dobu.uk/availability/richiejp">what’s
            there now</a>.</p>
            <p>Things got rapidly bogged down though. Perhaps due to my
            personality, but I also suspect there is something missing
            in the Vercel or serverless calculus.</p>
            <p>Regardless of what database you use it’s not going to be
            in the same VLAN as Vercel’s server (or whatever they would
            like to call it, considering they are serverless). Possibly
            it may be in the same datacenter depending on who you use to
            host your database.</p>
            <p>This serverly reduces the number of options you have for
            databases. You need a secure DB exposed to the internet and
            that can accept many simultaneous connections (because
            “serverless”). Regardless of what you pick there will be
            some latency added.</p>
            <p>Firestore, amongst others, can handle this and I suppose
            you can host your app on Firebase as well. I imagine that
            latency is not too much of an issue then. However you can’t
            very well pick up an app designed for Firestore and Firebase
            then move it to a different host.</p>
            <p>Firestore in particular is owned by Google who’s calendar
            and auth API’s I do use. However I don’t want a significant
            portion of my tech stack to be reliant on one vendor. I’m
            not just thinking about my current project, but future ones
            as well.</p>
            <p>Also the pricing is based on a variety of metrics
            including some relatively high level ones. Like the number
            of document writes and reads. Also the stored data and
            network egress. This doesn’t leave many degrees of freedom
            with which one has to optimise.</p>
            <p>When pricing is based on VM resources you have many
            degrees of freedom to optimise. You can send more or less
            requests based on what actually results in lower memory,
            storage and processor usage.</p>
            <p>In other words, by using these hosted databases you are
            giving up a lot of optionality. Admittedly this won’t matter
            for most developers, most of the time. Perhaps I could have
            saved a lot of time by using Firestore.</p>
            <p>Without having decided on my hosting, I decided to try
            using PostgreSQL for my database. It is very popular and
            there are lots of options for hosting it that also support a
            NodeJS instance.</p>
            <p>I tried using it with a ORM that is not an ORM, but it
            just seemed to get in the way of using PostgreSQL features.
            So then I tried writing raw SQL/DDL where (as usual) I was
            greeted with weird compilation errors and a manual full of
            ad-hoc features no one outside of enterprise uses.</p>
            <p>Previously I have simply resorted to using Redis and
            implementing the indexes myself if really necessary. I
            thought I would be sensible this time and not do that.
            However here I am again, using Redis.</p>
            <p>However this is where things start to get crazy. I got
            fixated on using an embedded database, like LevelDB,
            BadgerDB or PouchDB. This would then allow me to host the
            whole app in a single VM on Vultr running nanos.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>Update: I severely regret not trying PouchDB in NodeJS
            and a central CouchDB instance for operations which must be
            consistent.</p>
            </div>
            </div>
            <p>In the meantime I was making some progress with app
            features using Redis. Including some Lua libraries, a
            feature introduced in Redis 7.0, that I later was forced to
            remove due to the awful nature of Lua. This is despite
            having made hosting decisions based on needing Redis 7.0
            that almost nobody supported.</p>
            <p>The Redis Lua problems made me question Redis altogether,
            again. However I was also thinking that actually an embedded
            DB is not such a great idea because how would it scale?
            PouchDB can probably do that, but it’s performance is maybe
            not so great to begin with.</p>
            <p>Then I would think well I should just use Redis, I know
            that works already. I just need to throw out the Lua; I can
            make do without it.</p>
            <p>However Redis forks to write to disk, so it doesn’t work
            with nanos. So then I have to host it elsewhere. Maybe in a
            container, but I hate containers. Reading the Kubernetes
            documentation almost gave me a hernia.</p>
            <p>I started writing a JSON document database in Go (I’ve
            never used Go before) based on BadgerDB. With the intention
            of creating something that my NodeJS VMs could offload data
            operations into.</p>
            <p>By this point I think I had fully lost the plot. The
            whole point of using Redis was that it is just a default
            option. I’ve used it before, it’s simple, well supported
            etc. It’s probably not ideal, but this project isn’t for
            exploring databases.</p>
            <p>At any rate, I discovered Fly.io and got over my hatred
            of containers for now. I think Fly gives me enough freedom
            while taking away most concerns about updating the Linux
            kernel and various packages. I especially do not want to
            maintain Kubernetes.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>Update!</p>
            <p>Oh, but it doesn’t end there. I got into trouble with
            Redis because Fly.io shutdown the instance because the
            physical host storage was full. Then I tried Upstash, which
            I didn’t like. Eventually I switched to KeyDB with
            active-active replication which is HA and eventually
            consistent across regions.</p>
            <p>It’s still far from perfect. Again; should have tried
            Couch/PouchDB.</p>
            </div>
            </div>
            <p>So eventually my tech stack is comprised of a NodeJS 18.x
            container with the following deps SvelteKit (Vite), Vitest,
            TailwindCSS, TypeScript, ioredis. Then another container
            running Redis 7.x.</p>
            <p>My app talks to Google’s REST APIs, it also sends mail
            through Fastmail using JMAP and logs to Loki. Amongst other
            things, I implemented these from scratch instead of using
            libraries.</p>
            <p>I haven’t regretted these decisions, it makes the server
            side code small and maintainable. I’m only left to deal with
            breaking changes from SvelteKit and Vite updates.</p>
            <p>As everyone says, even quite basic NPM libraries
            themselves use many dependencies. They are also full of
            shims and bloat that isn’t completely solved with “tree
            shaking”. Not least because I still have to download it all
            on my dev machines.</p>
            <p>There are a lot of things which really bug me, but all in
            all my Svelte app is very svelte and I feel like I now have
            a tailwind allowing me to progress <em>très vite</em>.</p>
            <h1 id="updated-contemplation">Updated contemplation</h1>
            <p>Eventually I ended up writing a whole lot from scratch. I
            don’t regret this when it comes to the core of the
            application. That is the calendar itself. What I do regret
            is putting this level of effort into things merely
            supporting the core component.</p>
            <p>Reducing ones dependencies to those with a very high
            value-to-maintenance ratio means I have good performance,
            security and low maintenance. More importantly to the
            customer, I have microscopic control over the product
            behaviour where it matters.</p>
            <p>The problem is obviously that it’s a big effort to write
            everything just the way you want it. Meanwhile the payoff
            for having microscopic control over the menu and landing
            page (for e.g.) is small.</p>
            <p>I believe writing from scratch is a huge edge. However
            it’s an edge because it is hard. You can easily blow a month
            reinventing the wheel then ending up with a substandard
            solution.</p>
            <p>On the other hand, you can get a vastly better solution,
            that is faster, more secure and lower maintenance. Without
            doing anything fancy, just removing complexity your
            application doesn’t need.</p>
            <p>As usual I believe the solution here is not compromise,
            but a bar-bell strategy: write the core from scratch and use
            no/low code for the periphery.</p>
            <p>If I were to start again I would either use BudiBase or
            Google Apps Script/JSON for the user panel. The later is
            necessary to get a Google Workplace plugin anyway.</p>
            <p>Likewise I’d use some kind of no/low code platform for
            things like the landing page and docs. I think this will
            produce an average solution without much effort. Which is
            all that is needed to let the core shine.</p>
            <p>I think it is essential that the no/low code platform is
            fully managed. It’s possible to self host BudiBase, but I do
            not trust apps that complex unless they can be given
            constant attention.</p>
    </div>
  </content>
</entry>
<entry>
  <title>Three ways to help your project fail</title>
  <id>https://richiejp.com/ways-to-help-your-project-fail</id>
  <published>2022-01-23T16:44:59Z</published>
  <updated>2022-01-26T10:51:59Z</updated>
  <link rel="alternate" href="https://richiejp.com/ways-to-help-your-project-fail" />
  <summary>And make it an expensive failure</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <h1 id="intro">Intro</h1>
            <p>It’s said that failures provide more information than
            successes. If you know what doesn’t work then you can focus
            on what might work. If something succeeds, maybe you got
            lucky; perhaps you didn’t test the limits.</p>
            <p>When something does fail, it is sometimes clear why.
            Especially when you spot a potential failure and structure
            things as an experiment. It’s fine and necessary to fail.
            What isn’t, is to fail in a costly manner. Where you test
            things in the wrong order or don’t know what to blame. Where
            you spend a long time building supporting structures for a
            baseless undertaking.</p>
            <p>So here is a list of ways in which I have ruined
            projects. <em>Not just ruined them in the sense that they
            ultimately failed, but also in that I made them costly</em>.
            I did learn from them, a lot, however the time expenditure
            to payoff ratio is not as good as it could be.</p>
            <p>Doing the things listed here will take you away from
            fulfilling the basic premise of a project. Indeed it will
            set testing the project’s core assumptions further down the
            road. Also when the project eventually fails, it will be
            difficult to untangle the root causes.</p>
            <p>Additionally I have suggested some mitigations or what to
            do instead. These I am less confident about than what not to
            do. It has to be said that knowing what to do in the
            abstract is only 1% of the battle.</p>
            <p>For example, it’s all very well and good to know that
            presenting your work at a conference is a good way to get
            users or feedback. In fact you don’t even need to know its a
            good idea. The cost of trying it is relatively small, so you
            can merely try it… right?</p>
            <p>Usually the things you should be trying or avoiding are
            obvious. The problem is making yourself do them or not. The
            true battle is “selfovercoming” as Nietzsche put it. That
            said, it sometimes helps to have a map of the terrain, even
            if some important details are missing. Also there is maybe,
            just maybe, the possibility someone will find this article
            entertaining and share it. Thus bringing more traffic to <a
            href="https://richiejp.com">my website</a>.</p>
            <p>It amuses me quite a lot that <a href="/https/richiejp.com/block-chain">web3
            and block-chain</a> fall foul of all of these if you
            discount gamblers as real end users. If you look at
            Ethereum; they invented their own language and VM (Solana
            did better here, but only slightly). The underlying
            technology is new to begin with. They don’t ask the little
            people if they want their private details storing on a block
            chain forever, where they have to pay to <em>add</em> to
            them. As for whether block-chain is an over abstracted, over
            generalised, utopian vision that is essentially quixotical.
            Well, time will tell.</p>
            <p>Of course a block-chain or an element of block-chain
            could still be successful. Often the biggest discoveries are
            made when someone breaks the rules. Does all the things that
            shouldn’t work and still succeeds. These make interesting
            and heroic stories. However one has to look past these and
            look at how similar attempts faired.</p>
            <h1 id="neomania">Neomania</h1>
            <p><strong>1. Combine multiple new things at
            once</strong></p>
            <p>Implement a novel idea in a new language, with a new
            database, using new protocols and methodologies (for
            example). Combine things that have never been used together
            before. Doing all the work yourself to fix any issues that
            arise.</p>
            <p>Ignore the lack of a track record by the creators of
            multiple components you rely on. Ignore the lack of scrutiny
            these technologies have come under or the fact most of the
            users are enthusiasts and True Believers.</p>
            <p>Try to sell a new language and ecosystem to existing
            associates. In addition to selling them on the basic premise
            of your project. Which is also a new concept for them to
            digest.</p>
            <h2 id="mitigations">Mitigations</h2>
            <p>At the outset, if possible, use tried and tested
            components that are known to work together. If there is a
            strong reason to use something new, try to make it a single
            component. Then experiment with new tech by swapping out
            components one at a time. Keeping most of the things the
            same.</p>
            <p>Sometimes new components come as a package and work
            better together. If this is the case then try to stick to
            what the largest number of other people are doing. Usually
            this is not the case though, new projects try to be
            compatible with the largest existing code bases. As you
            should.</p>
            <p>The cost of <em>not</em> using something new can also be
            very high in theory. It’s just hidden until years down the
            line. So while it is an option to ignore new stuff, you
            probably don’t want to do that either. It’s often less
            costly to spend time trying things both ways and
            comparing.</p>
            <p>However if something is constantly changing in a way that
            breaks backwards compatibility. Then a comparison at a
            single point in time may not be a fair one. If a component
            introduces breaking changes on a regular basis, this may
            outweigh the advantages of using it.</p>
            <p>I suspect that it is perhaps easier to find new
            associates than to convince old ones to learn new things. It
            makes sense to find people for whom only one or two aspects
            of your project are new to them.</p>
            <h1 id="antisocial">Antisocial</h1>
            <p><strong>2. Avoid talking to people</strong></p>
            <p>Focus on documentation and 1-to-n communication. Don’t
            partake in general discussions in whatever communities your
            project is related to. Don’t ask other people for advice on
            the best way to do something. Remain aloof and
            unapproachable. Avoid any display of weakness or
            ignorance.</p>
            <p>If you do ask for feedback, then make it very specific.
            If people comment on things outside the scope of what you
            requested. Then ignore it and say you are not interested in
            that.</p>
            <p>Meanwhile consume large amounts of 1-to-n communication
            yourself. Don’t engage with the authors directly and risk
            exposing a treasured opinion to public scrutiny.</p>
            <h2 id="mitigations-1">Mitigations</h2>
            <p>Before writing any code, ask around if anyone would be
            interested. Whether someone is already working on it.
            Whether there is a way to cut out the problem entirely
            without writing code. If you think you know the answer
            already, put it out there for people to refute.</p>
            <p>Ask for advice on what components and methods to use.
            People love to be asked for advice. They don’t like being
            asked to do actual work. However people enjoy giving advice
            so much that it often comes unsolicited.</p>
            <p>Find some communities related to your project and take
            part in the general discussion. Avoid reading articles and
            discussions you’re not likely to respond to. Most of what is
            discussed <em>is noise</em> when viewed passively. Only as
            an active participant does it become a medium for building
            lasting bonds.</p>
            <p>Only write minimal documentation and instead make
            yourself available for discussion. You don’t want to spend
            time writing and maintaining documentation no one reads or
            understands. Your idea of how to explain things may be way
            off. It’s obvious in an interactive discussion if people
            don’t get what you are trying to say. Not so, if someone
            reads some documentation then disappears into the ether.</p>
            <h1 id="quixotic">Quixotic</h1>
            <p><strong>3. Future-proof, abstract and
            generalise</strong></p>
            <p>Spend time and effort designing and writing code for a
            future that may never come. Try to generalise, decouple and
            abstract every little detail. Split up and compartmentalise
            all of the functionality, before you have implemented most
            of it.</p>
            <p>Pick an abstract utopian model of how an application
            should look and try to stick to it. Regardless of if it
            makes the initial implementation far more difficult. Stick
            to your ideals at all costs. Especially if those ideals are
            lifted from a handful of unusual projects or companies. Top
            marks if you read about them in a blog or by a non-founding
            employee. Bonus points if they are more of a journalist than
            a software developer.</p>
            <p>Design for your application to scale in all directions.
            To scale in users, to scale in complexity, to scale in
            contributors, to scale in robustness. You especially want it
            to be decentralised across multiple dimensions.</p>
            <h2 id="mitigations-2">Mitigations</h2>
            <p>Do the simplest thing that works first and by “works”, I
            mean works in enough of a capacity to prove the concept.
            Satisfying the underlying purpose of the software is of
            primary importance.</p>
            <p>I’ve never seen a project fail because it didn’t use a
            distributed thingy that can scale to infinity from the
            outset. Not even using Perl and a single DB instance seems
            to result in failure. Scaling issues can be solved by
            profiling and fixing shoddy code for the most part.</p>
            <p>Often it’s only really one aspect of an application which
            needs to scale into the realm where structural changes are
            required. So a distributed thingy or a bit of code written
            in C (or equivalent) can be brought in to deal with that. If
            one constantly simplifies their application and rewrites
            shoddy code, farming bits out is not all that difficult.</p>
            <p>Of course things are not constantly simplified, so this
            is often a very painful process. Where code has to be
            disentangled and pulled out at the same time as being
            cleaned up. However this still doesn’t cause projects to
            fail if their basis is sound and alternatives are not
            obvious. That’s why there is so much “bad” code around.</p>
            <p>Ironically, the easiest thing to change later is usually
            the simplest. Often introducing abstractions and decoupling
            things, introduces complications. Abstractions are rarely
            free (even if there is no runtime cost), so you have to be
            conscious of the cost to benefit ratio.</p>
            <p>Sometimes abstractions and fancy scalable solutions are
            as easy to use as anything else. They are default way of
            doing things. In this case you should obviously use them,
            why not? However when in doubt I think it’s best to assume
            they aren’t.</p>
            <p>Sometimes an abstraction or generalisation actually helps
            to understand the core problem. I often find this better to
            do on paper however and still only write code for a specific
            case. It’s easy to change a model. Code on the other hand
            has to compile and run correctly.</p>
    </div>
  </content>
</entry>
<entry>
  <title>zc-data</title>
  <id>https://richiejp.com/zc-data</id>
  <published>2021-12-28T13:18:48Z</published>
  <updated>2022-02-04T12:14:47Z</updated>
  <link rel="alternate" href="https://richiejp.com/zc-data" />
  <summary>Some data-structures and algorithms in C. Built and tested
with Zig and Meson.</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>To brush up on my coding-test skills, I decided to
            implement radix sort and hash map. This was a chance to
            learn how some core data structures and algorithms work.
            Also I tried out using Zig to build and test the project. I
            believe there is a very strong case for using Zig in
            existing C codebases.</p>
            <p>I ended up using Meson to build some tests written in C
            as well. Zig is frankly very new and unstable (version
            0.9.0-dev). At least the C header translation gave me some
            trouble. Hopefully I will get time to look into that at a
            later date.</p>
            <h1 id="zig">Zig</h1>
            <p>Insofar that Zig did work I was very happy with it. It
            can cross-compile C to almost any architecture with minimal
            difficulty. That is usually a nightmare unless using
            something like Buildroot. Even then Buildroot recompiles GCC
            to make cross compilation work and is more challenging than
            Zig. At least if you ignore Zig’s immaturity.</p>
            <p>Zig’s build system is quite nice; it’s just a Zig file
            and library. Below is the Zig build file.</p>
            <div class="sourceCode" id="cb1"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="at">const</span> std <span class="op">=</span> <span class="bu">@import</span>(<span class="st">&quot;std&quot;</span>);</span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a></span>
<span id="cb1-3"><a href="#cb1-3" tabindex="-1"></a><span class="at">const</span> cflags <span class="op">=</span> <span class="op">&amp;</span>[_][]<span class="at">const</span> <span class="dt">u8</span>{ <span class="st">&quot;-Wall&quot;</span><span class="op">,</span> <span class="st">&quot;-Wextra&quot;</span> };</span>
<span id="cb1-4"><a href="#cb1-4" tabindex="-1"></a></span>
<span id="cb1-5"><a href="#cb1-5" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">fn</span> addCExe(</span>
<span id="cb1-6"><a href="#cb1-6" tabindex="-1"></a>    b<span class="op">:</span> <span class="op">*</span>std<span class="op">.</span>build<span class="op">.</span>Builder<span class="op">,</span></span>
<span id="cb1-7"><a href="#cb1-7" tabindex="-1"></a>    name<span class="op">:</span> []<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span></span>
<span id="cb1-8"><a href="#cb1-8" tabindex="-1"></a>    files<span class="op">:</span> []<span class="at">const</span> []<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span></span>
<span id="cb1-9"><a href="#cb1-9" tabindex="-1"></a>) <span class="op">*</span>std<span class="op">.</span>build<span class="op">.</span>LibExeObjStep {</span>
<span id="cb1-10"><a href="#cb1-10" tabindex="-1"></a>    <span class="at">const</span> exe <span class="op">=</span> b<span class="op">.</span>addExecutable(name<span class="op">,</span> <span class="cn">null</span>);</span>
<span id="cb1-11"><a href="#cb1-11" tabindex="-1"></a>    exe<span class="op">.</span>linkLibC();</span>
<span id="cb1-12"><a href="#cb1-12" tabindex="-1"></a>    exe<span class="op">.</span>addIncludeDir(<span class="st">&quot;include&quot;</span>);</span>
<span id="cb1-13"><a href="#cb1-13" tabindex="-1"></a>    exe<span class="op">.</span>addCSourceFiles(files<span class="op">,</span> cflags);</span>
<span id="cb1-14"><a href="#cb1-14" tabindex="-1"></a>    exe<span class="op">.</span>install();</span>
<span id="cb1-15"><a href="#cb1-15" tabindex="-1"></a></span>
<span id="cb1-16"><a href="#cb1-16" tabindex="-1"></a>    <span class="cf">return</span> exe;</span>
<span id="cb1-17"><a href="#cb1-17" tabindex="-1"></a>}</span>
<span id="cb1-18"><a href="#cb1-18" tabindex="-1"></a></span>
<span id="cb1-19"><a href="#cb1-19" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">fn</span> build(b<span class="op">:</span> <span class="op">*</span>std<span class="op">.</span>build<span class="op">.</span>Builder) <span class="dt">void</span> {</span>
<span id="cb1-20"><a href="#cb1-20" tabindex="-1"></a>    <span class="at">const</span> exes <span class="op">=</span> [_]<span class="op">*</span>std<span class="op">.</span>build<span class="op">.</span>LibExeObjStep{</span>
<span id="cb1-21"><a href="#cb1-21" tabindex="-1"></a>        addCExe(b<span class="op">,</span> <span class="st">&quot;sort_tests&quot;</span><span class="op">,</span> <span class="op">&amp;.</span>{</span>
<span id="cb1-22"><a href="#cb1-22" tabindex="-1"></a>            <span class="st">&quot;src/sort_tests.c&quot;</span><span class="op">,</span></span>
<span id="cb1-23"><a href="#cb1-23" tabindex="-1"></a>            <span class="st">&quot;src/radix_sort.c&quot;</span><span class="op">,</span></span>
<span id="cb1-24"><a href="#cb1-24" tabindex="-1"></a>            <span class="st">&quot;src/hash.c&quot;</span><span class="op">,</span></span>
<span id="cb1-25"><a href="#cb1-25" tabindex="-1"></a>        })<span class="op">,</span></span>
<span id="cb1-26"><a href="#cb1-26" tabindex="-1"></a>        addCExe(b<span class="op">,</span> <span class="st">&quot;map_tests&quot;</span><span class="op">,</span> <span class="op">&amp;.</span>{</span>
<span id="cb1-27"><a href="#cb1-27" tabindex="-1"></a>            <span class="st">&quot;src/map_tests.c&quot;</span><span class="op">,</span></span>
<span id="cb1-28"><a href="#cb1-28" tabindex="-1"></a>            <span class="st">&quot;src/hash.c&quot;</span><span class="op">,</span></span>
<span id="cb1-29"><a href="#cb1-29" tabindex="-1"></a>        })<span class="op">,</span></span>
<span id="cb1-30"><a href="#cb1-30" tabindex="-1"></a>    };</span>
<span id="cb1-31"><a href="#cb1-31" tabindex="-1"></a></span>
<span id="cb1-32"><a href="#cb1-32" tabindex="-1"></a>    <span class="at">const</span> target <span class="op">=</span> b<span class="op">.</span>standardTargetOptions(<span class="op">.</span>{});</span>
<span id="cb1-33"><a href="#cb1-33" tabindex="-1"></a>    <span class="at">const</span> mode <span class="op">=</span> b<span class="op">.</span>standardReleaseOptions();</span>
<span id="cb1-34"><a href="#cb1-34" tabindex="-1"></a>    <span class="cf">for</span> (exes) <span class="op">|</span>exe<span class="op">|</span> {</span>
<span id="cb1-35"><a href="#cb1-35" tabindex="-1"></a>        exe<span class="op">.</span>setTarget(target);</span>
<span id="cb1-36"><a href="#cb1-36" tabindex="-1"></a>        exe<span class="op">.</span>setBuildMode(mode);</span>
<span id="cb1-37"><a href="#cb1-37" tabindex="-1"></a>    }</span>
<span id="cb1-38"><a href="#cb1-38" tabindex="-1"></a></span>
<span id="cb1-39"><a href="#cb1-39" tabindex="-1"></a>    <span class="at">const</span> tests <span class="op">=</span> b<span class="op">.</span>addTest(<span class="st">&quot;src/main.zig&quot;</span>);</span>
<span id="cb1-40"><a href="#cb1-40" tabindex="-1"></a>    tests<span class="op">.</span>linkLibC();</span>
<span id="cb1-41"><a href="#cb1-41" tabindex="-1"></a>    tests<span class="op">.</span>addIncludeDir(<span class="st">&quot;include&quot;</span>);</span>
<span id="cb1-42"><a href="#cb1-42" tabindex="-1"></a>    tests<span class="op">.</span>addCSourceFiles(<span class="op">&amp;.</span>{ <span class="st">&quot;src/radix_sort.c&quot;</span><span class="op">,</span> <span class="st">&quot;src/hash.c&quot;</span> }<span class="op">,</span> cflags);</span>
<span id="cb1-43"><a href="#cb1-43" tabindex="-1"></a></span>
<span id="cb1-44"><a href="#cb1-44" tabindex="-1"></a>    <span class="at">const</span> test_step <span class="op">=</span> b<span class="op">.</span>step(<span class="st">&quot;test&quot;</span><span class="op">,</span> <span class="st">&quot;Run tests&quot;</span>);</span>
<span id="cb1-45"><a href="#cb1-45" tabindex="-1"></a>    test_step<span class="op">.</span>dependOn(<span class="op">&amp;</span>tests<span class="op">.</span>step);</span>
<span id="cb1-46"><a href="#cb1-46" tabindex="-1"></a>}</span></code></pre></div>
            <p>This builds some C object files and executables. Also it
            builds a Zig executable that links to the C. The Zig
            executable just contains some unit tests. These all look
            similar to the following test.</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="at">const</span> std <span class="op">=</span> <span class="bu">@import</span>(<span class="st">&quot;std&quot;</span>);</span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a><span class="at">const</span> c <span class="op">=</span> <span class="bu">@cImport</span>({</span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a>    <span class="bu">@cInclude</span>(<span class="st">&quot;slab.h&quot;</span>);</span>
<span id="cb2-5"><a href="#cb2-5" tabindex="-1"></a>    <span class="bu">@cInclude</span>(<span class="st">&quot;sort.h&quot;</span>);</span>
<span id="cb2-6"><a href="#cb2-6" tabindex="-1"></a>    <span class="bu">@cInclude</span>(<span class="st">&quot;hash.h&quot;</span>);</span>
<span id="cb2-7"><a href="#cb2-7" tabindex="-1"></a>});</span>
<span id="cb2-8"><a href="#cb2-8" tabindex="-1"></a></span>
<span id="cb2-9"><a href="#cb2-9" tabindex="-1"></a><span class="kw">fn</span> lt(_<span class="op">:</span> <span class="dt">void</span><span class="op">,</span> lhs<span class="op">:</span> <span class="dt">u64</span><span class="op">,</span> rhs<span class="op">:</span> <span class="dt">u64</span>) <span class="dt">bool</span> {</span>
<span id="cb2-10"><a href="#cb2-10" tabindex="-1"></a>    <span class="cf">return</span> lhs <span class="op">&lt;</span> rhs;</span>
<span id="cb2-11"><a href="#cb2-11" tabindex="-1"></a>}</span>
<span id="cb2-12"><a href="#cb2-12" tabindex="-1"></a></span>
<span id="cb2-13"><a href="#cb2-13" tabindex="-1"></a><span class="kw">test</span> <span class="st">&quot;order matches&quot;</span> {</span>
<span id="cb2-14"><a href="#cb2-14" tabindex="-1"></a>    <span class="at">var</span> prng <span class="op">=</span> std<span class="op">.</span>rand<span class="op">.</span>SplitMix64<span class="op">.</span>init(<span class="dv">0</span><span class="er">x505e</span>);</span>
<span id="cb2-15"><a href="#cb2-15" tabindex="-1"></a>    <span class="at">var</span> unsorted_ints<span class="op">:</span> [<span class="dv">100000</span>]<span class="dt">u64</span> <span class="op">=</span> <span class="cn">undefined</span>;</span>
<span id="cb2-16"><a href="#cb2-16" tabindex="-1"></a>    <span class="cf">for</span> (unsorted_ints) <span class="op">|*</span>n<span class="op">|</span> {</span>
<span id="cb2-17"><a href="#cb2-17" tabindex="-1"></a>        n<span class="op">.*</span> <span class="op">=</span> prng<span class="op">.</span>next();</span>
<span id="cb2-18"><a href="#cb2-18" tabindex="-1"></a>    }</span>
<span id="cb2-19"><a href="#cb2-19" tabindex="-1"></a>    <span class="at">var</span> sorted_ints <span class="op">=</span> unsorted_ints;</span>
<span id="cb2-20"><a href="#cb2-20" tabindex="-1"></a>    <span class="at">var</span> radix_sorted_ints <span class="op">=</span> unsorted_ints;</span>
<span id="cb2-21"><a href="#cb2-21" tabindex="-1"></a></span>
<span id="cb2-22"><a href="#cb2-22" tabindex="-1"></a>    std<span class="op">.</span>sort<span class="op">.</span>sort(<span class="dt">u64</span><span class="op">,</span> <span class="op">&amp;</span>sorted_ints<span class="op">,</span> {}<span class="op">,</span> lt);</span>
<span id="cb2-23"><a href="#cb2-23" tabindex="-1"></a>    _ <span class="op">=</span> c<span class="op">.</span>radix_sort(<span class="op">&amp;</span>radix_sorted_ints<span class="op">,</span> radix_sorted_ints<span class="op">.</span>len);</span>
<span id="cb2-24"><a href="#cb2-24" tabindex="-1"></a></span>
<span id="cb2-25"><a href="#cb2-25" tabindex="-1"></a>    <span class="cf">for</span> (sorted_ints) <span class="op">|</span>v<span class="op">,</span> i<span class="op">|</span> {</span>
<span id="cb2-26"><a href="#cb2-26" tabindex="-1"></a>        <span class="cf">try</span> std<span class="op">.</span>testing<span class="op">.</span>expect(radix_sorted_ints[i] <span class="op">==</span> v);</span>
<span id="cb2-27"><a href="#cb2-27" tabindex="-1"></a>    }</span>
<span id="cb2-28"><a href="#cb2-28" tabindex="-1"></a>}</span></code></pre></div>
            <p>This tests my radix sort implementation
            (<code>c.radix_sort</code>) against Zig’s sort function in
            the standard library (<code>std.sort.sort</code>). This all
            worked quite nicely if we just ignore bugs, missing docs and
            such. The foundations being set here are very strong.</p>
            <p>The Zig guys seem to be serious about augmenting C
            projects. Clearly it’s inventor wants to replace C with Zig.
            However they are taking a very pragmatic approach to it.
            There is Zig the programming language and Zig the C build
            system. The former can piggyback on the later.</p>
            <h1 id="meson">Meson</h1>
            <p>Having said that, I have been burnt (repeatedly) before
            by trying to do too many new things at once. Or equally,
            being sidetracked so far from my primary goal that it’s no
            long in sight. So after having some issues with Zig I
            decided to try Meson.</p>
            <p>This worked surprisingly nicely. I was expecting to be
            forced back into using CMake or simply Make. However I got
            things to work very easily in Meson.</p>
            <div class="sourceCode" id="cb3"><pre
            class="sourceCode python"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a>project(<span class="st">&#39;zc-data&#39;</span>, <span class="st">&#39;c&#39;</span>,</span>
<span id="cb3-2"><a href="#cb3-2" tabindex="-1"></a>  version : <span class="st">&#39;0.1&#39;</span>,</span>
<span id="cb3-3"><a href="#cb3-3" tabindex="-1"></a>  default_options : [</span>
<span id="cb3-4"><a href="#cb3-4" tabindex="-1"></a>      <span class="st">&#39;warning_level=3&#39;</span>,</span>
<span id="cb3-5"><a href="#cb3-5" tabindex="-1"></a>      <span class="st">&#39;b_sanitize=address,undefined&#39;</span>,</span>
<span id="cb3-6"><a href="#cb3-6" tabindex="-1"></a>      <span class="st">&#39;b_lto=true&#39;</span></span>
<span id="cb3-7"><a href="#cb3-7" tabindex="-1"></a>  ])</span>
<span id="cb3-8"><a href="#cb3-8" tabindex="-1"></a></span>
<span id="cb3-9"><a href="#cb3-9" tabindex="-1"></a>incdir <span class="op">=</span> include_directories(<span class="st">&#39;include&#39;</span>)</span>
<span id="cb3-10"><a href="#cb3-10" tabindex="-1"></a></span>
<span id="cb3-11"><a href="#cb3-11" tabindex="-1"></a>add_global_arguments(<span class="st">&#39;-fanalyzer&#39;</span>, language : <span class="st">&#39;c&#39;</span>)</span>
<span id="cb3-12"><a href="#cb3-12" tabindex="-1"></a></span>
<span id="cb3-13"><a href="#cb3-13" tabindex="-1"></a>executable(<span class="st">&#39;sort_tests&#39;</span>,</span>
<span id="cb3-14"><a href="#cb3-14" tabindex="-1"></a>           <span class="st">&#39;src/radix_sort.c&#39;</span>,</span>
<span id="cb3-15"><a href="#cb3-15" tabindex="-1"></a>           <span class="st">&#39;src/sort_tests.c&#39;</span>,</span>
<span id="cb3-16"><a href="#cb3-16" tabindex="-1"></a>           install : true,</span>
<span id="cb3-17"><a href="#cb3-17" tabindex="-1"></a>           include_directories : incdir)</span>
<span id="cb3-18"><a href="#cb3-18" tabindex="-1"></a></span>
<span id="cb3-19"><a href="#cb3-19" tabindex="-1"></a>executable(<span class="st">&#39;map_tests&#39;</span>,</span>
<span id="cb3-20"><a href="#cb3-20" tabindex="-1"></a>           <span class="st">&#39;src/map_tests.c&#39;</span>,</span>
<span id="cb3-21"><a href="#cb3-21" tabindex="-1"></a>           include_directories : incdir)</span>
<span id="cb3-22"><a href="#cb3-22" tabindex="-1"></a></span>
<span id="cb3-23"><a href="#cb3-23" tabindex="-1"></a>executable(<span class="st">&#39;slab_tests&#39;</span>,</span>
<span id="cb3-24"><a href="#cb3-24" tabindex="-1"></a>           <span class="st">&#39;src/slab_tests.c&#39;</span>,</span>
<span id="cb3-25"><a href="#cb3-25" tabindex="-1"></a>           include_directories : incdir)</span></code></pre></div>
            <p>Above is the build script. It creates a few test
            executables. The only thing unusual thing here is that I
            have set the defaults to include my favorite sanitizers.
            Also I enabled the static analyzer for GCC which found one
            bug at compile time.</p>
            <p>This setup has a lot of the same benefits as Zig, which
            at least turns on the undefined sanitizer as well. In my
            opinion the sanitizers should all be enabled by default.
            Using a modern C compiler with all these features turned off
            is like disabling ABS and traction control on a sports car
            then driving it blindfolded drunk as a sailor on shore
            leave. Compilers themselves can’t turn them on by default
            without causing chaos. However build systems could.</p>
            <h1 id="radix-sort">Radix sort</h1>
            <p>Radix sort is the best kind of sort if you can allocate
            memory. My implementation allocates a buffer the same size
            as the input data. There are of course other constraints to
            using radix sort. However its performance and simplicity for
            fixed sized keys is quite incredible.</p>
            <div class="sourceCode" id="cb4"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb4-1"><a href="#cb4-1" tabindex="-1"></a>def_attr <span class="dt">static</span> <span class="kw">inline</span> u8 radix_n<span class="op">(</span>u64 key<span class="op">,</span> usize n<span class="op">)</span></span>
<span id="cb4-2"><a href="#cb4-2" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb4-3"><a href="#cb4-3" tabindex="-1"></a>    <span class="cf">return</span> <span class="op">(</span>key <span class="op">&gt;&gt;</span> <span class="op">(</span>n <span class="op">&lt;&lt;</span> <span class="dv">3</span><span class="op">))</span> <span class="op">&amp;</span> <span class="bn">0xff</span><span class="op">;</span></span>
<span id="cb4-4"><a href="#cb4-4" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb4-5"><a href="#cb4-5" tabindex="-1"></a></span>
<span id="cb4-6"><a href="#cb4-6" tabindex="-1"></a><span class="dt">void</span> radix_sort<span class="op">(</span>u64 <span class="op">*</span>keys<span class="op">,</span> <span class="dt">const</span> usize keys_len<span class="op">)</span></span>
<span id="cb4-7"><a href="#cb4-7" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb4-8"><a href="#cb4-8" tabindex="-1"></a>    usize col_indx<span class="op">[</span><span class="dv">256</span><span class="op">];</span></span>
<span id="cb4-9"><a href="#cb4-9" tabindex="-1"></a>    u64 <span class="op">*</span>keys_buf <span class="op">=</span> calloc<span class="op">(</span>keys_len<span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span>u64<span class="op">));</span></span>
<span id="cb4-10"><a href="#cb4-10" tabindex="-1"></a>    u64 <span class="op">*</span>keys_l <span class="op">=</span> keys<span class="op">,</span> <span class="op">*</span>keys_r <span class="op">=</span> keys_buf<span class="op">;</span></span>
<span id="cb4-11"><a href="#cb4-11" tabindex="-1"></a></span>
<span id="cb4-12"><a href="#cb4-12" tabindex="-1"></a>    <span class="dt">const</span> usize max_pass <span class="op">=</span> <span class="kw">sizeof</span><span class="op">(*</span>keys<span class="op">);</span></span>
<span id="cb4-13"><a href="#cb4-13" tabindex="-1"></a></span>
<span id="cb4-14"><a href="#cb4-14" tabindex="-1"></a>    <span class="cf">for</span> <span class="op">(</span>usize p <span class="op">=</span> <span class="dv">0</span><span class="op">;</span> p <span class="op">&lt;</span> max_pass<span class="op">;</span> p<span class="op">++)</span> <span class="op">{</span></span>
<span id="cb4-15"><a href="#cb4-15" tabindex="-1"></a>        memset<span class="op">(</span>col_indx<span class="op">,</span> <span class="dv">0</span><span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span>col_indx<span class="op">));</span></span>
<span id="cb4-16"><a href="#cb4-16" tabindex="-1"></a></span>
<span id="cb4-17"><a href="#cb4-17" tabindex="-1"></a>        <span class="cf">for</span> <span class="op">(</span>usize i <span class="op">=</span> <span class="dv">0</span><span class="op">;</span> i <span class="op">&lt;</span> keys_len<span class="op">;</span> i<span class="op">++)</span> <span class="op">{</span></span>
<span id="cb4-18"><a href="#cb4-18" tabindex="-1"></a>            u8 key <span class="op">=</span> radix_n<span class="op">(</span>keys_l<span class="op">[</span>i<span class="op">],</span> p<span class="op">);</span></span>
<span id="cb4-19"><a href="#cb4-19" tabindex="-1"></a>            col_indx<span class="op">[</span>key<span class="op">]++;</span></span>
<span id="cb4-20"><a href="#cb4-20" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb4-21"><a href="#cb4-21" tabindex="-1"></a></span>
<span id="cb4-22"><a href="#cb4-22" tabindex="-1"></a>        <span class="cf">for</span> <span class="op">(</span>usize i <span class="op">=</span> <span class="dv">0</span><span class="op">,</span> offs <span class="op">=</span> <span class="dv">0</span><span class="op">;</span> i <span class="op">&lt;</span> <span class="dv">256</span><span class="op">;</span> i<span class="op">++)</span> <span class="op">{</span></span>
<span id="cb4-23"><a href="#cb4-23" tabindex="-1"></a>            <span class="dt">const</span> usize freq <span class="op">=</span> col_indx<span class="op">[</span>i<span class="op">];</span></span>
<span id="cb4-24"><a href="#cb4-24" tabindex="-1"></a>            col_indx<span class="op">[</span>i<span class="op">]</span> <span class="op">=</span> offs<span class="op">;</span></span>
<span id="cb4-25"><a href="#cb4-25" tabindex="-1"></a>            offs <span class="op">+=</span> freq<span class="op">;</span></span>
<span id="cb4-26"><a href="#cb4-26" tabindex="-1"></a></span>
<span id="cb4-27"><a href="#cb4-27" tabindex="-1"></a>            <span class="cf">if</span> <span class="op">(</span>i <span class="op">==</span> <span class="dv">255</span><span class="op">)</span></span>
<span id="cb4-28"><a href="#cb4-28" tabindex="-1"></a>                assert<span class="op">(</span>offs <span class="op">==</span> keys_len<span class="op">);</span></span>
<span id="cb4-29"><a href="#cb4-29" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb4-30"><a href="#cb4-30" tabindex="-1"></a></span>
<span id="cb4-31"><a href="#cb4-31" tabindex="-1"></a>        <span class="cf">for</span> <span class="op">(</span>usize i <span class="op">=</span> <span class="dv">0</span><span class="op">;</span> i <span class="op">&lt;</span> keys_len<span class="op">;</span> i<span class="op">++)</span> <span class="op">{</span></span>
<span id="cb4-32"><a href="#cb4-32" tabindex="-1"></a>            u8 key <span class="op">=</span> radix_n<span class="op">(</span>keys_l<span class="op">[</span>i<span class="op">],</span> p<span class="op">);</span></span>
<span id="cb4-33"><a href="#cb4-33" tabindex="-1"></a>            usize col <span class="op">=</span> col_indx<span class="op">[</span>key<span class="op">]++;</span></span>
<span id="cb4-34"><a href="#cb4-34" tabindex="-1"></a>            keys_r<span class="op">[</span>col<span class="op">]</span> <span class="op">=</span> keys_l<span class="op">[</span>i<span class="op">];</span></span>
<span id="cb4-35"><a href="#cb4-35" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb4-36"><a href="#cb4-36" tabindex="-1"></a></span>
<span id="cb4-37"><a href="#cb4-37" tabindex="-1"></a>        assert<span class="op">(</span>col_indx<span class="op">[</span><span class="dv">255</span><span class="op">]</span> <span class="op">==</span> keys_len<span class="op">);</span></span>
<span id="cb4-38"><a href="#cb4-38" tabindex="-1"></a></span>
<span id="cb4-39"><a href="#cb4-39" tabindex="-1"></a>        keys <span class="op">=</span> keys_r<span class="op">;</span></span>
<span id="cb4-40"><a href="#cb4-40" tabindex="-1"></a>        keys_r <span class="op">=</span> keys_l<span class="op">;</span></span>
<span id="cb4-41"><a href="#cb4-41" tabindex="-1"></a>        keys_l <span class="op">=</span> keys<span class="op">;</span></span>
<span id="cb4-42"><a href="#cb4-42" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb4-43"><a href="#cb4-43" tabindex="-1"></a></span>
<span id="cb4-44"><a href="#cb4-44" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>keys <span class="op">==</span> keys_buf<span class="op">)</span> <span class="op">{</span></span>
<span id="cb4-45"><a href="#cb4-45" tabindex="-1"></a>        memcpy<span class="op">(</span>keys_l<span class="op">,</span> keys<span class="op">,</span> keys_len <span class="op">*</span> <span class="kw">sizeof</span><span class="op">(*</span>keys<span class="op">));</span></span>
<span id="cb4-46"><a href="#cb4-46" tabindex="-1"></a>        keys <span class="op">=</span> keys_l<span class="op">;</span></span>
<span id="cb4-47"><a href="#cb4-47" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb4-48"><a href="#cb4-48" tabindex="-1"></a></span>
<span id="cb4-49"><a href="#cb4-49" tabindex="-1"></a>    free<span class="op">(</span>keys_buf<span class="op">);</span></span>
<span id="cb4-50"><a href="#cb4-50" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>This is the whole implementation. Even though it
            allocates memory it was still an order of magnitude faster
            than Zig’s default sort function. Zig uses a beast of an
            algorithm called “block sort”, this performs the sort
            in-place and is fairly generic. The extra constraints which
            block sort satisfies have a large impact on speed and
            complexity.</p>
            <p>I won’t try to explain how radix sort works. I’ll just
            point out that my implementation counts how many keys will
            go in each radix. It uses those figures to calculate offsets
            into a shared buffer. This means that we don’t need a full
            sized buffer for each radix. It also complicates the
            algorithm a bit by introducing some pointer gymnastics.</p>
            <p>The function <code>radix_n</code> is preceded by a macro
            containing some function attributes. Let’s explore those for
            a minute.</p>
            <h1 id="attributes">Attributes</h1>
            <p>For legacy reasons C defaults to all the wrong settings.
            I don’t just mean compiler flags. By default variables and
            pointers can be written to. You have to write
            <code>const</code> in several places if something shouldn’t
            change. In the above code I’ve actually forgotten to add
            <code>const</code> in some places it could be added.</p>
            <p>I now try to add <code>const</code> everywhere because
            over a long period I have observed that it catches enough
            bugs to be worth the pain. Zig deals with the problem nicely
            by forcing all variables, which are actually variable, to be
            prepended with <code>var</code>.</p>
            <p>This is just one example of where C supports something
            nice, but it’s not the default. C compilers also support a
            number of extensions, including some exposed as
            <em>attributes</em>, which should be the default.</p>
            <div class="sourceCode" id="cb5"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb5-1"><a href="#cb5-1" tabindex="-1"></a><span class="pp">#ifndef ZCD_ATTRS_H</span></span>
<span id="cb5-2"><a href="#cb5-2" tabindex="-1"></a><span class="pp">#  define ZCD_ATTRS_H</span></span>
<span id="cb5-3"><a href="#cb5-3" tabindex="-1"></a></span>
<span id="cb5-4"><a href="#cb5-4" tabindex="-1"></a><span class="pp">#  define attr</span><span class="op">(...)</span><span class="pp"> __attribute__</span><span class="op">((</span><span class="pp">__VA_ARGS__</span><span class="op">))</span></span>
<span id="cb5-5"><a href="#cb5-5" tabindex="-1"></a></span>
<span id="cb5-6"><a href="#cb5-6" tabindex="-1"></a><span class="pp">#  define nonnull_attr attr</span><span class="op">(</span><span class="pp">nonnull</span><span class="op">)</span></span>
<span id="cb5-7"><a href="#cb5-7" tabindex="-1"></a><span class="pp">#  define warn_unused_attr attr</span><span class="op">(</span><span class="pp">warn_unused_result</span><span class="op">)</span></span>
<span id="cb5-8"><a href="#cb5-8" tabindex="-1"></a><span class="pp">#  define access_attr</span><span class="op">(...)</span><span class="pp"> attr</span><span class="op">(</span><span class="pp">access </span><span class="op">(</span><span class="pp">__VA__ARGS__</span><span class="op">))</span></span>
<span id="cb5-9"><a href="#cb5-9" tabindex="-1"></a></span>
<span id="cb5-10"><a href="#cb5-10" tabindex="-1"></a><span class="pp">#  define def_attr attr</span><span class="op">(</span><span class="pp">nonnull</span><span class="op">,</span><span class="pp"> warn_unused_result</span><span class="op">)</span></span>
<span id="cb5-11"><a href="#cb5-11" tabindex="-1"></a><span class="pp">#endif</span></span></code></pre></div>
            <p>GCC introduced the <code>__attribute__((...))</code>
            syntax, however it is now supported by Clang. A new C
            standard also introduce some new syntax for attributes. I
            look forward to using that in 20 years.</p>
            <p>There are many attributes, but most of them are useless
            for catching bugs. They are mostly to deal with edge cases
            and allow optimizations. However at least one is very
            useful. i.e. <code>warn_unused_result</code>.</p>
            <p>Again it should be the default that if a function returns
            something, then it is a bug to discard it. Other languages
            force you to assign the result of a function to a dummy
            variable (e.g. <code>_</code>) if it is not needed. This at
            least shows you didn’t simply forget to check the
            result.</p>
            <h1 id="hash-map">Hash Map</h1>
            <p>At its core the hash map is an array. The index of an
            element or key is calculated from its value somehow. My hash
            map is an array with extra meta-data inserted at intervals.
            It’s of the variety which deal with collisions via chaining.
            This seems unpopular these days. However I think it’s
            possible that chaining is more cache efficient if one avoids
            pure linked lists.</p>
            <p>On modern hardware cache lines are 64 bytes. Meaning
            memory is loaded in 64 byte chunks. Because reading memory
            is much slower than most CPU instructions, we want to do as
            much work as possible on batches of 64 bytes. At least that
            is the simple theory I have stuck in my head.</p>
            <p>Therefor I have shaped the data so that everything is
            done on batches of 64 bytes. With 64-bit integers (or
            words), that is 8 units. So the main hash map vector is
            split into <em>slabs</em> of 64 bytes. The first word being
            used by meta data for the next 7 map entries. When multiple
            entries hash to the same value, they are put in a
            <em>bucket</em>.</p>
            <p>The buckets are split up into slabs of 64 bytes. The last
            word of each slab is used as a pointer to the next slab. So
            buckets are essentially still linked lists, but we clump
            items together to avoid chasing pointers. The first item of
            the first slab is also used to store the length of the
            bucket. This means the bucket items can take any value,
            including <code>NULL</code>.</p>
            <p>One reason the main vector (or array) has meta-data is
            because it allows us to store pointers to items directly in
            the <em>slots</em>. In fact we can store the items
            themselves in the slots if they fit in 8 bytes. This makes
            the best case very good. That is, when one item occupies a
            slot, we can retrieve it with a single cache line fill.</p>
            <p>When there is a collision we need to store a pointer to a
            bucket in the slot. Using the meta-data we can tell if the
            value in the slot is a pointer to a bucket or something
            else. When it is a bucket, then we need at least two cache
            line fills to retrieve the item. Up to 6 items can collide
            before this increases to 3 fills, when we must load another
            slab.</p>
            <p>Much of the time we will need to load the full value of a
            key (or item) to compare it to the key we are looking for.
            At least when there is a full hash collision. For now
            though, I’m just ignoring that.</p>
            <p>Instead of putting the meta-data in each slot, it is
            clumped together in one word at the start of each hash
            slab.</p>
            <div class="sourceCode" id="cb6"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb6-1"><a href="#cb6-1" tabindex="-1"></a><span class="kw">enum</span> hash_slot_kind <span class="op">{</span></span>
<span id="cb6-2"><a href="#cb6-2" tabindex="-1"></a>    hash_slot_nil<span class="op">,</span></span>
<span id="cb6-3"><a href="#cb6-3" tabindex="-1"></a>    hash_slot_imm<span class="op">,</span></span>
<span id="cb6-4"><a href="#cb6-4" tabindex="-1"></a>    hash_slot_ptr</span>
<span id="cb6-5"><a href="#cb6-5" tabindex="-1"></a><span class="op">}</span> attr<span class="op">(</span>packed<span class="op">);</span></span>
<span id="cb6-6"><a href="#cb6-6" tabindex="-1"></a>ASSERT_SIZE<span class="op">(</span><span class="kw">enum</span> hash_slot_kind<span class="op">,</span> <span class="dv">1</span><span class="op">);</span></span>
<span id="cb6-7"><a href="#cb6-7" tabindex="-1"></a></span>
<span id="cb6-8"><a href="#cb6-8" tabindex="-1"></a><span class="kw">union</span> hash_slot64 <span class="op">{</span></span>
<span id="cb6-9"><a href="#cb6-9" tabindex="-1"></a>    u64 imm_val<span class="op">;</span></span>
<span id="cb6-10"><a href="#cb6-10" tabindex="-1"></a>    <span class="kw">struct</span> slab64 <span class="op">*</span>bucket<span class="op">;</span></span>
<span id="cb6-11"><a href="#cb6-11" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb6-12"><a href="#cb6-12" tabindex="-1"></a>ASSERT_SIZE<span class="op">(</span><span class="kw">union</span> hash_slot64<span class="op">,</span> <span class="dv">8</span><span class="op">);</span></span>
<span id="cb6-13"><a href="#cb6-13" tabindex="-1"></a></span>
<span id="cb6-14"><a href="#cb6-14" tabindex="-1"></a><span class="kw">struct</span> hash_slot_kinds64 <span class="op">{</span></span>
<span id="cb6-15"><a href="#cb6-15" tabindex="-1"></a>    u8 error_bits<span class="op">;</span></span>
<span id="cb6-16"><a href="#cb6-16" tabindex="-1"></a>    <span class="kw">enum</span> hash_slot_kind kinds<span class="op">[</span><span class="dv">7</span><span class="op">];</span></span>
<span id="cb6-17"><a href="#cb6-17" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb6-18"><a href="#cb6-18" tabindex="-1"></a>ASSERT_SIZE<span class="op">(</span><span class="kw">struct</span> hash_slot_kinds64<span class="op">,</span> <span class="dv">8</span><span class="op">);</span></span>
<span id="cb6-19"><a href="#cb6-19" tabindex="-1"></a></span>
<span id="cb6-20"><a href="#cb6-20" tabindex="-1"></a><span class="kw">struct</span> hash_slab64 <span class="op">{</span></span>
<span id="cb6-21"><a href="#cb6-21" tabindex="-1"></a>    <span class="kw">struct</span> hash_slot_kinds64 slot<span class="op">;</span></span>
<span id="cb6-22"><a href="#cb6-22" tabindex="-1"></a>    <span class="kw">union</span> hash_slot64 slots<span class="op">[</span><span class="dv">7</span><span class="op">];</span></span>
<span id="cb6-23"><a href="#cb6-23" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb6-24"><a href="#cb6-24" tabindex="-1"></a>ASSERT_SIZE<span class="op">(</span><span class="kw">struct</span> hash_slab64<span class="op">,</span> <span class="dv">64</span><span class="op">);</span></span></code></pre></div>
            <p>Above you can see the meta-data in
            <code>struct hash_slot_kinds64</code>. Putting all the
            meta-data at the beginning makes alignment easier. Otherwise
            we would have one byte of meta-data then 8 bytes of data in
            each slot. This would most likely get padded or else we will
            have unaligned memory accesses.</p>
            <p>Presently we don’t need one byte of meta-data per entry.
            <code>error_bits</code> is unused and
            <code>enum hash_slot_kind</code> can be represented by 3
            bits. This means there is room for tighter packing or more
            meta-data.</p>
            <p>Inserting the meta-data into the data vector does make
            calculating which slab an index belongs to a little
            complicated. I have to admit that I got a little confused
            and was trying to divide by 8. This was perhaps wishful
            thinking because 8 is a power of 2. Dividing by a power of
            two can be done with a fast left shift. However it’s
            division by 7 that is needed.</p>
            <div class="sourceCode" id="cb7"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb7-1"><a href="#cb7-1" tabindex="-1"></a><span class="dt">const</span> usize offset_indx <span class="op">=</span> i <span class="op">+</span> <span class="op">(</span>i<span class="op">/</span><span class="dv">7</span> <span class="op">+</span> <span class="dv">1</span><span class="op">);</span></span>
<span id="cb7-2"><a href="#cb7-2" tabindex="-1"></a><span class="dt">const</span> usize slab_indx <span class="op">=</span> offset_indx <span class="op">&gt;&gt;</span> <span class="dv">3</span><span class="op">;</span></span>
<span id="cb7-3"><a href="#cb7-3" tabindex="-1"></a><span class="dt">const</span> usize slot_indx <span class="op">=</span> <span class="op">(</span>offset_indx <span class="op">&amp;</span> <span class="bn">0x7</span><span class="op">)</span> <span class="op">-</span> <span class="dv">1</span><span class="op">;</span></span></code></pre></div>
            <p>Every 7 items we have to add one to the index to skip the
            meta-data. Relatively speaking though, loading and storing
            items is simple. At least the functions are quite short.</p>
            <div class="sourceCode" id="cb8"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb8-1"><a href="#cb8-1" tabindex="-1"></a>attr<span class="op">(</span>nonnull<span class="op">,</span> warn_unused_result<span class="op">)</span></span>
<span id="cb8-2"><a href="#cb8-2" tabindex="-1"></a><span class="dt">static</span> <span class="kw">inline</span> <span class="kw">enum</span> hash_slot_kind hash_map64_load<span class="op">(</span><span class="dt">const</span> <span class="kw">struct</span> hash_map64 <span class="op">*</span><span class="dt">const</span> self<span class="op">,</span></span>
<span id="cb8-3"><a href="#cb8-3" tabindex="-1"></a>                          <span class="dt">const</span> usize i<span class="op">,</span></span>
<span id="cb8-4"><a href="#cb8-4" tabindex="-1"></a>                          u64 <span class="op">*</span><span class="dt">const</span> key<span class="op">,</span></span>
<span id="cb8-5"><a href="#cb8-5" tabindex="-1"></a>                          hash_map64_cmp_fn cmp<span class="op">)</span></span>
<span id="cb8-6"><a href="#cb8-6" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb8-7"><a href="#cb8-7" tabindex="-1"></a>    <span class="dt">const</span> usize offset_indx <span class="op">=</span> i <span class="op">+</span> <span class="op">(</span>i<span class="op">/</span><span class="dv">7</span> <span class="op">+</span> <span class="dv">1</span><span class="op">);</span></span>
<span id="cb8-8"><a href="#cb8-8" tabindex="-1"></a>    <span class="dt">const</span> usize slab_indx <span class="op">=</span> offset_indx <span class="op">&gt;&gt;</span> <span class="dv">3</span><span class="op">;</span></span>
<span id="cb8-9"><a href="#cb8-9" tabindex="-1"></a>    <span class="dt">const</span> usize slot_indx <span class="op">=</span> <span class="op">(</span>offset_indx <span class="op">&amp;</span> <span class="bn">0x7</span><span class="op">)</span> <span class="op">-</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb8-10"><a href="#cb8-10" tabindex="-1"></a>    <span class="dt">const</span> <span class="kw">struct</span> hash_slab64 <span class="op">*</span><span class="dt">const</span> slab <span class="op">=</span> self<span class="op">-&gt;</span>slabs <span class="op">+</span> slab_indx<span class="op">;</span></span>
<span id="cb8-11"><a href="#cb8-11" tabindex="-1"></a>    <span class="dt">const</span> <span class="kw">enum</span> hash_slot_kind <span class="op">*</span><span class="dt">const</span> kind <span class="op">=</span> slab<span class="op">-&gt;</span>slot<span class="op">.</span>kinds <span class="op">+</span> slot_indx<span class="op">;</span></span>
<span id="cb8-12"><a href="#cb8-12" tabindex="-1"></a>    <span class="dt">const</span> <span class="kw">union</span> hash_slot64 <span class="op">*</span><span class="dt">const</span> slot <span class="op">=</span> slab<span class="op">-&gt;</span>slots <span class="op">+</span> slot_indx<span class="op">;</span></span>
<span id="cb8-13"><a href="#cb8-13" tabindex="-1"></a></span>
<span id="cb8-14"><a href="#cb8-14" tabindex="-1"></a>    zcd_assert<span class="op">(</span>slab_indx <span class="op">&lt;</span> self<span class="op">-&gt;</span>slab_nr<span class="op">,</span> <span class="st">&quot;</span><span class="sc">%zu</span><span class="st"> &lt; </span><span class="sc">%zu</span><span class="st">&quot;</span><span class="op">,</span> slab_indx<span class="op">,</span> self<span class="op">-&gt;</span>slab_nr<span class="op">);</span></span>
<span id="cb8-15"><a href="#cb8-15" tabindex="-1"></a>    zcd_assert<span class="op">(</span>slot_indx <span class="op">&lt;</span> <span class="dv">7</span><span class="op">,</span> <span class="st">&quot;</span><span class="sc">%zu</span><span class="st"> &lt; 7&quot;</span><span class="op">,</span> offset_indx<span class="op">);</span></span>
<span id="cb8-16"><a href="#cb8-16" tabindex="-1"></a></span>
<span id="cb8-17"><a href="#cb8-17" tabindex="-1"></a>    <span class="cf">switch</span><span class="op">(*</span>kind<span class="op">)</span> <span class="op">{</span></span>
<span id="cb8-18"><a href="#cb8-18" tabindex="-1"></a>    <span class="cf">case</span> hash_slot_imm<span class="op">:</span> <span class="op">{</span></span>
<span id="cb8-19"><a href="#cb8-19" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>cmp<span class="op">(*</span>key<span class="op">,</span> slot<span class="op">-&gt;</span>imm_val<span class="op">))</span></span>
<span id="cb8-20"><a href="#cb8-20" tabindex="-1"></a>            <span class="cf">return</span> hash_slot_nil<span class="op">;</span></span>
<span id="cb8-21"><a href="#cb8-21" tabindex="-1"></a></span>
<span id="cb8-22"><a href="#cb8-22" tabindex="-1"></a>        <span class="op">*</span>key <span class="op">=</span> slot<span class="op">-&gt;</span>imm_val<span class="op">;</span></span>
<span id="cb8-23"><a href="#cb8-23" tabindex="-1"></a>        <span class="cf">return</span> hash_slot_imm<span class="op">;</span></span>
<span id="cb8-24"><a href="#cb8-24" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb8-25"><a href="#cb8-25" tabindex="-1"></a>    <span class="cf">case</span> hash_slot_ptr<span class="op">:</span> <span class="op">{</span></span>
<span id="cb8-26"><a href="#cb8-26" tabindex="-1"></a>        <span class="dt">const</span> u64 <span class="op">*</span>bckt_slot<span class="op">;</span></span>
<span id="cb8-27"><a href="#cb8-27" tabindex="-1"></a></span>
<span id="cb8-28"><a href="#cb8-28" tabindex="-1"></a>        for_each_slab64_u64<span class="op">(</span>slot<span class="op">-&gt;</span>bucket<span class="op">,</span> bckt_slot<span class="op">)</span> <span class="op">{</span></span>
<span id="cb8-29"><a href="#cb8-29" tabindex="-1"></a>            <span class="cf">if</span> <span class="op">(</span>cmp<span class="op">(*</span>key<span class="op">,</span> <span class="op">*</span>bckt_slot<span class="op">))</span></span>
<span id="cb8-30"><a href="#cb8-30" tabindex="-1"></a>                <span class="cf">continue</span><span class="op">;</span></span>
<span id="cb8-31"><a href="#cb8-31" tabindex="-1"></a></span>
<span id="cb8-32"><a href="#cb8-32" tabindex="-1"></a>            <span class="op">*</span>key <span class="op">=</span> <span class="op">*</span>bckt_slot<span class="op">;</span></span>
<span id="cb8-33"><a href="#cb8-33" tabindex="-1"></a>            <span class="cf">return</span> hash_slot_ptr<span class="op">;</span></span>
<span id="cb8-34"><a href="#cb8-34" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb8-35"><a href="#cb8-35" tabindex="-1"></a></span>
<span id="cb8-36"><a href="#cb8-36" tabindex="-1"></a>        <span class="cf">return</span> hash_slot_nil<span class="op">;</span></span>
<span id="cb8-37"><a href="#cb8-37" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb8-38"><a href="#cb8-38" tabindex="-1"></a>    <span class="cf">case</span> hash_slot_nil<span class="op">:</span></span>
<span id="cb8-39"><a href="#cb8-39" tabindex="-1"></a>        <span class="cf">return</span> hash_slot_nil<span class="op">;</span></span>
<span id="cb8-40"><a href="#cb8-40" tabindex="-1"></a>    <span class="cf">default</span><span class="op">:</span></span>
<span id="cb8-41"><a href="#cb8-41" tabindex="-1"></a>        zcd_assert<span class="op">(</span><span class="dv">0</span><span class="op">,</span> <span class="st">&quot;unreachable&quot;</span><span class="op">);</span></span>
<span id="cb8-42"><a href="#cb8-42" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb8-43"><a href="#cb8-43" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>Note that <code>i</code> should already be the index as
            selected by some hashing function. I also implemented a
            basic hash function which you can see in <a
            href="https://gitlab.com/Palethorpe/zc-data">the full
            code</a>.</p>
            <h1 id="radix-or-digital-trees">Radix or Digital Trees</h1>
            <p>Linux has a data structure called the
            <code>xarray</code>, which can be used like an array or hash
            map. Apparently this is based on the radix tree. While
            looking at these data structures I was lead to discover
            “Judy arrays”. A cache efficient, compact, digital tree.</p>
            <p>Dealing with the issues of hashing makes these “hybrid”
            tree structures look appealing. However they are far more
            involved than creating a simple radix tree or hash map.
            These are where I would be looking next though.</p>
    </div>
  </content>
</entry>
<entry>
  <title>Minimal Linux VM cross compiled with Clang and Zig</title>
  <id>https://richiejp.com/zig-cross-compile-ltp-ltx-linux</id>
  <published>2022-07-11T17:49:47+01:00</published>
  <updated>2024-11-13T10:53:06Z</updated>
  <link rel="alternate" href="https://richiejp.com/zig-cross-compile-ltp-ltx-linux" />
  <summary>Beginning to enable rapid Linux kernel testing on all
arches</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>One of my jobs is to write and run Linux kernel tests.
            These are part of the Linux Test Project (LTP). Building and
            running kernel tests requires a fair amount of unusual
            tooling.</p>
            <p>At least if you are concerned about more than one
            platform and kernel version. Writing a test and making sure
            it builds and runs on all architectures and Linux distros is
            a pain.</p>
            <p>It would be a big achievement just to be able to
            <em>quickly</em> and simply run a new test on all the
            architectures QEMU and Linux jointly support. It would also
            be nice to be able to test a kernel patch in a reasonable
            time.</p>
            <p>Of course there are various existing solutions to this.
            One is to use a build farm that someone has setup. If you
            pick a major distro, you can create a package and have their
            build farm compile it across multiple arches.</p>
            <p>You could then run the tests in something like OpenQA… Ah
            no, hold on. I said <em>quickly</em> and simply. In that
            case, the closest solution I have found is buildroot.</p>
            <p>Buildroot compiles GCC and everything else from source.
            It’s good at cross compiling and supports LTP. There is
            still a lot of stuff going on here though which we don’t
            want or care about.</p>
            <p>In general there is a lot of profit to be had in <a
            href="nanos-clone3-brk-and-nodejs">stripping back
            unnecessary layers</a>. The problem is it has a high up
            front cost. Meanwhile complexity is cheap as <a
            href="https://rule11.tech/papers/2018-complexitysecuritysec-dullien.pdf">Halvar
            Flake says</a>.</p>
            <p>Anyway we don’t even need busybox (for most tests anyway,
            some require <code>sh</code>). We are developing a <a
            href="https://github.com/richiejp/runltp-ng-1/tree/ltx/ltx">test
            executor (LTX)</a> which can run as init. All we need is a
            VM with LTX and some tests inside.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>Update2: I created a small project from this where a <a
            href="https://github.com/richiejp/m">Zig program is running
            as init</a></p>
            <p>Update: Andrea Cervesato rewrote LTX again (I wrote it
            twice, hopefully third time is a charm) and I added Zig <a
            href="https://github.com/acerv/ltx/blob/15febbd23a536ca4b55171cdbd9bac7e8e40b9c3/docs/cross.md">cross
            compilation</a> to that version too.</p>
            <p>LTX now works with <a
            href="https://github.com/acerv/kirk">Kirk</a> and I renewed
            my plan to automate compiling Linux and <a
            href="https://github.com/acerv/ltx/issues/1">direct boot VMs
            with LTX as init</a>. As of writing this is still very much
            work in progress.</p>
            </div>
            </div>
            <p>QEMU allows us to boot directly from a kernel image and
            an initrd file. We don’t need GRUB installed in a disk
            image. The initrd can contain LTX and some LTP tests.</p>
            <p>So all we need to do is cross compile the kernel, cross
            compile LTX and some LTP tests. Then create an initrd file.
            For this I used my favorite new shiny thing <a
            href="zig-vs-c-mini-http-server">which I am itching for an
            excuse to use</a>.</p>
            <h1 id="building-linux">Building Linux</h1>
            <p>Clang LLVM supports cross compilation out of the box.
            This doesn’t work well with userland due to a dependency on
            libgcc (see below). However Linux doesn’t require this.
            Using Clang with Linux is as simple as adding
            <code>LLVM=1</code> to the Make command.</p>
            <div class="sourceCode" id="cb1"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="ex">$</span> cd <span class="va">$linux</span></span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a><span class="ex">$</span> make LLVM=1 ARCH=arm64 defconfig</span>
<span id="cb1-3"><a href="#cb1-3" tabindex="-1"></a><span class="ex">$</span> make LLVM=1 ARCH=arm64 menuconfig</span>
<span id="cb1-4"><a href="#cb1-4" tabindex="-1"></a><span class="ex">$</span> make LLVM=1 ARCH=arm64 <span class="at">-j</span><span class="va">$(</span><span class="fu">nproc</span><span class="va">)</span></span>
<span id="cb1-5"><a href="#cb1-5" tabindex="-1"></a><span class="ex">$</span> cp arch/arm64/boot/Image.gz <span class="va">$ltx</span>/cross/aarch64/</span></code></pre></div>
            <p>GCC seems to need compiling for each architecture’s
            backend. I suppose that one should be able to use their
            distro’s cross compiler GCC package. However I haven’t had
            much luck with it.</p>
            <p>Note that I wasn’t able to get <code>zig cc</code> to
            work with the kernel. It’s not clear if it’s worth the
            effort either. At least not until Zig implements its own C
            compiler instead of wrapping Clang. And yes someone is
            working on that.</p>
            <h1 id="building-userland">Building Userland</h1>
            <p>Userland is complicated by libc’s dependency on the
            compiler runtime library (compiler-rt, libgcc). LLVM has
            it’s own compiler-rt, but it is missing features that are
            implemnted by libgcc. Cross compiling GCC is a farce, hence
            why we are using Clang in the first instance.</p>
            <p>Luckily the Zig language bundles Clang, its own
            compiler-rt and some libc’s (e..g musl, glibc). Zig’s
            compiler-rt appears to be incomplete as well, however it
            doesn’t seem to matter for our purposes.</p>
            <p>It appears that Zig compiles only the parts of libc
            required for our application from source. This is only a
            small subset of musl which doesn’t include some floating
            point functionality which is missing from LLVM’s and Zig’s
            runtime libraries.</p>
            <p>Also even though the LTX executable produced by Zig is
            statically compiled. It is relatively small at 126K. This is
            double the size compared to being dynamically linked to
            musl. However we can live with this.</p>
            <h2 id="ltx">LTX</h2>
            <p>So LTX can be compiled by Zig as follows.</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="ex">$</span> cd <span class="va">$runltp</span>-ng/ltx</span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a><span class="ex">$</span> zig cc <span class="at">--target</span><span class="op">=</span>aarch64-linux-musl <span class="at">-o</span> ltx ltx.c</span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a><span class="ex">$</span> cp ltx cross/initrd/init</span></code></pre></div>
            <p>We could of course put this in a Makefile or <a
            href="zc-data">build.zig</a>.</p>
            <h2 id="ltp">LTP</h2>
            <p>My first attempt at cross compiling LTP with Zig has not
            been entirely successful. However it appears that it can
            compile most tests. This is good enough for now and also a
            pleasant surprise.</p>
            <div class="sourceCode" id="cb3"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a><span class="ex">$</span> cd <span class="va">$ltp</span></span>
<span id="cb3-2"><a href="#cb3-2" tabindex="-1"></a><span class="ex">$</span> make autotools</span>
<span id="cb3-3"><a href="#cb3-3" tabindex="-1"></a><span class="ex">$</span> ./configure <span class="at">--prefix</span><span class="op">=</span><span class="er">(</span><span class="fu">realpath</span> ../ltp-install/<span class="kw">)</span> <span class="va">CC</span><span class="op">=</span><span class="st">&#39;zig cc --target=aarch64-linux-musl&#39;</span> <span class="ex">--host=aarch64</span></span>
<span id="cb3-4"><a href="#cb3-4" tabindex="-1"></a><span class="ex">$</span> make <span class="at">-j</span><span class="va">$(</span><span class="fu">nproc</span><span class="va">)</span></span></code></pre></div>
            <p>Test executables can be copy and pasted into
            <code>$ltx/cross/initrd/bin</code> or similar. Unfortunately
            the test executables come out as ~600-700K. This is a
            potential problem for LTP, because it has thousands of
            them.</p>
            <p>This is also LTP’s fault though, at least to some extent.
            The LTP library is built first and rolled up into an
            archive. Then it is linked into each test executable.</p>
            <p>It would be better for Zig if all the library sources
            were passed in each time. This should allow much better dead
            code elimination. Possibly it would improve compile times as
            well. Zig is designed to compile from source.</p>
            <h1 id="initrd">initrd</h1>
            <p>Unless we embed init (LTX) inside the kernel image. We
            need an initial system image which the kernel can load init
            from. This can be created with <code>cpio</code>.</p>
            <div class="sourceCode" id="cb4"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb4-1"><a href="#cb4-1" tabindex="-1"></a><span class="ex">$</span> cd <span class="va">$ltx</span>/cross/initrd</span>
<span id="cb4-2"><a href="#cb4-2" tabindex="-1"></a><span class="ex">$</span> find . <span class="kw">|</span> <span class="fu">cpio</span> <span class="at">-H</span> newc <span class="at">-o</span> <span class="kw">|</span> <span class="fu">gzip</span> <span class="at">-n</span> <span class="op">&gt;</span> ../aarch64/initrd.cpio.gz</span></code></pre></div>
            <p>Thus we have a compressed system image just with
            <code>/init</code> (LTX) in the root directory.</p>
            <p>N.B. LTX must be located at <code>/init</code> (assuming
            default kernel config). If we don’t do that then Linux
            starts trying to use some deprecated boot procedure. This
            then results in a confusing error message about not being
            able to find the root partition or whatever.</p>
            <h1 id="running">Running</h1>
            <p>The kernel and initramfs can be direct booted by QEMU</p>
            <div class="sourceCode" id="cb5"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb5-1"><a href="#cb5-1" tabindex="-1"></a><span class="ex">$</span> qemu-system-aarch64 <span class="at">-m</span> 1G <span class="dt">\</span></span>
<span id="cb5-2"><a href="#cb5-2" tabindex="-1"></a>                      <span class="at">-smp</span> 2 <span class="dt">\</span></span>
<span id="cb5-3"><a href="#cb5-3" tabindex="-1"></a>                      <span class="at">-display</span> none <span class="dt">\</span></span>
<span id="cb5-4"><a href="#cb5-4" tabindex="-1"></a>                      <span class="at">-kernel</span> aarch64/Image <span class="dt">\</span></span>
<span id="cb5-5"><a href="#cb5-5" tabindex="-1"></a>                      <span class="at">-initrd</span> aarch64/initrd.cpio.gz <span class="dt">\</span></span>
<span id="cb5-6"><a href="#cb5-6" tabindex="-1"></a>                      <span class="at">-machine</span> virt <span class="at">-cpu</span> cortex-a57 <span class="dt">\</span></span>
<span id="cb5-7"><a href="#cb5-7" tabindex="-1"></a>                      <span class="at">-serial</span> stdio <span class="dt">\</span></span>
<span id="cb5-8"><a href="#cb5-8" tabindex="-1"></a>                      <span class="at">-append</span> <span class="st">&#39;console=ttyAMA0 earlyprintk=ttyAMA0&#39;</span></span></code></pre></div>
            <p>LTX starts up fine as init, the last lines from the
            kernel log and LTX’s stderr are below.</p>
            <pre><code>[    1.278618] Run /init as init process
[ltx.c:main:1075] Running as init</code></pre>
            <p>Next we need to communicate with LTX, using serial or
            whatever, and get it to run some tests. I’ll leave that for
            another day.</p>
            <h1 id="related">Related</h1>
            <ul>
            <li><a href="/https/richiejp.com/barely-http2-zig">Barely HTTP/2 server in
            Zig</a></li>
            <li><a href="/https/richiejp.com/zig-vs-c-mini-http-server">Zig Vs C - Minimal
            HTTP server</a></li>
            <li><a href="/https/richiejp.com/zig-ld-preload-trick">Override libc’s malloc
            with Zig</a></li>
            <li><a href="zig-fuse-one">Zig &amp; FUSE: Hello file
            systems</a></li>
            </ul>
    </div>
  </content>
</entry>
<entry>
  <title>Zig &amp; FUSE: Hello file systems</title>
  <id>https://richiejp.com/zig-fuse-one</id>
  <published>2023-07-26T00:18:09+01:00</published>
  <updated>2023-09-29T16:13:49+01:00</updated>
  <link rel="alternate" href="https://richiejp.com/zig-fuse-one" />
  <summary>Using Zig’s ability to call into libfuse to create a very
simple file system in userspace</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>On Linux you can create and mount file systems in
            userspace. You don’t even need to be root. This allows for
            things like <a
            href="https://github.com/superfly/litefs">LiteFS</a> which
            intercepts reads and writes to an SQLite database, or <a
            href="https://github.com/190n/jxl-fuse">jxl-fuse</a>.</p>
            <p>Jxl-fuse is particularly relevant because it is written
            in Zig. It is also an interesting use case; it allows you to
            store images in JPEG XL format and then convert them on the
            fly to regular JPEG.</p>
            <p>This means you get better compression while maintaining
            compatibility. Generalising, this let’s us decouple the
            storage format from what the application loads. Essentially
            meaning we can transparently insert an adapter between the
            storage and the application at runtime.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>I have a <a href="https://youtu.be/6Lv6-7kWIvI">video
            covering this topic on YouTube</a> as well.</p>
            <p>Update: I added jxl-fuse, thanks to 190n for pointing it
            out on Discord.</p>
            <p>Update2: There is now a second article covering the raw
            interface <a href="/https/richiejp.com/zig-fuse-two">Zig &amp; /dev/fuse: A
            weird file system</a></p>
            </div>
            </div>
            <p>FUSE gets interesting when you think; what is a file
            system really? I usually think of a complicated data
            structure which stores data at particular paths.</p>
            <p>However if you are familiar with the linux kernel (or
            similar) interfaces. In particular the <code>/proc</code>
            and <code>/sys</code> file systems, then the term starts to
            take on another meaning. A file system in the kernel is
            really some code which implements an interface.</p>
            <p>The interface being functions like <code>open</code>,
            <code>stat</code>, <code>read</code>, <code>write</code>,
            <code>seek</code>, <code>close</code>, etc. Each function is
            limited by what arguments it takes, but potentially it can
            do anything. <code>read</code> can generate data on the fly
            or have side effects.</p>
            <p>Zig is a great language for systems programming and it
            turns out to be reasonably easy to get it working with
            FUSE.</p>
            <h1 id="libfuse">libfuse</h1>
            <p>To simplify interacting with the kernel interface there
            is a C library with the obvious name. It would be better to
            use the kernel interface directly. Both for performance and
            to take full advantage of Zig. However it would require
            implementing the message protocol from scratch.</p>
            <p>Also libfuse comes with some examples, so I approximately
            translated one of the examples into Zig. This is an easy way
            to get a feel for FUSE development or to quickly get started
            implementing a file system.</p>
            <p>I had some trouble compiling libfuse with Zig. Libfuse
            uses Meson which didn’t like Zig’s linker version output. It
            would be nice to create a <code>build.zig</code> for
            libfuse, but I think the effort would be better directed at
            implementing the FUSE protocol directly.</p>
            <p>So to get things moving I linked against the system’s
            libfuse (NixOS in my case). This can be seen in the <a
            href="https://github.com/richiejp/fuse.zig/blob/main/build.zig"><code>build.zig</code></a>,
            which we will get to in a moment.</p>
            <p>Zig can directly import C headers, however I decided to
            translate the header to Zig and include that instead. The
            reason being that I can then look at the contents and modify
            them.</p>
            <div class="sourceCode" id="cb1"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="ex">$</span> zig translate-c <span class="at">-DFUSE_USE_VERSION</span><span class="op">=</span>31 <span class="dt">\</span></span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a>    <span class="at">-isystem</span> /nix/store/jan1gkl34v83h1pwd43q716nsvf06miq-fuse-3.11.0/include<span class="dt">\</span></span>
<span id="cb1-3"><a href="#cb1-3" tabindex="-1"></a>    <span class="at">-isystem</span> /nix/store/kd1z202w3l3njfn7n6dkyridwvnm3yg2-musl-1.2.3-dev/include <span class="dt">\</span></span>
<span id="cb1-4"><a href="#cb1-4" tabindex="-1"></a>    /nix/store/jan1gkl34v83h1pwd43q716nsvf06miq-fuse-3.11.0/include/fuse3/fuse.h <span class="op">&gt;</span> src/fuse31.zig</span></code></pre></div>
            <p>The path names are awful because it allows Nix to
            maintain many versions of the same software on the same
            system.</p>
            <p>I found that I had to include the FUSE directory and the
            libc directory. I used musl instead of glibc because
            whenever I want to know how something in libc works I go to
            musl.</p>
            <p>The <code>FUSE_USE_VERSION</code> needs to be set.
            Possibly other things could be set, but this was enough to
            get the symbols I wanted.</p>
            <h1 id="building">building</h1>
            <p>The <code>build.zig</code> is pretty much the default
            produced by <code>zig init-exe</code>. I’ll just include the
            bits that were changed.</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a>    <span class="at">const</span> exe <span class="op">=</span> b<span class="op">.</span>addExecutable(<span class="op">.</span>{</span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a>        <span class="op">.</span>name <span class="op">=</span> <span class="st">&quot;fuse&quot;</span><span class="op">,</span></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a>        <span class="op">...</span></span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a>        <span class="op">.</span>link_libc <span class="op">=</span> <span class="cn">true</span><span class="op">,</span></span>
<span id="cb2-5"><a href="#cb2-5" tabindex="-1"></a>    });</span>
<span id="cb2-6"><a href="#cb2-6" tabindex="-1"></a></span>
<span id="cb2-7"><a href="#cb2-7" tabindex="-1"></a>    exe<span class="op">.</span>linkSystemLibrary(<span class="st">&quot;fuse3&quot;</span>);</span></code></pre></div>
            <p>So all I had to do was link to libc and fuse3. As
            discussed above, using the system’s libfuse is not ideal;
            depending on the distribution the static and cross-compiled
            libraries may not be available. Linking to a shared library
            is not great for optimisation.</p>
            <p>It is actually quite easy to compile libfuse to a static
            library with Meson if the distribution doesn’t support it.
            However we still don’t get the full magic of Zig’s cross
            compilation.</p>
            <h1 id="hello-zig">Hello Zig</h1>
            <p>I copied <a
            href="https://github.com/libfuse/libfuse/blob/master/example/hello.c">libfuse/example/hello.c</a>.
            The entry point in C looks like the following.</p>
            <div class="sourceCode" id="cb3"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a><span class="dt">static</span> <span class="dt">const</span> <span class="kw">struct</span> fuse_operations hello_oper <span class="op">=</span> <span class="op">{</span></span>
<span id="cb3-2"><a href="#cb3-2" tabindex="-1"></a>    <span class="op">.</span>init           <span class="op">=</span> hello_init<span class="op">,</span></span>
<span id="cb3-3"><a href="#cb3-3" tabindex="-1"></a>    <span class="op">.</span>getattr    <span class="op">=</span> hello_getattr<span class="op">,</span></span>
<span id="cb3-4"><a href="#cb3-4" tabindex="-1"></a>    <span class="op">.</span>readdir    <span class="op">=</span> hello_readdir<span class="op">,</span></span>
<span id="cb3-5"><a href="#cb3-5" tabindex="-1"></a>    <span class="op">.</span>open       <span class="op">=</span> hello_open<span class="op">,</span></span>
<span id="cb3-6"><a href="#cb3-6" tabindex="-1"></a>    <span class="op">.</span>read       <span class="op">=</span> hello_read<span class="op">,</span></span>
<span id="cb3-7"><a href="#cb3-7" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb3-8"><a href="#cb3-8" tabindex="-1"></a></span>
<span id="cb3-9"><a href="#cb3-9" tabindex="-1"></a><span class="op">...</span></span>
<span id="cb3-10"><a href="#cb3-10" tabindex="-1"></a></span>
<span id="cb3-11"><a href="#cb3-11" tabindex="-1"></a><span class="dt">int</span> main<span class="op">(</span><span class="dt">int</span> argc<span class="op">,</span> <span class="dt">char</span> <span class="op">*</span>argv<span class="op">[])</span></span>
<span id="cb3-12"><a href="#cb3-12" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb3-13"><a href="#cb3-13" tabindex="-1"></a>    <span class="dt">int</span> ret<span class="op">;</span></span>
<span id="cb3-14"><a href="#cb3-14" tabindex="-1"></a>    <span class="kw">struct</span> fuse_args args <span class="op">=</span> FUSE_ARGS_INIT<span class="op">(</span>argc<span class="op">,</span> argv<span class="op">);</span></span>
<span id="cb3-15"><a href="#cb3-15" tabindex="-1"></a></span>
<span id="cb3-16"><a href="#cb3-16" tabindex="-1"></a>    <span class="op">...</span></span>
<span id="cb3-17"><a href="#cb3-17" tabindex="-1"></a>    options<span class="op">.</span>filename <span class="op">=</span> strdup<span class="op">(</span><span class="st">&quot;hello&quot;</span><span class="op">);</span></span>
<span id="cb3-18"><a href="#cb3-18" tabindex="-1"></a>    options<span class="op">.</span>contents <span class="op">=</span> strdup<span class="op">(</span><span class="st">&quot;Hello World!</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">);</span></span>
<span id="cb3-19"><a href="#cb3-19" tabindex="-1"></a></span>
<span id="cb3-20"><a href="#cb3-20" tabindex="-1"></a>    <span class="co">/* Parse options */</span></span>
<span id="cb3-21"><a href="#cb3-21" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>fuse_opt_parse<span class="op">(&amp;</span>args<span class="op">,</span> <span class="op">&amp;</span>options<span class="op">,</span> option_spec<span class="op">,</span> NULL<span class="op">)</span> <span class="op">==</span> <span class="op">-</span><span class="dv">1</span><span class="op">)</span></span>
<span id="cb3-22"><a href="#cb3-22" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb3-23"><a href="#cb3-23" tabindex="-1"></a></span>
<span id="cb3-24"><a href="#cb3-24" tabindex="-1"></a>    <span class="op">...</span></span>
<span id="cb3-25"><a href="#cb3-25" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>options<span class="op">.</span>show_help<span class="op">)</span> <span class="op">{</span></span>
<span id="cb3-26"><a href="#cb3-26" tabindex="-1"></a>        show_help<span class="op">(</span>argv<span class="op">[</span><span class="dv">0</span><span class="op">]);</span></span>
<span id="cb3-27"><a href="#cb3-27" tabindex="-1"></a>        assert<span class="op">(</span>fuse_opt_add_arg<span class="op">(&amp;</span>args<span class="op">,</span> <span class="st">&quot;--help&quot;</span><span class="op">)</span> <span class="op">==</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb3-28"><a href="#cb3-28" tabindex="-1"></a>        args<span class="op">.</span>argv<span class="op">[</span><span class="dv">0</span><span class="op">][</span><span class="dv">0</span><span class="op">]</span> <span class="op">=</span> <span class="ch">&#39;</span><span class="sc">\0</span><span class="ch">&#39;</span><span class="op">;</span></span>
<span id="cb3-29"><a href="#cb3-29" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb3-30"><a href="#cb3-30" tabindex="-1"></a></span>
<span id="cb3-31"><a href="#cb3-31" tabindex="-1"></a>    ret <span class="op">=</span> fuse_main<span class="op">(</span>args<span class="op">.</span>argc<span class="op">,</span> args<span class="op">.</span>argv<span class="op">,</span> <span class="op">&amp;</span>hello_oper<span class="op">,</span> NULL<span class="op">);</span></span>
<span id="cb3-32"><a href="#cb3-32" tabindex="-1"></a>    fuse_opt_free_args<span class="op">(&amp;</span>args<span class="op">);</span></span>
<span id="cb3-33"><a href="#cb3-33" tabindex="-1"></a>    <span class="cf">return</span> ret<span class="op">;</span></span>
<span id="cb3-34"><a href="#cb3-34" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>I ignored the stuff about parsing the filename and
            contents from the command line. You probably don’t want
            libfuse to parse the command line when using Zig, but for
            now I just passed the args into <code>fuse_main</code>
            untouched (there are alternatives to
            <code>fuse_main</code>).</p>
            <div class="sourceCode" id="cb4"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb4-1"><a href="#cb4-1" tabindex="-1"></a><span class="at">const</span> std <span class="op">=</span> <span class="bu">@import</span>(<span class="st">&quot;std&quot;</span>);</span>
<span id="cb4-2"><a href="#cb4-2" tabindex="-1"></a><span class="at">const</span> log <span class="op">=</span> std<span class="op">.</span>log;</span>
<span id="cb4-3"><a href="#cb4-3" tabindex="-1"></a><span class="op">...</span></span>
<span id="cb4-4"><a href="#cb4-4" tabindex="-1"></a><span class="at">const</span> fuse <span class="op">=</span> <span class="bu">@import</span>(<span class="st">&quot;fuse31.zig&quot;</span>);</span>
<span id="cb4-5"><a href="#cb4-5" tabindex="-1"></a></span>
<span id="cb4-6"><a href="#cb4-6" tabindex="-1"></a><span class="op">...</span></span>
<span id="cb4-7"><a href="#cb4-7" tabindex="-1"></a></span>
<span id="cb4-8"><a href="#cb4-8" tabindex="-1"></a><span class="at">const</span> ops <span class="op">=</span> mem<span class="op">.</span>zeroInit(fuse<span class="op">.</span>struct_fuse_operations<span class="op">,</span> <span class="op">.</span>{</span>
<span id="cb4-9"><a href="#cb4-9" tabindex="-1"></a>    <span class="op">.</span>init <span class="op">=</span> init<span class="op">,</span></span>
<span id="cb4-10"><a href="#cb4-10" tabindex="-1"></a>    <span class="op">.</span>getattr <span class="op">=</span> getattr<span class="op">,</span></span>
<span id="cb4-11"><a href="#cb4-11" tabindex="-1"></a>    <span class="op">.</span>readdir <span class="op">=</span> readdir<span class="op">,</span></span>
<span id="cb4-12"><a href="#cb4-12" tabindex="-1"></a>    <span class="op">.</span>open <span class="op">=</span> open<span class="op">,</span></span>
<span id="cb4-13"><a href="#cb4-13" tabindex="-1"></a>    <span class="op">.</span>read <span class="op">=</span> read<span class="op">,</span></span>
<span id="cb4-14"><a href="#cb4-14" tabindex="-1"></a>});</span>
<span id="cb4-15"><a href="#cb4-15" tabindex="-1"></a></span>
<span id="cb4-16"><a href="#cb4-16" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">fn</span> main() <span class="op">!</span><span class="dt">u8</span> {</span>
<span id="cb4-17"><a href="#cb4-17" tabindex="-1"></a>    log<span class="op">.</span>info(<span class="st">&quot;Zig hello FUSE&quot;</span><span class="op">,</span> <span class="op">.</span>{});</span>
<span id="cb4-18"><a href="#cb4-18" tabindex="-1"></a></span>
<span id="cb4-19"><a href="#cb4-19" tabindex="-1"></a>    <span class="at">const</span> ret <span class="op">=</span> fuse<span class="op">.</span>fuse_main_real(</span>
<span id="cb4-20"><a href="#cb4-20" tabindex="-1"></a>        <span class="bu">@intCast</span>(std<span class="op">.</span>os<span class="op">.</span>argv<span class="op">.</span>len)<span class="op">,</span></span>
<span id="cb4-21"><a href="#cb4-21" tabindex="-1"></a>        <span class="bu">@ptrCast</span>(std<span class="op">.</span>os<span class="op">.</span>argv<span class="op">.</span>ptr)<span class="op">,</span></span>
<span id="cb4-22"><a href="#cb4-22" tabindex="-1"></a>        <span class="op">&amp;</span>ops<span class="op">,</span></span>
<span id="cb4-23"><a href="#cb4-23" tabindex="-1"></a>        <span class="bu">@sizeOf</span>(<span class="bu">@TypeOf</span>(ops))<span class="op">,</span></span>
<span id="cb4-24"><a href="#cb4-24" tabindex="-1"></a>        <span class="cn">null</span><span class="op">,</span></span>
<span id="cb4-25"><a href="#cb4-25" tabindex="-1"></a>    );</span>
<span id="cb4-26"><a href="#cb4-26" tabindex="-1"></a></span>
<span id="cb4-27"><a href="#cb4-27" tabindex="-1"></a>    <span class="cf">return</span> <span class="cf">switch</span> (ret) {</span>
<span id="cb4-28"><a href="#cb4-28" tabindex="-1"></a>        <span class="dv">0</span> <span class="op">=&gt;</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb4-29"><a href="#cb4-29" tabindex="-1"></a>        <span class="dv">1</span> <span class="op">=&gt;</span> <span class="kw">error</span><span class="op">.</span>FuseParseCmdline<span class="op">,</span></span>
<span id="cb4-30"><a href="#cb4-30" tabindex="-1"></a>        <span class="dv">2</span> <span class="op">=&gt;</span> <span class="kw">error</span><span class="op">.</span>FuseMountpoint<span class="op">,</span></span>
<span id="cb4-31"><a href="#cb4-31" tabindex="-1"></a>        <span class="dv">3</span> <span class="op">=&gt;</span> <span class="kw">error</span><span class="op">.</span>FuseNew<span class="op">,</span></span>
<span id="cb4-32"><a href="#cb4-32" tabindex="-1"></a>        <span class="dv">4</span> <span class="op">=&gt;</span> <span class="kw">error</span><span class="op">.</span>FuseMount<span class="op">,</span></span>
<span id="cb4-33"><a href="#cb4-33" tabindex="-1"></a>        <span class="dv">5</span> <span class="op">=&gt;</span> <span class="kw">error</span><span class="op">.</span>FuseDaemonize<span class="op">,</span></span>
<span id="cb4-34"><a href="#cb4-34" tabindex="-1"></a>        <span class="dv">6</span> <span class="op">=&gt;</span> <span class="kw">error</span><span class="op">.</span>FuseSession<span class="op">,</span></span>
<span id="cb4-35"><a href="#cb4-35" tabindex="-1"></a>        <span class="dv">7</span> <span class="op">=&gt;</span> <span class="kw">error</span><span class="op">.</span>FuseLoopCfg<span class="op">,</span></span>
<span id="cb4-36"><a href="#cb4-36" tabindex="-1"></a>        <span class="dv">8</span> <span class="op">=&gt;</span> <span class="kw">error</span><span class="op">.</span>FuseEventLoop<span class="op">,</span></span>
<span id="cb4-37"><a href="#cb4-37" tabindex="-1"></a>        <span class="cf">else</span> <span class="op">=&gt;</span> <span class="kw">error</span><span class="op">.</span>FuseUnknown<span class="op">,</span></span>
<span id="cb4-38"><a href="#cb4-38" tabindex="-1"></a>    };</span>
<span id="cb4-39"><a href="#cb4-39" tabindex="-1"></a>}</span></code></pre></div>
            <p>libfuse uses a common C idiom of having a struct full of
            callbacks to implement an interface. In this case
            <code>struct fuse_operations</code>
            (<code>fuse.struct_fuse_operations</code> in Zig) which we
            pass to <code>fuse_main_real</code>. We’ll look at the
            function implementations below.</p>
            <p>Note that in the C we just call <code>fuse_main</code>
            which is a macro. Zig could not translate this macro. So
            instead we have to call <code>fuse_main_real</code> which is
            what the macro points to.</p>
            <p>In Zig <code>struct fuse_operations</code> needs to be
            initialised with <code>mem.zeroInit</code>. This sets most
            of the fields (of which there are a lot) to
            <code>null</code> except for those specified in the second
            argument. This is an anti-pattern in Zig, but is required
            when dealing with C.</p>
            <p>In the Zig version I translated the
            <code>fuse_main</code> return value to an implicit error
            enum. I’m not sure why, probably I thought it would help
            with debugging.</p>
            <h1 id="getattr">getattr</h1>
            <p>Now lets look at some of the interface implementation.
            First we have <code>getattr</code> which more or less
            correlates with the <code>stat</code> system call. This
            returns some file attributes like its size, whether it is a
            directory, whether it can be read or written.</p>
            <p>The C version looks like this.</p>
            <div class="sourceCode" id="cb5"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb5-1"><a href="#cb5-1" tabindex="-1"></a><span class="dt">static</span> <span class="dt">int</span> hello_getattr<span class="op">(</span><span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span>path<span class="op">,</span> <span class="kw">struct</span> stat <span class="op">*</span>stbuf<span class="op">,</span></span>
<span id="cb5-2"><a href="#cb5-2" tabindex="-1"></a>             <span class="kw">struct</span> fuse_file_info <span class="op">*</span>fi<span class="op">)</span></span>
<span id="cb5-3"><a href="#cb5-3" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb5-4"><a href="#cb5-4" tabindex="-1"></a>    <span class="op">(</span><span class="dt">void</span><span class="op">)</span> fi<span class="op">;</span></span>
<span id="cb5-5"><a href="#cb5-5" tabindex="-1"></a>    <span class="dt">int</span> res <span class="op">=</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb5-6"><a href="#cb5-6" tabindex="-1"></a></span>
<span id="cb5-7"><a href="#cb5-7" tabindex="-1"></a>    memset<span class="op">(</span>stbuf<span class="op">,</span> <span class="dv">0</span><span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span><span class="kw">struct</span> stat<span class="op">));</span></span>
<span id="cb5-8"><a href="#cb5-8" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>strcmp<span class="op">(</span>path<span class="op">,</span> <span class="st">&quot;/&quot;</span><span class="op">)</span> <span class="op">==</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-9"><a href="#cb5-9" tabindex="-1"></a>        stbuf<span class="op">-&gt;</span>st_mode <span class="op">=</span> S_IFDIR <span class="op">|</span> <span class="bn">0755</span><span class="op">;</span></span>
<span id="cb5-10"><a href="#cb5-10" tabindex="-1"></a>        stbuf<span class="op">-&gt;</span>st_nlink <span class="op">=</span> <span class="dv">2</span><span class="op">;</span></span>
<span id="cb5-11"><a href="#cb5-11" tabindex="-1"></a>    <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="op">(</span>strcmp<span class="op">(</span>path<span class="op">+</span><span class="dv">1</span><span class="op">,</span> options<span class="op">.</span>filename<span class="op">)</span> <span class="op">==</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-12"><a href="#cb5-12" tabindex="-1"></a>        stbuf<span class="op">-&gt;</span>st_mode <span class="op">=</span> S_IFREG <span class="op">|</span> <span class="bn">0444</span><span class="op">;</span></span>
<span id="cb5-13"><a href="#cb5-13" tabindex="-1"></a>        stbuf<span class="op">-&gt;</span>st_nlink <span class="op">=</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb5-14"><a href="#cb5-14" tabindex="-1"></a>        stbuf<span class="op">-&gt;</span>st_size <span class="op">=</span> strlen<span class="op">(</span>options<span class="op">.</span>contents<span class="op">);</span></span>
<span id="cb5-15"><a href="#cb5-15" tabindex="-1"></a>    <span class="op">}</span> <span class="cf">else</span></span>
<span id="cb5-16"><a href="#cb5-16" tabindex="-1"></a>        res <span class="op">=</span> <span class="op">-</span>ENOENT<span class="op">;</span></span>
<span id="cb5-17"><a href="#cb5-17" tabindex="-1"></a></span>
<span id="cb5-18"><a href="#cb5-18" tabindex="-1"></a>    <span class="cf">return</span> res<span class="op">;</span></span>
<span id="cb5-19"><a href="#cb5-19" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>When converting this, my first question was how do I
            create a Zig function which can be called from C? This is
            where <code>fuse31.zig</code> is very useful because it
            contains the function signatures inside.</p>
            <div class="sourceCode" id="cb6"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb6-1"><a href="#cb6-1" tabindex="-1"></a><span class="kw">pub</span> <span class="at">const</span> struct_fuse_operations <span class="op">=</span> <span class="at">extern</span> <span class="kw">struct</span> {</span>
<span id="cb6-2"><a href="#cb6-2" tabindex="-1"></a>    getattr<span class="op">:</span> <span class="op">?*</span><span class="at">const</span> <span class="kw">fn</span> ([<span class="op">*</span>c]<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span> <span class="op">?*</span>struct_stat<span class="op">,</span> <span class="op">?*</span>struct_fuse_file_info) <span class="kw">callconv</span>(<span class="op">.</span>C) <span class="dt">c_int</span><span class="op">,</span></span>
<span id="cb6-3"><a href="#cb6-3" tabindex="-1"></a>    readlink<span class="op">:</span> <span class="op">?*</span><span class="at">const</span> <span class="kw">fn</span> ([<span class="op">*</span>c]<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span> [<span class="op">*</span>c]<span class="dt">u8</span><span class="op">,</span> <span class="dt">usize</span>) <span class="kw">callconv</span>(<span class="op">.</span>C) <span class="dt">c_int</span><span class="op">,</span></span>
<span id="cb6-4"><a href="#cb6-4" tabindex="-1"></a>    mknod<span class="op">:</span> <span class="op">?*</span><span class="at">const</span> <span class="kw">fn</span> ([<span class="op">*</span>c]<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span> mode_t<span class="op">,</span> dev_t) <span class="kw">callconv</span>(<span class="op">.</span>C) <span class="dt">c_int</span><span class="op">,</span></span>
<span id="cb6-5"><a href="#cb6-5" tabindex="-1"></a>    mkdir<span class="op">:</span> <span class="op">?*</span><span class="at">const</span> <span class="kw">fn</span> ([<span class="op">*</span>c]<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span> mode_t) <span class="kw">callconv</span>(<span class="op">.</span>C) <span class="dt">c_int</span><span class="op">,</span></span>
<span id="cb6-6"><a href="#cb6-6" tabindex="-1"></a>    <span class="op">...</span></span></code></pre></div>
            <p>We just need to add a function name and some argument
            names to the function pointer’s signature. Below is the Zig
            implementation of <code>getattr</code> along with some
            helpers.</p>
            <div class="sourceCode" id="cb7"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb7-1"><a href="#cb7-1" tabindex="-1"></a><span class="at">const</span> E <span class="op">=</span> std<span class="op">.</span>os<span class="op">.</span>linux<span class="op">.</span>E;</span>
<span id="cb7-2"><a href="#cb7-2" tabindex="-1"></a></span>
<span id="cb7-3"><a href="#cb7-3" tabindex="-1"></a><span class="op">...</span></span>
<span id="cb7-4"><a href="#cb7-4" tabindex="-1"></a></span>
<span id="cb7-5"><a href="#cb7-5" tabindex="-1"></a><span class="at">const</span> filename<span class="op">:</span> [<span class="op">:</span><span class="dv">0</span>]<span class="at">const</span> <span class="dt">u8</span> <span class="op">=</span> <span class="st">&quot;hello&quot;</span>;</span>
<span id="cb7-6"><a href="#cb7-6" tabindex="-1"></a><span class="at">const</span> contents <span class="op">=</span> <span class="st">&quot;Alright, mate!</span><span class="sc">\n</span><span class="st">&quot;</span>;</span>
<span id="cb7-7"><a href="#cb7-7" tabindex="-1"></a></span>
<span id="cb7-8"><a href="#cb7-8" tabindex="-1"></a><span class="kw">fn</span> cErr(err<span class="op">:</span> E) <span class="dt">c_int</span> {</span>
<span id="cb7-9"><a href="#cb7-9" tabindex="-1"></a>    <span class="at">const</span> n<span class="op">:</span> <span class="dt">c_int</span> <span class="op">=</span> @intFromEnum(err);</span>
<span id="cb7-10"><a href="#cb7-10" tabindex="-1"></a></span>
<span id="cb7-11"><a href="#cb7-11" tabindex="-1"></a>    <span class="cf">return</span> <span class="op">-</span>n;</span>
<span id="cb7-12"><a href="#cb7-12" tabindex="-1"></a>}</span>
<span id="cb7-13"><a href="#cb7-13" tabindex="-1"></a></span>
<span id="cb7-14"><a href="#cb7-14" tabindex="-1"></a><span class="op">...</span></span>
<span id="cb7-15"><a href="#cb7-15" tabindex="-1"></a></span>
<span id="cb7-16"><a href="#cb7-16" tabindex="-1"></a><span class="kw">fn</span> getattr(</span>
<span id="cb7-17"><a href="#cb7-17" tabindex="-1"></a>    path<span class="op">:</span> [<span class="op">*</span>c]<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span></span>
<span id="cb7-18"><a href="#cb7-18" tabindex="-1"></a>    stat<span class="op">:</span> <span class="op">?*</span>fuse<span class="op">.</span>struct_stat<span class="op">,</span></span>
<span id="cb7-19"><a href="#cb7-19" tabindex="-1"></a>    _<span class="op">:</span> <span class="op">?*</span>fuse<span class="op">.</span>struct_fuse_file_info<span class="op">,</span></span>
<span id="cb7-20"><a href="#cb7-20" tabindex="-1"></a>) <span class="kw">callconv</span>(<span class="op">.</span>C) <span class="dt">c_int</span> {</span>
<span id="cb7-21"><a href="#cb7-21" tabindex="-1"></a>    <span class="at">var</span> st <span class="op">=</span> mem<span class="op">.</span>zeroes(fuse<span class="op">.</span>struct_stat);</span>
<span id="cb7-22"><a href="#cb7-22" tabindex="-1"></a>    <span class="at">const</span> p <span class="op">=</span> mem<span class="op">.</span>span(path);</span>
<span id="cb7-23"><a href="#cb7-23" tabindex="-1"></a></span>
<span id="cb7-24"><a href="#cb7-24" tabindex="-1"></a>    log<span class="op">.</span>info(<span class="st">&quot;stat: {s}&quot;</span><span class="op">,</span> <span class="op">.</span>{p});</span>
<span id="cb7-25"><a href="#cb7-25" tabindex="-1"></a></span>
<span id="cb7-26"><a href="#cb7-26" tabindex="-1"></a>    <span class="cf">if</span> (mem<span class="op">.</span>eql(<span class="dt">u8</span><span class="op">,</span> <span class="st">&quot;/&quot;</span><span class="op">,</span> p)) {</span>
<span id="cb7-27"><a href="#cb7-27" tabindex="-1"></a>        st<span class="op">.</span>st_mode <span class="op">=</span> fuse<span class="op">.</span>S_IFDIR <span class="op">|</span> <span class="dv">0</span><span class="er">o0755</span>;</span>
<span id="cb7-28"><a href="#cb7-28" tabindex="-1"></a>        st<span class="op">.</span>st_nlink <span class="op">=</span> <span class="dv">2</span>;</span>
<span id="cb7-29"><a href="#cb7-29" tabindex="-1"></a>    } <span class="cf">else</span> <span class="cf">if</span> (mem<span class="op">.</span>eql(<span class="dt">u8</span><span class="op">,</span> filename<span class="op">,</span> p[<span class="dv">1</span><span class="er">.</span><span class="op">.</span>])) {</span>
<span id="cb7-30"><a href="#cb7-30" tabindex="-1"></a>        st<span class="op">.</span>st_mode <span class="op">=</span> fuse<span class="op">.</span>S_IFREG <span class="op">|</span> <span class="dv">0</span><span class="er">o0444</span>;</span>
<span id="cb7-31"><a href="#cb7-31" tabindex="-1"></a>        st<span class="op">.</span>st_nlink <span class="op">=</span> <span class="dv">1</span>;</span>
<span id="cb7-32"><a href="#cb7-32" tabindex="-1"></a>        st<span class="op">.</span>st_size <span class="op">=</span> contents<span class="op">.</span>len;</span>
<span id="cb7-33"><a href="#cb7-33" tabindex="-1"></a>    } <span class="cf">else</span> {</span>
<span id="cb7-34"><a href="#cb7-34" tabindex="-1"></a>        <span class="cf">return</span> cErr(E<span class="op">.</span>NOENT);</span>
<span id="cb7-35"><a href="#cb7-35" tabindex="-1"></a>    }</span>
<span id="cb7-36"><a href="#cb7-36" tabindex="-1"></a></span>
<span id="cb7-37"><a href="#cb7-37" tabindex="-1"></a>    stat<span class="op">.?.*</span> <span class="op">=</span> st;</span>
<span id="cb7-38"><a href="#cb7-38" tabindex="-1"></a></span>
<span id="cb7-39"><a href="#cb7-39" tabindex="-1"></a>    <span class="cf">return</span> <span class="dv">0</span>;</span>
<span id="cb7-40"><a href="#cb7-40" tabindex="-1"></a>}</span></code></pre></div>
            <p>It looks similar to the C, but note that we do not write
            directly to the passed <code>struct stat</code> in Zig. We
            zero a new struct and copy it at the end of the function.
            The stat argument is an optional pointer and it feels like
            Zig discourages one from interacting with it piecemeal. It
            makes sense to only check once if it is null or not.</p>
            <p>The path argument should be a null terminated C string. I
            like to convert it to a slice using <code>mem.span</code>.
            Then we can compare it directly with other slices or get its
            length without doing another count.</p>
            <h1 id="readdir">readdir</h1>
            <p>Next up we have <code>readdir</code> which correlates
            with the <code>opendir[at]</code> and
            <code>getdents[64]</code> system calls. There is a
            deprecated <code>readdir</code> syscall on some
            architectures as well, but something went wrong.</p>
            <p>Implementing this allows us to use <code>ls</code> on the
            root of the mount. The C implementation looks like this</p>
            <div class="sourceCode" id="cb8"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb8-1"><a href="#cb8-1" tabindex="-1"></a><span class="dt">static</span> <span class="dt">int</span> hello_readdir<span class="op">(</span><span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span>path<span class="op">,</span> <span class="dt">void</span> <span class="op">*</span>buf<span class="op">,</span> fuse_fill_dir_t filler<span class="op">,</span></span>
<span id="cb8-2"><a href="#cb8-2" tabindex="-1"></a>             off_t offset<span class="op">,</span> <span class="kw">struct</span> fuse_file_info <span class="op">*</span>fi<span class="op">,</span></span>
<span id="cb8-3"><a href="#cb8-3" tabindex="-1"></a>             <span class="kw">enum</span> fuse_readdir_flags flags<span class="op">)</span></span>
<span id="cb8-4"><a href="#cb8-4" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb8-5"><a href="#cb8-5" tabindex="-1"></a>    <span class="op">(</span><span class="dt">void</span><span class="op">)</span> offset<span class="op">;</span></span>
<span id="cb8-6"><a href="#cb8-6" tabindex="-1"></a>    <span class="op">(</span><span class="dt">void</span><span class="op">)</span> fi<span class="op">;</span></span>
<span id="cb8-7"><a href="#cb8-7" tabindex="-1"></a>    <span class="op">(</span><span class="dt">void</span><span class="op">)</span> flags<span class="op">;</span></span>
<span id="cb8-8"><a href="#cb8-8" tabindex="-1"></a></span>
<span id="cb8-9"><a href="#cb8-9" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>strcmp<span class="op">(</span>path<span class="op">,</span> <span class="st">&quot;/&quot;</span><span class="op">)</span> <span class="op">!=</span> <span class="dv">0</span><span class="op">)</span></span>
<span id="cb8-10"><a href="#cb8-10" tabindex="-1"></a>        <span class="cf">return</span> <span class="op">-</span>ENOENT<span class="op">;</span></span>
<span id="cb8-11"><a href="#cb8-11" tabindex="-1"></a></span>
<span id="cb8-12"><a href="#cb8-12" tabindex="-1"></a>    filler<span class="op">(</span>buf<span class="op">,</span> <span class="st">&quot;.&quot;</span><span class="op">,</span> NULL<span class="op">,</span> <span class="dv">0</span><span class="op">,</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb8-13"><a href="#cb8-13" tabindex="-1"></a>    filler<span class="op">(</span>buf<span class="op">,</span> <span class="st">&quot;..&quot;</span><span class="op">,</span> NULL<span class="op">,</span> <span class="dv">0</span><span class="op">,</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb8-14"><a href="#cb8-14" tabindex="-1"></a>    filler<span class="op">(</span>buf<span class="op">,</span> options<span class="op">.</span>filename<span class="op">,</span> NULL<span class="op">,</span> <span class="dv">0</span><span class="op">,</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb8-15"><a href="#cb8-15" tabindex="-1"></a></span>
<span id="cb8-16"><a href="#cb8-16" tabindex="-1"></a>    <span class="cf">return</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb8-17"><a href="#cb8-17" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>It seems we are given a buffer and function called
            <code>filler</code>. We can add entries to the buffer with
            <code>filler</code>. Most of the functionality is ignored,
            just the paths are added.</p>
            <p>Now the Zig version</p>
            <div class="sourceCode" id="cb9"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb9-1"><a href="#cb9-1" tabindex="-1"></a><span class="kw">fn</span> readdir(</span>
<span id="cb9-2"><a href="#cb9-2" tabindex="-1"></a>    path<span class="op">:</span> [<span class="op">*</span>c]<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span></span>
<span id="cb9-3"><a href="#cb9-3" tabindex="-1"></a>    buf<span class="op">:</span> <span class="op">?*</span><span class="dt">anyopaque</span><span class="op">,</span></span>
<span id="cb9-4"><a href="#cb9-4" tabindex="-1"></a>    filler<span class="op">:</span> fuse<span class="op">.</span>fuse_fill_dir_t<span class="op">,</span></span>
<span id="cb9-5"><a href="#cb9-5" tabindex="-1"></a>    _<span class="op">:</span> fuse<span class="op">.</span>off_t<span class="op">,</span></span>
<span id="cb9-6"><a href="#cb9-6" tabindex="-1"></a>    _<span class="op">:</span> <span class="op">?*</span>fuse<span class="op">.</span>struct_fuse_file_info<span class="op">,</span></span>
<span id="cb9-7"><a href="#cb9-7" tabindex="-1"></a>    _<span class="op">:</span> fuse<span class="op">.</span>enum_fuse_readdir_flags<span class="op">,</span></span>
<span id="cb9-8"><a href="#cb9-8" tabindex="-1"></a>) <span class="kw">callconv</span>(<span class="op">.</span>C) <span class="dt">c_int</span> {</span>
<span id="cb9-9"><a href="#cb9-9" tabindex="-1"></a>    <span class="at">const</span> p <span class="op">=</span> mem<span class="op">.</span>span(path);</span>
<span id="cb9-10"><a href="#cb9-10" tabindex="-1"></a></span>
<span id="cb9-11"><a href="#cb9-11" tabindex="-1"></a>    log<span class="op">.</span>info(<span class="st">&quot;readdir: {s}&quot;</span><span class="op">,</span> <span class="op">.</span>{p});</span>
<span id="cb9-12"><a href="#cb9-12" tabindex="-1"></a></span>
<span id="cb9-13"><a href="#cb9-13" tabindex="-1"></a>    <span class="cf">if</span> (<span class="op">!</span>mem<span class="op">.</span>eql(<span class="dt">u8</span><span class="op">,</span> <span class="st">&quot;/&quot;</span><span class="op">,</span> p))</span>
<span id="cb9-14"><a href="#cb9-14" tabindex="-1"></a>        <span class="cf">return</span> cErr(E<span class="op">.</span>NOENT);</span>
<span id="cb9-15"><a href="#cb9-15" tabindex="-1"></a></span>
<span id="cb9-16"><a href="#cb9-16" tabindex="-1"></a>    <span class="at">const</span> names <span class="op">=</span> [_][<span class="op">:</span><span class="dv">0</span>]<span class="at">const</span> <span class="dt">u8</span>{ <span class="st">&quot;.&quot;</span><span class="op">,</span> <span class="st">&quot;..&quot;</span><span class="op">,</span> filename };</span>
<span id="cb9-17"><a href="#cb9-17" tabindex="-1"></a></span>
<span id="cb9-18"><a href="#cb9-18" tabindex="-1"></a>    <span class="cf">for</span> (names) <span class="op">|</span>n<span class="op">|</span> {</span>
<span id="cb9-19"><a href="#cb9-19" tabindex="-1"></a>        <span class="at">const</span> ret <span class="op">=</span> filler<span class="op">.?</span>(buf<span class="op">,</span> n<span class="op">,</span> <span class="cn">null</span><span class="op">,</span> <span class="dv">0</span><span class="op">,</span> <span class="dv">0</span>);</span>
<span id="cb9-20"><a href="#cb9-20" tabindex="-1"></a></span>
<span id="cb9-21"><a href="#cb9-21" tabindex="-1"></a>        <span class="cf">if</span> (ret <span class="op">&gt;</span> <span class="dv">0</span>)</span>
<span id="cb9-22"><a href="#cb9-22" tabindex="-1"></a>            log<span class="op">.</span>err(<span class="st">&quot;readdir: {s}: {}&quot;</span><span class="op">,</span> <span class="op">.</span>{ p<span class="op">,</span> ret });</span>
<span id="cb9-23"><a href="#cb9-23" tabindex="-1"></a>    }</span>
<span id="cb9-24"><a href="#cb9-24" tabindex="-1"></a></span>
<span id="cb9-25"><a href="#cb9-25" tabindex="-1"></a>    <span class="cf">return</span> <span class="dv">0</span>;</span>
<span id="cb9-26"><a href="#cb9-26" tabindex="-1"></a>}</span></code></pre></div>
            <p>The <code>filler</code> callback returns a value which
            Zig doesn’t want to be ignored. C has an attribute for that
            as well, but it is not the default. In Zig we either pay
            attention to the return value or explicitly ignore it with
            <code>_ = filler...</code>.</p>
            <p>I didn’t look into what is the right thing to do when
            filler fails. It depends on what is likely to fail and how
            that could be communicated to the user.</p>
            <h1 id="open">open</h1>
            <p>The <code>open</code> syscall tries to associate a file
            handle with a path. All the libfuse callbacks I have seen
            take a path as their first argument instead of a file
            handle. However the file handle is still there it is just
            buried in <code>struct fuse_file_info</code>.</p>
            <p>The C implementation of <code>open</code> looks like
            this</p>
            <div class="sourceCode" id="cb10"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb10-1"><a href="#cb10-1" tabindex="-1"></a><span class="dt">static</span> <span class="dt">int</span> hello_open<span class="op">(</span><span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span>path<span class="op">,</span> <span class="kw">struct</span> fuse_file_info <span class="op">*</span>fi<span class="op">)</span></span>
<span id="cb10-2"><a href="#cb10-2" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb10-3"><a href="#cb10-3" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>strcmp<span class="op">(</span>path<span class="op">+</span><span class="dv">1</span><span class="op">,</span> options<span class="op">.</span>filename<span class="op">)</span> <span class="op">!=</span> <span class="dv">0</span><span class="op">)</span></span>
<span id="cb10-4"><a href="#cb10-4" tabindex="-1"></a>        <span class="cf">return</span> <span class="op">-</span>ENOENT<span class="op">;</span></span>
<span id="cb10-5"><a href="#cb10-5" tabindex="-1"></a></span>
<span id="cb10-6"><a href="#cb10-6" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">((</span>fi<span class="op">-&gt;</span>flags <span class="op">&amp;</span> O_ACCMODE<span class="op">)</span> <span class="op">!=</span> O_RDONLY<span class="op">)</span></span>
<span id="cb10-7"><a href="#cb10-7" tabindex="-1"></a>        <span class="cf">return</span> <span class="op">-</span>EACCES<span class="op">;</span></span>
<span id="cb10-8"><a href="#cb10-8" tabindex="-1"></a></span>
<span id="cb10-9"><a href="#cb10-9" tabindex="-1"></a>    <span class="cf">return</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb10-10"><a href="#cb10-10" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>It just checks the path and access mode. The Zig version
            looks like this</p>
            <div class="sourceCode" id="cb11"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb11-1"><a href="#cb11-1" tabindex="-1"></a><span class="co">// May not be the correct size depending on the target because of the</span></span>
<span id="cb11-2"><a href="#cb11-2" tabindex="-1"></a><span class="co">// bitfield: https://github.com/ziglang/zig/issues/1499</span></span>
<span id="cb11-3"><a href="#cb11-3" tabindex="-1"></a><span class="at">const</span> FileInfo <span class="op">=</span> <span class="at">extern</span> <span class="kw">struct</span> {</span>
<span id="cb11-4"><a href="#cb11-4" tabindex="-1"></a>    flags<span class="op">:</span> <span class="dt">c_int</span><span class="op">,</span></span>
<span id="cb11-5"><a href="#cb11-5" tabindex="-1"></a>    bitfield<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb11-6"><a href="#cb11-6" tabindex="-1"></a>    padding2<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb11-7"><a href="#cb11-7" tabindex="-1"></a>    fh<span class="op">:</span> <span class="dt">u64</span><span class="op">,</span></span>
<span id="cb11-8"><a href="#cb11-8" tabindex="-1"></a>    lock_owner<span class="op">:</span> <span class="dt">u64</span><span class="op">,</span></span>
<span id="cb11-9"><a href="#cb11-9" tabindex="-1"></a>    poll_events<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb11-10"><a href="#cb11-10" tabindex="-1"></a>};</span>
<span id="cb11-11"><a href="#cb11-11" tabindex="-1"></a></span>
<span id="cb11-12"><a href="#cb11-12" tabindex="-1"></a><span class="op">...</span></span>
<span id="cb11-13"><a href="#cb11-13" tabindex="-1"></a></span>
<span id="cb11-14"><a href="#cb11-14" tabindex="-1"></a><span class="kw">fn</span> open(</span>
<span id="cb11-15"><a href="#cb11-15" tabindex="-1"></a>    path<span class="op">:</span> [<span class="op">*</span>c]<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span></span>
<span id="cb11-16"><a href="#cb11-16" tabindex="-1"></a>    file_info<span class="op">:</span> <span class="op">?*</span>fuse<span class="op">.</span>struct_fuse_file_info<span class="op">,</span></span>
<span id="cb11-17"><a href="#cb11-17" tabindex="-1"></a>) <span class="kw">callconv</span>(<span class="op">.</span>C) <span class="dt">c_int</span> {</span>
<span id="cb11-18"><a href="#cb11-18" tabindex="-1"></a>    <span class="at">const</span> p <span class="op">=</span> mem<span class="op">.</span>span(path);</span>
<span id="cb11-19"><a href="#cb11-19" tabindex="-1"></a>    <span class="at">const</span> fi<span class="op">:</span> <span class="op">*</span>FileInfo <span class="op">=</span> <span class="bu">@ptrCast</span>(<span class="bu">@alignCast</span>(file_info<span class="op">.?</span>));</span>
<span id="cb11-20"><a href="#cb11-20" tabindex="-1"></a></span>
<span id="cb11-21"><a href="#cb11-21" tabindex="-1"></a>    log<span class="op">.</span>info(<span class="st">&quot;open: {s}&quot;</span><span class="op">,</span> <span class="op">.</span>{p});</span>
<span id="cb11-22"><a href="#cb11-22" tabindex="-1"></a></span>
<span id="cb11-23"><a href="#cb11-23" tabindex="-1"></a>    <span class="cf">if</span> (<span class="op">!</span>mem<span class="op">.</span>eql(<span class="dt">u8</span><span class="op">,</span> filename<span class="op">,</span> p[<span class="dv">1</span><span class="er">.</span><span class="op">.</span>]))</span>
<span id="cb11-24"><a href="#cb11-24" tabindex="-1"></a>        <span class="cf">return</span> cErr(E<span class="op">.</span>NOENT);</span>
<span id="cb11-25"><a href="#cb11-25" tabindex="-1"></a></span>
<span id="cb11-26"><a href="#cb11-26" tabindex="-1"></a>    <span class="cf">if</span> ((fi<span class="op">.</span>flags <span class="op">&amp;</span> fuse<span class="op">.</span>O_ACCMODE) <span class="op">!=</span> fuse<span class="op">.</span>O_RDONLY)</span>
<span id="cb11-27"><a href="#cb11-27" tabindex="-1"></a>        <span class="cf">return</span> cErr(E<span class="op">.</span>ACCES);</span>
<span id="cb11-28"><a href="#cb11-28" tabindex="-1"></a></span>
<span id="cb11-29"><a href="#cb11-29" tabindex="-1"></a>    <span class="cf">return</span> <span class="dv">0</span>;</span>
<span id="cb11-30"><a href="#cb11-30" tabindex="-1"></a>}</span></code></pre></div>
            <p>The <code>struct fuse_file_info</code> contains a
            bitfield which can’t presently be translated from C. Zig has
            bitfields as well, but they have the same layout on all
            targets. In C, bitfields change between targets, which means
            extra work for Zig’s authors. You can see why in the <a
            href="https://github.com/ziglang/zig/issues/1499">linked
            issue</a>.</p>
            <p>Luckily we just want to access <code>flags</code> which
            comes before the bitfield. We could even just cast the
            pointer to <code>*c_int</code> as we don’t access any memory
            after it. If we needed to know where some other part of the
            struct came in memory then we could have an issue.</p>
            <h1 id="read">read</h1>
            <p>Finally we have a call to read the file content</p>
            <div class="sourceCode" id="cb12"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb12-1"><a href="#cb12-1" tabindex="-1"></a><span class="dt">static</span> <span class="dt">int</span> hello_read<span class="op">(</span><span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span>path<span class="op">,</span> <span class="dt">char</span> <span class="op">*</span>buf<span class="op">,</span> <span class="dt">size_t</span> size<span class="op">,</span> off_t offset<span class="op">,</span></span>
<span id="cb12-2"><a href="#cb12-2" tabindex="-1"></a>              <span class="kw">struct</span> fuse_file_info <span class="op">*</span>fi<span class="op">)</span></span>
<span id="cb12-3"><a href="#cb12-3" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb12-4"><a href="#cb12-4" tabindex="-1"></a>    <span class="dt">size_t</span> len<span class="op">;</span></span>
<span id="cb12-5"><a href="#cb12-5" tabindex="-1"></a>    <span class="op">(</span><span class="dt">void</span><span class="op">)</span> fi<span class="op">;</span></span>
<span id="cb12-6"><a href="#cb12-6" tabindex="-1"></a>    <span class="cf">if</span><span class="op">(</span>strcmp<span class="op">(</span>path<span class="op">+</span><span class="dv">1</span><span class="op">,</span> options<span class="op">.</span>filename<span class="op">)</span> <span class="op">!=</span> <span class="dv">0</span><span class="op">)</span></span>
<span id="cb12-7"><a href="#cb12-7" tabindex="-1"></a>        <span class="cf">return</span> <span class="op">-</span>ENOENT<span class="op">;</span></span>
<span id="cb12-8"><a href="#cb12-8" tabindex="-1"></a></span>
<span id="cb12-9"><a href="#cb12-9" tabindex="-1"></a>    len <span class="op">=</span> strlen<span class="op">(</span>options<span class="op">.</span>contents<span class="op">);</span></span>
<span id="cb12-10"><a href="#cb12-10" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>offset <span class="op">&lt;</span> len<span class="op">)</span> <span class="op">{</span></span>
<span id="cb12-11"><a href="#cb12-11" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>offset <span class="op">+</span> size <span class="op">&gt;</span> len<span class="op">)</span></span>
<span id="cb12-12"><a href="#cb12-12" tabindex="-1"></a>            size <span class="op">=</span> len <span class="op">-</span> offset<span class="op">;</span></span>
<span id="cb12-13"><a href="#cb12-13" tabindex="-1"></a>        memcpy<span class="op">(</span>buf<span class="op">,</span> options<span class="op">.</span>contents <span class="op">+</span> offset<span class="op">,</span> size<span class="op">);</span></span>
<span id="cb12-14"><a href="#cb12-14" tabindex="-1"></a>    <span class="op">}</span> <span class="cf">else</span></span>
<span id="cb12-15"><a href="#cb12-15" tabindex="-1"></a>        size <span class="op">=</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb12-16"><a href="#cb12-16" tabindex="-1"></a></span>
<span id="cb12-17"><a href="#cb12-17" tabindex="-1"></a>    <span class="cf">return</span> size<span class="op">;</span></span>
<span id="cb12-18"><a href="#cb12-18" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>And the Zig version</p>
            <div class="sourceCode" id="cb13"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb13-1"><a href="#cb13-1" tabindex="-1"></a><span class="kw">fn</span> read(</span>
<span id="cb13-2"><a href="#cb13-2" tabindex="-1"></a>    path<span class="op">:</span> [<span class="op">*</span>c]<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span></span>
<span id="cb13-3"><a href="#cb13-3" tabindex="-1"></a>    buf<span class="op">:</span> [<span class="op">*</span>c]<span class="dt">u8</span><span class="op">,</span></span>
<span id="cb13-4"><a href="#cb13-4" tabindex="-1"></a>    size<span class="op">:</span> <span class="dt">usize</span><span class="op">,</span></span>
<span id="cb13-5"><a href="#cb13-5" tabindex="-1"></a>    offset<span class="op">:</span> fuse<span class="op">.</span>off_t<span class="op">,</span></span>
<span id="cb13-6"><a href="#cb13-6" tabindex="-1"></a>    _<span class="op">:</span> <span class="op">?*</span>fuse<span class="op">.</span>struct_fuse_file_info<span class="op">,</span></span>
<span id="cb13-7"><a href="#cb13-7" tabindex="-1"></a>) <span class="kw">callconv</span>(<span class="op">.</span>C) <span class="dt">c_int</span> {</span>
<span id="cb13-8"><a href="#cb13-8" tabindex="-1"></a>    <span class="at">const</span> p <span class="op">=</span> mem<span class="op">.</span>span(path);</span>
<span id="cb13-9"><a href="#cb13-9" tabindex="-1"></a>    <span class="at">const</span> off<span class="op">:</span> <span class="dt">usize</span> <span class="op">=</span> <span class="bu">@intCast</span>(offset);</span>
<span id="cb13-10"><a href="#cb13-10" tabindex="-1"></a></span>
<span id="cb13-11"><a href="#cb13-11" tabindex="-1"></a>    log<span class="op">.</span>info(<span class="st">&quot;read: {s},size={},offset={}&quot;</span><span class="op">,</span> <span class="op">.</span>{ p<span class="op">,</span> size<span class="op">,</span> offset });</span>
<span id="cb13-12"><a href="#cb13-12" tabindex="-1"></a></span>
<span id="cb13-13"><a href="#cb13-13" tabindex="-1"></a>    <span class="cf">if</span> (<span class="op">!</span>mem<span class="op">.</span>eql(<span class="dt">u8</span><span class="op">,</span> filename<span class="op">,</span> p[<span class="dv">1</span><span class="er">.</span><span class="op">.</span>]))</span>
<span id="cb13-14"><a href="#cb13-14" tabindex="-1"></a>        <span class="cf">return</span> cErr(E<span class="op">.</span>NOENT);</span>
<span id="cb13-15"><a href="#cb13-15" tabindex="-1"></a></span>
<span id="cb13-16"><a href="#cb13-16" tabindex="-1"></a>    <span class="cf">if</span> (off <span class="op">&gt;=</span> contents<span class="op">.</span>len)</span>
<span id="cb13-17"><a href="#cb13-17" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">0</span>;</span>
<span id="cb13-18"><a href="#cb13-18" tabindex="-1"></a></span>
<span id="cb13-19"><a href="#cb13-19" tabindex="-1"></a>    <span class="at">const</span> s <span class="op">=</span> <span class="cf">if</span> (off <span class="op">+</span> size <span class="op">&gt;</span> contents<span class="op">.</span>len)</span>
<span id="cb13-20"><a href="#cb13-20" tabindex="-1"></a>        contents<span class="op">.</span>len <span class="op">-</span> off</span>
<span id="cb13-21"><a href="#cb13-21" tabindex="-1"></a>    <span class="cf">else</span></span>
<span id="cb13-22"><a href="#cb13-22" tabindex="-1"></a>        size;</span>
<span id="cb13-23"><a href="#cb13-23" tabindex="-1"></a></span>
<span id="cb13-24"><a href="#cb13-24" tabindex="-1"></a>    <span class="bu">@memcpy</span>(buf[<span class="dv">0</span><span class="er">.</span><span class="op">.</span>s]<span class="op">,</span> contents[off<span class="op">..</span>]);</span>
<span id="cb13-25"><a href="#cb13-25" tabindex="-1"></a></span>
<span id="cb13-26"><a href="#cb13-26" tabindex="-1"></a>    <span class="cf">return</span> <span class="bu">@intCast</span>(s);</span>
<span id="cb13-27"><a href="#cb13-27" tabindex="-1"></a>}</span></code></pre></div>
            <p>Zig is quite strict about what types can appear in
            operations together. So <code>off</code> has to be cast to
            <code>usize</code> or else we would have to cast
            <code>size</code> and <code>contents.len</code> to
            <code>fuse.off_t</code>/<code>c_long</code>.</p>
            <p>The <code>memcpy</code> in Zig is done with slices
            instead of pointer arithmetic. The <code>buf</code> argument
            is a many-item pointer, but we slice it up to <code>s</code>
            which is either <code>size</code> or
            <code>contents.len - off</code>.</p>
            <h1 id="run">run</h1>
            <p>The Zig version can be built and run as follows, this
            will also mount the filesystem.</p>
            <div class="sourceCode" id="cb14"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb14-1"><a href="#cb14-1" tabindex="-1"></a><span class="ex">$</span> mkdir /tmp/fuse</span>
<span id="cb14-2"><a href="#cb14-2" tabindex="-1"></a><span class="ex">$</span> zig build run <span class="at">--</span> <span class="at">-f</span> /tmp/fuse</span></code></pre></div>
            <p>Then in another terminal you can do</p>
            <div class="sourceCode" id="cb15"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb15-1"><a href="#cb15-1" tabindex="-1"></a><span class="ex">$</span> ls <span class="at">-l</span> /tmp/fuse</span>
<span id="cb15-2"><a href="#cb15-2" tabindex="-1"></a><span class="ex">total</span> 0</span>
<span id="cb15-3"><a href="#cb15-3" tabindex="-1"></a><span class="ex">-r--r--r--</span> 1 root root 15 Jan  1  1970 hello</span>
<span id="cb15-4"><a href="#cb15-4" tabindex="-1"></a><span class="ex">$</span> cat /tmp/fuse/hello</span>
<span id="cb15-5"><a href="#cb15-5" tabindex="-1"></a><span class="ex">Alright,</span> mate!</span></code></pre></div>
            <p>This produces log output similar to</p>
            <div class="sourceCode" id="cb16"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb16-1"><a href="#cb16-1" tabindex="-1"></a><span class="ex">info:</span> Zig hello FUSE</span>
<span id="cb16-2"><a href="#cb16-2" tabindex="-1"></a><span class="ex">info:</span> stat: /</span>
<span id="cb16-3"><a href="#cb16-3" tabindex="-1"></a><span class="ex">info:</span> readdir: /</span>
<span id="cb16-4"><a href="#cb16-4" tabindex="-1"></a><span class="ex">info:</span> stat: /hello</span>
<span id="cb16-5"><a href="#cb16-5" tabindex="-1"></a><span class="ex">info:</span> stat: /hello</span>
<span id="cb16-6"><a href="#cb16-6" tabindex="-1"></a><span class="ex">info:</span> stat: /hello</span>
<span id="cb16-7"><a href="#cb16-7" tabindex="-1"></a><span class="ex">info:</span> stat: /</span>
<span id="cb16-8"><a href="#cb16-8" tabindex="-1"></a><span class="ex">info:</span> readdir: /</span>
<span id="cb16-9"><a href="#cb16-9" tabindex="-1"></a><span class="ex">info:</span> open: /hello</span>
<span id="cb16-10"><a href="#cb16-10" tabindex="-1"></a><span class="ex">info:</span> read: /hello,size=4096,offset=0</span></code></pre></div>
            <h1 id="fin">fin</h1>
            <p>Depending on what it is you want to do this is a quick
            way of getting started. If you are embarking on a complex
            project then implementing the kernel interface directly
            seems like the way to go.</p>
            <p>For something simple, then the main concern is the usage
            of bitfields. I guess from looking at the bitfield in
            question that it won’t have padding added to it. Below is
            the struct definition with comments removed.</p>
            <div class="sourceCode" id="cb17"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb17-1"><a href="#cb17-1" tabindex="-1"></a><span class="kw">struct</span> fuse_file_info <span class="op">{</span></span>
<span id="cb17-2"><a href="#cb17-2" tabindex="-1"></a>    <span class="dt">int</span> flags<span class="op">;</span></span>
<span id="cb17-3"><a href="#cb17-3" tabindex="-1"></a>    <span class="dt">unsigned</span> <span class="dt">int</span> writepage <span class="op">:</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb17-4"><a href="#cb17-4" tabindex="-1"></a>    <span class="dt">unsigned</span> <span class="dt">int</span> direct_io <span class="op">:</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb17-5"><a href="#cb17-5" tabindex="-1"></a>    <span class="dt">unsigned</span> <span class="dt">int</span> keep_cache <span class="op">:</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb17-6"><a href="#cb17-6" tabindex="-1"></a>    <span class="dt">unsigned</span> <span class="dt">int</span> parallel_direct_writes <span class="op">:</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb17-7"><a href="#cb17-7" tabindex="-1"></a>    <span class="dt">unsigned</span> <span class="dt">int</span> flush <span class="op">:</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb17-8"><a href="#cb17-8" tabindex="-1"></a>    <span class="dt">unsigned</span> <span class="dt">int</span> nonseekable <span class="op">:</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb17-9"><a href="#cb17-9" tabindex="-1"></a>    <span class="dt">unsigned</span> <span class="dt">int</span> flock_release <span class="op">:</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb17-10"><a href="#cb17-10" tabindex="-1"></a>    <span class="dt">unsigned</span> <span class="dt">int</span> cache_readdir <span class="op">:</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb17-11"><a href="#cb17-11" tabindex="-1"></a>    <span class="dt">unsigned</span> <span class="dt">int</span> noflush <span class="op">:</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb17-12"><a href="#cb17-12" tabindex="-1"></a>    <span class="dt">unsigned</span> <span class="dt">int</span> padding <span class="op">:</span> <span class="dv">23</span><span class="op">;</span></span>
<span id="cb17-13"><a href="#cb17-13" tabindex="-1"></a>    <span class="dt">unsigned</span> <span class="dt">int</span> padding2 <span class="op">:</span> <span class="dv">32</span><span class="op">;</span></span>
<span id="cb17-14"><a href="#cb17-14" tabindex="-1"></a>    <span class="op">...</span></span>
<span id="cb17-15"><a href="#cb17-15" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>The author added explicit padding for the remaining 23
            bits in a 32-bit int. So probably it’s fine, the same struct
            can be recreated in Zig.</p>
            <h1 id="related">Related</h1>
            <ul>
            <li><a href="/https/richiejp.com/zig-fuse-two">Zig &amp; /dev/fuse: A weird
            file system</a></li>
            <li><a href="/https/richiejp.com/barely-http2-zig">Barely HTTP/2 server in
            Zig</a></li>
            <li><a href="/https/richiejp.com/zig-cross-compile-ltp-ltx-linux">Minimal Linux
            VM cross compiled with Clang and Zig</a></li>
            <li><a href="/https/richiejp.com/zig-ld-preload-trick">Override libc’s malloc
            with Zig</a></li>
            <li><a href="/https/richiejp.com/zig-vs-c-mini-http-server">Zig Vs C - Minimal
            HTTP server</a></li>
            </ul>
    </div>
  </content>
</entry>
<entry>
  <title>Zig &amp; /dev/fuse: A weird file system</title>
  <id>https://richiejp.com/zig-fuse-two</id>
  <published>2023-09-28T23:04:10+01:00</published>
  <updated>2023-11-09T10:13:26Z</updated>
  <link rel="alternate" href="https://richiejp.com/zig-fuse-two" />
  <summary>Using the raw Linux kernel FUSE interface to create a strange
file system that only allows setxattr</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>I previously wrote about using <a
            href="/https/richiejp.com/zig-fuse-one">libfuse and Zig</a> to create a minimal
            file system. You can see from that article that I had some
            issues compiling libfuse and also speculated that it would
            be better to use the raw interface directly.</p>
            <p>Soon after finishing that article I decided to try
            exploiting a bug (an n-day UAF). I have spent a lot of time
            reproducing bugs for the Linux Test Project, but never
            exploiting one.</p>
            <p>Presently I’m stuck on finding a <em>heap spray</em>
            technique that I can get to work with the bug given the
            limitations of Google’s kCTF targets. I’ll leave the details
            for another article if or when I get it to work. For now
            let’s just say that I want to implement
            <code>setxattr</code> and/or some other messages. To see why
            take a look at <a
            href="https://chompie.rip/Blog+Posts/Put+an+io_uring+on+it+-+Exploiting+the+Linux+Kernel#Universal+Heap+Spray">chompie’s
            article</a>.</p>
            <p>To get a deeper look at how FUSE works I decided to use
            the raw interface. It seems like there is a lot of interest
            in FUSE and I keep finding use cases for it (in addition to
            nefarious activities). Most recently <a
            href="https://github.com/ufrisk/MemProcFS">MemProcFs</a>
            which lets you mount your RAM as a file system. I’ve also
            been using <a
            href="https://github.com/libfuse/sshfs">sshfs</a> quite
            extensively with my headless workstation.</p>
            <p>This article is as much a narrative into my investigation
            as it is information about FUSE and Zig. As such it will
            contain false understanding and speculation.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>I created a video which mainly <a
            href="https://studio.youtube.com/video/B1G7p5qUW2o/edit">shows
            the debugging process</a>.</p>
            </div>
            </div>
            <h1 id="raw-fuse">Raw FUSE</h1>
            <p>The raw interface for FUSE is usually found at
            <code>/dev/fuse</code>. This is a <em>special character
            file</em> that your system creates using <code>mknod</code>
            with the device type <code>0xa:0xe5</code> (I’ll just refer
            to it as <code>/dev/fuse</code>).</p>
            <p>The character device type with the major number
            <code>0xa</code> and minor number <code>0xe5</code> is what
            allows us to create file systems in userspace. Usually this
            is made accessible at <code>/dev/fuse</code>, but in theory
            you could create this file at any path using
            <code>mknod</code>. This is true for all devices and
            device-like interfaces such as FUSE.</p>
            <p>Whether this file is accessible depends on what
            permissions it is given. Most distro’s allow regular users
            to access it. If they don’t then a user that has
            <code>CAP_MKNOD</code> could still create the device node.
            Then again, a user that can create device nodes can probably
            do anything.</p>
            <p>A character device can be opened like a regular file.
            Once we have a file handle to it then we can read and write
            to it. Note that each open and resulting file handle are
            independent. To create multiple file systems we just open
            <code>/dev/fuse</code> multiple times.</p>
            <p>Exactly what happens when we read or write to the file
            handle depends on the device driver. In FUSE’s case we’re
            not really dealing with a device; it is purely an interface
            between software components.</p>
            <p>So FUSE’s device driver is basically a big piece of glue
            code which translates file system requests into messages
            that are passed to our daemon reading
            <code>/dev/fuse</code>.</p>
            <p>We read from <code>/dev/fuse</code> and when we get a
            message, we send back a response. The FUSE code then does
            all kinds of caching, sends out notifications, translates
            error codes, creates internal kernel objects and so on.</p>
            <p>Opening <code>/dev/fuse</code>, is not enough to use the
            file system, it also needs to be mounted. Usually a device
            is mounted by specifying its path
            (e.g. <code>/dev/hda</code>) or if there is no device, we
            just specify the file system name (e.g. tmpfs). However, to
            my knowledge, a FUSE file system has no device or name
            associated with it.</p>
            <p>So instead we pass the file descriptor we just opened to
            mount. This is done using a file system option called
            <code>fd</code>. The mount system call then looks at our
            process’s file descriptor table and checks that the file
            descriptor points to an instance of
            <code>/dev/fuse</code>.</p>
            <p>Before looking into any of this I was contacted by
            José-Paul Dominguez who was using the low-level libfuse API.
            While debugging their application with <code>strace</code>
            they noticed that libfuse called mount and it failed.
            However something still got mounted.</p>
            <p>I mentioned in my previous article that a regular user
            can use FUSE. Strictly speaking though this is not true,
            because a regular user can not call the <code>mount</code>
            system call. This confused me and I wondered if there is in
            fact some other magical way to mount a file system. At least
            a FUSE file system.</p>
            <p>It turns out that what libfuse actually does is call a
            <code>suid</code> binary called <code>fusermount</code>.
            This has the permission bit to run as root. So it can
            perform the mount. This didn’t turn up in
            <code>strace</code> because it was missing the
            <code>-f</code> follow switch, so child processes
            (i.e. fusermount) are not traced. Also I doubt that suid
            binaries can be traced unless strace is ran as root.</p>
            <p>For some reason <code>fusermount</code> does not simply
            take an argument specifying the open <code>fd</code> for
            <code>/dev/fuse</code>. Usually unless
            <code>FD_CLOEXEC</code> is set, the <code>fd</code> should
            just stay open after the <code>fork</code> and
            <code>exec</code> to execute <code>fusermount</code>.</p>
            <p>Instead it opens a UNIX socket and uses a control message
            to transfer the <code>fd</code> from the parent process to
            the child. Passing file descriptors between running
            processes is not something I have seen very often. OpenQA
            does it to pass an FD to QEMU, but let’s not get into
            that.</p>
            <p>I decided not to do that, instead I run my code in an
            unprivileged user and mount namespace. This is not a good
            general solution, but it works for this demo. User
            namespaces are how it’s possible to have a container with a
            root user in. It’s not the real root user, but it lets you
            mount some file systems, including FUSE filesystems.</p>
            <p>Being able to create a new user namespace from an
            unprivileged user is obviously a massive security challenge.
            It simply makes a lot more system calls available to an
            attacker.</p>
            <p>It’s useful though for avoiding your container runtime
            requiring root while still being able to run containers with
            a root user in. So a lot of distributions allow it by
            default.</p>
            <h1 id="methodology">Methodology</h1>
            <p>The raw FUSE API has a man page. However it is both
            incomplete and not up to date. <code>libfuse</code> or the
            Go equivalent can also be used as documentation. I didn’t
            spend much time looking at these though.</p>
            <p>Instead I decided to (mostly) ignore the library and look
            at the kernel directly. I setup some break points in GDB
            (actually <code>pwndbg</code> also with the linux kernel
            scripts) and started reading <code>/dev/fuse</code> to see
            what happened. I like <code>bpftrace</code> for debugging
            the kernel, but didn’t use it for this.</p>
            <p>I have a <a
            href="https://github.com/richiejp/m/blob/main/script/run-qemu.sh">script
            which starts QEMU</a> with the <code>-s</code> switch to
            enable the GDB stub. Then once QEMU has started I run
            <code>pwndbg</code> from the kernel source directory. The
            output of <code>pwndbg</code> is highly verbose, so I have
            removed most of it.</p>
            <pre><code>$ pwndbg vmlinux
...
pwndbg&gt; target remote localhost:1234
...
pwndbg&gt; lx-symbols
loading vmlinux
scanning for modules in /home/rich/kernel/linux
loading @0xffffffffc0201000: /home/rich/kernel/linux/fs/fuse/fuse.ko</code></pre>
            <p>For <code>lx-symbols</code> to work I have a
            <code>~/.conf/gdb/gdbinit</code> like:
            <code>add-auto-load-safe-path /home/rich/kernel/linux/scripts/gdb/vmlinux-gdb.py</code></p>
            <p>There is a whole bunch of other setup to build the VM
            image I am using. You can see this at <a
            href="https://github.com/richiejp/m">github.com/richiejp/m</a>.
            At some point I’d like to fully automate recreating the
            environment and get cross compilation to work for a
            reasonable set of tools. Right now though you may find some
            bits are missing from the initrd. Also see the related
            article on <a href="/https/richiejp.com/zig-cross-compile-ltp-ltx-linux">cross
            compiling with Zig and LLVM</a>.</p>
            <p>Anyway, once we have all this setup, then it’s possible
            to set a breakpoint at say <code>fuse_simple_request</code>
            in the kernel.</p>
            <pre><code>pwndbg&gt; b fuse_simple_request
Breakpoint 1 at 0xffffffffc0201587: file fs/fuse/dev.c, line 485.</code></pre>
            <p>Then when we start processing requests we’ll get dropped
            into this function. This function is particularly useful
            because it gets called for most requests. If we inspect the
            backtrace then it is usually easy to see what the current
            operation is.</p>
            <pre><code>pwndbg&gt; bt
#0  fuse_simple_request (fm=0xffff888100ac1860, args=0xffffc900003f7b18) at fs/fuse/dev.c:485
#1  0xffffffffc0207e67 in fuse_do_getattr (inode=0xffff888107e7c340, stat=0x0 &lt;fixed_percpu_data&gt;, file=0x0 &lt;fixed_percpu_data&gt;) at fs/fuse/dir.c:1119
#2  0xffffffffc02081e3 in fuse_perm_getattr (inode=0xffff888107e7c340, mask=1) at fs/fuse/dir.c:1306
#3  fuse_permission (mnt_userns=&lt;optimized out&gt;, inode=0xffff888107e7c340, mask=1) at fs/fuse/dir.c:1347
#4  0xffffffff81347749 in do_inode_permission (mnt_userns=0xffffffff828534b0 &lt;init_user_ns&gt;, inode=0xffff888107e7c340, mask=1) at fs/namei.c:458
#5  inode_permission (mnt_userns=0xffffffff828534b0 &lt;init_user_ns&gt;, inode=0xffff888107e7c340, mask=1) at fs/namei.c:525
#6  0xffffffff8134e8a5 in may_lookup (mnt_userns=0xffffffff828534b0 &lt;init_user_ns&gt;, nd=0xffffc900003f7d30) at fs/namei.c:1715
#7  link_path_walk (name=0xffff88810088b02f &quot;foo&quot;, nd=0xffffc900003f7d30) at fs/namei.c:2262
#8  0xffffffff8134809a in path_lookupat (nd=0xffffc900003f7d30, flags=65, path=0xffffc900003f7e88) at fs/namei.c:2473
#9  0xffffffff81347f37 in filename_lookup (dfd=-100, name=0xffff88810088b000, flags=&lt;optimized out&gt;, path=0xffffc900003f7e88, root=0x0 &lt;fixed_percpu_data&gt;) at fs/namei.c:2503
#10 0xffffffff81348c16 in user_path_at_empty (dfd=-100, name=0x7b9079d55ef8 &quot;/tmp/fuse-test/foo&quot;, flags=1, path=0xffffc900003f7e88, empty=0x0 &lt;fixed_percpu_data&gt;) at fs/namei.c:2876
#11 0xffffffff8136e73f in user_path_at (dfd=-100, name=0x7b9079d55ef8 &quot;/tmp/fuse-test/foo&quot;, flags=1, path=0xffffc900003f7e88) at ./include/linux/namei.h:57
#12 path_setxattr (pathname=0x7b9079d55ef8 &quot;/tmp/fuse-test/foo&quot;, name=0x215f25 &quot;user.bar&quot;, value=0x215f2e, size=3, flags=0, lookup_flags=1) at fs/xattr.c:631
#13 0xffffffff8136d807 in __do_sys_setxattr (pathname=0xb54d318b60b4ad00 &lt;error: Cannot access memory at address 0xb54d318b60b4ad00&gt;, name=0xffffc900003f7b18 &quot;\001&quot;, value=0x0 &lt;fixed_percpu_data&gt;, size=0,
    flags=81) at fs/xattr.c:652
#14 __se_sys_setxattr (pathname=-5382591504945206016, name=-60473135367400, value=0, size=0, flags=&lt;optimized out&gt;) at fs/xattr.c:648
#15 __x64_sys_setxattr (regs=&lt;optimized out&gt;) at fs/xattr.c:648
#16 0xffffffff81b90089 in do_syscall_x64 (regs=0xffffc900003f7f58, nr=1622453504) at arch/x86/entry/common.c:50
#17 do_syscall_64 (regs=0xffffc900003f7f58, nr=1622453504) at arch/x86/entry/common.c:80
#18 0xffffffff81c0009b in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:120</code></pre>
            <p>In this backtrace a lot of stuff is going on and I’ll
            come back to it again. However we can see at the bottom we
            enter the kernel due to the <code>setxattr</code> system
            call. Then there a bunch of functions related to a path
            lookup, which results in a permissions check, which then
            results in <code>getattr</code> (note there is no
            <code>x</code>).</p>
            <p>The actual request is <code>GETATTR</code> which can be
            seen by printing the <code>args-&gt;opcode</code> struct
            member passed to <code>fuse_simple_request</code>.</p>
            <pre><code>pwndbg&gt; p args-&gt;opcode
$3 = 3</code></pre>
            <p>We know that <code>FUSE_GETATTR = 3</code> from
            <code>enum fuse_opcode</code> in
            <code>uapi/linux/fuse.h</code>. This is part of the Linux
            headers. One of the first things I did was to translate it
            into Zig using <code>zig translate-c</code>. I then just
            copy and paste useful bits of this into <a
            href="https://github.com/richiejp/m/blob/5da25f865712c185a09e83fab836be3364fb4f8c/src/fuse.zig">the
            final program</a>.</p>
            <h1 id="tlv">TLV</h1>
            <p>The FUSE protocol is one that can be described as roughly
            Tag-length-value. Every message contains some minimal
            information including the message length, opcode (tag) and
            some common info like the user ID.</p>
            <p>The opcode decides what other data is transmitted after
            the standard header. Compared to <a
            href="/https/richiejp.com/barely-http2-zig">HTTP/2</a> it’s wonderfully simple.
            Although this would be expected from a purely local protocol
            (maybe not USB).</p>
            <p>The headers sent from kernel to userspace daemon and
            userspace daemon to kernel are different. In Zig notation
            they are</p>
            <div class="sourceCode" id="cb5"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb5-1"><a href="#cb5-1" tabindex="-1"></a><span class="at">const</span> InHeader <span class="op">=</span> <span class="at">extern</span> <span class="kw">struct</span> {</span>
<span id="cb5-2"><a href="#cb5-2" tabindex="-1"></a>    len<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb5-3"><a href="#cb5-3" tabindex="-1"></a>    opcode<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb5-4"><a href="#cb5-4" tabindex="-1"></a>    unique<span class="op">:</span> <span class="dt">u64</span><span class="op">,</span></span>
<span id="cb5-5"><a href="#cb5-5" tabindex="-1"></a>    nodeid<span class="op">:</span> <span class="dt">u64</span><span class="op">,</span></span>
<span id="cb5-6"><a href="#cb5-6" tabindex="-1"></a></span>
<span id="cb5-7"><a href="#cb5-7" tabindex="-1"></a>    uid<span class="op">:</span> l<span class="op">.</span>uid_t<span class="op">,</span></span>
<span id="cb5-8"><a href="#cb5-8" tabindex="-1"></a>    gid<span class="op">:</span> l<span class="op">.</span>gid_t<span class="op">,</span></span>
<span id="cb5-9"><a href="#cb5-9" tabindex="-1"></a>    pid<span class="op">:</span> l<span class="op">.</span>pid_t<span class="op">,</span></span>
<span id="cb5-10"><a href="#cb5-10" tabindex="-1"></a></span>
<span id="cb5-11"><a href="#cb5-11" tabindex="-1"></a>    padding<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb5-12"><a href="#cb5-12" tabindex="-1"></a>};</span>
<span id="cb5-13"><a href="#cb5-13" tabindex="-1"></a></span>
<span id="cb5-14"><a href="#cb5-14" tabindex="-1"></a><span class="at">const</span> OutHeader <span class="op">=</span> <span class="at">extern</span> <span class="kw">struct</span> {</span>
<span id="cb5-15"><a href="#cb5-15" tabindex="-1"></a>    len<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb5-16"><a href="#cb5-16" tabindex="-1"></a>    err<span class="op">:</span> <span class="dt">i32</span><span class="op">,</span></span>
<span id="cb5-17"><a href="#cb5-17" tabindex="-1"></a>    unique<span class="op">:</span> <span class="dt">u64</span><span class="op">,</span></span>
<span id="cb5-18"><a href="#cb5-18" tabindex="-1"></a>};</span></code></pre></div>
            <p>“In” is what comes into the daemon and “Out” is sent back
            to the kernel. This naming seems to be used consistently
            between the kernel and userspace headers.</p>
            <p>The <code>len</code> includes the length of the header
            and the following operation arguments. The
            <code>unique</code> field identifies the request so that
            responses can be matched to requests out-of-order.</p>
            <p><code>nodeid</code> appears to be the <code>inode</code>
            number. I’d describe an <code>inode</code> as a thing which
            exists in a file system. These things have numbers and we’re
            supposed to keep track of them, but instead I choose just to
            give them amusing (to me) numbers like
            <code>0xf00</code>.</p>
            <p>The purpose of <code>nodeid</code> of course depends on
            the opcode. It seems that at least the init operation does
            not need it, but probably most other operations do.</p>
            <p>We also have the user credentials and the process ID. I
            assume these are usually of the process that triggered a
            file system request.</p>
            <p>Then there is <code>padding</code> which I guess is there
            to ensure the struct is aligned to 8 bytes (64bit).
            Otherwise padding may get inserted elsewhere if the struct
            is embedded in another struct with 8 byte fields or if it is
            put into an array.</p>
            <p>Because the kernel and userland may use different
            compilers, the padding in structs needs to be made explicit.
            In our case we are not even using the same language.</p>
            <p>The out header has an <code>err</code> field which either
            has a negative error code or 0. For a lot of opcodes we can
            set the error to <code>ENOSYS</code> and the kernel won’t
            try making another request with the same opcode. We also
            don’t have to send a valid message body.</p>
            <p>libfuse implements a lot of basic features by default so
            that you can implement a small file system and have it work
            with standard tools like <code>ls</code>. However this isn’t
            necessary for the kernel which just passes on errors to
            userspace or just gives up trying to do some operation.</p>
            <h1 id="init">INIT</h1>
            <p>The first thing we actually receive when reading from
            <code>/dev/fuse</code> is an init message that negotiates
            which protocol to use.</p>
            <div class="sourceCode" id="cb6"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb6-1"><a href="#cb6-1" tabindex="-1"></a><span class="at">const</span> InitIn <span class="op">=</span> <span class="at">extern</span> <span class="kw">struct</span> {</span>
<span id="cb6-2"><a href="#cb6-2" tabindex="-1"></a>    major<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb6-3"><a href="#cb6-3" tabindex="-1"></a>    minor<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb6-4"><a href="#cb6-4" tabindex="-1"></a>    max_readahead<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb6-5"><a href="#cb6-5" tabindex="-1"></a>    flags<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb6-6"><a href="#cb6-6" tabindex="-1"></a>    flags2<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb6-7"><a href="#cb6-7" tabindex="-1"></a>    unused<span class="op">:</span> [<span class="dv">11</span>]<span class="dt">u32</span><span class="op">,</span></span>
<span id="cb6-8"><a href="#cb6-8" tabindex="-1"></a>};</span>
<span id="cb6-9"><a href="#cb6-9" tabindex="-1"></a></span>
<span id="cb6-10"><a href="#cb6-10" tabindex="-1"></a><span class="at">const</span> InitOut <span class="op">=</span> <span class="at">extern</span> <span class="kw">struct</span> {</span>
<span id="cb6-11"><a href="#cb6-11" tabindex="-1"></a>    major<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb6-12"><a href="#cb6-12" tabindex="-1"></a>    minor<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb6-13"><a href="#cb6-13" tabindex="-1"></a>    max_readahead<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb6-14"><a href="#cb6-14" tabindex="-1"></a>    flags<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb6-15"><a href="#cb6-15" tabindex="-1"></a></span>
<span id="cb6-16"><a href="#cb6-16" tabindex="-1"></a>    max_background<span class="op">:</span> <span class="dt">u16</span><span class="op">,</span></span>
<span id="cb6-17"><a href="#cb6-17" tabindex="-1"></a>    congestion_threshold<span class="op">:</span> <span class="dt">u16</span><span class="op">,</span></span>
<span id="cb6-18"><a href="#cb6-18" tabindex="-1"></a>    max_write<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb6-19"><a href="#cb6-19" tabindex="-1"></a>    time_gran<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb6-20"><a href="#cb6-20" tabindex="-1"></a>    max_pages<span class="op">:</span> <span class="dt">u16</span><span class="op">,</span></span>
<span id="cb6-21"><a href="#cb6-21" tabindex="-1"></a>    map_alignment<span class="op">:</span> <span class="dt">u16</span><span class="op">,</span></span>
<span id="cb6-22"><a href="#cb6-22" tabindex="-1"></a>    flags2<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb6-23"><a href="#cb6-23" tabindex="-1"></a>    unused<span class="op">:</span> [<span class="dv">7</span>]<span class="dt">u32</span> <span class="op">=</span> <span class="op">.</span>{<span class="dv">0</span>} <span class="op">**</span> <span class="dv">7</span><span class="op">,</span></span>
<span id="cb6-24"><a href="#cb6-24" tabindex="-1"></a>};</span></code></pre></div>
            <p>For the most part I just reflect what is in the
            <code>InitIn</code> message to the <code>InitOut</code> or
            pick some minimal value the kernel will accept. In reality
            my FS does not support even a small fraction of whats in
            <code>flags</code> and <code>flags2</code>.</p>
            <p>To get <code>InitIn</code> we just read the bytes from
            the file descriptor and cast them to the above structs.</p>
            <p>From <a
            href="https://github.com/richiejp/m/blob/5da25f865712c185a09e83fab836be3364fb4f8c/src/fuse.zig#L272">init</a>:</p>
            <div class="sourceCode" id="cb7"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb7-1"><a href="#cb7-1" tabindex="-1"></a>        <span class="at">var</span> buf<span class="op">:</span> []<span class="dt">u8</span> <span class="op">=</span> <span class="op">&amp;</span><span class="va">self</span><span class="op">.</span>read_buf;</span>
<span id="cb7-2"><a href="#cb7-2" tabindex="-1"></a>        <span class="at">const</span> len <span class="op">=</span> <span class="cf">try</span> os<span class="op">.</span>read(fd<span class="op">,</span> buf);</span>
<span id="cb7-3"><a href="#cb7-3" tabindex="-1"></a></span>
<span id="cb7-4"><a href="#cb7-4" tabindex="-1"></a>        assert(len <span class="op">&gt;=</span> <span class="bu">@sizeOf</span>(InHeader) <span class="op">+</span> <span class="bu">@sizeOf</span>(InitIn));</span>
<span id="cb7-5"><a href="#cb7-5" tabindex="-1"></a></span>
<span id="cb7-6"><a href="#cb7-6" tabindex="-1"></a>        <span class="at">const</span> hdr <span class="op">=</span> mem<span class="op">.</span>bytesAsValue(InHeader<span class="op">,</span> buf[<span class="dv">0</span><span class="er">.</span><span class="op">.</span><span class="bu">@sizeOf</span>(InHeader)]);</span>
<span id="cb7-7"><a href="#cb7-7" tabindex="-1"></a></span>
<span id="cb7-8"><a href="#cb7-8" tabindex="-1"></a>        std<span class="op">.</span>debug<span class="op">.</span>print(<span class="st">&quot;kernel: hdr: {}</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span> <span class="op">.</span>{hdr<span class="op">.*</span>});</span>
<span id="cb7-9"><a href="#cb7-9" tabindex="-1"></a></span>
<span id="cb7-10"><a href="#cb7-10" tabindex="-1"></a>        <span class="at">const</span> opcode<span class="op">:</span> Opcode <span class="op">=</span> @enumFromInt(hdr<span class="op">.</span>opcode);</span>
<span id="cb7-11"><a href="#cb7-11" tabindex="-1"></a>        assert(opcode <span class="op">==</span> <span class="op">.</span>INIT);</span>
<span id="cb7-12"><a href="#cb7-12" tabindex="-1"></a>        assert(hdr<span class="op">.</span>len <span class="op">==</span> <span class="bu">@sizeOf</span>(InHeader) <span class="op">+</span> <span class="bu">@sizeOf</span>(InitIn));</span>
<span id="cb7-13"><a href="#cb7-13" tabindex="-1"></a></span>
<span id="cb7-14"><a href="#cb7-14" tabindex="-1"></a>        <span class="va">self</span><span class="op">.</span>read_len <span class="op">=</span> len <span class="op">-</span> hdr<span class="op">.</span>len;</span>
<span id="cb7-15"><a href="#cb7-15" tabindex="-1"></a></span>
<span id="cb7-16"><a href="#cb7-16" tabindex="-1"></a>        <span class="at">const</span> req <span class="op">=</span> mem<span class="op">.</span>bytesAsValue(InitIn<span class="op">,</span> (buf[<span class="bu">@sizeOf</span>(InHeader)<span class="op">..</span>][<span class="dv">0</span><span class="er">.</span><span class="op">.</span><span class="bu">@sizeOf</span>(InitIn)]));</span>
<span id="cb7-17"><a href="#cb7-17" tabindex="-1"></a></span>
<span id="cb7-18"><a href="#cb7-18" tabindex="-1"></a>        std<span class="op">.</span>debug<span class="op">.</span>print(<span class="st">&quot;kernel: init: {}</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span> <span class="op">.</span>{req<span class="op">.*</span>});</span>
<span id="cb7-19"><a href="#cb7-19" tabindex="-1"></a></span>
<span id="cb7-20"><a href="#cb7-20" tabindex="-1"></a>        assert(req<span class="op">.</span>major <span class="op">==</span> <span class="dv">7</span>);</span>
<span id="cb7-21"><a href="#cb7-21" tabindex="-1"></a>        assert(req<span class="op">.</span>minor <span class="op">==</span> <span class="dv">37</span>);</span></code></pre></div>
            <p>The Zig standard library has
            <code>mem.bytesAsValue</code> which just casts the buffer to
            a specified type. Apart from some safety checks (assuming
            they are enabled), it doesn’t appear to do anything at
            runtime.</p>
            <div class="sourceCode" id="cb8"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb8-1"><a href="#cb8-1" tabindex="-1"></a><span class="kw">fn</span> BytesAsValueReturnType(<span class="at">comptime</span> T<span class="op">:</span> <span class="dt">type</span><span class="op">,</span> <span class="at">comptime</span> B<span class="op">:</span> <span class="dt">type</span>) <span class="dt">type</span> {</span>
<span id="cb8-2"><a href="#cb8-2" tabindex="-1"></a>    <span class="at">const</span> size <span class="op">=</span> <span class="bu">@as</span>(<span class="dt">usize</span><span class="op">,</span> <span class="bu">@sizeOf</span>(T));</span>
<span id="cb8-3"><a href="#cb8-3" tabindex="-1"></a></span>
<span id="cb8-4"><a href="#cb8-4" tabindex="-1"></a>    <span class="cf">if</span> (<span class="at">comptime</span> <span class="op">!</span>trait<span class="op">.</span>is(<span class="op">.</span>Pointer)(B) <span class="kw">or</span></span>
<span id="cb8-5"><a href="#cb8-5" tabindex="-1"></a>        (meta<span class="op">.</span>Child(B) <span class="op">!=</span> [size]<span class="dt">u8</span> <span class="kw">and</span> meta<span class="op">.</span>Child(B) <span class="op">!=</span> [size<span class="op">:</span><span class="dv">0</span>]<span class="dt">u8</span>))</span>
<span id="cb8-6"><a href="#cb8-6" tabindex="-1"></a>    {</span>
<span id="cb8-7"><a href="#cb8-7" tabindex="-1"></a>        <span class="bu">@compileError</span>(std<span class="op">.</span>fmt<span class="op">.</span>comptimePrint(<span class="st">&quot;expected *[{}]u8, passed &quot;</span> <span class="op">++</span> <span class="bu">@typeName</span>(B)<span class="op">,</span> <span class="op">.</span>{size}));</span>
<span id="cb8-8"><a href="#cb8-8" tabindex="-1"></a>    }</span>
<span id="cb8-9"><a href="#cb8-9" tabindex="-1"></a></span>
<span id="cb8-10"><a href="#cb8-10" tabindex="-1"></a>    <span class="cf">return</span> CopyPtrAttrs(B<span class="op">,</span> <span class="op">.</span>One<span class="op">,</span> T);</span>
<span id="cb8-11"><a href="#cb8-11" tabindex="-1"></a>}</span>
<span id="cb8-12"><a href="#cb8-12" tabindex="-1"></a></span>
<span id="cb8-13"><a href="#cb8-13" tabindex="-1"></a><span class="co">/// Given a pointer to an array of bytes, returns a pointer to a value of the specified type</span></span>
<span id="cb8-14"><a href="#cb8-14" tabindex="-1"></a><span class="co">/// backed by those bytes, preserving pointer attributes.</span></span>
<span id="cb8-15"><a href="#cb8-15" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">fn</span> bytesAsValue(<span class="at">comptime</span> T<span class="op">:</span> <span class="dt">type</span><span class="op">,</span> bytes<span class="op">:</span> <span class="kw">anytype</span>) BytesAsValueReturnType(T<span class="op">,</span> <span class="bu">@TypeOf</span>(bytes)) {</span>
<span id="cb8-16"><a href="#cb8-16" tabindex="-1"></a>    <span class="cf">return</span> <span class="bu">@as</span>(BytesAsValueReturnType(T<span class="op">,</span> <span class="bu">@TypeOf</span>(bytes))<span class="op">,</span> <span class="bu">@ptrCast</span>(bytes));</span>
<span id="cb8-17"><a href="#cb8-17" tabindex="-1"></a>}</span></code></pre></div>
            <p>My understanding is that
            <code>BytesAsValueReturnType</code> inspects the argument
            types at compile time. If argument <code>B</code> or
            <code>bytes</code> is a pointer or a slice of
            <code>u8</code> then it copies some the “pointer attributes”
            of <code>B</code> to a new pointer type.</p>
            <p>Pointer attributes here includes things such as
            <code>const</code>, <code>volatile</code>,
            <code>address_space</code> and <code>alignment</code>. The
            size and underlying type are not copied though, they come
            from what we are trying to cast to (<code>T</code>).</p>
            <p>Apart from the fact that this is implemented in the Zig
            library and not the compiler. It’s interesting because the
            <code>alignment</code> ends up being 1 (the length of
            <code>u8</code>) which is copied from the slice. To my
            knowledge an alignment of 8 (<code>u64</code>) should be
            ideal and is what the protocol structures are aligned
            to.</p>
            <p>I’m not sure how to get the alignment to be 8. There’s
            probably some assertion or cast that can be done. It’s not
            important for getting the code to work, but it’s interesting
            from a performance point of view.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>Update! The alignment of the read buffer can be set as
            follows</p>
            <div class="sourceCode" id="cb9"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb9-1"><a href="#cb9-1" tabindex="-1"></a>read_buf<span class="op">:</span> [MIN_READ_BUFFER <span class="op">*</span> <span class="dv">2</span>]<span class="dt">u8</span> <span class="kw">align</span>(<span class="bu">@alignOf</span>(InHeader)) <span class="op">=</span> <span class="cn">undefined</span><span class="op">,</span></span></code></pre></div>
            <p>Then the beginning of the buffer is aligned so that we
            can directly cast it to <code>InHeader</code>.</p>
            </div>
            </div>
            <p>After casting into the <code>InitIn</code> struct we can
            print it using the Zig standard library. The result is quite
            ugly, but requires minimal effort and works for
            debugging.</p>
            <pre><code>kernel: hdr: fuse.InHeader{ .len = 104, .opcode = 26, .unique = 2, .nodeid = 0, .uid = 0, .gid = 0, .pid = 0, .padding = 0 }
kernel: init: fuse.InitIn{ .major = 7, .minor = 37, .max_readahead = 131072, .flags = 1946157051, .flags2 = 1, .unused = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }</code></pre>
            <p>Once we have checked some assumptions about what we
            should receive and have printed it. We write a response.</p>
            <div class="sourceCode" id="cb11"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb11-1"><a href="#cb11-1" tabindex="-1"></a>        <span class="at">const</span> res <span class="op">=</span> InitOutMsg{</span>
<span id="cb11-2"><a href="#cb11-2" tabindex="-1"></a>            <span class="op">.</span>head <span class="op">=</span> <span class="op">.</span>{</span>
<span id="cb11-3"><a href="#cb11-3" tabindex="-1"></a>                <span class="op">.</span>len <span class="op">=</span> <span class="bu">@sizeOf</span>(InitOutMsg)<span class="op">,</span></span>
<span id="cb11-4"><a href="#cb11-4" tabindex="-1"></a>                <span class="op">.</span>err <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb11-5"><a href="#cb11-5" tabindex="-1"></a>                <span class="op">.</span>unique <span class="op">=</span> hdr<span class="op">.</span>unique<span class="op">,</span></span>
<span id="cb11-6"><a href="#cb11-6" tabindex="-1"></a>            }<span class="op">,</span></span>
<span id="cb11-7"><a href="#cb11-7" tabindex="-1"></a>            <span class="op">.</span>body <span class="op">=</span> <span class="op">.</span>{</span>
<span id="cb11-8"><a href="#cb11-8" tabindex="-1"></a>                <span class="op">.</span>major <span class="op">=</span> <span class="dv">7</span><span class="op">,</span></span>
<span id="cb11-9"><a href="#cb11-9" tabindex="-1"></a>                <span class="op">.</span>minor <span class="op">=</span> <span class="dv">37</span><span class="op">,</span></span>
<span id="cb11-10"><a href="#cb11-10" tabindex="-1"></a>                <span class="op">.</span>max_readahead <span class="op">=</span> req<span class="op">.</span>max_readahead<span class="op">,</span></span>
<span id="cb11-11"><a href="#cb11-11" tabindex="-1"></a>                <span class="op">.</span>flags <span class="op">=</span> req<span class="op">.</span>flags<span class="op">,</span></span>
<span id="cb11-12"><a href="#cb11-12" tabindex="-1"></a>                <span class="op">.</span>flags2 <span class="op">=</span> req<span class="op">.</span>flags2<span class="op">,</span></span>
<span id="cb11-13"><a href="#cb11-13" tabindex="-1"></a></span>
<span id="cb11-14"><a href="#cb11-14" tabindex="-1"></a>                <span class="op">.</span>max_background <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb11-15"><a href="#cb11-15" tabindex="-1"></a>                <span class="op">.</span>congestion_threshold <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb11-16"><a href="#cb11-16" tabindex="-1"></a>                <span class="op">.</span>max_write <span class="op">=</span> <span class="dv">4096</span><span class="op">,</span></span>
<span id="cb11-17"><a href="#cb11-17" tabindex="-1"></a>                <span class="op">.</span>time_gran <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb11-18"><a href="#cb11-18" tabindex="-1"></a>                <span class="op">.</span>max_pages <span class="op">=</span> <span class="dv">1</span><span class="op">,</span></span>
<span id="cb11-19"><a href="#cb11-19" tabindex="-1"></a>                <span class="op">.</span>map_alignment <span class="op">=</span> <span class="dv">1</span><span class="op">,</span></span>
<span id="cb11-20"><a href="#cb11-20" tabindex="-1"></a>            }<span class="op">,</span></span>
<span id="cb11-21"><a href="#cb11-21" tabindex="-1"></a>        };</span>
<span id="cb11-22"><a href="#cb11-22" tabindex="-1"></a></span>
<span id="cb11-23"><a href="#cb11-23" tabindex="-1"></a>        std<span class="op">.</span>debug<span class="op">.</span>print(<span class="st">&quot;fuse: init: {}</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span> <span class="op">.</span>{res});</span>
<span id="cb11-24"><a href="#cb11-24" tabindex="-1"></a>        assert(<span class="cf">try</span> os<span class="op">.</span>write(fd<span class="op">,</span> mem<span class="op">.</span>asBytes(<span class="op">&amp;</span>res)) <span class="op">==</span> <span class="bu">@sizeOf</span>(<span class="bu">@TypeOf</span>(res)));</span>
<span id="cb11-25"><a href="#cb11-25" tabindex="-1"></a></span>
<span id="cb11-26"><a href="#cb11-26" tabindex="-1"></a>        mem<span class="op">.</span>copyForwards(<span class="dt">u8</span><span class="op">,</span> buf<span class="op">,</span> buf[<span class="bu">@sizeOf</span>(InHeader) <span class="op">+</span> <span class="bu">@sizeOf</span>(InitIn) <span class="op">..</span>]);</span></code></pre></div>
            <p>This casts our response to a <code>u8</code> slice,
            writes it to <code>/dev/fuse</code> and shifts the read
            buffer to the left. Probably the read buffer only has the
            init message in it, but we just treat it the same as other
            messages in this regard.</p>
            <p>When we read we assume that multiple messages will be
            queued and that we could read a chunk of the next
            message(s).</p>
            <h1 id="getattr">GETATTR</h1>
            <p>The first message I expected to receive was
            <code>LOOKUP</code> because when we do <code>setxattr</code>
            on the file path <code>/tmp/fuse-test/foo</code>. The
            <code>foo</code> part is inside our file systems mount and
            it needs to be resolved to an <code>inode</code> number.
            It’s the job of the file system, to take components of a
            path and resolve them to a number.</p>
            <p>Most internal operations in the kernel work on inodes or
            some similar thing, not paths.</p>
            <p>According to the kernel our file system already has an
            inode in it with <code>nodeid = 1</code>. Maybe this is the
            mount point itself or some other file system thing. Whatever
            it is the kernel calls <code>inode_permission</code>, then
            <code>fuse_permission</code> and eventually
            <code>fuse_do_getattr</code>. So we can guess it wants the
            inodes mode to see if it can be accessed.</p>
            <div class="sourceCode" id="cb12"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb12-1"><a href="#cb12-1" tabindex="-1"></a>    <span class="cf">switch</span> (opcode) {</span>
<span id="cb12-2"><a href="#cb12-2" tabindex="-1"></a>        <span class="op">.</span>GETATTR <span class="op">=&gt;</span> {</span>
<span id="cb12-3"><a href="#cb12-3" tabindex="-1"></a>            <span class="at">const</span> getattr_in <span class="op">=</span></span>
<span id="cb12-4"><a href="#cb12-4" tabindex="-1"></a>                mem<span class="op">.</span>bytesAsValue(GetattrIn<span class="op">,</span> msg[<span class="dv">0</span><span class="er">.</span><span class="op">.</span><span class="bu">@sizeOf</span>(GetattrIn)]);</span>
<span id="cb12-5"><a href="#cb12-5" tabindex="-1"></a></span>
<span id="cb12-6"><a href="#cb12-6" tabindex="-1"></a>            std<span class="op">.</span>debug<span class="op">.</span>print(<span class="st">&quot;kernel: getattr: {}</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span> <span class="op">.</span>{getattr_in});</span>
<span id="cb12-7"><a href="#cb12-7" tabindex="-1"></a></span>
<span id="cb12-8"><a href="#cb12-8" tabindex="-1"></a>            <span class="at">const</span> time<span class="op">:</span> <span class="dt">u64</span> <span class="op">=</span> <span class="bu">@intCast</span>(<span class="bu">@min</span>(<span class="dv">0</span><span class="op">,</span> std<span class="op">.</span>time<span class="op">.</span>timestamp()));</span>
<span id="cb12-9"><a href="#cb12-9" tabindex="-1"></a></span>
<span id="cb12-10"><a href="#cb12-10" tabindex="-1"></a>            res<span class="op">.</span>out<span class="op">.</span>attr <span class="op">=</span> <span class="op">.</span>{</span>
<span id="cb12-11"><a href="#cb12-11" tabindex="-1"></a>                <span class="op">.</span>valid <span class="op">=</span> time <span class="op">+</span> <span class="dv">300</span><span class="op">,</span></span>
<span id="cb12-12"><a href="#cb12-12" tabindex="-1"></a>                <span class="op">.</span>valid_nsec <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb12-13"><a href="#cb12-13" tabindex="-1"></a>                <span class="op">.</span>dummy <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb12-14"><a href="#cb12-14" tabindex="-1"></a>                <span class="op">.</span>attr <span class="op">=</span> <span class="op">.</span>{</span>
<span id="cb12-15"><a href="#cb12-15" tabindex="-1"></a>                    <span class="op">.</span>ino <span class="op">=</span> getattr_in<span class="op">.</span>nodeid<span class="op">,</span></span>
<span id="cb12-16"><a href="#cb12-16" tabindex="-1"></a>                    <span class="op">.</span>blocks <span class="op">=</span> <span class="dv">1</span><span class="op">,</span></span>
<span id="cb12-17"><a href="#cb12-17" tabindex="-1"></a>                    <span class="op">.</span>size <span class="op">=</span> <span class="dv">42</span><span class="op">,</span></span>
<span id="cb12-18"><a href="#cb12-18" tabindex="-1"></a>                    <span class="op">.</span>atime <span class="op">=</span> time<span class="op">,</span></span>
<span id="cb12-19"><a href="#cb12-19" tabindex="-1"></a>                    <span class="op">.</span>mtime <span class="op">=</span> time<span class="op">,</span></span>
<span id="cb12-20"><a href="#cb12-20" tabindex="-1"></a>                    <span class="op">.</span>ctime <span class="op">=</span> time<span class="op">,</span></span>
<span id="cb12-21"><a href="#cb12-21" tabindex="-1"></a>                    <span class="op">.</span>atimensec <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb12-22"><a href="#cb12-22" tabindex="-1"></a>                    <span class="op">.</span>mtimensec <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb12-23"><a href="#cb12-23" tabindex="-1"></a>                    <span class="op">.</span>ctimensec <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb12-24"><a href="#cb12-24" tabindex="-1"></a>                    <span class="op">.</span>mode <span class="op">=</span> l<span class="op">.</span>S<span class="op">.</span>IFDIR <span class="op">|</span> <span class="dv">0</span><span class="er">o666</span><span class="op">,</span></span>
<span id="cb12-25"><a href="#cb12-25" tabindex="-1"></a>                    <span class="op">.</span>nlink <span class="op">=</span> <span class="dv">1</span><span class="op">,</span></span>
<span id="cb12-26"><a href="#cb12-26" tabindex="-1"></a>                    <span class="op">.</span>uid <span class="op">=</span> l<span class="op">.</span>getuid()<span class="op">,</span></span>
<span id="cb12-27"><a href="#cb12-27" tabindex="-1"></a>                    <span class="op">.</span>gid <span class="op">=</span> l<span class="op">.</span>getgid()<span class="op">,</span></span>
<span id="cb12-28"><a href="#cb12-28" tabindex="-1"></a>                    <span class="op">.</span>rdev <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb12-29"><a href="#cb12-29" tabindex="-1"></a>                    <span class="op">.</span>blksize <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb12-30"><a href="#cb12-30" tabindex="-1"></a>                    <span class="op">.</span>flags <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb12-31"><a href="#cb12-31" tabindex="-1"></a>                }<span class="op">,</span></span>
<span id="cb12-32"><a href="#cb12-32" tabindex="-1"></a>            };</span>
<span id="cb12-33"><a href="#cb12-33" tabindex="-1"></a></span>
<span id="cb12-34"><a href="#cb12-34" tabindex="-1"></a>            res<span class="op">.</span>hdr<span class="op">.</span>len <span class="op">+=</span> <span class="bu">@sizeOf</span>(AttrOut);</span>
<span id="cb12-35"><a href="#cb12-35" tabindex="-1"></a>        }<span class="op">,</span></span></code></pre></div>
            <p>I’ve skipped <a
            href="https://github.com/richiejp/m/blob/5da25f865712c185a09e83fab836be3364fb4f8c/src/fuse.zig#L330">reading
            the headers and setting up the response</a>. Above you can
            just see the response we send.</p>
            <p>From reading the kernel code I deduced that
            <code>valid</code> is the time when the attributes cease to
            be valid. If that is not the case then it still accepted
            this value. I suppose in the worst case it’ll remain valid
            for a few decades.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>Update! Indeed the valid time provided is the amount of
            time it is valid for. This is seen in
            <code>fs/fuse/dir.c:time_to_jiffies()</code>.</p>
            </div>
            </div>
            <p>Next we have things like <code>blocks</code>,
            <code>size</code>, <code>blksize</code>, <code>rdev</code>
            and various time stamps. These certainly don’t have any
            effect on what the kernel is currently trying to do, but
            it’s probably best to try picking values in a sensible
            range.</p>
            <p>The <code>nlink</code> does seem to be important. Setting
            it to zero would seem to indicate the item has been deleted.
            Although whether that would effect the current operation, I
            don’t know.</p>
            <p>The important field is <code>mode</code>, this lets us
            set the file type and permissions. I decided that whatever
            the thing is that is being accessed it should be a directory
            and all users should have read-write permissions. The kernel
            was happy with this.</p>
            <h1 id="lookup">LOOKUP</h1>
            <p>Next the message I was expecting arrived.</p>
            <div class="sourceCode" id="cb13"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb13-1"><a href="#cb13-1" tabindex="-1"></a>        <span class="op">.</span>LOOKUP <span class="op">=&gt;</span> blk<span class="op">:</span> {</span>
<span id="cb13-2"><a href="#cb13-2" tabindex="-1"></a>            <span class="at">const</span> Static <span class="op">=</span> <span class="kw">struct</span> {</span>
<span id="cb13-3"><a href="#cb13-3" tabindex="-1"></a>                <span class="at">var</span> generation<span class="op">:</span> <span class="dt">u64</span> <span class="op">=</span> <span class="dv">0</span>;</span>
<span id="cb13-4"><a href="#cb13-4" tabindex="-1"></a>            };</span>
<span id="cb13-5"><a href="#cb13-5" tabindex="-1"></a>            <span class="at">const</span> lookup_in<span class="op">:</span> []<span class="at">const</span> <span class="dt">u8</span> <span class="op">=</span> msg[<span class="dv">0</span><span class="er">.</span><span class="op">.</span>msg_len];</span>
<span id="cb13-6"><a href="#cb13-6" tabindex="-1"></a></span>
<span id="cb13-7"><a href="#cb13-7" tabindex="-1"></a>            std<span class="op">.</span>debug<span class="op">.</span>print(<span class="st">&quot;kernel: lookup: {s}</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span> <span class="op">.</span>{lookup_in});</span>
<span id="cb13-8"><a href="#cb13-8" tabindex="-1"></a></span>
<span id="cb13-9"><a href="#cb13-9" tabindex="-1"></a>            <span class="cf">if</span> (<span class="op">!</span>mem<span class="op">.</span>eql(<span class="dt">u8</span><span class="op">,</span> <span class="st">&quot;foo&quot;</span><span class="op">,</span> lookup_in[<span class="dv">0</span><span class="er">.</span><span class="op">.</span><span class="dv">3</span>])) {</span>
<span id="cb13-10"><a href="#cb13-10" tabindex="-1"></a>                res<span class="op">.</span>hdr<span class="op">.</span>err <span class="op">=</span> <span class="op">-</span><span class="bu">@as</span>(<span class="dt">i32</span><span class="op">,</span> @intFromEnum(E<span class="op">.</span>NOENT));</span>
<span id="cb13-11"><a href="#cb13-11" tabindex="-1"></a>                <span class="cf">break</span> <span class="op">:</span>blk;</span>
<span id="cb13-12"><a href="#cb13-12" tabindex="-1"></a>            }</span>
<span id="cb13-13"><a href="#cb13-13" tabindex="-1"></a></span>
<span id="cb13-14"><a href="#cb13-14" tabindex="-1"></a>            <span class="at">const</span> time<span class="op">:</span> <span class="dt">u64</span> <span class="op">=</span> <span class="bu">@intCast</span>(<span class="bu">@min</span>(<span class="dv">0</span><span class="op">,</span> std<span class="op">.</span>time<span class="op">.</span>timestamp()));</span>
<span id="cb13-15"><a href="#cb13-15" tabindex="-1"></a></span>
<span id="cb13-16"><a href="#cb13-16" tabindex="-1"></a>            Static<span class="op">.</span>generation <span class="op">+=</span> <span class="dv">1</span>;</span>
<span id="cb13-17"><a href="#cb13-17" tabindex="-1"></a></span>
<span id="cb13-18"><a href="#cb13-18" tabindex="-1"></a>            res<span class="op">.</span>out<span class="op">.</span>entry <span class="op">=</span> <span class="op">.</span>{</span>
<span id="cb13-19"><a href="#cb13-19" tabindex="-1"></a>                <span class="op">.</span>nodeid <span class="op">=</span> <span class="dv">0</span><span class="er">xf00</span><span class="op">,</span></span>
<span id="cb13-20"><a href="#cb13-20" tabindex="-1"></a>                <span class="op">.</span>generation <span class="op">=</span> Static<span class="op">.</span>generation<span class="op">,</span></span>
<span id="cb13-21"><a href="#cb13-21" tabindex="-1"></a>                <span class="op">.</span>entry_valid <span class="op">=</span> time <span class="op">+</span> <span class="dv">300</span><span class="op">,</span></span>
<span id="cb13-22"><a href="#cb13-22" tabindex="-1"></a>                <span class="op">.</span>entry_valid_nsec <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb13-23"><a href="#cb13-23" tabindex="-1"></a>                <span class="op">.</span>attr_valid <span class="op">=</span> time <span class="op">+</span> <span class="dv">300</span><span class="op">,</span></span>
<span id="cb13-24"><a href="#cb13-24" tabindex="-1"></a>                <span class="op">.</span>attr_valid_nsec <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb13-25"><a href="#cb13-25" tabindex="-1"></a>                <span class="op">.</span>attr <span class="op">=</span> <span class="op">.</span>{</span>
<span id="cb13-26"><a href="#cb13-26" tabindex="-1"></a>                    <span class="op">.</span>ino <span class="op">=</span> <span class="dv">0</span><span class="er">xf00</span><span class="op">,</span></span>
<span id="cb13-27"><a href="#cb13-27" tabindex="-1"></a>                    <span class="op">.</span>blocks <span class="op">=</span> <span class="dv">1</span><span class="op">,</span></span>
<span id="cb13-28"><a href="#cb13-28" tabindex="-1"></a>                    <span class="op">.</span>size <span class="op">=</span> <span class="dv">420</span><span class="op">,</span></span>
<span id="cb13-29"><a href="#cb13-29" tabindex="-1"></a>                    <span class="op">.</span>atime <span class="op">=</span> time<span class="op">,</span></span>
<span id="cb13-30"><a href="#cb13-30" tabindex="-1"></a>                    <span class="op">.</span>mtime <span class="op">=</span> time<span class="op">,</span></span>
<span id="cb13-31"><a href="#cb13-31" tabindex="-1"></a>                    <span class="op">.</span>ctime <span class="op">=</span> time<span class="op">,</span></span>
<span id="cb13-32"><a href="#cb13-32" tabindex="-1"></a>                    <span class="op">.</span>atimensec <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb13-33"><a href="#cb13-33" tabindex="-1"></a>                    <span class="op">.</span>mtimensec <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb13-34"><a href="#cb13-34" tabindex="-1"></a>                    <span class="op">.</span>ctimensec <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb13-35"><a href="#cb13-35" tabindex="-1"></a>                    <span class="op">.</span>mode <span class="op">=</span> l<span class="op">.</span>S<span class="op">.</span>IFREG <span class="op">|</span> <span class="dv">0</span><span class="er">o666</span><span class="op">,</span></span>
<span id="cb13-36"><a href="#cb13-36" tabindex="-1"></a>                    <span class="op">.</span>nlink <span class="op">=</span> <span class="dv">1</span><span class="op">,</span></span>
<span id="cb13-37"><a href="#cb13-37" tabindex="-1"></a>                    <span class="op">.</span>uid <span class="op">=</span> l<span class="op">.</span>getuid()<span class="op">,</span></span>
<span id="cb13-38"><a href="#cb13-38" tabindex="-1"></a>                    <span class="op">.</span>gid <span class="op">=</span> l<span class="op">.</span>getgid()<span class="op">,</span></span>
<span id="cb13-39"><a href="#cb13-39" tabindex="-1"></a>                    <span class="op">.</span>rdev <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb13-40"><a href="#cb13-40" tabindex="-1"></a>                    <span class="op">.</span>blksize <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb13-41"><a href="#cb13-41" tabindex="-1"></a>                    <span class="op">.</span>flags <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb13-42"><a href="#cb13-42" tabindex="-1"></a>                }<span class="op">,</span></span>
<span id="cb13-43"><a href="#cb13-43" tabindex="-1"></a>            };</span>
<span id="cb13-44"><a href="#cb13-44" tabindex="-1"></a></span>
<span id="cb13-45"><a href="#cb13-45" tabindex="-1"></a>            res<span class="op">.</span>hdr<span class="op">.</span>len <span class="op">+=</span> <span class="bu">@sizeOf</span>(EntryOut);</span>
<span id="cb13-46"><a href="#cb13-46" tabindex="-1"></a>        }<span class="op">,</span></span></code></pre></div>
            <p>The <code>Static</code> struct containing
            <code>generation</code> is how static variables are declared
            in Zig. We use this to set the inode generation. For now it
            is unlikely we’ll get more than one lookup request, but if
            we do we’ll increase the generation to ensure it’s unique
            for this session.</p>
            <p>Another interesting Zig feature is the
            <code>blk: {...}</code> label. Here <code>blk</code> is an
            arbitrary name, but I with convention. In Zig you can add a
            label to a block (i.e. <code>{...}</code>). Then use the
            <code>break</code> keyword to return from that block.
            Including with a value, but we don’t use that here. This
            eliminates one use of <code>goto</code> or very small
            functions.</p>
            <p>We check that the lookup path is “foo” and return from
            the block if it is not. This isn’t strictly necessary, it’s
            just checking my assumptions about what is happening.</p>
            <p>Then the result again has an attribute embedded within it
            (<code>attr</code>). So the response to <code>LOOKUP</code>
            is a superset of <code>GETATTR</code>.</p>
            <p>The big difference is that we get to set the inode number
            and how long the mapping lasts. The mapping being the path
            to inode number relationship.</p>
            <h1 id="setxattr">SETXATTR</h1>
            <p>Finally we get to the operation that matters to us. This
            allows us to implement the <code>setxattr</code> system call
            for our file system. This system call is used to set
            extended attributes.</p>
            <p>I imagine many readers have never heard of extended
            attributes. On file systems that support them, they let you
            set name value pairs on files.</p>
            <p>The name is a null terminated string and the value is a
            binary chunk. This allows files to be tagged with whatever
            meta-data you like.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>I probably also need to implement <code>OPEN</code> and
            <code>READ</code> to enable <code>mmap</code> on my file
            system. However for now I’m going with the idea that if I
            fully control <code>SETXATTR</code> or <code>GETXATTR</code>
            then I don’t need to inject delays via
            <code>mmap</code>.</p>
            <p>Also extended attributes in the “user” namespace are not
            supported on TMPFS. If <code>/tmp</code> or
            <code>/run</code> are mounted on TMPFS and we can’t write to
            any other FS. Then the only option left may be to mount a
            FUSE FS that supports <code>setxattr</code>.</p>
            </div>
            </div>
            <p>For instance the QEMU 9p remote filesystem implementation
            uses this to map user permissions within a VM to user
            permissions on the host. I’m not entirely sure what that
            entails, but I use 9p to share folders with VMs with this
            mapping feature enabled.</p>
            <p>You can imagine that if you want to have multiple sets of
            permissions on a file for different systems. Then you can
            use extended attributes to store those permissions.</p>
            <p>Extended attribute names start with a namespace. We’re
            interested in the “user” namespace which any user can write
            to. There are a number of other namespaces, such as the
            security namespace which SELinux uses.</p>
            <p>If you try writing to these other namespaces, you will
            probably get EPERM or some error if you have permission, but
            the attribute is in the wrong format.</p>
            <p>First of all let’s look at how we use the
            <code>setxattr</code> system call. Zig has no wrapper at the
            time (I should add it) for setxattr, so we make the system
            call directly.</p>
            <div class="sourceCode" id="cb14"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb14-1"><a href="#cb14-1" tabindex="-1"></a><span class="at">const</span> XATTR_NAME<span class="op">:</span> [<span class="op">:</span><span class="dv">0</span>]<span class="at">const</span> <span class="dt">u8</span> <span class="op">=</span> <span class="st">&quot;user.bar&quot;</span>;</span>
<span id="cb14-2"><a href="#cb14-2" tabindex="-1"></a></span>
<span id="cb14-3"><a href="#cb14-3" tabindex="-1"></a><span class="op">...</span></span>
<span id="cb14-4"><a href="#cb14-4" tabindex="-1"></a></span>
<span id="cb14-5"><a href="#cb14-5" tabindex="-1"></a><span class="kw">fn</span> setxattr(path<span class="op">:</span> [<span class="op">*:</span><span class="dv">0</span>]<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span> name<span class="op">:</span> [<span class="op">*:</span><span class="dv">0</span>]<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span> value<span class="op">:</span> []<span class="at">const</span> <span class="dt">u8</span><span class="op">,</span> size<span class="op">:</span> <span class="dt">usize</span><span class="op">,</span> flags<span class="op">:</span> <span class="dt">usize</span>) <span class="dt">usize</span> {</span>
<span id="cb14-6"><a href="#cb14-6" tabindex="-1"></a>    <span class="cf">return</span> l<span class="op">.</span>syscall5(<span class="op">.</span>setxattr<span class="op">,</span> @intFromPtr(path)<span class="op">,</span> @intFromPtr(name)<span class="op">,</span> @intFromPtr(value<span class="op">.</span>ptr)<span class="op">,</span> size<span class="op">,</span> flags);</span>
<span id="cb14-7"><a href="#cb14-7" tabindex="-1"></a>}</span>
<span id="cb14-8"><a href="#cb14-8" tabindex="-1"></a></span>
<span id="cb14-9"><a href="#cb14-9" tabindex="-1"></a><span class="kw">fn</span> setXAttr(env<span class="op">:</span> <span class="op">*</span><span class="at">const</span> TestEnv) <span class="dt">void</span> {</span>
<span id="cb14-10"><a href="#cb14-10" tabindex="-1"></a>    <span class="at">var</span> buf<span class="op">:</span> [os<span class="op">.</span>PATH_MAX]<span class="dt">u8</span> <span class="op">=</span> <span class="op">.</span>{<span class="dv">0</span>} <span class="op">**</span> os<span class="op">.</span>PATH_MAX;</span>
<span id="cb14-11"><a href="#cb14-11" tabindex="-1"></a></span>
<span id="cb14-12"><a href="#cb14-12" tabindex="-1"></a>    <span class="at">const</span> path <span class="op">=</span> std<span class="op">.</span>fmt<span class="op">.</span>bufPrint(buf[<span class="dv">0</span> <span class="op">..</span> buf<span class="op">.</span>len <span class="op">-</span> <span class="dv">1</span>]<span class="op">,</span> <span class="st">&quot;{s}/{s}&quot;</span><span class="op">,</span> <span class="op">.</span>{ env<span class="op">.</span>mnt_path<span class="op">,</span> <span class="st">&quot;foo&quot;</span> }) <span class="cf">catch</span> <span class="op">|</span>err<span class="op">|</span> {</span>
<span id="cb14-13"><a href="#cb14-13" tabindex="-1"></a>        std<span class="op">.</span>debug<span class="op">.</span>print(<span class="st">&quot;bufPrint: {}&quot;</span><span class="op">,</span> <span class="op">.</span>{err});</span>
<span id="cb14-14"><a href="#cb14-14" tabindex="-1"></a>        <span class="cf">return</span>;</span>
<span id="cb14-15"><a href="#cb14-15" tabindex="-1"></a>    };</span>
<span id="cb14-16"><a href="#cb14-16" tabindex="-1"></a></span>
<span id="cb14-17"><a href="#cb14-17" tabindex="-1"></a>    <span class="at">const</span> res <span class="op">=</span> setxattr(<span class="bu">@ptrCast</span>(path)<span class="op">,</span> XATTR_NAME<span class="op">,</span> <span class="st">&quot;baz&quot;</span><span class="op">,</span> <span class="dv">3</span><span class="op">,</span> <span class="dv">0</span>);</span>
<span id="cb14-18"><a href="#cb14-18" tabindex="-1"></a>    <span class="at">const</span> err <span class="op">=</span> os<span class="op">.</span>errno(res);</span>
<span id="cb14-19"><a href="#cb14-19" tabindex="-1"></a></span>
<span id="cb14-20"><a href="#cb14-20" tabindex="-1"></a>    std<span class="op">.</span>debug<span class="op">.</span>print(<span class="st">&quot;setxattr: {s}: {}</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span> <span class="op">.</span>{ path<span class="op">,</span> err });</span>
<span id="cb14-21"><a href="#cb14-21" tabindex="-1"></a>}</span></code></pre></div>
            <p>We could set any binary data, but for now I just set
            “user.bar” to “baz”.</p>
            <p>The code to handle this is pretty simple as we don’t
            bother to actually store anything.</p>
            <div class="sourceCode" id="cb15"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb15-1"><a href="#cb15-1" tabindex="-1"></a><span class="at">const</span> SetxattrIn <span class="op">=</span> <span class="at">extern</span> <span class="kw">struct</span> {</span>
<span id="cb15-2"><a href="#cb15-2" tabindex="-1"></a>    size<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb15-3"><a href="#cb15-3" tabindex="-1"></a>    flags<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb15-4"><a href="#cb15-4" tabindex="-1"></a>    setxattr_flags<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb15-5"><a href="#cb15-5" tabindex="-1"></a>    padding<span class="op">:</span> <span class="dt">u32</span> <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb15-6"><a href="#cb15-6" tabindex="-1"></a>};</span>
<span id="cb15-7"><a href="#cb15-7" tabindex="-1"></a></span>
<span id="cb15-8"><a href="#cb15-8" tabindex="-1"></a><span class="op">...</span></span>
<span id="cb15-9"><a href="#cb15-9" tabindex="-1"></a></span>
<span id="cb15-10"><a href="#cb15-10" tabindex="-1"></a>        <span class="op">.</span>SETXATTR <span class="op">=&gt;</span> {</span>
<span id="cb15-11"><a href="#cb15-11" tabindex="-1"></a>            <span class="at">const</span> xattr_in <span class="op">=</span> mem<span class="op">.</span>bytesToValue(SetxattrIn<span class="op">,</span> msg[<span class="dv">0</span><span class="er">.</span><span class="op">.</span><span class="bu">@sizeOf</span>(SetxattrIn)]);</span>
<span id="cb15-12"><a href="#cb15-12" tabindex="-1"></a>            <span class="at">const</span> tail <span class="op">=</span> msg[<span class="bu">@sizeOf</span>(SetxattrIn)<span class="op">..</span>];</span>
<span id="cb15-13"><a href="#cb15-13" tabindex="-1"></a></span>
<span id="cb15-14"><a href="#cb15-14" tabindex="-1"></a>            std<span class="op">.</span>debug<span class="op">.</span>print(<span class="st">&quot;kernel: setxattr: {}</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span> <span class="op">.</span>{xattr_in});</span>
<span id="cb15-15"><a href="#cb15-15" tabindex="-1"></a></span>
<span id="cb15-16"><a href="#cb15-16" tabindex="-1"></a>            <span class="at">const</span> name_len <span class="op">=</span> msg_len <span class="op">-</span> <span class="bu">@sizeOf</span>(SetxattrIn) <span class="op">-</span> xattr_in<span class="op">.</span>size;</span>
<span id="cb15-17"><a href="#cb15-17" tabindex="-1"></a>            <span class="at">const</span> name <span class="op">=</span> tail[<span class="dv">0</span> <span class="op">..</span> name_len <span class="op">-</span> <span class="dv">1</span> <span class="op">:</span><span class="dv">0</span>];</span>
<span id="cb15-18"><a href="#cb15-18" tabindex="-1"></a>            <span class="at">const</span> value <span class="op">=</span> tail[name_len<span class="op">..</span>];</span>
<span id="cb15-19"><a href="#cb15-19" tabindex="-1"></a></span>
<span id="cb15-20"><a href="#cb15-20" tabindex="-1"></a>            std<span class="op">.</span>debug<span class="op">.</span>print(<span class="st">&quot;kernel: setxattr: [{}]{s} =&gt; {s}</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span> <span class="op">.</span>{ name<span class="op">.</span>len<span class="op">,</span> name<span class="op">,</span> value });</span>
<span id="cb15-21"><a href="#cb15-21" tabindex="-1"></a></span>
<span id="cb15-22"><a href="#cb15-22" tabindex="-1"></a>            assert(mem<span class="op">.</span>eql(<span class="dt">u8</span><span class="op">,</span> XATTR_NAME<span class="op">,</span> name));</span>
<span id="cb15-23"><a href="#cb15-23" tabindex="-1"></a>            assert(mem<span class="op">.</span>eql(<span class="dt">u8</span><span class="op">,</span> <span class="st">&quot;baz&quot;</span><span class="op">,</span> value));</span>
<span id="cb15-24"><a href="#cb15-24" tabindex="-1"></a>        }<span class="op">,</span></span></code></pre></div>
            <p>The <code>SETXATTR</code> name-value pair are not part of
            the <code>SetxattrIn</code> structure. They are transmitted
            immediately after it. The <code>size</code> field refers to
            the value size. This seems rather arbitrary and a bit
            redundant because the name is null-terminated and we already
            have the overall message length.</p>
            <p>This worries me because possibly it means that padding
            can be inserted somewhere which would warrant sending the
            value size. Or something else I have not thought of. On the
            other hand it does allow separating the name from the value
            without scanning for the null byte.</p>
            <p>Anyway, all we do is assert that we got the expected
            values.</p>
            <h1 id="setattr">SETATTR</h1>
            <p>I thought that would be the end of it, but even after
            handling <code>SETXATTR</code> the kernel refused to return
            from <code>setxattr</code> (the system call). I also
            foolishly did not check if any further messages had been
            received in my test.</p>
            <p>Initially debugging did not reveal what was going on
            because I made some assumptions about what was happening and
            did not do a careful analysis of each call to
            <code>fuse_simple_request</code>.</p>
            <p>However looking back on the kernel code which handles
            <code>setxattr</code>, there is an obvious culprit.</p>
            <div class="sourceCode" id="cb16"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb16-1"><a href="#cb16-1" tabindex="-1"></a><span class="dt">int</span> fuse_setxattr<span class="op">(</span><span class="kw">struct</span> inode <span class="op">*</span>inode<span class="op">,</span> <span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span>name<span class="op">,</span> <span class="dt">const</span> <span class="dt">void</span> <span class="op">*</span>value<span class="op">,</span></span>
<span id="cb16-2"><a href="#cb16-2" tabindex="-1"></a>          <span class="dt">size_t</span> size<span class="op">,</span> <span class="dt">int</span> flags<span class="op">,</span> <span class="dt">unsigned</span> <span class="dt">int</span> extra_flags<span class="op">)</span></span>
<span id="cb16-3"><a href="#cb16-3" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb16-4"><a href="#cb16-4" tabindex="-1"></a>    <span class="kw">struct</span> fuse_mount <span class="op">*</span>fm <span class="op">=</span> get_fuse_mount<span class="op">(</span>inode<span class="op">);</span></span>
<span id="cb16-5"><a href="#cb16-5" tabindex="-1"></a>    FUSE_ARGS<span class="op">(</span>args<span class="op">);</span></span>
<span id="cb16-6"><a href="#cb16-6" tabindex="-1"></a>    <span class="kw">struct</span> fuse_setxattr_in inarg<span class="op">;</span></span>
<span id="cb16-7"><a href="#cb16-7" tabindex="-1"></a>    <span class="dt">int</span> err<span class="op">;</span></span>
<span id="cb16-8"><a href="#cb16-8" tabindex="-1"></a></span>
<span id="cb16-9"><a href="#cb16-9" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>fm<span class="op">-&gt;</span>fc<span class="op">-&gt;</span>no_setxattr<span class="op">)</span></span>
<span id="cb16-10"><a href="#cb16-10" tabindex="-1"></a>        <span class="cf">return</span> <span class="op">-</span>EOPNOTSUPP<span class="op">;</span></span>
<span id="cb16-11"><a href="#cb16-11" tabindex="-1"></a></span>
<span id="cb16-12"><a href="#cb16-12" tabindex="-1"></a>    memset<span class="op">(&amp;</span>inarg<span class="op">,</span> <span class="dv">0</span><span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span>inarg<span class="op">));</span></span>
<span id="cb16-13"><a href="#cb16-13" tabindex="-1"></a>    inarg<span class="op">.</span>size <span class="op">=</span> size<span class="op">;</span></span>
<span id="cb16-14"><a href="#cb16-14" tabindex="-1"></a>    inarg<span class="op">.</span>flags <span class="op">=</span> flags<span class="op">;</span></span>
<span id="cb16-15"><a href="#cb16-15" tabindex="-1"></a>    inarg<span class="op">.</span>setxattr_flags <span class="op">=</span> extra_flags<span class="op">;</span></span>
<span id="cb16-16"><a href="#cb16-16" tabindex="-1"></a></span>
<span id="cb16-17"><a href="#cb16-17" tabindex="-1"></a>    args<span class="op">.</span>opcode <span class="op">=</span> FUSE_SETXATTR<span class="op">;</span></span>
<span id="cb16-18"><a href="#cb16-18" tabindex="-1"></a>    args<span class="op">.</span>nodeid <span class="op">=</span> get_node_id<span class="op">(</span>inode<span class="op">);</span></span>
<span id="cb16-19"><a href="#cb16-19" tabindex="-1"></a>    args<span class="op">.</span>in_numargs <span class="op">=</span> <span class="dv">3</span><span class="op">;</span></span>
<span id="cb16-20"><a href="#cb16-20" tabindex="-1"></a>    args<span class="op">.</span>in_args<span class="op">[</span><span class="dv">0</span><span class="op">].</span>size <span class="op">=</span> fm<span class="op">-&gt;</span>fc<span class="op">-&gt;</span>setxattr_ext <span class="op">?</span></span>
<span id="cb16-21"><a href="#cb16-21" tabindex="-1"></a>        <span class="kw">sizeof</span><span class="op">(</span>inarg<span class="op">)</span> <span class="op">:</span> FUSE_COMPAT_SETXATTR_IN_SIZE<span class="op">;</span></span>
<span id="cb16-22"><a href="#cb16-22" tabindex="-1"></a>    args<span class="op">.</span>in_args<span class="op">[</span><span class="dv">0</span><span class="op">].</span>value <span class="op">=</span> <span class="op">&amp;</span>inarg<span class="op">;</span></span>
<span id="cb16-23"><a href="#cb16-23" tabindex="-1"></a>    args<span class="op">.</span>in_args<span class="op">[</span><span class="dv">1</span><span class="op">].</span>size <span class="op">=</span> strlen<span class="op">(</span>name<span class="op">)</span> <span class="op">+</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb16-24"><a href="#cb16-24" tabindex="-1"></a>    args<span class="op">.</span>in_args<span class="op">[</span><span class="dv">1</span><span class="op">].</span>value <span class="op">=</span> name<span class="op">;</span></span>
<span id="cb16-25"><a href="#cb16-25" tabindex="-1"></a>    args<span class="op">.</span>in_args<span class="op">[</span><span class="dv">2</span><span class="op">].</span>size <span class="op">=</span> size<span class="op">;</span></span>
<span id="cb16-26"><a href="#cb16-26" tabindex="-1"></a>    args<span class="op">.</span>in_args<span class="op">[</span><span class="dv">2</span><span class="op">].</span>value <span class="op">=</span> value<span class="op">;</span></span>
<span id="cb16-27"><a href="#cb16-27" tabindex="-1"></a>    err <span class="op">=</span> fuse_simple_request<span class="op">(</span>fm<span class="op">,</span> <span class="op">&amp;</span>args<span class="op">);</span> <span class="op">&lt;--</span> The request is sent</span>
<span id="cb16-28"><a href="#cb16-28" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>err <span class="op">==</span> <span class="op">-</span>ENOSYS<span class="op">)</span> <span class="op">{</span> <span class="op">&lt;--</span> I look in debugger and see err <span class="op">==</span> <span class="dv">0</span></span>
<span id="cb16-29"><a href="#cb16-29" tabindex="-1"></a>        fm<span class="op">-&gt;</span>fc<span class="op">-&gt;</span>no_setxattr <span class="op">=</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb16-30"><a href="#cb16-30" tabindex="-1"></a>        err <span class="op">=</span> <span class="op">-</span>EOPNOTSUPP<span class="op">;</span></span>
<span id="cb16-31"><a href="#cb16-31" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb16-32"><a href="#cb16-32" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>err<span class="op">)</span></span>
<span id="cb16-33"><a href="#cb16-33" tabindex="-1"></a>        fuse_update_ctime<span class="op">(</span>inode<span class="op">);</span> <span class="op">&lt;--</span> I skip over this function in the debugger</span>
<span id="cb16-34"><a href="#cb16-34" tabindex="-1"></a></span>
<span id="cb16-35"><a href="#cb16-35" tabindex="-1"></a>    <span class="cf">return</span> err<span class="op">;</span></span>
<span id="cb16-36"><a href="#cb16-36" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>The function <code>fuse_update_ctime</code> results in
            <code>SETATTR</code> being sent to the FUSE daemon. It takes
            a rather roundabout route to achieve this (going through
            <code>fs-writeback.c</code>), but it tells our daemon the
            inode’s timestamps need updating.</p>
            <p>Our code which handles <code>SETATTR</code> looks a lot
            like <code>GETATTR</code>. The main difference is we get
            some attribute info from the kernel. This tells us what
            attributes are being set and to which values. However we get
            chance to modify this and send it back.</p>
            <div class="sourceCode" id="cb17"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb17-1"><a href="#cb17-1" tabindex="-1"></a><span class="at">const</span> SetattrIn <span class="op">=</span> <span class="at">extern</span> <span class="kw">struct</span> {</span>
<span id="cb17-2"><a href="#cb17-2" tabindex="-1"></a>    valid<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb17-3"><a href="#cb17-3" tabindex="-1"></a>    padding<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb17-4"><a href="#cb17-4" tabindex="-1"></a>    fh<span class="op">:</span> <span class="dt">u64</span><span class="op">,</span></span>
<span id="cb17-5"><a href="#cb17-5" tabindex="-1"></a>    size<span class="op">:</span> <span class="dt">u64</span><span class="op">,</span></span>
<span id="cb17-6"><a href="#cb17-6" tabindex="-1"></a>    lock_owner<span class="op">:</span> <span class="dt">u64</span><span class="op">,</span></span>
<span id="cb17-7"><a href="#cb17-7" tabindex="-1"></a>    atime<span class="op">:</span> <span class="dt">u64</span><span class="op">,</span></span>
<span id="cb17-8"><a href="#cb17-8" tabindex="-1"></a>    mtime<span class="op">:</span> <span class="dt">u64</span><span class="op">,</span></span>
<span id="cb17-9"><a href="#cb17-9" tabindex="-1"></a>    ctime<span class="op">:</span> <span class="dt">u64</span><span class="op">,</span></span>
<span id="cb17-10"><a href="#cb17-10" tabindex="-1"></a>    atimensec<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb17-11"><a href="#cb17-11" tabindex="-1"></a>    mtimensec<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb17-12"><a href="#cb17-12" tabindex="-1"></a>    ctimensec<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb17-13"><a href="#cb17-13" tabindex="-1"></a>    mode<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb17-14"><a href="#cb17-14" tabindex="-1"></a>    unused4<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb17-15"><a href="#cb17-15" tabindex="-1"></a>    uid<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb17-16"><a href="#cb17-16" tabindex="-1"></a>    gid<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb17-17"><a href="#cb17-17" tabindex="-1"></a>    unused5<span class="op">:</span> <span class="dt">u32</span><span class="op">,</span></span>
<span id="cb17-18"><a href="#cb17-18" tabindex="-1"></a>};</span>
<span id="cb17-19"><a href="#cb17-19" tabindex="-1"></a></span>
<span id="cb17-20"><a href="#cb17-20" tabindex="-1"></a></span>
<span id="cb17-21"><a href="#cb17-21" tabindex="-1"></a><span class="op">...</span></span>
<span id="cb17-22"><a href="#cb17-22" tabindex="-1"></a></span>
<span id="cb17-23"><a href="#cb17-23" tabindex="-1"></a>        <span class="op">.</span>SETATTR <span class="op">=&gt;</span> {</span>
<span id="cb17-24"><a href="#cb17-24" tabindex="-1"></a>            <span class="at">const</span> setattr_in <span class="op">=</span></span>
<span id="cb17-25"><a href="#cb17-25" tabindex="-1"></a>                mem<span class="op">.</span>bytesAsValue(SetattrIn<span class="op">,</span> msg[<span class="dv">0</span><span class="er">.</span><span class="op">.</span><span class="bu">@sizeOf</span>(SetattrIn)]);</span>
<span id="cb17-26"><a href="#cb17-26" tabindex="-1"></a></span>
<span id="cb17-27"><a href="#cb17-27" tabindex="-1"></a>            std<span class="op">.</span>debug<span class="op">.</span>print(<span class="st">&quot;kernel: setattr: {}</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span> <span class="op">.</span>{setattr_in});</span>
<span id="cb17-28"><a href="#cb17-28" tabindex="-1"></a></span>
<span id="cb17-29"><a href="#cb17-29" tabindex="-1"></a>            <span class="at">const</span> time<span class="op">:</span> <span class="dt">u64</span> <span class="op">=</span> <span class="bu">@intCast</span>(<span class="bu">@min</span>(<span class="dv">0</span><span class="op">,</span> std<span class="op">.</span>time<span class="op">.</span>timestamp()));</span>
<span id="cb17-30"><a href="#cb17-30" tabindex="-1"></a></span>
<span id="cb17-31"><a href="#cb17-31" tabindex="-1"></a>            res<span class="op">.</span>out<span class="op">.</span>attr <span class="op">=</span> <span class="op">.</span>{</span>
<span id="cb17-32"><a href="#cb17-32" tabindex="-1"></a>                <span class="op">.</span>valid <span class="op">=</span> time <span class="op">+</span> <span class="dv">300</span><span class="op">,</span></span>
<span id="cb17-33"><a href="#cb17-33" tabindex="-1"></a>                <span class="op">.</span>valid_nsec <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb17-34"><a href="#cb17-34" tabindex="-1"></a>                <span class="op">.</span>dummy <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb17-35"><a href="#cb17-35" tabindex="-1"></a>                <span class="op">.</span>attr <span class="op">=</span> <span class="op">.</span>{</span>
<span id="cb17-36"><a href="#cb17-36" tabindex="-1"></a>                    <span class="op">.</span>ino <span class="op">=</span> hdr<span class="op">.</span>nodeid<span class="op">,</span></span>
<span id="cb17-37"><a href="#cb17-37" tabindex="-1"></a>                    <span class="op">.</span>blocks <span class="op">=</span> <span class="dv">1</span><span class="op">,</span></span>
<span id="cb17-38"><a href="#cb17-38" tabindex="-1"></a>                    <span class="op">.</span>size <span class="op">=</span> <span class="dv">42</span><span class="op">,</span></span>
<span id="cb17-39"><a href="#cb17-39" tabindex="-1"></a>                    <span class="op">.</span>atime <span class="op">=</span> time<span class="op">,</span></span>
<span id="cb17-40"><a href="#cb17-40" tabindex="-1"></a>                    <span class="op">.</span>mtime <span class="op">=</span> time<span class="op">,</span></span>
<span id="cb17-41"><a href="#cb17-41" tabindex="-1"></a>                    <span class="op">.</span>ctime <span class="op">=</span> time<span class="op">,</span></span>
<span id="cb17-42"><a href="#cb17-42" tabindex="-1"></a>                    <span class="op">.</span>atimensec <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb17-43"><a href="#cb17-43" tabindex="-1"></a>                    <span class="op">.</span>mtimensec <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb17-44"><a href="#cb17-44" tabindex="-1"></a>                    <span class="op">.</span>ctimensec <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb17-45"><a href="#cb17-45" tabindex="-1"></a>                    <span class="op">.</span>mode <span class="op">=</span> l<span class="op">.</span>S<span class="op">.</span>IFDIR <span class="op">|</span> <span class="dv">0</span><span class="er">o666</span><span class="op">,</span></span>
<span id="cb17-46"><a href="#cb17-46" tabindex="-1"></a>                    <span class="op">.</span>nlink <span class="op">=</span> <span class="dv">1</span><span class="op">,</span></span>
<span id="cb17-47"><a href="#cb17-47" tabindex="-1"></a>                    <span class="op">.</span>uid <span class="op">=</span> l<span class="op">.</span>getuid()<span class="op">,</span></span>
<span id="cb17-48"><a href="#cb17-48" tabindex="-1"></a>                    <span class="op">.</span>gid <span class="op">=</span> l<span class="op">.</span>getgid()<span class="op">,</span></span>
<span id="cb17-49"><a href="#cb17-49" tabindex="-1"></a>                    <span class="op">.</span>rdev <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb17-50"><a href="#cb17-50" tabindex="-1"></a>                    <span class="op">.</span>blksize <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb17-51"><a href="#cb17-51" tabindex="-1"></a>                    <span class="op">.</span>flags <span class="op">=</span> <span class="dv">0</span><span class="op">,</span></span>
<span id="cb17-52"><a href="#cb17-52" tabindex="-1"></a>                }<span class="op">,</span></span>
<span id="cb17-53"><a href="#cb17-53" tabindex="-1"></a>            };</span>
<span id="cb17-54"><a href="#cb17-54" tabindex="-1"></a></span>
<span id="cb17-55"><a href="#cb17-55" tabindex="-1"></a>            <span class="at">const</span> v <span class="op">=</span> setattr_in<span class="op">.</span>valid;</span>
<span id="cb17-56"><a href="#cb17-56" tabindex="-1"></a>            <span class="cf">if</span> (v <span class="op">&amp;</span> <span class="op">~</span>(FATTR<span class="op">.</span>ATIME <span class="op">|</span> FATTR<span class="op">.</span>MTIME <span class="op">|</span> FATTR<span class="op">.</span>CTIME) <span class="op">&gt;</span> <span class="dv">0</span>) {</span>
<span id="cb17-57"><a href="#cb17-57" tabindex="-1"></a>                std<span class="op">.</span>debug<span class="op">.</span>print(<span class="st">&quot;setattr: setting attributes not supported</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span> <span class="op">.</span>{});</span>
<span id="cb17-58"><a href="#cb17-58" tabindex="-1"></a>                res<span class="op">.</span>hdr<span class="op">.</span>err <span class="op">=</span> <span class="op">-</span><span class="bu">@as</span>(<span class="dt">i32</span><span class="op">,</span> @intFromEnum(E<span class="op">.</span>OPNOTSUPP));</span>
<span id="cb17-59"><a href="#cb17-59" tabindex="-1"></a>            } <span class="cf">else</span> {</span>
<span id="cb17-60"><a href="#cb17-60" tabindex="-1"></a>                res<span class="op">.</span>hdr<span class="op">.</span>len <span class="op">+=</span> <span class="bu">@sizeOf</span>(AttrOut);</span>
<span id="cb17-61"><a href="#cb17-61" tabindex="-1"></a>            }</span>
<span id="cb17-62"><a href="#cb17-62" tabindex="-1"></a>        }<span class="op">,</span></span></code></pre></div>
            <p>The <code>valid</code> field in <code>SetattrIn</code> we
            receive from the kernel says which attributes are being set.
            It is a bit field, where each bit represents an
            attribute.</p>
            <p>We only expect some time attributes to be valid in the
            request. It’s not clear if the <code>valid</code> bitfield
            also applies to the response. You can see in the code I
            accidentally set <code>mode</code> to a directory. I’m not
            sure if the kernel ignores this or it is simply happy that
            after updating the timestamps the inode transformed from a
            file into a directory.</p>
            <h1 id="closing-remarks">Closing remarks</h1>
            <p>Part of me thinks that I really need to read up on what
            an inode is. Or some other FS thing like a super block.
            However this would give a normative understanding of these
            things. At best it is a peg to hang future knowledge on and
            at worst it delays achieving real world understanding that
            comes from exposure.</p>
            <p>Often labels are mistaken for abstractions. The same
            labels are given to some group of objects or interfaces or
            illogical concepts. When you start digging into it you
            realise that the code only behaves according to the label’s
            description in some specific circumstance or not at all.</p>
            <p>I Hope you enjoyed that.</p>
            <h1 id="related">Related</h1>
            <ul>
            <li><a href="zig-fuse-one">Zig &amp; FUSE: Hello file
            systems</a></li>
            <li><a href="/https/richiejp.com/barely-http2-zig">Barely HTTP/2 server in
            Zig</a></li>
            <li><a href="/https/richiejp.com/zig-cross-compile-ltp-ltx-linux">Minimal Linux
            VM cross compiled with Clang and Zig</a></li>
            <li><a href="/https/richiejp.com/zig-ld-preload-trick">Override libc’s malloc
            with Zig</a></li>
            <li><a href="/https/richiejp.com/zig-vs-c-mini-http-server">Zig Vs C - Minimal
            HTTP server</a></li>
            </ul>
    </div>
  </content>
</entry>
<entry>
  <title>Override libc’s malloc with Zig</title>
  <id>https://richiejp.com/zig-ld-preload-trick</id>
  <published>2023-06-16T14:33:24+01:00</published>
  <updated>2023-07-26T15:24:08+01:00</updated>
  <link rel="alternate" href="https://richiejp.com/zig-ld-preload-trick" />
  <summary>Use the LD_PRELOAD trick with dynamically linked libc and Zig
to override malloc</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>The LD_PRELOAD trick allows you load a dynamic library
            before any other libraries are loaded. You can use this to
            override functions inside an application at runtime. This is
            often used to override functions in libc to allow some
            debugging.</p>
            <p>Doing it with Zig is cool because Zig itself does not
            rely on libc. So if we override the <code>malloc</code>
            family of functions we don’t have to worry about
            <code>malloc</code> <a
            href="https://stackoverflow.com/a/10008252">recursively
            calling itself</a>.</p>
            <p>We can also use the Zig standard library and you don’t
            need to worry about how to implement <code>malloc</code>
            because it can be copied and pasted from <a
            href="https://github.com/marler8997/ziglibc">ziglibc</a>
            with some minor alterations.</p>
            <h1 id="malloc.zig">malloc.zig</h1>
            <div class="sourceCode" id="cb1"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="at">const</span> builtin <span class="op">=</span> <span class="bu">@import</span>(<span class="st">&quot;builtin&quot;</span>);</span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a><span class="at">const</span> std <span class="op">=</span> <span class="bu">@import</span>(<span class="st">&quot;std&quot;</span>);</span>
<span id="cb1-3"><a href="#cb1-3" tabindex="-1"></a></span>
<span id="cb1-4"><a href="#cb1-4" tabindex="-1"></a><span class="at">const</span> alloc_align <span class="op">=</span> <span class="dv">16</span>;</span>
<span id="cb1-5"><a href="#cb1-5" tabindex="-1"></a><span class="at">const</span> alloc_metadata_len <span class="op">=</span> std<span class="op">.</span>mem<span class="op">.</span>alignForward(<span class="bu">@sizeOf</span>(<span class="dt">usize</span>)<span class="op">,</span> alloc_align);</span>
<span id="cb1-6"><a href="#cb1-6" tabindex="-1"></a><span class="at">var</span> gpa <span class="op">=</span> std<span class="op">.</span>heap<span class="op">.</span>GeneralPurposeAllocator(<span class="op">.</span>{</span>
<span id="cb1-7"><a href="#cb1-7" tabindex="-1"></a>    <span class="op">.</span>MutexType <span class="op">=</span> std<span class="op">.</span>Thread<span class="op">.</span>Mutex<span class="op">,</span></span>
<span id="cb1-8"><a href="#cb1-8" tabindex="-1"></a>}){};</span>
<span id="cb1-9"><a href="#cb1-9" tabindex="-1"></a></span>
<span id="cb1-10"><a href="#cb1-10" tabindex="-1"></a><span class="kw">export</span> <span class="kw">fn</span> malloc(size<span class="op">:</span> <span class="dt">usize</span>) <span class="kw">callconv</span>(<span class="op">.</span>C) <span class="op">?</span>[<span class="op">*</span>]<span class="kw">align</span>(alloc_align) <span class="dt">u8</span> {</span>
<span id="cb1-11"><a href="#cb1-11" tabindex="-1"></a>    std<span class="op">.</span>debug<span class="op">.</span>assert(size <span class="op">&gt;</span> <span class="dv">0</span>); <span class="co">// </span><span class="al">TODO</span><span class="co">: what should we do in this case?</span></span>
<span id="cb1-12"><a href="#cb1-12" tabindex="-1"></a>    <span class="at">const</span> full_len <span class="op">=</span> alloc_metadata_len <span class="op">+</span> size;</span>
<span id="cb1-13"><a href="#cb1-13" tabindex="-1"></a>    <span class="at">const</span> buf <span class="op">=</span> gpa<span class="op">.</span>allocator()<span class="op">.</span>alignedAlloc(<span class="dt">u8</span><span class="op">,</span> alloc_align<span class="op">,</span> full_len) <span class="cf">catch</span> <span class="op">|</span>err<span class="op">|</span> <span class="cf">switch</span> (err) {</span>
<span id="cb1-14"><a href="#cb1-14" tabindex="-1"></a>        <span class="kw">error</span><span class="op">.</span>OutOfMemory <span class="op">=&gt;</span> {</span>
<span id="cb1-15"><a href="#cb1-15" tabindex="-1"></a>            std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;malloc return null&quot;</span><span class="op">,</span> <span class="op">.</span>{});</span>
<span id="cb1-16"><a href="#cb1-16" tabindex="-1"></a>            <span class="cf">return</span> <span class="cn">null</span>;</span>
<span id="cb1-17"><a href="#cb1-17" tabindex="-1"></a>        }<span class="op">,</span></span>
<span id="cb1-18"><a href="#cb1-18" tabindex="-1"></a>    };</span>
<span id="cb1-19"><a href="#cb1-19" tabindex="-1"></a>    <span class="bu">@ptrCast</span>(<span class="op">*</span><span class="dt">usize</span><span class="op">,</span> buf)<span class="op">.*</span> <span class="op">=</span> full_len;</span>
<span id="cb1-20"><a href="#cb1-20" tabindex="-1"></a>    <span class="at">const</span> result <span class="op">=</span> <span class="bu">@intToPtr</span>([<span class="op">*</span>]<span class="kw">align</span>(alloc_align) <span class="dt">u8</span><span class="op">,</span> <span class="bu">@ptrToInt</span>(buf<span class="op">.</span>ptr) <span class="op">+</span> alloc_metadata_len);</span>
<span id="cb1-21"><a href="#cb1-21" tabindex="-1"></a>    std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;malloc({}) return {*}&quot;</span><span class="op">,</span> <span class="op">.</span>{ size<span class="op">,</span> result });</span>
<span id="cb1-22"><a href="#cb1-22" tabindex="-1"></a>    <span class="cf">return</span> result;</span>
<span id="cb1-23"><a href="#cb1-23" tabindex="-1"></a>}</span>
<span id="cb1-24"><a href="#cb1-24" tabindex="-1"></a></span>
<span id="cb1-25"><a href="#cb1-25" tabindex="-1"></a><span class="kw">fn</span> getGpaBuf(ptr<span class="op">:</span> [<span class="op">*</span>]<span class="dt">u8</span>) []<span class="kw">align</span>(alloc_align) <span class="dt">u8</span> {</span>
<span id="cb1-26"><a href="#cb1-26" tabindex="-1"></a>    <span class="at">const</span> start <span class="op">=</span> <span class="bu">@ptrToInt</span>(ptr) <span class="op">-</span> alloc_metadata_len;</span>
<span id="cb1-27"><a href="#cb1-27" tabindex="-1"></a>    <span class="at">const</span> len <span class="op">=</span> <span class="bu">@intToPtr</span>(<span class="op">*</span><span class="dt">usize</span><span class="op">,</span> start)<span class="op">.*</span>;</span>
<span id="cb1-28"><a href="#cb1-28" tabindex="-1"></a>    <span class="cf">return</span> <span class="bu">@alignCast</span>(alloc_align<span class="op">,</span> <span class="bu">@intToPtr</span>([<span class="op">*</span>]<span class="dt">u8</span><span class="op">,</span> start)[<span class="dv">0</span><span class="er">.</span><span class="op">.</span>len]);</span>
<span id="cb1-29"><a href="#cb1-29" tabindex="-1"></a>}</span>
<span id="cb1-30"><a href="#cb1-30" tabindex="-1"></a></span>
<span id="cb1-31"><a href="#cb1-31" tabindex="-1"></a><span class="kw">export</span> <span class="kw">fn</span> realloc(ptr<span class="op">:</span> <span class="op">?</span>[<span class="op">*</span>]<span class="kw">align</span>(alloc_align) <span class="dt">u8</span><span class="op">,</span> size<span class="op">:</span> <span class="dt">usize</span>) <span class="kw">callconv</span>(<span class="op">.</span>C) <span class="op">?</span>[<span class="op">*</span>]<span class="kw">align</span>(alloc_align) <span class="dt">u8</span> {</span>
<span id="cb1-32"><a href="#cb1-32" tabindex="-1"></a>    std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;realloc {*} {}&quot;</span><span class="op">,</span> <span class="op">.</span>{ ptr<span class="op">,</span> size });</span>
<span id="cb1-33"><a href="#cb1-33" tabindex="-1"></a>    <span class="at">const</span> gpa_buf <span class="op">=</span> getGpaBuf(ptr <span class="kw">orelse</span> {</span>
<span id="cb1-34"><a href="#cb1-34" tabindex="-1"></a>        <span class="at">const</span> result <span class="op">=</span> malloc(size);</span>
<span id="cb1-35"><a href="#cb1-35" tabindex="-1"></a>        std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;realloc return {*} (from malloc)&quot;</span><span class="op">,</span> <span class="op">.</span>{result});</span>
<span id="cb1-36"><a href="#cb1-36" tabindex="-1"></a>        <span class="cf">return</span> result;</span>
<span id="cb1-37"><a href="#cb1-37" tabindex="-1"></a>    });</span>
<span id="cb1-38"><a href="#cb1-38" tabindex="-1"></a>    <span class="cf">if</span> (size <span class="op">==</span> <span class="dv">0</span>) {</span>
<span id="cb1-39"><a href="#cb1-39" tabindex="-1"></a>        gpa<span class="op">.</span>allocator()<span class="op">.</span>free(gpa_buf);</span>
<span id="cb1-40"><a href="#cb1-40" tabindex="-1"></a>        <span class="cf">return</span> <span class="cn">null</span>;</span>
<span id="cb1-41"><a href="#cb1-41" tabindex="-1"></a>    }</span>
<span id="cb1-42"><a href="#cb1-42" tabindex="-1"></a></span>
<span id="cb1-43"><a href="#cb1-43" tabindex="-1"></a>    <span class="at">const</span> gpa_size <span class="op">=</span> alloc_metadata_len <span class="op">+</span> size;</span>
<span id="cb1-44"><a href="#cb1-44" tabindex="-1"></a>    <span class="cf">if</span> (gpa<span class="op">.</span>allocator()<span class="op">.</span>rawResize(gpa_buf<span class="op">,</span> std<span class="op">.</span>math<span class="op">.</span>log2(alloc_align)<span class="op">,</span> gpa_size<span class="op">,</span> <span class="bu">@returnAddress</span>())) {</span>
<span id="cb1-45"><a href="#cb1-45" tabindex="-1"></a>        <span class="bu">@ptrCast</span>(<span class="op">*</span><span class="dt">usize</span><span class="op">,</span> gpa_buf<span class="op">.</span>ptr)<span class="op">.*</span> <span class="op">=</span> gpa_size;</span>
<span id="cb1-46"><a href="#cb1-46" tabindex="-1"></a>        std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;realloc return {*}&quot;</span><span class="op">,</span> <span class="op">.</span>{ptr});</span>
<span id="cb1-47"><a href="#cb1-47" tabindex="-1"></a>        <span class="cf">return</span> ptr;</span>
<span id="cb1-48"><a href="#cb1-48" tabindex="-1"></a>    }</span>
<span id="cb1-49"><a href="#cb1-49" tabindex="-1"></a></span>
<span id="cb1-50"><a href="#cb1-50" tabindex="-1"></a>    <span class="at">const</span> new_buf <span class="op">=</span> gpa<span class="op">.</span>allocator()<span class="op">.</span>reallocAdvanced(</span>
<span id="cb1-51"><a href="#cb1-51" tabindex="-1"></a>        gpa_buf<span class="op">,</span></span>
<span id="cb1-52"><a href="#cb1-52" tabindex="-1"></a>        gpa_size<span class="op">,</span></span>
<span id="cb1-53"><a href="#cb1-53" tabindex="-1"></a>        <span class="bu">@returnAddress</span>()<span class="op">,</span></span>
<span id="cb1-54"><a href="#cb1-54" tabindex="-1"></a>    ) <span class="cf">catch</span> <span class="op">|</span>e<span class="op">|</span> <span class="cf">switch</span> (e) {</span>
<span id="cb1-55"><a href="#cb1-55" tabindex="-1"></a>        <span class="kw">error</span><span class="op">.</span>OutOfMemory <span class="op">=&gt;</span> {</span>
<span id="cb1-56"><a href="#cb1-56" tabindex="-1"></a>            std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;realloc out-of-mem from {} to {}&quot;</span><span class="op">,</span> <span class="op">.</span>{ gpa_buf<span class="op">.</span>len<span class="op">,</span> gpa_size });</span>
<span id="cb1-57"><a href="#cb1-57" tabindex="-1"></a>            <span class="cf">return</span> <span class="cn">null</span>;</span>
<span id="cb1-58"><a href="#cb1-58" tabindex="-1"></a>        }<span class="op">,</span></span>
<span id="cb1-59"><a href="#cb1-59" tabindex="-1"></a>    };</span>
<span id="cb1-60"><a href="#cb1-60" tabindex="-1"></a>    <span class="bu">@ptrCast</span>(<span class="op">*</span><span class="dt">usize</span><span class="op">,</span> new_buf<span class="op">.</span>ptr)<span class="op">.*</span> <span class="op">=</span> gpa_size;</span>
<span id="cb1-61"><a href="#cb1-61" tabindex="-1"></a>    <span class="at">const</span> result <span class="op">=</span> <span class="bu">@intToPtr</span>([<span class="op">*</span>]<span class="kw">align</span>(alloc_align) <span class="dt">u8</span><span class="op">,</span> <span class="bu">@ptrToInt</span>(new_buf<span class="op">.</span>ptr) <span class="op">+</span> alloc_metadata_len);</span>
<span id="cb1-62"><a href="#cb1-62" tabindex="-1"></a>    std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;realloc return {*}&quot;</span><span class="op">,</span> <span class="op">.</span>{result});</span>
<span id="cb1-63"><a href="#cb1-63" tabindex="-1"></a>    <span class="cf">return</span> result;</span>
<span id="cb1-64"><a href="#cb1-64" tabindex="-1"></a>}</span>
<span id="cb1-65"><a href="#cb1-65" tabindex="-1"></a></span>
<span id="cb1-66"><a href="#cb1-66" tabindex="-1"></a><span class="kw">export</span> <span class="kw">fn</span> calloc(nmemb<span class="op">:</span> <span class="dt">usize</span><span class="op">,</span> size<span class="op">:</span> <span class="dt">usize</span>) <span class="kw">callconv</span>(<span class="op">.</span>C) <span class="op">?</span>[<span class="op">*</span>]<span class="kw">align</span>(alloc_align) <span class="dt">u8</span> {</span>
<span id="cb1-67"><a href="#cb1-67" tabindex="-1"></a>    <span class="at">const</span> total <span class="op">=</span> std<span class="op">.</span>math<span class="op">.</span>mul(<span class="dt">usize</span><span class="op">,</span> nmemb<span class="op">,</span> size) <span class="cf">catch</span> {</span>
<span id="cb1-68"><a href="#cb1-68" tabindex="-1"></a>        <span class="co">// </span><span class="al">TODO</span><span class="co">: set errno</span></span>
<span id="cb1-69"><a href="#cb1-69" tabindex="-1"></a>        <span class="co">//errno = c.ENOMEM;</span></span>
<span id="cb1-70"><a href="#cb1-70" tabindex="-1"></a>        <span class="cf">return</span> <span class="cn">null</span>;</span>
<span id="cb1-71"><a href="#cb1-71" tabindex="-1"></a>    };</span>
<span id="cb1-72"><a href="#cb1-72" tabindex="-1"></a>    <span class="at">const</span> ptr <span class="op">=</span> malloc(total) <span class="kw">orelse</span> <span class="cf">return</span> <span class="cn">null</span>;</span>
<span id="cb1-73"><a href="#cb1-73" tabindex="-1"></a>    <span class="bu">@memset</span>(ptr[<span class="dv">0</span><span class="er">.</span><span class="op">.</span>total]<span class="op">,</span> <span class="dv">0</span>);</span>
<span id="cb1-74"><a href="#cb1-74" tabindex="-1"></a>    <span class="cf">return</span> ptr;</span>
<span id="cb1-75"><a href="#cb1-75" tabindex="-1"></a>}</span>
<span id="cb1-76"><a href="#cb1-76" tabindex="-1"></a></span>
<span id="cb1-77"><a href="#cb1-77" tabindex="-1"></a><span class="kw">export</span> <span class="kw">fn</span> free(ptr<span class="op">:</span> <span class="op">?</span>[<span class="op">*</span>]<span class="kw">align</span>(alloc_align) <span class="dt">u8</span>) <span class="kw">callconv</span>(<span class="op">.</span>C) <span class="dt">void</span> {</span>
<span id="cb1-78"><a href="#cb1-78" tabindex="-1"></a>    std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;free {*}&quot;</span><span class="op">,</span> <span class="op">.</span>{ptr});</span>
<span id="cb1-79"><a href="#cb1-79" tabindex="-1"></a>    <span class="at">const</span> p <span class="op">=</span> ptr <span class="kw">orelse</span> <span class="cf">return</span>;</span>
<span id="cb1-80"><a href="#cb1-80" tabindex="-1"></a>    gpa<span class="op">.</span>allocator()<span class="op">.</span>free(getGpaBuf(p));</span>
<span id="cb1-81"><a href="#cb1-81" tabindex="-1"></a>}</span></code></pre></div>
            <p>You can see that <code>realloc</code> complicates things
            a bit. It needs to be included otherwise we could have
            pointers allocated by libc passed to our
            <code>free</code>.</p>
            <h1 id="build-and-run">Build and run</h1>
            <p>We can compile the above and run it like so:</p>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode sh"><code class="sourceCode bash"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="ex">$</span> zig build-lib malloc.zig <span class="at">-dynamic</span></span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a><span class="ex">$</span> LD_PRELOAD=./libmalloc.so ls</span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a><span class="ex">...</span></span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a><span class="ex">info:</span> malloc<span class="er">(</span><span class="ex">128</span><span class="kw">)</span> <span class="cf">return</span> <span class="ex">u8@7fa33a7a6310</span></span>
<span id="cb2-5"><a href="#cb2-5" tabindex="-1"></a><span class="ex">info:</span> realloc <span class="pp">[</span><span class="ss">*</span><span class="pp">]</span>align<span class="er">(</span><span class="ex">16</span><span class="kw">)</span> <span class="ex">u8@0</span> 20800</span>
<span id="cb2-6"><a href="#cb2-6" tabindex="-1"></a><span class="ex">info:</span> malloc<span class="er">(</span><span class="ex">20800</span><span class="kw">)</span> <span class="cf">return</span> <span class="ex">u8@7fa33a488010</span></span>
<span id="cb2-7"><a href="#cb2-7" tabindex="-1"></a><span class="ex">info:</span> realloc return <span class="pp">[</span><span class="ss">*</span><span class="pp">]</span>align<span class="er">(</span><span class="ex">16</span><span class="kw">)</span> <span class="ex">u8@7fa33a488010</span> <span class="er">(</span><span class="ex">from</span> malloc<span class="kw">)</span></span>
<span id="cb2-8"><a href="#cb2-8" tabindex="-1"></a><span class="ex">info:</span> malloc<span class="er">(</span><span class="ex">32</span><span class="kw">)</span> <span class="cf">return</span> <span class="ex">u8@7fa33a4900d0</span></span>
<span id="cb2-9"><a href="#cb2-9" tabindex="-1"></a><span class="ex">info:</span> malloc<span class="er">(</span><span class="ex">2</span><span class="kw">)</span> <span class="cf">return</span> <span class="ex">u8@7fa33a7a4610</span></span>
<span id="cb2-10"><a href="#cb2-10" tabindex="-1"></a><span class="ex">info:</span> malloc<span class="er">(</span><span class="ex">32816</span><span class="kw">)</span> <span class="cf">return</span> <span class="ex">u8@7fa33a47f010</span></span>
<span id="cb2-11"><a href="#cb2-11" tabindex="-1"></a><span class="ex">info:</span> malloc<span class="er">(</span><span class="ex">11</span><span class="kw">)</span> <span class="cf">return</span> <span class="ex">u8@7fa33a7a4630</span></span>
<span id="cb2-12"><a href="#cb2-12" tabindex="-1"></a><span class="ex">info:</span> malloc<span class="er">(</span><span class="ex">15</span><span class="kw">)</span> <span class="cf">return</span> <span class="ex">u8@7fa33a7a4650</span></span>
<span id="cb2-13"><a href="#cb2-13" tabindex="-1"></a><span class="ex">info:</span> malloc<span class="er">(</span><span class="ex">13</span><span class="kw">)</span> <span class="cf">return</span> <span class="ex">u8@7fa33a7a4670</span></span>
<span id="cb2-14"><a href="#cb2-14" tabindex="-1"></a><span class="ex">info:</span> free <span class="pp">[</span><span class="ss">*</span><span class="pp">]</span>align<span class="er">(</span><span class="ex">16</span><span class="kw">)</span> <span class="ex">u8@7fa33a47f010</span></span>
<span id="cb2-15"><a href="#cb2-15" tabindex="-1"></a><span class="ex">info:</span> free <span class="pp">[</span><span class="ss">*</span><span class="pp">]</span>align<span class="er">(</span><span class="ex">16</span><span class="kw">)</span> <span class="ex">u8@0</span></span>
<span id="cb2-16"><a href="#cb2-16" tabindex="-1"></a><span class="ex">info:</span> realloc <span class="pp">[</span><span class="ss">*</span><span class="pp">]</span>align<span class="er">(</span><span class="ex">16</span><span class="kw">)</span> <span class="ex">u8@0</span> 72</span>
<span id="cb2-17"><a href="#cb2-17" tabindex="-1"></a><span class="ex">info:</span> malloc<span class="er">(</span><span class="ex">72</span><span class="kw">)</span> <span class="cf">return</span> <span class="ex">u8@7fa33a4a3e90</span></span>
<span id="cb2-18"><a href="#cb2-18" tabindex="-1"></a><span class="ex">info:</span> realloc return <span class="pp">[</span><span class="ss">*</span><span class="pp">]</span>align<span class="er">(</span><span class="ex">16</span><span class="kw">)</span> <span class="ex">u8@7fa33a4a3e90</span> <span class="er">(</span><span class="ex">from</span> malloc<span class="kw">)</span></span>
<span id="cb2-19"><a href="#cb2-19" tabindex="-1"></a><span class="ex">info:</span> realloc <span class="pp">[</span><span class="ss">*</span><span class="pp">]</span>align<span class="er">(</span><span class="ex">16</span><span class="kw">)</span> <span class="ex">u8@0</span> 144</span>
<span id="cb2-20"><a href="#cb2-20" tabindex="-1"></a><span class="ex">info:</span> malloc<span class="er">(</span><span class="ex">144</span><span class="kw">)</span> <span class="cf">return</span> <span class="ex">u8@7fa33a7a6410</span></span>
<span id="cb2-21"><a href="#cb2-21" tabindex="-1"></a><span class="ex">info:</span> realloc return <span class="pp">[</span><span class="ss">*</span><span class="pp">]</span>align<span class="er">(</span><span class="ex">16</span><span class="kw">)</span> <span class="ex">u8@7fa33a7a6410</span> <span class="er">(</span><span class="ex">from</span> malloc<span class="kw">)</span></span>
<span id="cb2-22"><a href="#cb2-22" tabindex="-1"></a><span class="ex">info:</span> realloc <span class="pp">[</span><span class="ss">*</span><span class="pp">]</span>align<span class="er">(</span><span class="ex">16</span><span class="kw">)</span> <span class="ex">u8@0</span> 168</span>
<span id="cb2-23"><a href="#cb2-23" tabindex="-1"></a><span class="ex">info:</span> malloc<span class="er">(</span><span class="ex">168</span><span class="kw">)</span> <span class="cf">return</span> <span class="ex">u8@7fa33a7a6510</span></span>
<span id="cb2-24"><a href="#cb2-24" tabindex="-1"></a><span class="ex">info:</span> realloc return <span class="pp">[</span><span class="ss">*</span><span class="pp">]</span>align<span class="er">(</span><span class="ex">16</span><span class="kw">)</span> <span class="ex">u8@7fa33a7a6510</span> <span class="er">(</span><span class="ex">from</span> malloc<span class="kw">)</span></span>
<span id="cb2-25"><a href="#cb2-25" tabindex="-1"></a><span class="ex">info:</span> malloc<span class="er">(</span><span class="ex">1024</span><span class="kw">)</span> <span class="cf">return</span> <span class="ex">u8@7fa33a487010</span></span>
<span id="cb2-26"><a href="#cb2-26" tabindex="-1"></a><span class="ex">libmalloc.so*</span>  libmalloc.so.o  malloc.zig</span>
<span id="cb2-27"><a href="#cb2-27" tabindex="-1"></a><span class="ex">info:</span> free <span class="pp">[</span><span class="ss">*</span><span class="pp">]</span>align<span class="er">(</span><span class="ex">16</span><span class="kw">)</span> <span class="ex">u8@7fa33a7a4610</span></span>
<span id="cb2-28"><a href="#cb2-28" tabindex="-1"></a><span class="ex">info:</span> free <span class="pp">[</span><span class="ss">*</span><span class="pp">]</span>align<span class="er">(</span><span class="ex">16</span><span class="kw">)</span> <span class="ex">u8@0</span></span>
<span id="cb2-29"><a href="#cb2-29" tabindex="-1"></a><span class="ex">info:</span> free <span class="pp">[</span><span class="ss">*</span><span class="pp">]</span>align<span class="er">(</span><span class="ex">16</span><span class="kw">)</span> <span class="ex">u8@7fa33a4900d0</span></span>
<span id="cb2-30"><a href="#cb2-30" tabindex="-1"></a><span class="ex">info:</span> free <span class="pp">[</span><span class="ss">*</span><span class="pp">]</span>align<span class="er">(</span><span class="ex">16</span><span class="kw">)</span> <span class="ex">u8@7fa33a487010</span></span></code></pre></div>
            <p>I have removed most of the output.</p>
            <h1 id="related">Related</h1>
            <ul>
            <li><a href="/https/richiejp.com/barely-http2-zig">Barely HTTP/2 server in
            Zig</a></li>
            <li><a href="/https/richiejp.com/zig-vs-c-mini-http-server">Zig Vs C - Minimal
            HTTP server</a></li>
            <li><a href="/https/richiejp.com/zig-cross-compile-ltp-ltx-linux">Minimal Linux
            VM cross compiled with Clang and Zig</a></li>
            <li><a href="zig-fuse-one">Zig &amp; FUSE: Hello file
            systems</a></li>
            </ul>
    </div>
  </content>
</entry>
<entry>
  <title>Zig Vs C - Minimal HTTP server</title>
  <id>https://richiejp.com/zig-vs-c-mini-http-server</id>
  <published>2022-02-04T12:16:40Z</published>
  <updated>2023-07-26T15:24:08+01:00</updated>
  <link rel="alternate" href="https://richiejp.com/zig-vs-c-mini-http-server" />
  <summary>Comparison of a minimal HTTP server written in C and
Zig</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <p>While working on my <a href="/https/richiejp.com/linux-socket-example">Linux
            socket example</a> I decided to write a tiny HTTP server for
            previewing <a href="/https/richiejp.com/pandoc-bulma-static-site">my static
            website</a>. This shows the basics of using TCP sockets,
            correctly adds <code>.html</code> to routes without it and
            saves me the distress of typing <code>python</code>,
            <code>npm</code> or similar blasphemies. The server is
            barely functional of course. However it is enough to get my
            pages to appear in FireFox and Chrome.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>I started a HTTP/2 version: <a href="/https/richiejp.com/barely-http2-zig">A
            barely HTTP/2 server in Zig</a></p>
            </div>
            </div>
            <p>It also happens that I am desperate to write Zig code.
            It’s an unfortunate part of my personality that I can not
            stay away from new languages (and <a
            href="/https/richiejp.com/nanos-clone3-brk-and-nodejs">kernels</a>, web
            frameworkers etc.). If you want to <a
            href="/https/richiejp.com/ways-to-help-your-project-fail">ruin a project</a>
            then choosing all new stuff is an excellent way to go about
            it. However I’ve learned the hard way to try out one new
            thing at a time. So in this article I’m just going to use
            Zig to do something I have done before.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>Update in 2023!</p>
            <p>The segfaults mentioned here have been solved to my
            knowledge.</p>
            </div>
            </div>
            <p>This is the second time I have written some Zig, the
            first time I tried using it to <a href="/https/richiejp.com/zc-data">build and
            test a radix sort and hash map implementation in C</a>. This
            was moderately successful. One problem was that I managed to
            segfault the compiler, the other that I was confused about
            slices and pointers. This time I managed to <a
            href="https://github.com/ziglang/zig/issues/10315#issuecomment-1028999168">also
            segfault the compiler</a> and was still confused about
            slices.</p>
            <div class="message is-info">
            <div class="message-body">
            <p>Update in 2023!</p>
            <p>The compiler now prints some helpful hints when there is
            an issue. Also I think it is now more permissive in
            situations where there is no ambiguity.</p>
            <p>After a bit more Zig hacking I now feel totally
            comfortable with slices and the various pointer types.</p>
            </div>
            </div>
            <p>This hasn’t deterred me however. For one thing I have
            spent barely any time on Zig. I’ve spent more time trying to
            figure out if something is a scalar or an array in Perl than
            I have with Zig. So I can forgive some head scratching over
            its obtuse type system errors.</p>
            <p>Just to be clear, this is hardly an apples to apples
            comparison. For that I think we would have to rip out the
            standard libraries for both languages. Then build an
            application with total feature parity. Then we shall see
            exactly what each <em>language</em> gives us. Alternatively
            we could try using a C library which provides similar
            features to the Zig one.</p>
            <p>Anyway enough rambling and interlinking. You can see the
            <a
            href="https://gitlab.com/Palethorpe/portfolio/-/blob/master/src/self-serve.zig">latest
            zig code here</a> and the <a
            href="https://gitlab.com/Palethorpe/portfolio/-/blob/master/src/self-serve.c">latest
            C code here</a>. Let’s compare the imports and includes
            first.</p>
            <h1 id="importinclude">Import/Include</h1>
            <h3 id="zig">Zig</h3>
            <div class="sourceCode" id="cb1"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="at">const</span> std <span class="op">=</span> <span class="bu">@import</span>(<span class="st">&quot;std&quot;</span>);</span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a><span class="at">const</span> net <span class="op">=</span> std<span class="op">.</span>net;</span>
<span id="cb1-3"><a href="#cb1-3" tabindex="-1"></a><span class="at">const</span> mem <span class="op">=</span> std<span class="op">.</span>mem;</span>
<span id="cb1-4"><a href="#cb1-4" tabindex="-1"></a><span class="at">const</span> fs <span class="op">=</span> std<span class="op">.</span>fs;</span>
<span id="cb1-5"><a href="#cb1-5" tabindex="-1"></a><span class="at">const</span> io <span class="op">=</span> std<span class="op">.</span>io;</span></code></pre></div>
            <h3 id="c">C</h3>
            <div class="sourceCode" id="cb2"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="pp">#define </span><span class="ot">_GNU_SOURCE</span></span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;limits.h&gt;</span></span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;errno.h&gt;</span></span>
<span id="cb2-5"><a href="#cb2-5" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;unistd.h&gt;</span></span>
<span id="cb2-6"><a href="#cb2-6" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;string.h&gt;</span></span>
<span id="cb2-7"><a href="#cb2-7" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;stdio.h&gt;</span></span>
<span id="cb2-8"><a href="#cb2-8" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;fcntl.h&gt;</span></span>
<span id="cb2-9"><a href="#cb2-9" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;signal.h&gt;</span></span>
<span id="cb2-10"><a href="#cb2-10" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;sys/stat.h&gt;</span></span>
<span id="cb2-11"><a href="#cb2-11" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;sys/socket.h&gt;</span></span>
<span id="cb2-12"><a href="#cb2-12" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;sys/sendfile.h&gt;</span></span>
<span id="cb2-13"><a href="#cb2-13" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;netinet/in.h&gt;</span></span>
<span id="cb2-14"><a href="#cb2-14" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;netinet/tcp.h&gt;</span></span>
<span id="cb2-15"><a href="#cb2-15" tabindex="-1"></a><span class="pp">#include </span><span class="im">&lt;arpa/inet.h&gt;</span></span></code></pre></div>
            <p>I only used the standard library for Zig and POSIX for C.
            With the exception of <code>sys/sendfile.h</code> and
            perhaps something else I have forgotten about. Everything
            from the Zig standard library is imported entirely with
            <code>@import("std")</code>, the other statements are just
            regular assignments.</p>
            <p>Zig doesn’t specifically have <em>modules</em> or
            whatever, things like structs and unions act as namespaces.
            The <code>@import</code> statement wraps the source file it
            includes in a struct type. So <code>std</code> is a type of
            struct. Struct types (or just structs) can have static
            variables, which I assume is what <code>std.io</code>
            is.</p>
            <p>All struct types in Zig are anonymous unless they are
            assigned to a variable or appear in a return statement. Then
            they take on the name of the variable or the returning
            function respectively. It seems the first assignment becomes
            the canonical name.</p>
            <p>Already this is saying a lot about Zig I think. Meanwhile
            the C <code>#includes</code> are not actually C, they are
            preprocessor directives. The C preprocessor is a templating
            language more or less. Including a file inserts its
            processed content at the point of the include. It’s not
            immediately obvious what was included and which parts of it
            we use.</p>
            <p>I’m not entirely sure all of those includes are needed
            either. It should be possible to find out using <a
            href="custom-c-static-analysis-tools">static analysis</a>,
            however I’m not exactly sure how to do it. Having said that,
            I’m pretty sure they all are needed.</p>
            <p>The header files don’t include the full code for the
            functions being included either. The could do of course, but
            I’m linking against glibc and that is not how it works. By
            default Zig’s standard library is fully included. There is a
            huge discussion to be had about that, but it doesn’t effect
            the current project.</p>
            <p>The Zig produced executable is bigger than the C one and
            it takes longer to compile. However they are both more than
            adequate for this project. It’s difficult to extrapolate
            this to a larger or more constrained scenario because Zig
            appears to have ways of dealing with these issues. Not to
            mention that you can throw out the c standard library.</p>
            <p>What I think matters most here is that we have a big long
            list of C headers for a relatively simple program. Also we
            know that everything from <code>std</code> is in the
            <code>std</code> variable. At least until we assign
            something from <code>std</code> to an outer variable.</p>
            <p>It may be feasible to do something similar in C with
            structs and clever macros. However, using the defaults, Zig
            wins here.</p>
            <h1 id="main">Main</h1>
            <h3 id="zig-1">Zig</h3>
            <div class="sourceCode" id="cb3"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">fn</span> main() <span class="dt">anyerror</span><span class="op">!</span><span class="dt">void</span> {</span>
<span id="cb3-2"><a href="#cb3-2" tabindex="-1"></a>    <span class="at">var</span> args <span class="op">=</span> std<span class="op">.</span>process<span class="op">.</span>args();</span>
<span id="cb3-3"><a href="#cb3-3" tabindex="-1"></a>    <span class="at">const</span> exe_name <span class="op">=</span> args<span class="op">.</span>next() <span class="kw">orelse</span> <span class="st">&quot;zelf-zerve&quot;</span>;</span>
<span id="cb3-4"><a href="#cb3-4" tabindex="-1"></a>    <span class="at">const</span> public_path <span class="op">=</span> args<span class="op">.</span>next() <span class="kw">orelse</span> {</span>
<span id="cb3-5"><a href="#cb3-5" tabindex="-1"></a>        std<span class="op">.</span>log<span class="op">.</span>err(<span class="st">&quot;Usage: {s} &lt;dir to serve files from&gt;&quot;</span><span class="op">,</span> <span class="op">.</span>{exe_name});</span>
<span id="cb3-6"><a href="#cb3-6" tabindex="-1"></a>        <span class="cf">return</span>;</span>
<span id="cb3-7"><a href="#cb3-7" tabindex="-1"></a>    };</span>
<span id="cb3-8"><a href="#cb3-8" tabindex="-1"></a></span>
<span id="cb3-9"><a href="#cb3-9" tabindex="-1"></a>    <span class="at">var</span> dir <span class="op">=</span> <span class="cf">try</span> fs<span class="op">.</span>cwd()<span class="op">.</span>openDir(public_path<span class="op">,</span> <span class="op">.</span>{});</span>
<span id="cb3-10"><a href="#cb3-10" tabindex="-1"></a>    <span class="at">const</span> self_addr <span class="op">=</span> <span class="cf">try</span> net<span class="op">.</span>Address<span class="op">.</span>resolveIp(<span class="st">&quot;127.0.0.1&quot;</span><span class="op">,</span> <span class="dv">9000</span>);</span>
<span id="cb3-11"><a href="#cb3-11" tabindex="-1"></a>    <span class="at">var</span> listener <span class="op">=</span> net<span class="op">.</span>StreamServer<span class="op">.</span>init(<span class="op">.</span>{});</span>
<span id="cb3-12"><a href="#cb3-12" tabindex="-1"></a>    <span class="cf">try</span> (<span class="op">&amp;</span>listener)<span class="op">.</span>listen(self_addr);</span>
<span id="cb3-13"><a href="#cb3-13" tabindex="-1"></a></span>
<span id="cb3-14"><a href="#cb3-14" tabindex="-1"></a>    std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;Listening on {}; press Ctrl-C to exit...&quot;</span><span class="op">,</span> <span class="op">.</span>{self_addr});</span>
<span id="cb3-15"><a href="#cb3-15" tabindex="-1"></a></span>
<span id="cb3-16"><a href="#cb3-16" tabindex="-1"></a>    <span class="cf">while</span> ((<span class="op">&amp;</span>listener)<span class="op">.</span>accept()) <span class="op">|</span>conn<span class="op">|</span> {</span>
<span id="cb3-17"><a href="#cb3-17" tabindex="-1"></a>        std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;Accepted Connection from: {}&quot;</span><span class="op">,</span> <span class="op">.</span>{conn<span class="op">.</span>address});</span>
<span id="cb3-18"><a href="#cb3-18" tabindex="-1"></a></span>
<span id="cb3-19"><a href="#cb3-19" tabindex="-1"></a>        serveFile(<span class="op">&amp;</span>conn<span class="op">.</span>stream<span class="op">,</span> dir) <span class="cf">catch</span> <span class="op">|</span>err<span class="op">|</span> {</span>
<span id="cb3-20"><a href="#cb3-20" tabindex="-1"></a>            <span class="cf">if</span> (<span class="bu">@errorReturnTrace</span>()) <span class="op">|</span>bt<span class="op">|</span> {</span>
<span id="cb3-21"><a href="#cb3-21" tabindex="-1"></a>                std<span class="op">.</span>log<span class="op">.</span>err(<span class="st">&quot;Failed to serve client: {}: {}&quot;</span><span class="op">,</span> <span class="op">.</span>{err<span class="op">,</span> bt});</span>
<span id="cb3-22"><a href="#cb3-22" tabindex="-1"></a>            } <span class="cf">else</span> {</span>
<span id="cb3-23"><a href="#cb3-23" tabindex="-1"></a>                std<span class="op">.</span>log<span class="op">.</span>err(<span class="st">&quot;Failed to serve client: {}&quot;</span><span class="op">,</span> <span class="op">.</span>{err});</span>
<span id="cb3-24"><a href="#cb3-24" tabindex="-1"></a>            }</span>
<span id="cb3-25"><a href="#cb3-25" tabindex="-1"></a>        };</span>
<span id="cb3-26"><a href="#cb3-26" tabindex="-1"></a></span>
<span id="cb3-27"><a href="#cb3-27" tabindex="-1"></a>        conn<span class="op">.</span>stream<span class="op">.</span>close();</span>
<span id="cb3-28"><a href="#cb3-28" tabindex="-1"></a>    } <span class="cf">else</span> <span class="op">|</span>err<span class="op">|</span> {</span>
<span id="cb3-29"><a href="#cb3-29" tabindex="-1"></a>        <span class="cf">return</span> err;</span>
<span id="cb3-30"><a href="#cb3-30" tabindex="-1"></a>    }</span>
<span id="cb3-31"><a href="#cb3-31" tabindex="-1"></a>}</span></code></pre></div>
            <h3 id="c-1">C</h3>
            <div class="sourceCode" id="cb4"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb4-1"><a href="#cb4-1" tabindex="-1"></a><span class="dt">int</span> main<span class="op">(</span><span class="dt">const</span> <span class="dt">int</span> argc<span class="op">,</span> <span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span><span class="dt">const</span> argv<span class="op">[])</span></span>
<span id="cb4-2"><a href="#cb4-2" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb4-3"><a href="#cb4-3" tabindex="-1"></a>    <span class="dt">const</span> pid_t orig_parent <span class="op">=</span> getppid<span class="op">();</span></span>
<span id="cb4-4"><a href="#cb4-4" tabindex="-1"></a>    <span class="dt">const</span> <span class="kw">struct</span> sockaddr_in self_addr <span class="op">=</span> <span class="op">{</span></span>
<span id="cb4-5"><a href="#cb4-5" tabindex="-1"></a>        <span class="op">.</span>sin_family <span class="op">=</span> AF_INET<span class="op">,</span></span>
<span id="cb4-6"><a href="#cb4-6" tabindex="-1"></a>        <span class="op">.</span>sin_port <span class="op">=</span> htons<span class="op">(</span><span class="dv">9000</span><span class="op">),</span></span>
<span id="cb4-7"><a href="#cb4-7" tabindex="-1"></a>        <span class="op">.</span>sin_addr <span class="op">=</span> <span class="op">{</span></span>
<span id="cb4-8"><a href="#cb4-8" tabindex="-1"></a>            htonl<span class="op">(</span>INADDR_LOOPBACK<span class="op">)</span></span>
<span id="cb4-9"><a href="#cb4-9" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb4-10"><a href="#cb4-10" tabindex="-1"></a>    <span class="op">};</span></span>
<span id="cb4-11"><a href="#cb4-11" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">int</span> listen_sk <span class="op">=</span> socket<span class="op">(</span>AF_INET<span class="op">,</span> SOCK_STREAM<span class="op">,</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb4-12"><a href="#cb4-12" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">int</span> public_dir <span class="op">=</span> open<span class="op">(</span>argv<span class="op">[</span><span class="dv">1</span><span class="op">],</span> O_PATH<span class="op">);</span></span>
<span id="cb4-13"><a href="#cb4-13" tabindex="-1"></a>    <span class="kw">struct</span> sockaddr client_addr<span class="op">;</span></span>
<span id="cb4-14"><a href="#cb4-14" tabindex="-1"></a>    socklen_t addr_len<span class="op">;</span></span>
<span id="cb4-15"><a href="#cb4-15" tabindex="-1"></a></span>
<span id="cb4-16"><a href="#cb4-16" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>argc <span class="op">&lt;</span> <span class="dv">2</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb4-17"><a href="#cb4-17" tabindex="-1"></a>        dprintf<span class="op">(</span>STDERR_FILENO<span class="op">,</span></span>
<span id="cb4-18"><a href="#cb4-18" tabindex="-1"></a>            <span class="st">&quot;usage: </span><span class="sc">%s</span><span class="st"> &lt;dir to serve files from&gt;</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">,</span></span>
<span id="cb4-19"><a href="#cb4-19" tabindex="-1"></a>            argv<span class="op">[</span><span class="dv">0</span><span class="op">]);</span></span>
<span id="cb4-20"><a href="#cb4-20" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb4-21"><a href="#cb4-21" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb4-22"><a href="#cb4-22" tabindex="-1"></a></span>
<span id="cb4-23"><a href="#cb4-23" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>bind<span class="op">(</span>listen_sk<span class="op">,</span> <span class="op">(</span><span class="kw">struct</span> sockaddr <span class="op">*)&amp;</span>self_addr<span class="op">,</span> <span class="kw">sizeof</span><span class="op">(</span>self_addr<span class="op">)))</span> <span class="op">{</span></span>
<span id="cb4-24"><a href="#cb4-24" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;bind&quot;</span><span class="op">);</span></span>
<span id="cb4-25"><a href="#cb4-25" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb4-26"><a href="#cb4-26" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb4-27"><a href="#cb4-27" tabindex="-1"></a></span>
<span id="cb4-28"><a href="#cb4-28" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>listen<span class="op">(</span>listen_sk<span class="op">,</span> <span class="dv">8</span><span class="op">))</span> <span class="op">{</span></span>
<span id="cb4-29"><a href="#cb4-29" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;listen&quot;</span><span class="op">);</span></span>
<span id="cb4-30"><a href="#cb4-30" tabindex="-1"></a>        <span class="cf">return</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb4-31"><a href="#cb4-31" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb4-32"><a href="#cb4-32" tabindex="-1"></a></span>
<span id="cb4-33"><a href="#cb4-33" tabindex="-1"></a>    printf<span class="op">(</span><span class="st">&quot;[+] Listening; press Ctrl-C to exit...</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">);</span></span>
<span id="cb4-34"><a href="#cb4-34" tabindex="-1"></a></span>
<span id="cb4-35"><a href="#cb4-35" tabindex="-1"></a>    <span class="cf">while</span> <span class="op">(</span>orig_parent <span class="op">==</span> getppid<span class="op">())</span> <span class="op">{</span></span>
<span id="cb4-36"><a href="#cb4-36" tabindex="-1"></a>        <span class="dt">const</span> <span class="dt">int</span> sk <span class="op">=</span> accept<span class="op">(</span>listen_sk<span class="op">,</span> <span class="op">&amp;</span>client_addr<span class="op">,</span> <span class="op">&amp;</span>addr_len<span class="op">);</span></span>
<span id="cb4-37"><a href="#cb4-37" tabindex="-1"></a></span>
<span id="cb4-38"><a href="#cb4-38" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>sk <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb4-39"><a href="#cb4-39" tabindex="-1"></a>            perror<span class="op">(</span><span class="st">&quot;[-] accept&quot;</span><span class="op">);</span></span>
<span id="cb4-40"><a href="#cb4-40" tabindex="-1"></a>            <span class="cf">break</span><span class="op">;</span></span>
<span id="cb4-41"><a href="#cb4-41" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb4-42"><a href="#cb4-42" tabindex="-1"></a></span>
<span id="cb4-43"><a href="#cb4-43" tabindex="-1"></a>        printf<span class="op">(</span><span class="st">&quot;[+] Accepted Connection</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">);</span></span>
<span id="cb4-44"><a href="#cb4-44" tabindex="-1"></a></span>
<span id="cb4-45"><a href="#cb4-45" tabindex="-1"></a>        serve_file<span class="op">(</span>sk<span class="op">,</span> public_dir<span class="op">);</span></span>
<span id="cb4-46"><a href="#cb4-46" tabindex="-1"></a>        close<span class="op">(</span>sk<span class="op">);</span></span>
<span id="cb4-47"><a href="#cb4-47" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb4-48"><a href="#cb4-48" tabindex="-1"></a></span>
<span id="cb4-49"><a href="#cb4-49" tabindex="-1"></a>    <span class="cf">return</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb4-50"><a href="#cb4-50" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>If you want to access <code>argv</code> in Zig, then you
            usually create an iterator around it. You can of course
            access it directly, but this is more error prone. You can
            see in the C code that I am accessing <code>argv[1]</code>
            before checking <code>argc</code>. The result is that it
            could try opening a path descriptor from an environment
            variable or something along these lines.</p>
            <p>For whatever reason Zig does not include args in main’s
            arguments. I can’t say this makes any difference to me. The
            Zig return value is <code>void</code> or an error code. If
            an error code is returned from main then Zig prints it. If
            debugging info is available then Zig also prints a
            <em>return error trace</em>. This is not to be confused with
            a <em>back trace</em>.</p>
            <p>The way that Zig handles errors has a very significant
            impact on this program. Most functions which can return an
            error are prefixed with <code>try</code>. If an error is
            returned then <code>try</code> acts like <code>return</code>
            and propagates the error. Otherwise it behaves like an
            expression.</p>
            <p>There is also <code>catch</code> which can be used in
            various places to branch on an error. Other things like
            <code>while</code> can handle errors as well. You can see on
            the bottom that the loop there has an <code>else</code>
            clause.</p>
            <p>In C we just use <code>if</code> statements and you can
            see I am ignoring some errors. My guess is that it is
            possible to implement error return traces in C and something
            similar to <code>try</code> using various types of magic.
            However I haven’t seen it done, so this is a win for
            Zig.</p>
            <p>The way that the <code>while</code> loop captures the
            connection variable <code>|conn|</code> is a big win. Also
            note the <code>orelse</code> which specifically handles a
            <code>null</code> result. The type system forces us to check
            that something is not <code>null</code> or an error before
            we try using it. This mitigates a category of bugs and then
            Zig also provides some syntax to avoid having
            <code>if</code>s all over the place (or if you have used
            Rust then… well, you know).</p>
            <p>Variables in Zig must either be declared with
            <code>const</code> or <code>var</code>. What is more, if a
            variable can be <code>const</code> it must be. By default,
            in C everything is mutable. I also haven’t found a way to
            warn when a variable could be const. Again it should be
            possible to implement for C, but for now Zig wins here. Zig
            also can infer the type of a variable most of the time. This
            is obviously a good thing in some situations, but here it
            may just leave a reader wondering what types the variables
            are.</p>
            <p>Let’s ignore the address declaration in C, I could have
            done that differently. So moving on.</p>
            <h1 id="receiving">Receiving</h1>
            <h3 id="zig-2">Zig</h3>
            <div class="sourceCode" id="cb5"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb5-1"><a href="#cb5-1" tabindex="-1"></a><span class="at">const</span> ServeFileError <span class="op">=</span> <span class="kw">error</span> {</span>
<span id="cb5-2"><a href="#cb5-2" tabindex="-1"></a>    RecvHeaderEOF<span class="op">,</span></span>
<span id="cb5-3"><a href="#cb5-3" tabindex="-1"></a>    RecvHeaderExceededBuffer<span class="op">,</span></span>
<span id="cb5-4"><a href="#cb5-4" tabindex="-1"></a>    HeaderDidNotMatch<span class="op">,</span></span>
<span id="cb5-5"><a href="#cb5-5" tabindex="-1"></a>};</span>
<span id="cb5-6"><a href="#cb5-6" tabindex="-1"></a></span>
<span id="cb5-7"><a href="#cb5-7" tabindex="-1"></a><span class="kw">fn</span> serveFile(stream<span class="op">:</span> <span class="op">*</span><span class="at">const</span> net<span class="op">.</span>Stream<span class="op">,</span> dir<span class="op">:</span> fs<span class="op">.</span>Dir) <span class="op">!</span><span class="dt">void</span> {</span>
<span id="cb5-8"><a href="#cb5-8" tabindex="-1"></a>    <span class="at">var</span> recv_buf<span class="op">:</span> [BUFSIZ]<span class="dt">u8</span> <span class="op">=</span> <span class="cn">undefined</span>;</span>
<span id="cb5-9"><a href="#cb5-9" tabindex="-1"></a>    <span class="at">var</span> recv_total<span class="op">:</span> <span class="dt">usize</span> <span class="op">=</span> <span class="dv">0</span>;</span>
<span id="cb5-10"><a href="#cb5-10" tabindex="-1"></a></span>
<span id="cb5-11"><a href="#cb5-11" tabindex="-1"></a>    <span class="cf">while</span> (stream<span class="op">.</span>read(recv_buf[recv_total<span class="op">..</span>])) <span class="op">|</span>recv_len<span class="op">|</span> {</span>
<span id="cb5-12"><a href="#cb5-12" tabindex="-1"></a>        <span class="cf">if</span> (recv_len <span class="op">==</span> <span class="dv">0</span>)</span>
<span id="cb5-13"><a href="#cb5-13" tabindex="-1"></a>            <span class="cf">return</span> ServeFileError<span class="op">.</span>RecvHeaderEOF;</span>
<span id="cb5-14"><a href="#cb5-14" tabindex="-1"></a></span>
<span id="cb5-15"><a href="#cb5-15" tabindex="-1"></a>        recv_total <span class="op">+=</span> recv_len;</span>
<span id="cb5-16"><a href="#cb5-16" tabindex="-1"></a></span>
<span id="cb5-17"><a href="#cb5-17" tabindex="-1"></a>        <span class="cf">if</span> (mem<span class="op">.</span>containsAtLeast(<span class="dt">u8</span><span class="op">,</span> recv_buf[<span class="dv">0</span><span class="er">.</span><span class="op">.</span>recv_total]<span class="op">,</span> <span class="dv">1</span><span class="op">,</span> <span class="st">&quot;</span><span class="sc">\r\n\r\n</span><span class="st">&quot;</span>))</span>
<span id="cb5-18"><a href="#cb5-18" tabindex="-1"></a>            <span class="cf">break</span>;</span>
<span id="cb5-19"><a href="#cb5-19" tabindex="-1"></a></span>
<span id="cb5-20"><a href="#cb5-20" tabindex="-1"></a>        <span class="cf">if</span> (recv_total <span class="op">&gt;=</span> recv_buf<span class="op">.</span>len)</span>
<span id="cb5-21"><a href="#cb5-21" tabindex="-1"></a>            <span class="cf">return</span> ServeFileError<span class="op">.</span>RecvHeaderExceededBuffer;</span>
<span id="cb5-22"><a href="#cb5-22" tabindex="-1"></a>    } <span class="cf">else</span> <span class="op">|</span>read_err<span class="op">|</span> {</span>
<span id="cb5-23"><a href="#cb5-23" tabindex="-1"></a>        <span class="cf">return</span> read_err;</span>
<span id="cb5-24"><a href="#cb5-24" tabindex="-1"></a>    }</span>
<span id="cb5-25"><a href="#cb5-25" tabindex="-1"></a></span>
<span id="cb5-26"><a href="#cb5-26" tabindex="-1"></a>    <span class="at">const</span> recv_slice <span class="op">=</span> recv_buf[<span class="dv">0</span><span class="er">.</span><span class="op">.</span>recv_total];</span>
<span id="cb5-27"><a href="#cb5-27" tabindex="-1"></a>    std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot; &lt;&lt;&lt;</span><span class="sc">\n</span><span class="st">{s}&quot;</span><span class="op">,</span> <span class="op">.</span>{recv_slice});</span>
<span id="cb5-28"><a href="#cb5-28" tabindex="-1"></a></span>
<span id="cb5-29"><a href="#cb5-29" tabindex="-1"></a>    <span class="op">...</span></span></code></pre></div>
            <h3 id="c-2">C</h3>
            <div class="sourceCode" id="cb6"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb6-1"><a href="#cb6-1" tabindex="-1"></a><span class="dt">static</span> <span class="dt">void</span> serve_file<span class="op">(</span><span class="dt">const</span> <span class="dt">int</span> sk<span class="op">,</span> <span class="dt">const</span> <span class="dt">int</span> public_dir<span class="op">)</span></span>
<span id="cb6-2"><a href="#cb6-2" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb6-3"><a href="#cb6-3" tabindex="-1"></a>    <span class="dt">char</span> recv_buf<span class="op">[</span>BUFSIZ<span class="op">];</span></span>
<span id="cb6-4"><a href="#cb6-4" tabindex="-1"></a>    <span class="dt">char</span> head_buf<span class="op">[</span>BUFSIZ<span class="op">];</span></span>
<span id="cb6-5"><a href="#cb6-5" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">size_t</span> buf_len <span class="op">=</span> BUFSIZ <span class="op">-</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb6-6"><a href="#cb6-6" tabindex="-1"></a>    <span class="dt">char</span> path_buf<span class="op">[</span><span class="dv">256</span><span class="op">];</span></span>
<span id="cb6-7"><a href="#cb6-7" tabindex="-1"></a>    <span class="dt">char</span> <span class="op">*</span>file_path<span class="op">;</span></span>
<span id="cb6-8"><a href="#cb6-8" tabindex="-1"></a>    <span class="dt">ssize_t</span> recv<span class="op">,</span> sent<span class="op">;</span></span>
<span id="cb6-9"><a href="#cb6-9" tabindex="-1"></a>    <span class="dt">size_t</span> recv_total <span class="op">=</span> <span class="dv">0</span><span class="op">,</span> sent_total <span class="op">=</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb6-10"><a href="#cb6-10" tabindex="-1"></a>    <span class="dt">int</span> body_fd<span class="op">;</span></span>
<span id="cb6-11"><a href="#cb6-11" tabindex="-1"></a></span>
<span id="cb6-12"><a href="#cb6-12" tabindex="-1"></a>    <span class="cf">while</span> <span class="op">(</span><span class="dv">1</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb6-13"><a href="#cb6-13" tabindex="-1"></a>        recv <span class="op">=</span> read<span class="op">(</span>sk<span class="op">,</span></span>
<span id="cb6-14"><a href="#cb6-14" tabindex="-1"></a>                recv_buf <span class="op">+</span> recv_total<span class="op">,</span></span>
<span id="cb6-15"><a href="#cb6-15" tabindex="-1"></a>                buf_len <span class="op">-</span> recv_total<span class="op">);</span></span>
<span id="cb6-16"><a href="#cb6-16" tabindex="-1"></a></span>
<span id="cb6-17"><a href="#cb6-17" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>recv <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb6-18"><a href="#cb6-18" tabindex="-1"></a>            perror<span class="op">(</span><span class="st">&quot;[-] read&quot;</span><span class="op">);</span></span>
<span id="cb6-19"><a href="#cb6-19" tabindex="-1"></a>            <span class="cf">return</span><span class="op">;</span></span>
<span id="cb6-20"><a href="#cb6-20" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb6-21"><a href="#cb6-21" tabindex="-1"></a></span>
<span id="cb6-22"><a href="#cb6-22" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(!</span>recv<span class="op">)</span> <span class="op">{</span></span>
<span id="cb6-23"><a href="#cb6-23" tabindex="-1"></a>            dprintf<span class="op">(</span>STDERR_FILENO<span class="op">,</span></span>
<span id="cb6-24"><a href="#cb6-24" tabindex="-1"></a>                <span class="st">&quot;[-] End of data before header was received</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">);</span></span>
<span id="cb6-25"><a href="#cb6-25" tabindex="-1"></a>            <span class="cf">return</span><span class="op">;</span></span>
<span id="cb6-26"><a href="#cb6-26" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb6-27"><a href="#cb6-27" tabindex="-1"></a></span>
<span id="cb6-28"><a href="#cb6-28" tabindex="-1"></a>        recv_total <span class="op">+=</span> recv<span class="op">;</span></span>
<span id="cb6-29"><a href="#cb6-29" tabindex="-1"></a>        recv_buf<span class="op">[</span>recv_total<span class="op">]</span> <span class="op">=</span> <span class="dv">0</span><span class="op">;</span></span>
<span id="cb6-30"><a href="#cb6-30" tabindex="-1"></a></span>
<span id="cb6-31"><a href="#cb6-31" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>strstr<span class="op">(</span>recv_buf<span class="op">,</span> <span class="st">&quot;</span><span class="sc">\r\n\r\n</span><span class="st">&quot;</span><span class="op">))</span></span>
<span id="cb6-32"><a href="#cb6-32" tabindex="-1"></a>            <span class="cf">break</span><span class="op">;</span></span>
<span id="cb6-33"><a href="#cb6-33" tabindex="-1"></a></span>
<span id="cb6-34"><a href="#cb6-34" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>recv_total <span class="op">&gt;=</span> buf_len<span class="op">)</span> <span class="op">{</span></span>
<span id="cb6-35"><a href="#cb6-35" tabindex="-1"></a>            dprintf<span class="op">(</span>STDERR_FILENO<span class="op">,</span></span>
<span id="cb6-36"><a href="#cb6-36" tabindex="-1"></a>                <span class="st">&quot;Exceeded buffer reading header</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">);</span></span>
<span id="cb6-37"><a href="#cb6-37" tabindex="-1"></a>            <span class="cf">return</span><span class="op">;</span></span>
<span id="cb6-38"><a href="#cb6-38" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb6-39"><a href="#cb6-39" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb6-40"><a href="#cb6-40" tabindex="-1"></a></span>
<span id="cb6-41"><a href="#cb6-41" tabindex="-1"></a>    printf<span class="op">(</span><span class="st">&quot;[*] &lt;&lt;&lt;</span><span class="sc">\n%s\n</span><span class="st">&quot;</span><span class="op">,</span> recv_buf<span class="op">);</span></span>
<span id="cb6-42"><a href="#cb6-42" tabindex="-1"></a>    <span class="op">...</span></span></code></pre></div>
            <p>When we have a connection the first thing we do is
            receive the header. It’s expected that the entire header
            will be received in a single read most of the time. This web
            server is only for local usage after all. However
            occasionally this won’t happen because the copying of
            buffers can be interrupted and other random reasons. So we
            need a loop.</p>
            <p>It’s difficult to know where to start here. I guess the
            weirdest thing about the Zig code is that the while has
            <code>|recv_len|</code> an <code>else</code> clause. The
            while loop here is saying “while <code>read</code> is not an
            error then… else if it is an error…”. The symbol enclosed in
            pipes (<code>|</code>) is capturing the return value or
            error.</p>
            <p>The call to <code>read</code> is the first thing we do
            and will want to break on if it goes wrong. In the C code I
            use a <code>while(1)</code> loop for the same reason; there
            is nothing to check before we do the read. If the Zig code
            provides any concrete advantage over C it is that it forces
            error checking. Meanwhile Zig gives you a minimal effort way
            of debugging errors.</p>
            <p>If I were to just return the <code>errno</code> from
            <code>serve_file</code> in C then I wouldn’t know exactly
            where an error came from. That is unless I use and outside
            tool like <code>strace</code> to see which system call
            caused an error (if any). So ignoring outside tracing
            methods, Zig gets another win here.</p>
            <p>Also here you can see Zig’s arrays and slices;
            <code>recv_buf[recv_total..]</code> means we begin reading
            into the buffer at an offset of <code>recv_total</code>.
            Also we don’t need to pass the buffer length separately
            because it is part of the slice struct. Nor do we need to
            calculate the remaining length. Hurray!</p>
            <p>I suspect that Zig gets another win through slices for
            making it easy to avoid null terminated strings. Zig
            explicitly supports null terminated strings, but you don’t
            need them for the standard library’s string functions.</p>
            <h1 id="routing">Routing</h1>
            <h3 id="zig-3">Zig</h3>
            <div class="sourceCode" id="cb7"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb7-1"><a href="#cb7-1" tabindex="-1"></a>    <span class="at">var</span> file_path<span class="op">:</span> []<span class="at">const</span> <span class="dt">u8</span> <span class="op">=</span> <span class="cn">undefined</span>;</span>
<span id="cb7-2"><a href="#cb7-2" tabindex="-1"></a>    <span class="at">var</span> tok_itr <span class="op">=</span> mem<span class="op">.</span>tokenize(<span class="dt">u8</span><span class="op">,</span> recv_slice<span class="op">,</span> <span class="st">&quot; &quot;</span>);</span>
<span id="cb7-3"><a href="#cb7-3" tabindex="-1"></a></span>
<span id="cb7-4"><a href="#cb7-4" tabindex="-1"></a>    <span class="cf">if</span> (<span class="op">!</span>mem<span class="op">.</span>eql(<span class="dt">u8</span><span class="op">,</span> tok_itr<span class="op">.</span>next() <span class="kw">orelse</span> <span class="st">&quot;&quot;</span><span class="op">,</span> <span class="st">&quot;GET&quot;</span>))</span>
<span id="cb7-5"><a href="#cb7-5" tabindex="-1"></a>        <span class="cf">return</span> ServeFileError<span class="op">.</span>HeaderDidNotMatch;</span>
<span id="cb7-6"><a href="#cb7-6" tabindex="-1"></a></span>
<span id="cb7-7"><a href="#cb7-7" tabindex="-1"></a>    <span class="at">const</span> path <span class="op">=</span> tok_itr<span class="op">.</span>next() <span class="kw">orelse</span> <span class="st">&quot;&quot;</span>;</span>
<span id="cb7-8"><a href="#cb7-8" tabindex="-1"></a>    <span class="cf">if</span> (path[<span class="dv">0</span>] <span class="op">!=</span> <span class="ch">&#39;/&#39;</span>)</span>
<span id="cb7-9"><a href="#cb7-9" tabindex="-1"></a>        <span class="cf">return</span> ServeFileError<span class="op">.</span>HeaderDidNotMatch;</span>
<span id="cb7-10"><a href="#cb7-10" tabindex="-1"></a></span>
<span id="cb7-11"><a href="#cb7-11" tabindex="-1"></a>    <span class="cf">if</span> (mem<span class="op">.</span>eql(<span class="dt">u8</span><span class="op">,</span> path<span class="op">,</span> <span class="st">&quot;/&quot;</span>))</span>
<span id="cb7-12"><a href="#cb7-12" tabindex="-1"></a>        file_path <span class="op">=</span> <span class="st">&quot;index&quot;</span></span>
<span id="cb7-13"><a href="#cb7-13" tabindex="-1"></a>    <span class="cf">else</span></span>
<span id="cb7-14"><a href="#cb7-14" tabindex="-1"></a>        file_path <span class="op">=</span> path[<span class="dv">1</span><span class="er">.</span><span class="op">.</span>];</span>
<span id="cb7-15"><a href="#cb7-15" tabindex="-1"></a></span>
<span id="cb7-16"><a href="#cb7-16" tabindex="-1"></a>    <span class="cf">if</span> (<span class="op">!</span>mem<span class="op">.</span>startsWith(<span class="dt">u8</span><span class="op">,</span> tok_itr<span class="op">.</span>rest()<span class="op">,</span> <span class="st">&quot;HTTP/1.1</span><span class="sc">\r\n</span><span class="st">&quot;</span>))</span>
<span id="cb7-17"><a href="#cb7-17" tabindex="-1"></a>        <span class="cf">return</span> ServeFileError<span class="op">.</span>HeaderDidNotMatch;</span>
<span id="cb7-18"><a href="#cb7-18" tabindex="-1"></a></span>
<span id="cb7-19"><a href="#cb7-19" tabindex="-1"></a>    <span class="at">var</span> file_ext <span class="op">=</span> fs<span class="op">.</span>path<span class="op">.</span>extension(file_path);</span>
<span id="cb7-20"><a href="#cb7-20" tabindex="-1"></a>    <span class="at">var</span> path_buf<span class="op">:</span> [fs<span class="op">.</span>MAX_PATH_BYTES]<span class="dt">u8</span> <span class="op">=</span> <span class="cn">undefined</span>;</span>
<span id="cb7-21"><a href="#cb7-21" tabindex="-1"></a></span>
<span id="cb7-22"><a href="#cb7-22" tabindex="-1"></a>    <span class="cf">if</span> (file_ext<span class="op">.</span>len <span class="op">==</span> <span class="dv">0</span>) {</span>
<span id="cb7-23"><a href="#cb7-23" tabindex="-1"></a>        <span class="at">var</span> path_fbs <span class="op">=</span> io<span class="op">.</span>fixedBufferStream(<span class="op">&amp;</span>path_buf);</span>
<span id="cb7-24"><a href="#cb7-24" tabindex="-1"></a></span>
<span id="cb7-25"><a href="#cb7-25" tabindex="-1"></a>        <span class="cf">try</span> path_fbs<span class="op">.</span>writer()<span class="op">.</span>print(<span class="st">&quot;{s}.html&quot;</span><span class="op">,</span> <span class="op">.</span>{file_path});</span>
<span id="cb7-26"><a href="#cb7-26" tabindex="-1"></a>        file_ext <span class="op">=</span> <span class="st">&quot;.html&quot;</span>;</span>
<span id="cb7-27"><a href="#cb7-27" tabindex="-1"></a>        file_path <span class="op">=</span> path_fbs<span class="op">.</span>getWritten();</span>
<span id="cb7-28"><a href="#cb7-28" tabindex="-1"></a>    }</span>
<span id="cb7-29"><a href="#cb7-29" tabindex="-1"></a></span>
<span id="cb7-30"><a href="#cb7-30" tabindex="-1"></a>    std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot;Opening {s}&quot;</span><span class="op">,</span> <span class="op">.</span>{file_path});</span>
<span id="cb7-31"><a href="#cb7-31" tabindex="-1"></a></span>
<span id="cb7-32"><a href="#cb7-32" tabindex="-1"></a>    <span class="at">var</span> body_file <span class="op">=</span> <span class="cf">try</span> dir<span class="op">.</span>openFile(file_path<span class="op">,</span> <span class="op">.</span>{});</span>
<span id="cb7-33"><a href="#cb7-33" tabindex="-1"></a>    <span class="cf">defer</span> body_file<span class="op">.</span>close();</span>
<span id="cb7-34"><a href="#cb7-34" tabindex="-1"></a></span>
<span id="cb7-35"><a href="#cb7-35" tabindex="-1"></a>    <span class="at">const</span> file_len <span class="op">=</span> <span class="cf">try</span> body_file<span class="op">.</span>getEndPos();</span></code></pre></div>
            <h3 id="c-3">C</h3>
            <div class="sourceCode" id="cb8"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb8-1"><a href="#cb8-1" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>sscanf<span class="op">(</span>recv_buf<span class="op">,</span> <span class="st">&quot;GET </span><span class="sc">%250s</span><span class="st"> HTTP/1.1&quot;</span><span class="op">,</span> path_buf<span class="op">))</span> <span class="op">{</span></span>
<span id="cb8-2"><a href="#cb8-2" tabindex="-1"></a>        dprintf<span class="op">(</span>STDERR_FILENO<span class="op">,</span></span>
<span id="cb8-3"><a href="#cb8-3" tabindex="-1"></a>            <span class="st">&quot;[-] &#39;GET &lt;file_path&gt; HTTP/1.1&#39; not matched in:</span><span class="sc">\n</span><span class="st"> </span><span class="sc">%s</span><span class="st">&quot;</span><span class="op">,</span></span>
<span id="cb8-4"><a href="#cb8-4" tabindex="-1"></a>            recv_buf<span class="op">);</span></span>
<span id="cb8-5"><a href="#cb8-5" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb8-6"><a href="#cb8-6" tabindex="-1"></a></span>
<span id="cb8-7"><a href="#cb8-7" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>strcmp<span class="op">(</span><span class="st">&quot;/&quot;</span><span class="op">,</span> path_buf<span class="op">))</span> <span class="op">{</span></span>
<span id="cb8-8"><a href="#cb8-8" tabindex="-1"></a>        strcpy<span class="op">(</span>path_buf<span class="op">,</span> <span class="st">&quot;index.html&quot;</span><span class="op">);</span></span>
<span id="cb8-9"><a href="#cb8-9" tabindex="-1"></a>        file_path <span class="op">=</span> path_buf<span class="op">;</span></span>
<span id="cb8-10"><a href="#cb8-10" tabindex="-1"></a>    <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="op">(</span>path_buf<span class="op">[</span><span class="dv">0</span><span class="op">]</span> <span class="op">==</span> <span class="ch">&#39;/&#39;</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb8-11"><a href="#cb8-11" tabindex="-1"></a>        file_path <span class="op">=</span> path_buf <span class="op">+</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb8-12"><a href="#cb8-12" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb8-13"><a href="#cb8-13" tabindex="-1"></a></span>
<span id="cb8-14"><a href="#cb8-14" tabindex="-1"></a>    printf<span class="op">(</span><span class="st">&quot;[*] Opening </span><span class="sc">%s</span><span class="st">&quot;</span><span class="op">,</span> file_path<span class="op">);</span></span>
<span id="cb8-15"><a href="#cb8-15" tabindex="-1"></a>    body_fd <span class="op">=</span> openat<span class="op">(</span>public_dir<span class="op">,</span> file_path<span class="op">,</span> O_RDONLY<span class="op">);</span></span>
<span id="cb8-16"><a href="#cb8-16" tabindex="-1"></a></span>
<span id="cb8-17"><a href="#cb8-17" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>body_fd <span class="op">&lt;</span> <span class="dv">0</span> <span class="op">&amp;&amp;</span> errno <span class="op">==</span> ENOENT<span class="op">)</span> <span class="op">{</span></span>
<span id="cb8-18"><a href="#cb8-18" tabindex="-1"></a>        strcpy<span class="op">(</span>file_path <span class="op">+</span> strlen<span class="op">(</span>file_path<span class="op">),</span> <span class="st">&quot;.html&quot;</span><span class="op">);</span></span>
<span id="cb8-19"><a href="#cb8-19" tabindex="-1"></a>        body_fd <span class="op">=</span> openat<span class="op">(</span>public_dir<span class="op">,</span> file_path<span class="op">,</span> O_RDONLY<span class="op">);</span></span>
<span id="cb8-20"><a href="#cb8-20" tabindex="-1"></a>        printf<span class="op">(</span><span class="st">&quot; failed trying with .html&quot;</span><span class="op">);</span></span>
<span id="cb8-21"><a href="#cb8-21" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb8-22"><a href="#cb8-22" tabindex="-1"></a>    printf<span class="op">(</span><span class="st">&quot;</span><span class="sc">\n</span><span class="st">&quot;</span><span class="op">);</span></span>
<span id="cb8-23"><a href="#cb8-23" tabindex="-1"></a></span>
<span id="cb8-24"><a href="#cb8-24" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>body_fd <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb8-25"><a href="#cb8-25" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;[-] openat&quot;</span><span class="op">);</span></span>
<span id="cb8-26"><a href="#cb8-26" tabindex="-1"></a>        <span class="cf">return</span><span class="op">;</span></span>
<span id="cb8-27"><a href="#cb8-27" tabindex="-1"></a>    <span class="op">}</span></span></code></pre></div>
            <p>The Zig code is a bit lot longer because there is no
            <code>sscanf</code> equivalent in the Zig library. I’m not
            that confident about either the C or Zig code. However note
            the <code>defer body_file.close()</code> line. This saves
            having to do a <code>goto</code> or close the file at every
            early return thereafter.</p>
            <h1 id="sending">Sending</h1>
            <h3 id="zig-4">zig</h3>
            <div class="sourceCode" id="cb9"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb9-1"><a href="#cb9-1" tabindex="-1"></a>    <span class="at">const</span> http_head <span class="op">=</span></span>
<span id="cb9-2"><a href="#cb9-2" tabindex="-1"></a>        <span class="st">&quot;HTTP/1.1 200 OK</span><span class="sc">\r\n</span><span class="st">&quot;</span> <span class="op">++</span></span>
<span id="cb9-3"><a href="#cb9-3" tabindex="-1"></a>        <span class="st">&quot;Connection: close</span><span class="sc">\r\n</span><span class="st">&quot;</span> <span class="op">++</span></span>
<span id="cb9-4"><a href="#cb9-4" tabindex="-1"></a>        <span class="st">&quot;Content-Type: {s}</span><span class="sc">\r\n</span><span class="st">&quot;</span> <span class="op">++</span></span>
<span id="cb9-5"><a href="#cb9-5" tabindex="-1"></a>        <span class="st">&quot;Content-Length: {}</span><span class="sc">\r\n</span><span class="st">&quot;</span> <span class="op">++</span></span>
<span id="cb9-6"><a href="#cb9-6" tabindex="-1"></a>        <span class="st">&quot;</span><span class="sc">\r\n</span><span class="st">&quot;</span>;</span>
<span id="cb9-7"><a href="#cb9-7" tabindex="-1"></a>    <span class="at">const</span> mimes <span class="op">=</span> <span class="op">.</span>{</span>
<span id="cb9-8"><a href="#cb9-8" tabindex="-1"></a>        <span class="op">.</span>{<span class="st">&quot;.html&quot;</span><span class="op">,</span> <span class="st">&quot;text/html&quot;</span>}<span class="op">,</span></span>
<span id="cb9-9"><a href="#cb9-9" tabindex="-1"></a>        <span class="op">.</span>{<span class="st">&quot;.css&quot;</span><span class="op">,</span> <span class="st">&quot;text/css&quot;</span>}<span class="op">,</span></span>
<span id="cb9-10"><a href="#cb9-10" tabindex="-1"></a>        <span class="op">.</span>{<span class="st">&quot;.map&quot;</span><span class="op">,</span> <span class="st">&quot;application/json&quot;</span>}<span class="op">,</span></span>
<span id="cb9-11"><a href="#cb9-11" tabindex="-1"></a>        <span class="op">.</span>{<span class="st">&quot;.svg&quot;</span><span class="op">,</span> <span class="st">&quot;image/svg+xml&quot;</span>}<span class="op">,</span></span>
<span id="cb9-12"><a href="#cb9-12" tabindex="-1"></a>        <span class="op">.</span>{<span class="st">&quot;.jpg&quot;</span><span class="op">,</span> <span class="st">&quot;image/jpg&quot;</span>}<span class="op">,</span></span>
<span id="cb9-13"><a href="#cb9-13" tabindex="-1"></a>        <span class="op">.</span>{<span class="st">&quot;.png&quot;</span><span class="op">,</span> <span class="st">&quot;image/png&quot;</span>}</span>
<span id="cb9-14"><a href="#cb9-14" tabindex="-1"></a>    };</span>
<span id="cb9-15"><a href="#cb9-15" tabindex="-1"></a>    <span class="at">var</span> mime<span class="op">:</span> []<span class="at">const</span> <span class="dt">u8</span> <span class="op">=</span> <span class="st">&quot;text/plain&quot;</span>;</span>
<span id="cb9-16"><a href="#cb9-16" tabindex="-1"></a></span>
<span id="cb9-17"><a href="#cb9-17" tabindex="-1"></a>    <span class="kw">inline</span> <span class="cf">for</span> (mimes) <span class="op">|</span>kv<span class="op">|</span> {</span>
<span id="cb9-18"><a href="#cb9-18" tabindex="-1"></a>        <span class="cf">if</span> (mem<span class="op">.</span>eql(<span class="dt">u8</span><span class="op">,</span> file_ext<span class="op">,</span> kv[<span class="dv">0</span>]))</span>
<span id="cb9-19"><a href="#cb9-19" tabindex="-1"></a>            mime <span class="op">=</span> kv[<span class="dv">1</span>];</span>
<span id="cb9-20"><a href="#cb9-20" tabindex="-1"></a>    }</span>
<span id="cb9-21"><a href="#cb9-21" tabindex="-1"></a></span>
<span id="cb9-22"><a href="#cb9-22" tabindex="-1"></a>    std<span class="op">.</span>log<span class="op">.</span>info(<span class="st">&quot; &gt;&gt;&gt;</span><span class="sc">\n</span><span class="st">&quot;</span> <span class="op">++</span> http_head<span class="op">,</span> <span class="op">.</span>{mime<span class="op">,</span> file_len});</span>
<span id="cb9-23"><a href="#cb9-23" tabindex="-1"></a>    <span class="cf">try</span> stream<span class="op">.</span>writer()<span class="op">.</span>print(http_head<span class="op">,</span> <span class="op">.</span>{mime<span class="op">,</span> file_len});</span>
<span id="cb9-24"><a href="#cb9-24" tabindex="-1"></a></span>
<span id="cb9-25"><a href="#cb9-25" tabindex="-1"></a>    <span class="at">const</span> zero_iovec <span class="op">=</span> <span class="op">&amp;</span>[<span class="dv">0</span>]std<span class="op">.</span>os<span class="op">.</span>iovec_const{};</span>
<span id="cb9-26"><a href="#cb9-26" tabindex="-1"></a>    <span class="at">var</span> send_total<span class="op">:</span> <span class="dt">usize</span> <span class="op">=</span> <span class="dv">0</span>;</span>
<span id="cb9-27"><a href="#cb9-27" tabindex="-1"></a></span>
<span id="cb9-28"><a href="#cb9-28" tabindex="-1"></a>    <span class="cf">while</span> (<span class="cn">true</span>) {</span>
<span id="cb9-29"><a href="#cb9-29" tabindex="-1"></a>        <span class="at">const</span> send_len <span class="op">=</span> <span class="cf">try</span> std<span class="op">.</span>os<span class="op">.</span>sendfile(</span>
<span id="cb9-30"><a href="#cb9-30" tabindex="-1"></a>            stream<span class="op">.</span>handle<span class="op">,</span></span>
<span id="cb9-31"><a href="#cb9-31" tabindex="-1"></a>            body_file<span class="op">.</span>handle<span class="op">,</span></span>
<span id="cb9-32"><a href="#cb9-32" tabindex="-1"></a>            send_total<span class="op">,</span></span>
<span id="cb9-33"><a href="#cb9-33" tabindex="-1"></a>            file_len<span class="op">,</span></span>
<span id="cb9-34"><a href="#cb9-34" tabindex="-1"></a>            zero_iovec<span class="op">,</span></span>
<span id="cb9-35"><a href="#cb9-35" tabindex="-1"></a>            zero_iovec<span class="op">,</span></span>
<span id="cb9-36"><a href="#cb9-36" tabindex="-1"></a>            <span class="dv">0</span></span>
<span id="cb9-37"><a href="#cb9-37" tabindex="-1"></a>        );</span>
<span id="cb9-38"><a href="#cb9-38" tabindex="-1"></a></span>
<span id="cb9-39"><a href="#cb9-39" tabindex="-1"></a>        <span class="cf">if</span> (send_len <span class="op">==</span> <span class="dv">0</span>)</span>
<span id="cb9-40"><a href="#cb9-40" tabindex="-1"></a>            <span class="cf">break</span>;</span>
<span id="cb9-41"><a href="#cb9-41" tabindex="-1"></a></span>
<span id="cb9-42"><a href="#cb9-42" tabindex="-1"></a>        send_total <span class="op">+=</span> send_len;</span>
<span id="cb9-43"><a href="#cb9-43" tabindex="-1"></a>    }</span>
<span id="cb9-44"><a href="#cb9-44" tabindex="-1"></a>}</span></code></pre></div>
            <div class="sourceCode" id="cb10"><pre
            class="sourceCode c"><code class="sourceCode c"><span id="cb10-1"><a href="#cb10-1" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span><span class="dt">const</span> http_head <span class="op">=</span></span>
<span id="cb10-2"><a href="#cb10-2" tabindex="-1"></a>        <span class="st">&quot;HTTP/1.1 200 OK</span><span class="sc">\r\n</span><span class="st">&quot;</span></span>
<span id="cb10-3"><a href="#cb10-3" tabindex="-1"></a>        <span class="st">&quot;Connection: close</span><span class="sc">\r\n</span><span class="st">&quot;</span></span>
<span id="cb10-4"><a href="#cb10-4" tabindex="-1"></a>        <span class="st">&quot;Content-Type: </span><span class="sc">%s\r\n</span><span class="st">&quot;</span></span>
<span id="cb10-5"><a href="#cb10-5" tabindex="-1"></a>        <span class="st">&quot;Content-Length: </span><span class="sc">%lu\r\n</span><span class="st">&quot;</span></span>
<span id="cb10-6"><a href="#cb10-6" tabindex="-1"></a>        <span class="st">&quot;</span><span class="sc">\r\n</span><span class="st">&quot;</span><span class="op">;</span></span>
<span id="cb10-7"><a href="#cb10-7" tabindex="-1"></a>    <span class="dt">const</span> <span class="dt">char</span> <span class="op">*</span>mime <span class="op">=</span> <span class="st">&quot;text/html&quot;</span><span class="op">;</span></span>
<span id="cb10-8"><a href="#cb10-8" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>strstr<span class="op">(</span>file_path<span class="op">,</span> <span class="st">&quot;.css&quot;</span><span class="op">))</span></span>
<span id="cb10-9"><a href="#cb10-9" tabindex="-1"></a>        mime <span class="op">=</span> <span class="st">&quot;text/css&quot;</span><span class="op">;</span></span>
<span id="cb10-10"><a href="#cb10-10" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>strstr<span class="op">(</span>file_path<span class="op">,</span> <span class="st">&quot;.map&quot;</span><span class="op">))</span></span>
<span id="cb10-11"><a href="#cb10-11" tabindex="-1"></a>        mime <span class="op">=</span> <span class="st">&quot;application/json&quot;</span><span class="op">;</span></span>
<span id="cb10-12"><a href="#cb10-12" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>strstr<span class="op">(</span>file_path<span class="op">,</span> <span class="st">&quot;.svg&quot;</span><span class="op">))</span></span>
<span id="cb10-13"><a href="#cb10-13" tabindex="-1"></a>        mime <span class="op">=</span> <span class="st">&quot;image/svg+xml&quot;</span><span class="op">;</span></span>
<span id="cb10-14"><a href="#cb10-14" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>strstr<span class="op">(</span>file_path<span class="op">,</span> <span class="st">&quot;.jpg&quot;</span><span class="op">))</span></span>
<span id="cb10-15"><a href="#cb10-15" tabindex="-1"></a>        mime <span class="op">=</span> <span class="st">&quot;image/jpg&quot;</span><span class="op">;</span></span>
<span id="cb10-16"><a href="#cb10-16" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>strstr<span class="op">(</span>file_path<span class="op">,</span> <span class="st">&quot;.png&quot;</span><span class="op">))</span></span>
<span id="cb10-17"><a href="#cb10-17" tabindex="-1"></a>        mime <span class="op">=</span> <span class="st">&quot;image/png&quot;</span><span class="op">;</span></span>
<span id="cb10-18"><a href="#cb10-18" tabindex="-1"></a></span>
<span id="cb10-19"><a href="#cb10-19" tabindex="-1"></a>    <span class="kw">struct</span> stat body_stat<span class="op">;</span></span>
<span id="cb10-20"><a href="#cb10-20" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>fstat<span class="op">(</span>body_fd<span class="op">,</span> <span class="op">&amp;</span>body_stat<span class="op">))</span> <span class="op">{</span></span>
<span id="cb10-21"><a href="#cb10-21" tabindex="-1"></a>        perror<span class="op">(</span><span class="st">&quot;[-] fstat&quot;</span><span class="op">);</span></span>
<span id="cb10-22"><a href="#cb10-22" tabindex="-1"></a>        <span class="cf">goto</span> close_body<span class="op">;</span></span>
<span id="cb10-23"><a href="#cb10-23" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb10-24"><a href="#cb10-24" tabindex="-1"></a>    sprintf<span class="op">(</span>head_buf<span class="op">,</span> http_head<span class="op">,</span> mime<span class="op">,</span> body_stat<span class="op">.</span>st_size<span class="op">);</span></span>
<span id="cb10-25"><a href="#cb10-25" tabindex="-1"></a>    printf<span class="op">(</span><span class="st">&quot;[*] &gt;&gt;&gt;</span><span class="sc">\n%s</span><span class="st">&quot;</span><span class="op">,</span> head_buf<span class="op">);</span></span>
<span id="cb10-26"><a href="#cb10-26" tabindex="-1"></a></span>
<span id="cb10-27"><a href="#cb10-27" tabindex="-1"></a>    <span class="cf">while</span> <span class="op">(</span>sent_total <span class="op">&lt;</span> strlen<span class="op">(</span>http_head<span class="op">))</span> <span class="op">{</span></span>
<span id="cb10-28"><a href="#cb10-28" tabindex="-1"></a>        sent <span class="op">=</span> write<span class="op">(</span>sk<span class="op">,</span> head_buf <span class="op">+</span> sent_total<span class="op">,</span> strlen<span class="op">(</span>head_buf<span class="op">));</span></span>
<span id="cb10-29"><a href="#cb10-29" tabindex="-1"></a></span>
<span id="cb10-30"><a href="#cb10-30" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>sent <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb10-31"><a href="#cb10-31" tabindex="-1"></a>            perror<span class="op">(</span><span class="st">&quot;[-] write&quot;</span><span class="op">);</span></span>
<span id="cb10-32"><a href="#cb10-32" tabindex="-1"></a>            <span class="cf">goto</span> close_body<span class="op">;</span></span>
<span id="cb10-33"><a href="#cb10-33" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb10-34"><a href="#cb10-34" tabindex="-1"></a></span>
<span id="cb10-35"><a href="#cb10-35" tabindex="-1"></a>        sent_total <span class="op">+=</span> sent<span class="op">;</span></span>
<span id="cb10-36"><a href="#cb10-36" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb10-37"><a href="#cb10-37" tabindex="-1"></a></span>
<span id="cb10-38"><a href="#cb10-38" tabindex="-1"></a>    <span class="cf">do</span> <span class="op">{</span></span>
<span id="cb10-39"><a href="#cb10-39" tabindex="-1"></a>        sent <span class="op">=</span> sendfile<span class="op">(</span>sk<span class="op">,</span> body_fd<span class="op">,</span> NULL<span class="op">,</span> body_stat<span class="op">.</span>st_size<span class="op">);</span></span>
<span id="cb10-40"><a href="#cb10-40" tabindex="-1"></a></span>
<span id="cb10-41"><a href="#cb10-41" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>sent <span class="op">&lt;</span> <span class="dv">0</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb10-42"><a href="#cb10-42" tabindex="-1"></a>            perror<span class="op">(</span><span class="st">&quot;[-] sendfile&quot;</span><span class="op">);</span></span>
<span id="cb10-43"><a href="#cb10-43" tabindex="-1"></a>            <span class="cf">goto</span> close_body<span class="op">;</span></span>
<span id="cb10-44"><a href="#cb10-44" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb10-45"><a href="#cb10-45" tabindex="-1"></a></span>
<span id="cb10-46"><a href="#cb10-46" tabindex="-1"></a>        sent_total <span class="op">+=</span> sent<span class="op">;</span></span>
<span id="cb10-47"><a href="#cb10-47" tabindex="-1"></a>    <span class="op">}</span> <span class="cf">while</span> <span class="op">(</span>sent <span class="op">&gt;</span> <span class="dv">0</span><span class="op">);</span></span>
<span id="cb10-48"><a href="#cb10-48" tabindex="-1"></a></span>
<span id="cb10-49"><a href="#cb10-49" tabindex="-1"></a>close_body<span class="op">:</span></span>
<span id="cb10-50"><a href="#cb10-50" tabindex="-1"></a>    close<span class="op">(</span>body_fd<span class="op">);</span></span>
<span id="cb10-51"><a href="#cb10-51" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
            <p>So here we can see the C has a <code>goto</code> in it.
            I’m not sure it makes much of a difference here although I
            guess it’s easier to mess up using <code>goto</code> than
            <code>defer</code> for freeing resources on exit. On the
            other hand you may be looking at <code>defer</code> thinking
            “huh? When does that run?”.</p>
            <p>I have to say that Zig suffered a major fail in this part
            because the compiler segfaulted when I was trying to write
            the mime selection code. At the time of writing the
            following code will cause a segfault.</p>
            <div class="sourceCode" id="cb11"><pre
            class="sourceCode zig"><code class="sourceCode zig"><span id="cb11-1"><a href="#cb11-1" tabindex="-1"></a>    <span class="at">const</span> ms <span class="op">=</span> <span class="op">.</span>{ <span class="st">&quot;a&quot;</span><span class="op">,</span> <span class="st">&quot;b&quot;</span> };</span>
<span id="cb11-2"><a href="#cb11-2" tabindex="-1"></a>    <span class="at">const</span> a <span class="op">=</span> set<span class="op">:</span> {</span>
<span id="cb11-3"><a href="#cb11-3" tabindex="-1"></a>        <span class="kw">inline</span> <span class="cf">for</span> (ms) <span class="op">|</span>m<span class="op">|</span> {</span>
<span id="cb11-4"><a href="#cb11-4" tabindex="-1"></a>            <span class="cf">if</span> (mem<span class="op">.</span>eql(<span class="dt">u8</span><span class="op">,</span> <span class="st">&quot;a&quot;</span><span class="op">,</span> m))</span>
<span id="cb11-5"><a href="#cb11-5" tabindex="-1"></a>                <span class="cf">break</span> <span class="op">:</span>set m;</span>
<span id="cb11-6"><a href="#cb11-6" tabindex="-1"></a>        }</span>
<span id="cb11-7"><a href="#cb11-7" tabindex="-1"></a>        <span class="cf">break</span> <span class="op">:</span>set <span class="st">&quot;c&quot;</span>;</span>
<span id="cb11-8"><a href="#cb11-8" tabindex="-1"></a>    };</span>
<span id="cb11-9"><a href="#cb11-9" tabindex="-1"></a></span>
<span id="cb11-10"><a href="#cb11-10" tabindex="-1"></a>    <span class="at">const</span> a2<span class="op">:</span> [<span class="op">:</span><span class="dv">0</span>]<span class="at">const</span> <span class="dt">u8</span> <span class="op">=</span> <span class="st">&quot;a&quot;</span>;</span>
<span id="cb11-11"><a href="#cb11-11" tabindex="-1"></a>    <span class="cf">try</span> testing<span class="op">.</span>expectEqual(a2<span class="op">,</span> a);</span></code></pre></div>
            <p>This appears to be valid Zig code because it at least
            gets as far as emitting LLVM IR. However there is some issue
            there. Of course this is also very weird looking, so it’s
            perhaps best that I removed it.</p>
            <p>Also note the <code>inline for</code>, this is
            <em>required</em> because <code>ms</code> and
            <code>mimes</code> are known at compile time and I think
            have <code>comptime</code> types. Zig doesn’t have a
            preprocessor, macro’s or templates. Instead it allows code
            with inputs known at compile time, to be ran at compile
            time. I suppose we could stop this code being evaluated at
            compile time by specifying runtime types on
            <code>mimes</code>.</p>
            <p>In this program it’s not clear what the advantages of
            <code>comptime</code> are. Meanwhile it got in my way a
            little bit when getting errors like.</p>
            <pre><code>./src/self-serve.zig:114:5: error: unable to evaluate constant expression
    for (mimes) |kv| {</code></pre>
            <p>It’s worth mentioning that C compilers can evaluate a lot
            at compile time as well. You can see this demonstrated in my
            <a href="/https/richiejp.com/1d-reversible-automata">automata article</a>. This
            simply happens when turning on optimisations and avoiding
            things which will hide the “constness” of variables. I
            suppose that <code>comptime</code> has resulted in a win for
            C here. Although this won’t dampen my enthusiasm for
            <code>comptime</code> in general.</p>
            <p>Frankly I’m finding it increasingly difficult to draw
            solid comparisons at this point. While writing this article
            I keep discovering things I could do differently in both Zig
            and C. However I feel like it is time to cap this off.</p>
            <h1 id="conclusion">Conclusion</h1>
            <p>This application isn’t exactly a major stress test for
            either language. They both fit well within my requirements
            for executable size and execution performance even with all
            the sanitizers turned on. There aren’t any of the
            complications of a large modular code base either. It
            doesn’t even allocate heap memory.</p>
            <p>However I think this shows that Zig makes some concrete
            advances over C. Meanwhile it doesn’t appear to make
            anything more difficult. At least so long as the compiler
            doesn’t segfault or blurt out something like “cannot store
            runtime value in type ‘comptime_int’”, without any hint as
            to what to do about it.</p>
            <p>Most issues I have encountered seem to be temporary
            implementation problems. Andrew Kelly and Co. didn’t decide
            to make radical changes over C that introduce new problems.
            Rather they changed some defaults and added evolutionary
            improvements. At least as far as this application shows. I
            still wonder if there are dragons lurking in the
            <code>comptime</code> features. On the other hand
            <code>comptime</code> can be seen as an evolution of the C
            preprocessor and other tools which generate C code.</p>
            <h1 id="related">Related</h1>
            <ul>
            <li><a href="/https/richiejp.com/barely-http2-zig">Barely HTTP/2 server in
            Zig</a></li>
            <li><a href="/https/richiejp.com/zig-cross-compile-ltp-ltx-linux">Minimal Linux
            VM cross compiled with Clang and Zig</a></li>
            <li><a href="/https/richiejp.com/zig-ld-preload-trick">Override libc’s malloc
            with Zig</a></li>
            <li><a href="zig-fuse-one">Zig &amp; FUSE: Hello file
            systems</a></li>
            </ul>
    </div>
  </content>
</entry>
<entry>
  <title>ZSV: Viewing large CSV files without latency</title>
  <id>https://richiejp.com/zsv-index</id>
  <published>2024-11-29T16:59:10Z</published>
  <updated>2024-12-18T17:04:45Z</updated>
  <link rel="alternate" href="https://richiejp.com/zsv-index" />
  <summary>Introduction to the ZSV sheet viewer and how implementing an
index removed seconds of latency. ZSV is a collection of software tools
for viewing and processing CSV files, this article covers the basics of
the sheet feature and implementation details of indexing a CSV
file.</summary>
  <content type="xhtml">
    <div xmlns="http://www.w3.org/1999/xhtml">
            <h1 id="zsv-swiss-army-knife-cli-for-csv">ZSV: Swiss army
            knife CLI for CSV</h1>
            <p><a href="https://github.com/liquidaty/zsv">ZSV is a
            collection of high performance tools</a> for reading,
            manipulating and viewing CSV files. At it’s core is a
            library for processing CSV files in C and on top of that
            there is an extensible CLI and TUI.</p>
            <p>There are commands in the CLI for converting CSV to JSON
            or selecting a subset of columns or rows from the CSV and
            normalising the results. It is compatible with Excel so can
            be used for cleaning files that wouldn’t be accepted by
            other tools.</p>
            <p>In this article we are going to focus on the sheet
            command. This is a CSV viewer that displays the file in a
            terminal window and allows you to browse it.</p>
            <div id="zsv-sheet-asciinema" class="asciinemaFigure">

            </div>
            <script>
            AsciinemaPlayer.create('/zsv-2.ascii', document.getElementById('zsv-sheet-asciinema'))
            </script>
            <h1 id="reading-very-large-csv-files">Reading very large CSV
            files</h1>
            <p>CSV files are a simple text format where each row can
            vary in length. The only way to know how long a row is, is
            to read it into memory and parse the data. If you are only
            interested in the millionth row of a CSV file then you still
            have to read all of the rows preceding it.</p>
            <p>This presents a problem if you wish to open a CSV file
            and skip around in it. Especially if you want to skip to the
            end and still know what the row numbers are.</p>
            <p>ZSV can parse even quite large CSV files from start to
            finish in under a second and without using a large amount of
            memory. The Sheet command in particular only loads chunks of
            the file into memory that it needs to display.</p>
            <p>However if you have a 13GB CSV file then parsing it from
            start to finish can take several seconds. If you want to
            jump around to particular line numbers then the closer you
            get to the end of the file, the longer it’ll take to
            jump.</p>
            <p>The full 13GB of CSV is not loaded into memory all at
            once. My laptop has more than enough memory to do that, but
            imagine you are trying to browse several of these files
            simultaneously. Being forced to close one file to open
            another is inconvenient.</p>
            <p>So you can imagine that if the user is trying to jump
            between rows towards the end of the file then there will be
            unbearable latency. If the CSV is appended to in such a way
            that the newest entries are at the end then it’s common for
            the user to skip to the end.</p>
            <h1 id="improving-performance-with-an-index">Improving
            performance with an index</h1>
            <p>Let’s imagine for a second that our CSV file has a fixed
            row size where each line is the same number of bytes. In
            this case we could jump directly to any line we desired
            using a simple calculation:
            <code>row_number * row_length</code>;</p>
            <p>CSV files are not usually fixed width however, so we have
            to parse the file and count the rows. On smaller files this
            operation is so quick the user will not notice. On larger
            files it can take a considerable time.</p>
            <p>For smaller files we can load each line into an array
            entry. Getting a row then becomes a case of looking up an
            array entry. Typically array lookups are considered a
            constant time operation (<code>O(1)</code>) and are very
            quick. Faster than a lookup on a more complicated data
            structure that also offers constant time lookups (e.g. a
            hash map).</p>
            <p>Meanwhile for larger files we don’t want to load the full
            CSV data into memory. If the CSV file had fixed row sizes
            then we wouldn’t have to. Instead we could calculate where
            the row exists inside the file and just read that location.
            However the rows are different lengths.</p>
            <p>What we can do is store the offset of each row’s line end
            or line beginning in an array. Then when we want to read a
            particular row we can lookup its location inside the array.
            This array is our index and it allows us to read any row in
            constant time and we never need to load the whole CSV file
            into memory at the same time.</p>
            <p>Note that the performance improvement of this change is
            so dramatic that we can rely on simple functional testing to
            validate it. <a href="how-to-10x-most-software">Focusing on
            radical performance improvments has a number of
            advantages.</a></p>
            <h1 id="reducing-the-indexs-memory-footprint">Reducing the
            index’s memory footprint</h1>
            <div class="float">
            <img src="zsv-index.svg" alt="ZSV Index" />
            <div class="figcaption">ZSV Index</div>
            </div>
            <p>If a CSV file contains a large number of rows where each
            row is only a few bytes, then the index may not be much
            smaller than the data itself. Indeed if we use 64bit file
            offsets then that is 8 bytes per index entry. So a CSV file
            with less than 8 ASCII characters per line on average will
            be smaller than its index.</p>
            <p>In addition the ZSV sheet command typically loads 1024
            lines at a time. If we were to systematically jump around a
            file until we had seen every line, then most of the index
            entries would never be used.</p>
            <p>Finally reading 1024 lines with ZSV is very fast unless
            the lines are exceptionally large. So storing the location
            of every line end is both wasteful and unnecessary.</p>
            <p>Therefor ZSV takes the approach of only storing every
            1024th line end in the index. If a line number is requested
            that is not divisible by 1024 then the previous index entry
            will be used and it will parse and discard the rows leading
            up to the requested line.</p>
    </div>
  </content>
</entry>
</feed>
