Jekyll2019-05-18T00:13:21-07:00https://blog.fatihbakir.net/feed.xmlFatih’s blogBlog Are C++ Threads Preemptive?2019-05-18T00:00:00-07:002019-05-18T00:00:00-07:00https://blog.fatihbakir.net/2019/05/18/std-thread<p>Although we have a consensus on our desktops, servers and phones that an OS should provide preemptive threads, not all software is written for such environments and neither all operating systems support preemptive threads. I believe there’s a case for non-preemptive (or cooperative) threads in special applications. But that’s the topic of another article.</p>
<p>In this article, I’d like to see if the C++ standard allows for <code class="highlighter-rouge">std::thread</code>s to have cooperative semantics rather than preemptive.</p>
<p>All the <code class="highlighter-rouge">std::thread</code> implementations I have provide preemptive threads. This is expected however, as they are all for Win32 or POSIX interfaces, which themselves support preemptive threads. As far as I can see, there’s no <code class="highlighter-rouge">std::thread</code> implementation that provides cooperative threads in the wild.</p>
<p>So we have to dive into the standard to find our answer. The C++ standard <a class="citation" href="#cppstd">[1]</a> defines what a thread is as in [intro.multithread]:</p>
<blockquote>
<p>A thread of execution (also known as a thread) is a single flow of control within a program, including the initial invocation of a specific top-level function, and recursively including every function invocation subsequently executed by the thread.</p>
</blockquote>
<p>No mention of preemption, so we have to keep looking.</p>
<p>The next interesting information is in [intro.progress], which defines what making progress for a C++ thread means:</p>
<blockquote>
<p>The implementation may assume that any thread will eventually do one of the following:</p>
<ul>
<li>terminate,</li>
<li>make a call to a library I/O function,</li>
<li>perform an access through a volatile glvalue, or</li>
<li>perform a synchronization operation or an atomic operation.</li>
</ul>
</blockquote>
<p>This is <em>somewhat</em> more related to what we’re looking for, but is still not mentioning anything regarding preemption. The previous requirements can be made for both preemptive and cooperative threads.</p>
<p>However, something more interesting is in the 7th point:</p>
<blockquote>
<p>For a thread of execution providing concurrent forward progress guarantees, the implementation ensures that the thread will eventually make progress for as long as it has not terminated. [ Note: This is required regardless of whether or not other threads of executions (if any) have been or are making progress. To eventually fulfill this requirement means that this will happen in an unspecified but finite amount of time. — end note ]</p>
</blockquote>
<p>Making progress here means doing something that has visible effects, in a hand wavy way. The interesting bit is in the note however. It states that a thread must be able to make progress in a finite amount of time. Thus, I believe <em>it kind of</em> follows that cooperative threads <strong>cannot provide</strong> concurrent forward progress guarantees.</p>
<p>Imagine the following program:</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">atomic</span><span class="o"><</span><span class="kt">bool</span><span class="o">></span> <span class="n">b</span> <span class="o">=</span> <span class="nb">false</span><span class="p">;</span>
<span class="k">auto</span> <span class="n">t1</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="kr">thread</span><span class="p">([]{</span>
<span class="k">while</span><span class="p">(</span><span class="o">!</span><span class="n">b</span><span class="p">);</span>
<span class="p">});</span>
<span class="k">auto</span> <span class="n">t2</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="kr">thread</span><span class="p">([]{</span>
<span class="n">b</span> <span class="o">=</span> <span class="nb">true</span><span class="p">;</span>
<span class="n">print</span><span class="p">(</span><span class="n">uart</span><span class="p">,</span> <span class="s">"foo"</span><span class="p">);</span>
<span class="p">});</span>
</code></pre></div></div>
<p>If the first thread <code class="highlighter-rouge">t1</code> starts executing first, it’ll be stuck in the loop forever, thus starving <code class="highlighter-rouge">t2</code>. This means that <code class="highlighter-rouge">t2</code> may not make progress in a finite amount of time. Providing such a guarantee is impossible without either implementing preemptive threads, or instrumenting atomic accesses (and volatiles?) to potentially yield.</p>
<p>However, the next bullet in the standard says the following:</p>
<blockquote>
<p>It is implementation-defined whether the implementation-created thread of execution that executes main ([basic.start.main]) and the threads of execution created by std::thread ([thread.thread.class]) provide concurrent forward progress guarantees. [ Note: General-purpose implementations should provide these guarantees. — end note ]</p>
</blockquote>
<p>This states that if <code class="highlighter-rouge">std::thread</code>s provide concurrent forward progress guarantee or not is implementation defined. So, if I’m implementing the <code class="highlighter-rouge">std::thread</code>, I don’t really have to do preemptive threads. A cooperative thread is certainly a standard conforming implementation.</p>
<p>However, the point of implementing the <code class="highlighter-rouge">std::thread</code> API would be to ease the effort of porting programs to such an OS. I don’t have any data on this, but I’m pretty certain most (>90% ?) <code class="highlighter-rouge">std::thread</code> usage assumes a preemptive threading model. Thus, supporting such programs would probably give a false sense of portability and would cause a lot of misunderstandings. Thus, I don’t believe cooperative threads should implement a <code class="highlighter-rouge">std::thread</code> interface, even though it’s completely legal to do so.</p>
<h2 id="references">References</h2>
<ul class="bibliography"><li><span id="cppstd">[1]“Draft C++ Standard.” \urlhttp://eel.is/c++draft/.</span></li></ul>fatihWhat does the standard say?SSL considered harmful (in IoT)?2019-05-15T00:00:00-07:002019-05-15T00:00:00-07:00https://blog.fatihbakir.net/2019/05/15/ssl-considered-harmful<p>I’ve been working on <em>Internet of Things</em> research for over a year now. The bulk of it has been
working on getting cheap installations to work reliably and securely.</p>
<p>It’s a known fact that security hasn’t been the top priority of IoT applications, whether it’s a
temperature sensor, a thermostat or a smart bulb. They get hacked, they become part of botnets,
and if nothing, they reduce your downtime.</p>
<p>This is obviously not desirable. These things usually work over WiFi, so TCP/IP. So we should just
slap our trusty SSL on top of it and call it a day, right?</p>
<p>Nope. The main problem many people don’t see is what separates the <em>Internet of Things</em> from the
regular internet. On the regular internet, we’ve mostly solved the security problems, and yes, the
solution is usually just encrypting the traffic. However, this solution just doesn’t scale <em>down</em>
to tiny embedded systems that are now part of the same internet as our crazy back end servers. Here,
the devices are extremely constrained.</p>
<p>Public key cryptography, which powers our security infrastructure on the internet, is not designed
to be run on processors with just a tens of megahertz clock speeds and a few KBs of RAM.</p>
<p>No, they were designed for huge machines. A conforming, bidirectional SSL server <strong>must</strong> be able
to keep about 33 kilobytes of buffers. The (relatively high end) microcontroller I’m using
frequently only gives me about 48 kilobytes of RAM. So, if I wanted to run a full fledged SSL server
on it, I must waste two thirds of my RAM. Fortunately, you neither have to conform nor have to run
an SSL server on these things, so you can get away with about 5-10 KB of RAM for buffers.</p>
<p>But no, I’m not finished yet. Then we get to the point of actually using these buffers in an actual
SSL session. Regardless of whether you are a server or a client, you have to perform the dreaded
SSL handshake. Boy, oh boy. It takes slightly more than 4KB of stack to execute that. If you’re
smart and looking for some adventure, you can probably use that wasted memory after the handshake
for some other purposes, but otherwise, it’ll just lay there after the handshake.</p>
<p>Oh, there’s also the issue of runtime. A single SSL handshake, on 2048 bit RSA keys will take about
4 seconds on an ESP8266. And no, ECC doesn’t help too much either.</p>
<table>
<thead>
<tr>
<th>Operation</th>
<th>Time</th>
</tr>
</thead>
<tbody>
<tr>
<td>RSA Handshake (2048 Bit)</td>
<td>3,95 Seconds</td>
</tr>
<tr>
<td>RSA Handshake (4096 Bit)</td>
<td>32,32 Seconds</td>
</tr>
</tbody>
</table>
<p>I’m not an expert on SSL, but my understanding is that the handshake is bottlenecked at digital
signature operations. Here are the numbers for those primitives as well:</p>
<table>
<thead>
<tr>
<th>Algorithm</th>
<th>Sign Time</th>
<th>Verify Time</th>
</tr>
</thead>
<tbody>
<tr>
<td>PKCS1 (2048 Bits)</td>
<td>3280 ms</td>
<td>187 ms</td>
</tr>
<tr>
<td>PKCS1 (4096 Bits)</td>
<td>31580 ms</td>
<td>9190 ms</td>
</tr>
<tr>
<td>ECDSA (256 Bits)</td>
<td>214 ms</td>
<td>4340 ms</td>
</tr>
</tbody>
</table>
<p>Now, 4 seconds isn’t a huge amount of time. My sensors are duty cycled to do a measurement every 5
minutes and go to sleep. So, running for 4 more extra seconds shouldn’t hurt, right?</p>
<p>The reason I’m going to sleep is to conserve power. We’re running on batteries here, and a regular
wake up-sample sensor-publish to cloud/edge-go back to sleep cycle takes about 2 seconds. Now, I just pay
twice that time just to do an SSL handshake. Oh, did I mention I’m on a WiFi network, which also has
encryption in the link layer. So, when I’m pushing to the edge, I probably don’t make use of the SSL
<em>at all</em>. It’s pure overhead.</p>
<p>Well, you might say that I don’t have to use SSL, so why complain this much. The reason is that every
single public cloud provider I’ve used (<a class="citation" href="#azure-mqtt-ssl">[1]</a>, <a class="citation" href="#aws-mqtt-ssl">[2]</a>, <a class="citation" href="#gc-mqtt-ssl">[3]</a>)
expects me to publish only and only using SSL over MQTT.</p>
<p>Now, it’s not their fault. Security in this domain is extremely important. However, maybe porting
everything we’re used to in the regular internet programming to these constrained devices isn’t
the greatest idea. SSL (thus public key encryption) is truly a marvellous technology, but it’s solving
a very specific problem: you want to verify each parties identity * encrypt the communication * don’t
want to share a lot of keys ahead of time.</p>
<p>For my commands to my thermostat, I don’t care a lot about the encryption. As long as I can verify
the authenticity of commands properly, an attacker cannot change my settings. Similarly for sensing
applications, if an attacker can’t inject random data to my public cloud database, I don’t really
care about the data being encrypted, especially if I’m storing my data in an edge cloud where there
already exists a layer of security in link layer.</p>
<p>Maybe it’s time to search for lighter security primitives that can scale down to the embedded systems
we have to use in this domain. As otherwise, we’ll either not do it properly, or not do it at all.</p>
<h2 id="references">References</h2>
<ul class="bibliography"><li><span id="azure-mqtt-ssl">[1]“Azure IoT Hub communication protocols and ports | Microsoft Docs.” \urlhttps://docs.microsoft.com/en-us/azure/iot-hub/iot-hub-devguide-protocols.</span></li>
<li><span id="aws-mqtt-ssl">[2]“Protocols - AWS IoT.” \urlhttps://docs.aws.amazon.com/iot/latest/developerguide/protocols.html.</span></li>
<li><span id="gc-mqtt-ssl">[3]“Publishing over the MQTT bridge | Cloud IoT Core Documentation | Google Cloud.” \urlhttps://cloud.google.com/iot/docs/how-tos/mqtt-bridge#mqtt_server.</span></li></ul>fatihSecuring low power IoT devices is very resource consuming. Do we need lighter primitives?How to expect the unexpected2019-05-14T00:00:00-07:002019-05-14T00:00:00-07:00https://blog.fatihbakir.net/2019/05/14/unexpected<p>Although C++ has a sophisticated error handling mechanism, ie exceptions, in embedded domains they are
usually disabled due to code size spent on jump table sizes with zero overhead exceptions or the runtime
overhead in other implementations. And in projects they are enabled, their use is usually frowned upon
due to the extreme costs of throwing an exception. It’s said that you should only throw an exception
in exceptional situations, though what constitutes an exceptional situation is not well defined.</p>
<p>Anyway, due to these reasons, the C++ community has been in search of better error handling mechanisms.
While some <a class="citation" href="#herbceptions">[1]</a> are working on fixing exceptions with core language changes, some
<a class="citation" href="#boostoutcome">[2]</a>, <a class="citation" href="#expected">[3]</a> are working on going a different, purely library based route.</p>
<p>I personally like and use the expected objects as the return value from fallible functions. They encapsulate
the reason for the failure if something fails, so it’s better for the caller to get such information.</p>
<p>However, there are a few pain points I have with it:</p>
<p><strong>They don’t easily compose</strong></p>
<p>If you have a function <code class="highlighter-rouge">expected<T, foo_errors> foo();</code> and another <code class="highlighter-rouge">expected<T, bar_errors> bar();</code>, which
calls <code class="highlighter-rouge">foo</code>, it’s difficult to return it’s error value directly. Either you have to put all the values of
<code class="highlighter-rouge">foo_errors</code> to <code class="highlighter-rouge">bar_errors</code>, or simply discard that and put a single <code class="highlighter-rouge">foo_failed</code> in <code class="highlighter-rouge">bar_errors</code>. That’s
not really nice. You could change bar to be <code class="highlighter-rouge">expected<T, variant<foo_errors, bar_errors>></code> to make a better
interface, but it just goes deeper and deeper and still doesn’t compose automatically, you have to keep
track of the error types of every callee in a function.</p>
<p>However, I can live with this.</p>
<p><strong>They don’t work in embedded</strong></p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">expected</span><span class="o"><</span><span class="kt">float</span><span class="p">,</span> <span class="kt">int</span><span class="o">></span> <span class="n">foo</span><span class="p">()</span> <span class="p">{</span> <span class="k">return</span> <span class="n">unexpected</span><span class="p">(</span><span class="mi">42</span><span class="p">);</span> <span class="p">}</span>
<span class="k">auto</span> <span class="n">r</span> <span class="o">=</span> <span class="n">foo</span><span class="p">();</span>
<span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o"><<</span> <span class="n">r</span><span class="p">.</span><span class="n">value</span><span class="p">()</span> <span class="o"><<</span> <span class="sc">'\n'</span><span class="p">;</span>
</code></pre></div></div>
<p>What does this program do?</p>
<p>According to the proposal, and the reference implementation, it throws an exception. But the reason I’ve
picked this library is it was kind of promising to replace exceptions for me. Though the purpose of the paper
isn’t exactly that, I think we might actually solve this.</p>
<p>The main problem it’s useless in embedded is that the <code class="highlighter-rouge">expected::value</code> function, which promises to return
the internal value unconditionally. We want a tighter interface. For it to be usable in a mission critical
domain, it has to enforce the error checking at compile time.</p>
<h2 id="tosexpected"><code class="highlighter-rouge">tos::expected</code></h2>
<p>This is a type we provide in our embedded operating system. It simply privately inherits from <code class="highlighter-rouge">tl::expected</code>
and exposes a much more restricted interface. In short, there’s only <em>safe</em> functions for accessing the
internal value:</p>
<ol>
<li><code class="highlighter-rouge">with</code></li>
<li><code class="highlighter-rouge">get_or</code></li>
<li><code class="highlighter-rouge">operator std::optional<T></code></li>
<li><code class="highlighter-rouge">force_get</code></li>
</ol>
<p>The first one is simple, you call it with an expected, and pass 2 lambdas, one for when there’s a value, and
one for when there’s an error:</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">with</span><span class="p">(</span><span class="n">fallible_func</span><span class="p">(),</span> <span class="p">[](</span><span class="k">auto</span><span class="o">&</span> <span class="n">val</span><span class="p">)</span> <span class="p">{</span>
<span class="p">...</span> <span class="n">use</span> <span class="n">val</span> <span class="p">...</span>
<span class="p">},</span> <span class="p">[](</span><span class="k">auto</span><span class="o">&</span> <span class="n">err</span><span class="p">){</span>
<span class="p">...</span> <span class="n">use</span> <span class="n">err</span> <span class="p">...</span>
<span class="p">});</span>
</code></pre></div></div>
<p>It’s statically enforced that you can’t try to access the internal value if there’s none.</p>
<p>The second one is simply a refinement over <code class="highlighter-rouge">with</code>, it basically tries to get the internal value, and if
there’s no value, it returns another value passed to the function:</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">auto</span> <span class="n">v</span> <span class="o">=</span> <span class="n">get_or</span><span class="p">(</span><span class="n">fallible_func</span><span class="p">(),</span> <span class="mi">705</span><span class="p">);</span>
</code></pre></div></div>
<p>This doesn’t let you handle the error explicitly, but you still can’t access a non-existent value, enforced
at compile time.</p>
<p>The third one is a little convenience conversion operator for times when you don’t care about the error at all
and just want to get a <code class="highlighter-rouge">std::optional</code>. The operator is explicit, so you don’t get any unexpected (see the pun?)
conversions. However, this is a bit dangerous as <code class="highlighter-rouge">std::optional</code> doesn’t enforce checking the error at compile
time as we do.</p>
<p>Finally, <code class="highlighter-rouge">force_get</code>. Despite it’s name, it doesn’t really force anything. When you call this function, if there’s
no value in the expected object, the kernel panics. So, no undefined behavior, but still not really a desirable
thing to have in your systems.</p>
<p>However, the use case is definitely not calling it on random <code class="highlighter-rouge">expected</code>s. The use case is to call this function
only when you <em>know</em> there’s a value in the expected:</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">auto</span> <span class="n">e</span> <span class="o">=</span> <span class="n">fallible_func</span><span class="p">();</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">e</span><span class="p">)</span> <span class="k">return</span><span class="p">;</span>
<span class="k">auto</span><span class="o">&</span> <span class="n">v</span> <span class="o">=</span> <span class="n">force_get</span><span class="p">(</span><span class="n">e</span><span class="p">);</span>
</code></pre></div></div>
<p>Since you check the expected before accessing <code class="highlighter-rouge">e</code>, there’s no risk of a kernel panic.</p>
<p>However, as you might’ve guessed, we can’t enforce this at compile time.</p>
<p>Or can we?</p>
<p>Although it’s not completely standard, there’s a hack we use to enforce this. We can’t really enforce this
at the compiler since the types don’t care about the control flow.</p>
<p>However, using some tricks, we can actually detect whether you’ve missed to check an error. The trick is to
always inline the <code class="highlighter-rouge">force_get</code> call, and have a special hook to call when it fails:</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">decltype</span><span class="p">(</span><span class="k">auto</span><span class="p">)</span> <span class="n">ALWAYS_INLINE</span> <span class="n">force_get</span><span class="p">(</span><span class="n">ExpectedT</span><span class="o">&&</span> <span class="n">e</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">e</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">return</span> <span class="o">*</span><span class="n">e</span><span class="p">.</span><span class="n">m_internal</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">tos_force_get_failed</span><span class="p">(</span><span class="nb">nullptr</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p><code class="highlighter-rouge">expected::m_internal</code> is a <code class="highlighter-rouge">tl::expected</code>. As you can see, we have an always inline function that repeats
the check we’re supposed to do in the scope we should call it. This means that the compiler will see that
we’re doing a redundant check, and drop the if that’s coming from <code class="highlighter-rouge">force_get</code>. Since it drops the if, the
branch that calls <code class="highlighter-rouge">tos_force_get_failed</code> disappears completely. Therefore, we don’t get any runtime overhead
for doing this.</p>
<p>We also use link time size optimizations such as garbage collecting unused symbols. In a program that always
checks whether there’s a value before calling <code class="highlighter-rouge">force_get</code>, the <code class="highlighter-rouge">tos_force_get_failed</code> symbol must be unused,
and thus should not appear in the final binary. Therefore, with a single <code class="highlighter-rouge">nm | grep tos_force_get_failed</code>,
we can determine whether we’ve called <code class="highlighter-rouge">force_get</code> on an unknown expected.</p>
<p>Obviously, this won’t give you a lot of information regarding where you’ve forgot to check the expected, but
it’s better to realize you’ve forgotten to check it before programming the device rather than after crashing
at runtime.</p>
<h2 id="references">References</h2>
<ul class="bibliography"><li><span id="herbceptions">[1]“ACCU talk video posted – Sutter’s Mill.” \urlhttps://herbsutter.com/2019/04/28/accu-talk-video-posted/.</span></li>
<li><span id="boostoutcome">[2]“Outcome documentation.” \urlhttps://ned14.github.io/outcome/.</span></li>
<li><span id="expected">[3]J. F. B. Vicente J. Botet Escribá, “Utility class to represent expected object.” \urlhttp://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0323r3.pdf.</span></li></ul>fatihCan we deal with bad expected objects in embedded systems?Please do not disturb2019-05-13T00:00:00-07:002019-05-13T00:00:00-07:00https://blog.fatihbakir.net/2019/05/13/dont-disturb<p>Just like regular operating systems, embedded OSes have multiple responsibilities. First, they have to execute user programs. And second, they need to manage IO. IO usually doesn’t execute instantaneously, it takes some time. Some time we’d rather spend executing user code rather than just waiting for IO to complete. The external devices are usually smart enough to run on their own and just <em>interrupt</em> the compiler when they need something. So, upon an IO request by a thread, we just initiate an IO operation on the hardware and block the thread until IO finishes. If there are other threads to execute, we’ll just do that.</p>
<p>However, interrupts act like preemptive threads. Meaning, at any point in a program, we might get preempted by an interrupt service routine. Although ISRs and the normal thread world should be as decoupled as possible, at some point, some data will be shared between them. This easily leads to difficult to track down race conditions.</p>
<p>The most obvious resource ISRs and the <em>normal</em> world share is the thread queues. There’s a queue of runnable threads in a system that the scheduler uses, called the run queue. When a thread is created, it’s placed on that queue so it can start executing. When a thread starts executing, it’s taken off that list. When a thread blocks, it’s placed in the wait queue of whatever it’s blocking on. The threads in a block queue is usually placed back into the run queue by an interrupt service routine. For instance, when you’re sleeping for 5 seconds, you’re waiting for an ISR to wake you up by placing you back to the run queue eventually.</p>
<p>Now, imagine a moment where a task starts attempts to block on a resource, called <script type="math/tex">R_x</script> at time <script type="math/tex">T_1</script> and it takes <script type="math/tex">t</script> time to finish placing the thread in the block queue and suspend it. There exists an ISR <script type="math/tex">I_x</script> that unblocks threads waiting on <script type="math/tex">R_x</script>.</p>
<p><img src="/assets/img/img1.png" alt="" /></p>
<p>It is possible that <script type="math/tex">I_x</script> will be serviced during <script type="math/tex">[T_1, T_1 + t)</script>. In that case, we’ll have a nice race condition and quickly go into the undefined behavior land.</p>
<p><img src="/assets/img/img2.png" alt="" /></p>
<p>To avoid such a problem, we disable interrupts before placing a thread in the wait queue of a resource and enable them back right after placing it in the queue.</p>
<h2 id="death-by-a-thousand-interrupt-disables">Death by a thousand interrupt disables</h2>
<p>That solves our little problem. However, it turns out that you don’t want to always block unconditionally. You want to check if a condition has been satisfied and if not, block until it does. This is exactly what a <em>semaphore</em> represents:</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">semaphore</span> <span class="n">bytes_received</span><span class="p">{</span><span class="mi">0</span><span class="p">};</span>
<span class="n">ring_buf</span><span class="o"><</span><span class="kt">char</span><span class="p">,</span> <span class="mi">32</span><span class="o">></span> <span class="n">bytes</span><span class="p">;</span>
<span class="kt">char</span> <span class="n">read_byte</span><span class="p">(){</span>
<span class="n">bytes_received</span><span class="p">.</span><span class="n">down</span><span class="p">();</span>
<span class="k">auto</span> <span class="n">f</span> <span class="o">=</span> <span class="n">bytes</span><span class="p">.</span><span class="n">front</span><span class="p">();</span>
<span class="n">bytes</span><span class="p">.</span><span class="n">pop_front</span><span class="p">();</span>
<span class="k">return</span> <span class="n">f</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>When <code class="highlighter-rouge">read_byte</code> is called, we don’t necessarily want to block. If there already exists some bytes in the buffer, we want to take the first one. If there’s none, we want to wait until some arrives.</p>
<p>Now, to check how many bytes there are in the buffer, we have to read a shared integer. To do so, we must disable interrupts. This is in the <code class="highlighter-rouge">down</code> function of semaphores, and looks like this basically:</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="n">semaphore</span><span class="o">::</span><span class="n">down</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">int_guard</span> <span class="n">ig</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="o">--</span><span class="n">count</span> <span class="o"><</span> <span class="mi">0</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">m_wait</span><span class="p">.</span><span class="n">wait</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Now, we disable interrupts once in the <code class="highlighter-rouge">int_guard</code>. Then, we call <code class="highlighter-rouge">wait</code> on the <code class="highlighter-rouge">m_wait</code> object, which is of type <code class="highlighter-rouge">waitable</code>. <code class="highlighter-rouge">waitable</code>s are basically just a wrapper around a queue of threads that present an easier interface, like a <code class="highlighter-rouge">wait</code> function rather than <code class="highlighter-rouge">emplace_back(current_thread)</code> and suspend.</p>
<p>But, as we talked about previously, blocking also needs to disable interrupts. So, it’ll construct another <code class="highlighter-rouge">int_guard</code> object before placing the thread into the wait queue.</p>
<p><img src="/assets/img/img3.png" alt="" /></p>
<p>After that, we have to suspend the current thread, which means switching context to another thread. However, context switching is another potentially dangerous function. So interrupts must be disabled during that time as well.</p>
<p>So, for just a <code class="highlighter-rouge">semaphore::down</code> call, we have to disable interrupts 3 times. This is called the abstraction penalty. Since we obviously don’t want to pay this cost, we’ll cut some corners in terms of safety. For instance, the last call, <code class="highlighter-rouge">suspend_self</code> will have a precondition:</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/**
* Gives control of the CPU back to the scheduler,
* suspending * the current thread.
* If the interrupts are not disabled when this function
* is called, the behaviour is undefined
*/</span>
<span class="kt">void</span> <span class="n">suspend_self</span><span class="p">();</span>
<span class="c1">// pre-condition: interrupts must be disabled</span>
</code></pre></div></div>
<p>So, we’ve just traded performance for safety. Now, our code has UB if we attempt to suspend the current thread without disabling interrupts. Now, this function isn’t meant to be called directly, so the dangers aren’t that high, but still, there’s some unsafety and we still call the interrupt disable/enable pair twice.</p>
<p>You might be wondering how we can actually disable and enable interrupts twice. The hardware doesn’t know how many times you’ve disabled interrupts after all. There are 2 ways you can go about that: either store the current interrupt information in the <code class="highlighter-rouge">int_guard</code> object and restore it upon enabling, or count how many times we’ve disabled interrupts, and only enable them back when the counter reaches 0. The former solution is more <em>pure</em>, but the latter has $O(1)$ storage cost, so that’s the way we go about it. However, either method imposes some non-trivial runtime overhead, so I’d rather not to that twice.</p>
<p>Ideally, I could mark functions as <em>no interrupts</em> like we do with <code class="highlighter-rouge">const</code> or <code class="highlighter-rouge">noexcept</code>:</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="n">waitable</span><span class="o">::</span><span class="n">wait</span><span class="p">()</span> <span class="n">no_int</span><span class="p">;</span>
<span class="kt">void</span> <span class="n">suspend_self</span><span class="p">()</span> <span class="n">no_int</span><span class="p">;</span>
</code></pre></div></div>
<p>And the compiler just statically checks whether I’m calling from a non-interruptible context. However, this doesn’t scale as I can just come up with more stuff that fits this criteria.</p>
<p>The solution is to use the type system. We’ll introduce a new empty type, and make <code class="highlighter-rouge">int_guard</code> inherit from that:</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">no_int</span> <span class="p">{};</span>
<span class="k">struct</span> <span class="n">int_guard</span> <span class="o">:</span> <span class="n">no_int</span> <span class="p">{</span> <span class="p">...</span> <span class="p">};</span>
</code></pre></div></div>
<p>And we’ll change any function that expects there to be no interrupts to take a const reference to a <code class="highlighter-rouge">no_int</code>:</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="n">waitable</span><span class="o">::</span><span class="n">wait</span><span class="p">(</span><span class="k">const</span> <span class="n">no_int</span><span class="o">&</span><span class="p">);</span>
<span class="kt">void</span> <span class="n">suspend_self</span><span class="p">(</span><span class="k">const</span> <span class="n">no_int</span><span class="o">&</span><span class="p">);</span>
</code></pre></div></div>
<p>Now, the C++ compiler will just prevent anyone from calling these functions unless they have an <code class="highlighter-rouge">int_guard</code> instance lying around.</p>
<p>And <code class="highlighter-rouge">waitable::wait</code> just passes it’s reference to <code class="highlighter-rouge">suspend_self</code>, so it’s easy to carry this information down the call stack.</p>
<p>So, <code class="highlighter-rouge">semaphore::down</code> will just look like this:</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="n">semaphore</span><span class="o">::</span><span class="n">down</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">int_guard</span> <span class="n">ig</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="o">--</span><span class="n">count</span> <span class="o"><</span> <span class="mi">0</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">m_wait</span><span class="p">.</span><span class="n">wait</span><span class="p">(</span><span class="n">ig</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>If you aren’t familiar with C++’s inheritance mechanism, this doesn’t add any overhead to functions the compiler inlines. If it can’t inline, it’s equivalent to passing a single reference.</p>
<p>Awesome! Now, you might say they can just construct a <code class="highlighter-rouge">no_int</code> instance and pass that as well. Well, they shouldn’t. But we have the technology to solve that as well:</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">no_int</span> <span class="p">{</span>
<span class="k">private</span><span class="o">:</span>
<span class="n">no_int</span><span class="p">()</span> <span class="o">=</span> <span class="k">default</span><span class="p">;</span>
<span class="k">friend</span> <span class="k">class</span> <span class="nc">int_guard</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>
<p>This way, short of modifying the <code class="highlighter-rouge">no_int</code> type, the users of the library <em>must</em> use an <code class="highlighter-rouge">int_guard</code>.</p>
<h2 id="theres-another">There’s another</h2>
<p>You might be wondering why we’re going to unnecessary lengths and not using a <code class="highlighter-rouge">const int_guard&</code>. The answer is that, there’s another way of disabling interrupts in a system, and the other one is if we are already in an interrupt context (that means we’re already servicing an interrupt)!</p>
<p>To model this, we add another type:</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">namespace</span> <span class="n">detail</span> <span class="p">{</span>
<span class="k">struct</span> <span class="n">int_ctx</span> <span class="o">:</span> <span class="n">no_int</span> <span class="p">{};</span>
<span class="p">}</span>
</code></pre></div></div>
<p>When we enter an ISR, we’ll construct an <code class="highlighter-rouge">int_ctx</code>, and pass that to the functions that expect interrupts to be turned off.</p>fatihEnsuring atomic behavior without wasting time