Jekyll2022-01-05T11:51:48+00:00https://mauritsvanaltvorst.com/feed.xmlMaurits van AltvorstStudent in Econometrics and Economics at Erasmus University Rotterdam. Contact me via e-mail!Robot archery (w/ Jane Street’s December 2021 challenge)2022-01-01T09:00:00+00:002022-01-01T09:00:00+00:00https://mauritsvanaltvorst.com/jane-street-december-2021<pre style="white-space: pre-wrap;">
After a grueling year filled with a wide variety of robot sporting events, we have arrived at the final event of the year: Robot Archery. Four robots have qualified for this year’s finals, and have been seeded in the following order:
Robot Seed
Aaron 1
Barron 2
Caren 3
Darrin 4
The robots will take turns shooting arrows at a target [1], starting with Aaron and proceeding in order by seed. When it is a given robot’s turn, they shoot a single arrow. If it is closer to the center of the target than all previous arrows by all players, that robot remains in the tournament, going to the back of the queue to await their next turn. Otherwise that robot is eliminated immediately. The last robot remaining in the queue is the winner.
For example, here is how last year’s finals went, in which Caren was the winner. (Oddly enough it involved the same robots in the same seeding.)
Turn Robot Distance
1 Aaron 10nm
2 Barron 8nm
3 Caren 7nm
4 Darrin 1km
5 Aaron 9nm
6 Barron 2nm
7 Caren 1nm
8 Barron 1Ym [2]
To ten decimal places, what is the probability that Darrin will be this year’s winner? (Or, if you want to send in the exact answer, that’s fine too!)
[1] Each robot is equally skilled. Which is to say: for any region R on the target with nonzero area, the robots all have the same positive probability of landing an arrow within R on any given shot.
[2] It’s a large target.
</pre>
<p>The first thing we consider is the fact that the actual distance of the arrow is not important. All arrows are independent and identically distributed, so by taking the inverse we can transform the distance into quantiles that follow a \(\text{UNIF}(0, 1)\) distribution.</p>
<p>Furthermore, given that after \(k\) shots we have distances with quantiles \(q_1, \ldots, q_k\), let us calculate the probability of quantile \(Q \sim \text{UNIF}(0, 1)\) being the lower than all other quantiles:</p>
\[\begin{align*}
\mathbb{P}(Q < \text{min}\{q_1, \ldots, q_k\}) &= \int_0^1 f_Q(t) \cdot \mathbb{P}(t < \text{min}\{q_1, \ldots, q_k\}) dt \\
&\overset{i.i.d.}{=} \int_0^1 \Pi_{i=1}^k \mathbb{P}(t < q_i) dt \\
&\overset{i.i.d.}{=} \int_0^1 (1-t)^k dt \\
&= \left[- \frac 1 {k+1} \cdot (1-t)^{k+1}\right]^{t=1}_{t=0} \\
&= \frac 1 {k+1}
\end{align*}\]
<p>It turns out this probability only depends on the amount of previous arrows. This makes it easy for us to turn this puzzle into a dynamic programming problem. Let \(N \in \mathbb{N}\), then</p>
\[\begin{align*}
p &\colon \mathbb{N}^3 \rightarrow \mathbb{R} \\
p&(n, m, k) = \begin{cases}
1, & \text{if } n = 1 \\
\frac 1 {k+1} \cdot p(n, n-1, k+1), & \text{if } m = 0 \\
\frac 1 {k+1} \cdot p(n, m-1, k+1) + \left(1 - \frac 1 {k+1}\right) \cdot p(n-1, m-1, k+1), & \text{otherwise}
\end{cases}
\end{align*}\]
<p>Where \(N\) stands for the initial amount of players, \(n\) for the amount of players still in the game, \(m\) for the amount of players that still has to shoot before it’s Darrin’s turn, and \(k\) for the amount of arrows that have been shot.</p>
<p>The first case, where \(N=1\), is the case where there is only one player (Darrin) left. In that case the probability of Darrin winning is 1, given that he is the only player left.</p>
<p>In the second case, where \(m=0\), it is currently Darrin’s turn. The only possibility for him to win is if he shoots the closest arrow and then wins the game with this new game state, where there are \(m=n-1\) players that have to shoot the arrow before it is Darrin’s turn again.</p>
<p>The last case is the case where there are still players left, and it is not Darrin’s turn. In that case, you end up with two scenarios you have to take into account. If the current player shoots the closest arrow, he is still in the game (\(n := n\)), but if he does not shoot the closest arrow he gets removed from the game (\(n := n-1\)). Via the law of total probability, we can take the sum these two cases.</p>
<p><img src="/assets/jane-street-december-2021/tree.png" alt="a diagram showing the tree structure of the DP formula above" style="margin-left: auto;" /></p>
<p>This way, we have translated our original problem into simply solving \(p(4, 3, 0)\) given \(N=4\). Solving this numerically is not directly possible because we end up in an infinite loop. However, when we consider that we only need ten decimals, and that none of the three cases contribute more to the total sum than \(\left(\frac 1 {k+1}\right)^{k-4}\), we can simply evaluate this problem to a depth of \(k \approx 24\) to get an accurate enough answer: \(p(4,3,0) \approx 0.18343765086\).</p>After a grueling year filled with a wide variety of robot sporting events, we have arrived at the final event of the year: Robot Archery. Four robots have qualified for this year’s finals, and have been seeded in the following order:Robot tug-of-war (w/ Jane Street’s August 2021 challenge)2021-09-01T09:00:00+00:002021-09-01T09:00:00+00:00https://mauritsvanaltvorst.com/jane-street-august-2021<p>The fourth post in my series of <a href="https://www.janestreet.com/puzzles/robot-tug-of-war-index/">Jane Street puzzles</a>; previous solves can be found <a href="/jane-street-june-2021">here</a>, <a href="/jane-street-december-2020">here</a> and <a href="/jane-street-april-2021">here</a>.</p>
<pre style="white-space: pre-wrap;">
The Robot Weightlifting World Championship was such a huge success that the organizers have hired you to help design its sequel: a Robot Tug-of-War Competition!
In each one-on-one matchup, two robots are tied together with a rope. The center of the rope has a marker that begins above position 0 on the ground. The robots then alternate pulling on the rope. The first robot pulls in the positive direction towards 1; the second robot pulls in the negative direction towards -1. Each pull moves the marker a uniformly random draw from [0,1] towards the pulling robot. If the marker first leaves the interval [‑½,½] past ½, the first robot wins. If instead it first leaves the interval past -½, the second robot wins.
However, the organizers quickly noticed that the robot going second is at a disadvantage. They want to handicap the first robot by changing the initial position of the marker on the rope to be at some negative real number. Your job is to compute the position of the marker that makes each matchup a 50-50 competition between the robots. Find this position to seven significant digits—the integrity of the Robot Tug-of-War Competition hangs in the balance!
</pre>
<h1 id="simulation">Simulation</h1>
<p>Let’s make a simulation to get an initial overview of the problem. We denote \(x\) as the starting position of the robot, and we have to find \(x \in \left(-\frac 1 2, \frac 1 2 \right)\) such that</p>
\[p(x) = \mathbb{P}(\text{robot 1 wins} \mid \text{starting position is } x) = 0.5\]
<p><img src="/assets/jane-street-august-2021/overview.svg" alt="A graph that shows the probability of robot 1 winning versus x, its starting position." /></p>
<p>We can read from the graph that the solution is approximately \(0.285\). Unfortunately, brute force will not be quick enough to find the solution to seven decimal places. We also cannot use the binary search technique <a href="/jane-street-june-2021">I used last time</a>, because we are not dealing with a kink in a graph. We will have to find the actual closed-form formula of the graph.</p>
<h1 id="analytic-description">Analytic description</h1>
<p>The clue lies in the symmetry between robot 1 and robot 2. Let’s try to reason from the perspective of robot 2 after robot 1 has made his move. Suppose that robot 1 moved from \(x\) to \(x + t\), \(x + t \in \left(0, \frac 1 2 \right]\). Now it is the turn of robot 2, and from symmetry we can deduce that the probability of robot 2 winning in this case is the same as the probability of robot 1 winning after starting from \(-x-t\).</p>
<p><img src="/assets/jane-street-august-2021/line.svg" alt="An illustrative diagram showing the step from x to x+t." style="margin-left: auto;" /></p>
\[\mathbb{P}(\text{robot 2 wins} \mid \text{robot 2's turn, current position is } x+t)\]
\[=\]
\[\mathbb{P}(\text{robot 1 wins} \mid \text{robot 1's turn, current position is } -x-t)\]
<p>Let’s combine this with the law of total probability:</p>
\[\begin{align*}
p(x) &= \mathbb{P}(\text{robot 1 wins} \mid \text{starting position is } x) \\
p(x) &= \int_0^1 \mathbb{P}(\text{robot 1 wins} \mid \text{his first pull is } t) dt
\end{align*}\]
<p>We can split this integral into two different cases:</p>
\[\begin{cases}
\mathbb{P}(\text{robot 1 wins} \mid \text{position is } x+t) = 1, &x+t \geq \frac 1 2 \\
\mathbb{P}(\text{robot 1 wins} \mid \text{position is } x+t) = p(-x-t), &x+t < \frac 1 2
\end{cases}\]
<p>Therefore we have</p>
\[\begin{align*}
p(x) &= \int_0^{\frac 1 2 - x} \mathbb{P}(\text{robot 1 wins} \mid \text{his first pull is } t) dt + \int_{\frac 1 2 - x}^1 dt \\
&= \int_0^{\frac 1 2 - x} 1 - p(-x-t) dt + \frac 1 2 + x \\
&= 1 - \int_0^{\frac 1 2 - x} p(-x-t) dt \\
\end{align*}\]
<p>Apply a substitution with \(u = - x - t\):</p>
\[\begin{align*}
p(x) &= 1 + \int_{-x}^{-\frac 1 2} p(u) du \\
&= 1 - \int_{-\frac 1 2}^{-x} p(u) du
\end{align*}\]
<p>Let’s differentiate both sides, and apply the <a href="https://en.wikipedia.org/wiki/Fundamental_theorem_of_calculus">fundamental theorem of calculus</a>:</p>
\[\frac {dp(x)} {dx} = \frac d {dx} \left[ 1 + \int_{-\frac 1 2}^{-x} {p(u)}du \right]\]
\[p'(x) = p(-x)\]
<p>We can check this differential equation with the approximate simulation we’ve made before. You can see our tangent line aligns nicely.</p>
<p><img src="/assets/jane-street-august-2021/tangent.svg" alt="A check of the differential equation with a tangent line." /></p>
<p>To solve this differential equation analytically, let’s consider the Taylor series of \(p(x)\):</p>
\[p(x) = p(0) + \frac {p'(0)} {1!} \cdot x + \frac {p''(0)} {2!} \cdot x^2 + \ldots \\
\implies \begin{cases}
p(-x) &= p(0) &- \frac {p'(0)} {1!} \cdot x &+ \frac {p''(0)} {2!} \cdot x^2 &- \ldots \\
p'(x) &= p'(0) &+ \frac {p''(0)} {1!} \cdot x &+ \frac {p'''(0)} {2!} \cdot x^2 &+ \ldots \\
\end{cases}\]
<p>Because this holds for all \(x \in \left(\frac 1 2, \frac 1 2 \right)\), this implies that all \(i\)‘th derivatives of these two power series must be equal on this interval. Therefore, the coefficients of these two power series must be equal:</p>
\[\begin{cases}
p'(0) &= p(0), \\
p''(0) &= -p'(0), \\
p'''(0) &= p''(0), \\
p''''(0) &= -p'''(0), \\
\ldots
\end{cases}\]
<p>Let \(p(0) = a\), and let’s denote the \(n^{th}\) derivative of \(p(x)\) as \(p^{(n)}(x)\) for simplicity:</p>
\[\begin{cases}
p^{(0)}(0) & &= a, \\
p^{(1)}(0) = &p^{(0)}(0) &= a, \\
p^{(2)}(0) = - &p^{(1)}(0) &= -a, \\
p^{(3)}(0) = &p^{(2)}(0) &= -a, \\
p^{(4)}(0) = - &p^{(3)}(0) &= a, \\
p^{(5)}(0) = &p^{(4)}(0) &= a, \\
\ldots
\end{cases}\]
<p>Do you notice the pattern? Alternating double positives and double negatives, repeating indefinitely. We can use this to find the final Taylor expansion for \(p(x)\):</p>
\[\begin{align*}
p(x) &= &a &+ \frac a {1!} \cdot x &- \frac a {2!} \cdot x^2 &- \frac a {3!} \cdot x^3 &+ \ldots \\
&= &a & &- \frac a {2!} \cdot x^2 & &+ \ldots \\
& & &+ \frac a {1!} \cdot x & &- \frac a {3!} \cdot x^3 &+ \ldots
\end{align*}\]
<p>These are two Taylor series you might recognize! The top one resembles \(a \cdot \sin(x)\), and the bottom one \(a \cdot \cos(x)\). Therefore, we conclude that</p>
\[p(x) = a(\sin(x) + \cos(x))\]
<p>Solving for \(a\) is simple when you consider that \(p \left(\frac 1 2 \right) = 1\). When robot 1 starts at \(\frac 1 2\), it is certain it will win:</p>
\[\begin{align*}
a &= \frac 1 {\sin \left(\frac 1 2 \right) + \cos \left(\frac 1 2 \right)} \\
\implies p(x) &= \frac {\sin(x) + \cos(x)} {\sin \left(\frac 1 2 \right) + \cos \left(\frac 1 2 \right)}
\end{align*}\]
<p>Now we can solve for \(x\) and find the solution to our original problem.</p>
\[\begin{align*}
& & p(x) &= 0.5 \\
&\iff & \frac {\sin(x) + \cos(x)} {\sin \left(\frac 1 2 \right) + \cos \left(\frac 1 2 \right)} &= 0.5 \\
&\iff & x &= \frac \pi 4 - \cos^{-1}\left(\frac {\sin\left(\frac 1 2\right) + \cos\left(\frac 1 2\right)} {2 \sqrt 2}\right) \\
&\implies & x &\approx -0.2850001
\end{align*}\]
<p>So if robot 1 starts at \(-0.2850001\), both robots will have a \(50\%\) chance of winning the tug of war, which is the answer to our problem.</p>
<p><img src="/assets/jane-street-august-2021/closedform.svg" alt="A graph that shows the probability of robot 1 winning versus x, but this time a closed form solution." /></p>The fourth post in my series of Jane Street puzzles; previous solves can be found here, here and here.Double binary search (w/ Jane Street’s June 2021 challenge)2021-07-01T09:00:00+00:002021-07-01T09:00:00+00:00https://mauritsvanaltvorst.com/jane-street-june-2021<p>This was definitely the most difficult <a href="https://www.janestreet.com/puzzles/robot-weightlifting-index/">Jane Street puzzle</a> I have solved so far. You can find solutions to previous Jane Street puzzles I’ve solved <a href="/jane-street-december-2020">here</a> and <a href="/jane-street-april-2021">here</a>.</p>
<pre style="white-space: pre-wrap;">
The Robot Weightlifting World Championship’s final round is about to begin! Three robots, seeded 1, 2, and 3, remain in contention. They take turns from the 3rd seed to the 1st seed publicly declaring exactly how much weight (any nonnegative real number) they will attempt to lift, and no robot can choose exactly the same amount as a previous robot. Once the three weights have been announced, the robots attempt their lifts, and the robot that successfully lifts the most weight is the winner. If all robots fail, they just repeat the same lift amounts until at least one succeeds.
Assume the following:
1) all the robots have the same probability p(w) of successfully lifting a given weight w;
2) p(w) is exactly known by all competitors, continuous, strictly decreasing as the w increases, p(0) = 1, and p(w) -> 0 as w -> infinity; and
3) all competitors want to maximize their chance of winning the RWWC.
If w is the amount of weight the 3rd seed should request, find p(w). Give your answer to an accuracy of six decimal places.
</pre>
<h1 id="weights-vs-probabilities">Weights vs probabilities</h1>
<p>While trying to model this problem mathematically, notice that the weights are not directly of importance: only \(p(w)\) is. \(p(w)\) is strictly decreasing and continuous, therefore bijective. It also holds that \(w_i < w_j \iff p(w_i) > p(w_j)\). Therefore, we can work directly with the probabilities of succesful lifts, and reason about the ordering of weights from this probabilities. With this simple rework we simplify the problem massively. Let’s denote the probabilities of succesful lifts by \(p_1\), \(p_2\) and \(p_3\). Robot #3 starts the game by choosing \(p_3\).</p>
<p>We can make a simulation to get an approximate overview of the problem:</p>
<p><img src="/generated/assets/jane-street-june-2021/overview-799-421693f8b.png" alt="A graph that shows the probability of robot 3 winning versus p3." srcset="/generated/assets/jane-street-june-2021/overview-400-421693f8b.png 400w, /generated/assets/jane-street-june-2021/overview-600-421693f8b.png 600w, /generated/assets/jane-street-june-2021/overview-799-421693f8b.png 799w" /></p>
<p>In reality, the full range of \(p_1, p_2, p_3\) is \((0, 1]\), an infinite amount of points. To simplify this, I chose 400 evenly distributed points on this continuous range (that was the max my computer could handle in a reasonable amount of time). That causes some small artifacts, some lines show some sort of jumping pattern, but we still get a good overview of the game if we ignore these artifacts for now.</p>
<p>Unfortunately, this brute force algorithm is too inefficient to solve the problem to 6 decimal digits of precision. In that case, we would have to choose at least \(10^6\) points for all three variables. This algorithm has time complexity \(O((10^n)^3) = O(1000^n)\) where n is the required precision. Definitely not quick enough for \(n=6\).</p>
<p>We can see that our optimal \(p_3\) is somewhere around \(0.28\). That value of \(p_3\) maximizes the probability of robot 3 winning. That will be our new goal, find the exact \(p_3\) that corresponds to the kink at 0.28. Let’s redefine our boundaries on \(p_3\) and zoom in:</p>
<p><img src="/generated/assets/jane-street-june-2021/zoomed-794-0ec612b28.png" alt="The same graph as above, zoomed in around the maximum of p3" srcset="/generated/assets/jane-street-june-2021/zoomed-400-0ec612b28.png 400w, /generated/assets/jane-street-june-2021/zoomed-600-0ec612b28.png 600w, /generated/assets/jane-street-june-2021/zoomed-794-0ec612b28.png 794w" /></p>
<p>You can clearly see the rounding artifacts in this image. Notice I’ve added colours: orange corresponds to optimal solutions where \(p_1 = 1.0\). Consider the following data:</p>
<table>
<thead>
<tr>
<th>P(2 wins)</th>
<th>\(p_1\)</th>
<th>\(p_2\)</th>
<th>\(p_3\)</th>
</tr>
</thead>
<tbody>
<tr>
<td>0.311602</td>
<td>1.000000</td>
<td>0.436090</td>
<td>0.285464</td>
</tr>
<tr>
<td>0.311165</td>
<td>1.000000</td>
<td>0.436090</td>
<td>0.286466</td>
</tr>
<tr>
<td>0.310728</td>
<td>1.000000</td>
<td>0.436090</td>
<td>0.287469</td>
</tr>
<tr>
<td>0.311596</td>
<td>0.288221</td>
<td>0.441103</td>
<td>0.288471</td>
</tr>
<tr>
<td>0.310984</td>
<td>0.288221</td>
<td>0.441103</td>
<td>0.289474</td>
</tr>
<tr>
<td>0.310372</td>
<td>0.288221</td>
<td>0.441103</td>
<td>0.290476</td>
</tr>
</tbody>
</table>
<p>Notice the sudden flip of optimal \(p_1\) when we increase \(p_3\) too much. It is equal to 1.0 at first, but suddenly changes to ~0.28. The kink in the graph seems to correspond with the exact point where the optimal strategy for robot 1 switches from choosing \(p_1 = p_3 - \epsilon\) (more on that later) to \(p_1 = 1.0\). It makes sense that this optimum point for robot #3 corresponds to a strategy switch of another player: at some point, robot #3 has made \(p_3\) so large that it provides robot #1 the opportunity to “undercut” \(p_3\) (aka choose a slightly bigger weight than robot #3) and increase his chance of winning.</p>
<h1 id="layers-of-argmax">Layers of argmax</h1>
<p>Let’s fix \(p_3\) and look at the graph of \(p_2\) versus the probability of robot 2 winning (use the slider to change \(p_3\)!):</p>
<div style="display: flex;">
<span style="width: 3em;">$$p_3$$</span>
<input type="range" min="0" max="1000" value="280" oninput="handlep2(this.value)" style="width:100%" />
</div>
<div id="plot_area_p2"></div>
<script>
const w = 650;
const h = 450;
const padding = 60;
const epsilon = 0.00001;
function linspace(startValue, stopValue, cardinality) {
var arr = [];
var step = (stopValue - startValue) / (cardinality - 1);
for (var i = 0; i < cardinality; i++) {
arr.push(startValue + (step * i));
}
return arr;
}
function compute_probability_2_win(p1, p2, p3) {
let r = (1 - p1) * (1 - p2) * (1 - p3);
let win_prob = (1 - p1) * p2 * (1-p3);
if (p2 < p3) {
win_prob += (1 - p1) * p2 * p3;
}
if (p2 < p1) {
win_prob += p1 * p2 * (1 - p3);
}
if (p2 < p3 && p2 < p1) {
win_prob += p1 * p2 * p3;
}
return win_prob/(1-r);
}
function compute_probability_1_win(p1, p2, p3) {
let r = (1 - p1) * (1 - p2) * (1 - p3);
let win_prob = p1 * (1 - p2) * (1 - p3);
if (p1 < p2) {
win_prob += p1 * p2 * (1 - p3);
}
if (p1 < p3) {
win_prob += p1 * (1 - p2) * p3;
}
if (p1 < p2 && p1 < p3) {
win_prob += p1 * p2 * p3;
}
return win_prob/(1-r);
}
function getDatap2(p3) {
let data = [];
const np2 = 250;
let prob;
for (let p2 of linspace(0, 1, np2)) {
if (p3 == p2) continue;
let maximum_1 = [0, -1];
for (let p1 of [1.0, p2-epsilon, p3-epsilon]) {
if (p3 == p1) continue;
if (p1 <= 0) continue;
prob = compute_probability_1_win(p1, p2, p3);
if (prob > maximum_1[0]) maximum_1 = [prob, p1];
}
prob = compute_probability_2_win(maximum_1[1], p2, p3);
data.push([p2*100, prob*100]);
}
return data;
}
function handlep2(value) {
let p3 = value/1000;
updatep2(p3, getDatap2(p3));
}
let svgp2;
let xScalep2;
let yScalep2;
function updatep2(p3, data) {
let pMax = data.reduce((acc, value) => Math.max(acc, value[1]), 0);
label.text(`p₃ = ${p3}`);
svgp2
.select("#scatter")
.selectAll("circle")
.data(data)
.attr("cx", (d) => xScalep2(d[0]))
.attr("cy", (d) => yScalep2(d[1]))
.attr("r", 3)
.attr("fill", "steelblue");
svgp2
.select("#maxLine")
.datum([[0, pMax], [100, pMax]])
.attr('d', d3.line()
.x((d) => xScalep2(d[0]))
.y((d) => yScalep2(d[1])));
}
draw_initial(0.28, getDatap2(0.28));
function draw_initial(p3, data) {
const xMin = 0;
const xMax = 100;
const yMin = 0;
const yMax = 50;
let pMax = data.reduce((acc, value) => Math.max(acc, value[1]), 0);
xScalep2 = d3
.scaleLinear()
.domain([xMin, xMax])
.range([padding, w - padding]);
yScalep2 = d3
.scaleLinear()
.domain([yMin, yMax])
.range([h - padding, padding]);
svgp2 = d3
.select("#plot_area_p2")
.append("svg")
.attr("viewBox", "0 0 " + w + " " + h)
.attr("preserveAspectRatio", "xMinYMin meet");
label = svgp2.append('text')
.attr('x', xScalep2(50))
.attr('y', yScalep2(50))
.attr('text-anchor', 'middle')
.attr('font-weight', 'bold')
.style('font-family', 'sans-serif')
.style('font-size', '16px')
.text(`p₃ = ${p3}`);
svgp2
.append("g")
.style("font-size", "12px")
.attr("transform", "translate(0," + (h - padding) + ")")
.call(d3.axisBottom(xScalep2));
svgp2
.append("g")
.style("font-size", "12px")
.attr("transform", "translate(" + padding + ",0)")
.call(d3.axisLeft(yScalep2));
svgp2
.append("text")
.attr("x", w / 2)
.attr("y", h - 15)
.attr("text-anchor", "middle")
.style("font-family", "sans-serif")
.text("p₂");
svgp2
.append("text")
.attr("text-anchor", "middle")
.attr("transform", "translate(15," + h / 2 + ")rotate(-90)")
.style("font-family", "sans-serif")
.text("P(2 wins)");
svgp2.append('path')
.attr("id", "maxLine")
.datum([[0, pMax], [100, pMax]])
.attr('stroke', "brown")
.attr('stroke-width', 3)
.attr('d', d3.line()
.x((d) => xScalep2(d[0]))
.y((d) => yScalep2(d[1])));
dots = svgp2
.append("g")
.attr("id", "scatter")
.selectAll("circle")
.data(data)
.enter()
.append("circle")
.attr("cx", d => xScalep2(d[0]))
.attr("cy", d => yScalep2(d[1]))
.attr("r", 3)
.attr("fill", "steelblue");
}
</script>
<p>Notice that across the entire \(0.0 < p_3 < 0.31\) range, the optimal \(p_2\) corresponds to the rightmost discontinuity. If we zoom in on this discontinuity, we find the following image:</p>
<p><img src="/generated/assets/jane-street-june-2021/p2graphzoomed-794-aa93df06a.png" alt="The same graph as above, zoomed in around the maximum of p2" srcset="/generated/assets/jane-street-june-2021/p2graphzoomed-400-aa93df06a.png 400w, /generated/assets/jane-street-june-2021/p2graphzoomed-600-aa93df06a.png 600w, /generated/assets/jane-street-june-2021/p2graphzoomed-794-aa93df06a.png 794w" /></p>
<p>This time, we have three different approximate values of \(p_1\): \(p_1=p_3 - \epsilon\), \(p_1 = p_2 - \epsilon\) or \(p_1 = 1.0\). Why these values? Consider the following graph, where we fix both \(p_3\) and \(p_2\) (use the sliders to change their values!):</p>
<div style="display: flex;">
<span style="width: 3em;">$$p_2$$</span>
<input type="range" min="0" max="1000" value="280" oninput="handlep1()" style="width:100%" id="p2" />
</div>
<div style="display: flex;">
<span style="width: 3em;">$$p_3$$</span>
<input type="range" min="0" max="1000" value="560" oninput="handlep1()" style="width:100%" id="p3" />
</div>
<div id="plot_area_p1"></div>
<script>
const wp1 = 650;
const hp1 = 450;
const paddingp1 = 60;
function linspace(startValue, stopValue, cardinality) {
var arr = [];
var step = (stopValue - startValue) / (cardinality - 1);
for (var i = 0; i < cardinality; i++) {
arr.push(startValue + (step * i));
}
return arr;
}
function compute_probability_1_win(p1, p2, p3) {
let r = (1 - p1) * (1 - p2) * (1 - p3);
let win_prob = p1 * (1 - p2) * (1 - p3);
if (p1 < p2) {
win_prob += p1 * p2 * (1 - p3);
}
if (p1 < p3) {
win_prob += p1 * (1 - p2) * p3;
}
if (p1 < p2 && p1 < p3) {
win_prob += p1 * p2 * p3;
}
return win_prob/(1-r);
}
function getDatap1(p2, p3) {
let data = [];
const np1 = 250;
for (let p1 of linspace(0, 1, np1)) {
if (p3 == p1) continue;
if (p1 <= 0) continue;
prob = compute_probability_1_win(p1, p2, p3);
data.push([p1*100, prob*100]);
}
return data;
}
function handlep1() {
let p2 = d3.select("#p2").property("value")/1000;
let p3 = d3.select("#p3").property("value")/1000;
updatep1(p2, p3, getDatap1(p2, p3));
}
let xScalep1;
let yScalep1;
let labelp2;
let labelp3;
let svgp1;
function updatep1(p2, p3, data) {
let pMax = data.reduce((acc, value) => Math.max(acc, value[1]), 0);
labelp2.text(`p₂ = ${p2}`);
labelp3.text(`p₃ = ${p3}`);
svgp1
.select("#scatterp1")
.selectAll("circle")
.data(data)
.attr("cx", (d) => xScalep1(d[0]))
.attr("cy", (d) => yScalep1(d[1]))
.attr("r", 3)
.attr("fill", "steelblue");
svgp1
.select("#maxLinep1")
.datum([[0, pMax], [100, pMax]])
.attr('d', d3.line()
.x((d) => xScalep1(d[0]))
.y((d) => yScalep1(d[1])));
svgp1
.select("#p2line")
.attr("x1", xScalep1(p2*100))
.attr("x2", xScalep1(p2*100));
svgp1
.select("#p3line")
.attr("x1", xScalep1(p3*100))
.attr("x2", xScalep1(p3*100));
// circle.enter()
// .append("circle")
// .attr("cx", (d) => xScale(d[0]))
// .attr("cy", (d) => yScale(d[1]))
// .attr("r", 3)
// .attr("fill", "steelblue");
// circle.exit().remove();
}
draw_initialp1(0.28, 0.56, getDatap1(0.28, 0.56));
function draw_initialp1(p2, p3, data) {
// Set axis limits
const xMin = 0;
const xMax = 100;
const yMin = 0;
const yMax = 100;
let pMax = data.reduce((acc, value) => Math.max(acc, value[1]), 0);
// Set x and y-axis scales
xScalep1 = d3
.scaleLinear()
.domain([xMin, xMax])
.range([paddingp1, wp1 - paddingp1]);
yScalep1 = d3
.scaleLinear()
.domain([yMin, yMax])
.range([hp1 - paddingp1, paddingp1]);
// Append an svg to the plot_area div
svgp1 = d3
.select("#plot_area_p1")
.append("svg")
.attr("viewBox", "0 0 " + wp1 + " " + h)
.attr("preserveAspectRatio", "xMinYMin meet");
labelp2 = svgp1.append('text')
.attr('x', xScalep1(75))
.attr('y', yScalep1(95))
// .attr('text-anchor', 'middle')
.attr('font-weight', 'bold')
.style('font-family', 'sans-serif')
.style('font-size', '16px')
.text(`p₂ = ${p2}`);
labelp3 = svgp1.append('text')
.attr('x', xScalep1(75))
.attr('y', yScalep1(90))
// .attr('text-anchor', 'middle')
.attr('font-weight', 'bold')
.style('font-family', 'sans-serif')
.style('font-size', '16px')
.text(`p₃ = ${p3}`);
// Add x-axis
svgp1
.append("g")
.style("font-size", "12px")
.attr("transform", "translate(0," + (hp1 - paddingp1) + ")")
.call(d3.axisBottom(xScalep1));
// Add y-axis
svgp1
.append("g")
.style("font-size", "12px")
.attr("transform", "translate(" + paddingp1 + ",0)")
.call(d3.axisLeft(yScalep1));
// Add x-axis label
svgp1
.append("text")
.attr("x", wp1 / 2)
.attr("y", hp1 - 15)
.attr("text-anchor", "middle")
.style("font-family", "sans-serif")
.text("p₁");
// Add y-axis label
svgp1
.append("text")
.attr("text-anchor", "middle")
.attr("transform", "translate(15," + hp1 / 2 + ")rotate(-90)")
.style("font-family", "sans-serif")
.text("P(1 wins)");
svgp1.append('path')
.attr("id", "maxLinep1")
.datum([[0, pMax], [100, pMax]])
.attr('stroke', "brown")
.attr('stroke-width', 3)
.attr('d', d3.line()
.x((d) => xScalep1(d[0]))
.y((d) => yScalep1(d[1])));
svgp1.append("line")
.attr("id", "p2line")
.attr("x1", xScalep1(28))
.attr("y1", yScalep1(0))
.attr("x2", xScalep1(28))
.attr("y2", yScalep1(100))
.style("stroke-width", 1.5)
.style("stroke", "green")
.style("fill", "none");
svgp1.append("line")
.attr("id", "p3line")
.attr("x1", xScalep1(56))
.attr("y1", yScalep1(0))
.attr("x2", xScalep1(56))
.attr("y2", yScalep1(100))
.style("stroke-width", 1.5)
.style("stroke", "green")
.style("fill", "none");
dots = svgp1
.append("g")
.attr("id", "scatterp1")
.selectAll("circle")
.data(data)
.enter()
.append("circle")
.attr("cx", d => xScalep1(d[0]))
.attr("cy", d => yScalep1(d[1]))
.attr("r", 3)
.attr("fill", "steelblue");
}
</script>
<p>Notice that no matter what values you choose via the sliders, you always have the same values for the discontinuities on the horizontal axis: \(p_2\), \(p_3\) and \(1.0\). Furthermore, a maximum of this graph always corresponds to one of these discontinuities, as the function is increasing. This makes it easy to create a reaction function for robot #1: evaluate the chance of winning for the three possible values of \(p_1\) and choose the one with the highest probability of winning. It’s easy to obtain \(p_1\) this way, and, most importantly, we get the value of \(p_1\) to an arbitrary amount of decimals. This reaction function of robot #1 will be called \(R_1(p_2, p_3)\).</p>
<p>Back to robot #2: for a given \(p_3\), and now that we know how robot #1 will react to robot #2 and #3, how can we figure out the exact optimal value of \(p_2\)?</p>
<h1 id="the-first-binary-search">The first binary search</h1>
<p>Let’s create the following indicator function:</p>
\[𝟙_2(p_2, p_3) = \begin{cases}
1 & & \mathbb{P}(\text{1 wins} \mid p_1 = p_2 - \epsilon, p_2, p_3) > \mathbb{P}(\text{1 wins} \mid p_1 = p_3 - \epsilon, &p_2, p_3) \\ & \wedge & \mathbb{P}(\text{1 wins} \mid p_1 = p_2 - \epsilon, p_2, p_3) > \mathbb{P}(\text{1 wins} \mid p_1 = 1.0, &p_2, p_3) \\
0 & & \text{otherwise}
\end{cases}\]
<p>It basically checks whether \(p_1 = p_2 - \epsilon\) is the optimal value for \(p_1\), and that is equivalent to being to the right of the discontinuity we showed above (e.g. is \(p_2\) element of the green dots?).</p>
<p>With this indicator function, we can do a binary search on \(p_2\). We keep track of a lower and an upper bound (0.27 and 0.29, for example), and we repeatedly check whether the middle of this interval is to the left or to the right of the discontinuity via the indicator function. If we are to the left of the discontinuity, we set the value of the new lower bound equal to the current middle. In the opposite case, we set the new value of the upper bound equal to the current middle. Because we reduce the size of the interval by half each iteration, we can compute \(p_2\) to an arbitrary amount of decimal places in very little time. Each additional digit only takes \(log_2(10) \approx 3.32\) iterations to compute. We can calculate \(p_2\) to 20 decimal places in only 67 iterations! This algorithm provides us with the reaction function of robot #2: \(R_2(p_3)\).</p>
<h1 id="the-second-binary-search">The second binary search</h1>
<p>Back to the main question: what value of \(p_3\) maximizes the probability of robot #3 winning?</p>
<p>We already found \(R_1\) and \(R_2\). Therefore we have all we need to create a function that returns the maximum probability of 3 winning given some \(p_3\):</p>
\[f(p_3) = \mathbb{P}(\text{3 wins} \mid p_1 = R_1(p_2, p_3), p_2 = R_2(p_3), p_3)\]
<p>I used a trick to check whether \(f\) is increasing or decreasing at some \(p_3\):</p>
\[𝟙_3(p_3) = \begin{cases}
1 & f(p_3 + \epsilon) - f(p_3) < 0 \\
0 & \text{otherwise}
\end{cases}\]
<p>By incrementing \(p_3\) by a small amount (\(\epsilon\)) and checking the influence of this change on f, we can define an indicator function that checks whether we’re to the left or right of the peak. We apply the same binary search algorithm as described above to this function to find the optimal value to six decimal places: \(p_3 = 0.286833\).</p>
<p>Note that \(f\) actually depends on a binary search, and we use it within the indicator function of another binary search. This makes the runtime of this algorithm \(O(log^2(n))\) where n is the precision in decimal digits. A time complexity I have never encountered before!</p>
<p>My code can be found <a href="https://github.com/mvanaltvorst/janestreet/tree/main/06-2021">here.</a></p>
<h1 id="analytic-solution">Analytic solution</h1>
<p>I made a lot of assumptions and educated guesses during my initial solve of this problem. The fact that the maximum corresponded exactly to the kink at \(p_3 \approx 0.288\), for example. My solution was the result of a lot of trial and error, and a lot of simulations to test my hypotheses. It would be a lot more elegant to solve this problem analytically. The Bellman equation can be used to solve problems with optimal substructure, which this problem seems to belong to. I might revisit this problem in the future and try to solve it that way, as I have not figured out how that method exactly works yet!</p>This was definitely the most difficult Jane Street puzzle I have solved so far. You can find solutions to previous Jane Street puzzles I’ve solved here and here.Dynamic programming with probabilities (w/ Jane Street’s April 2021 challenge)2021-05-01T09:00:00+00:002021-05-01T09:00:00+00:00https://mauritsvanaltvorst.com/jane-street-april-2021<p>Three months ago, I wrote <a href="/jane-street-december-2020">an article about Jane Street’s January challenge</a>. Continuing in the same vein, I decided to write about this month’s Jane Street problem: <a href="https://www.janestreet.com/puzzles/bracketology-101-index/">Bracketology 101.</a></p>
<pre style="white-space: pre-wrap;">
There’s a certain insanity in the air this time of the year that gets us thinking about tournament brackets. Consider a tournament with 16 competitors, seeded 1-16, and arranged in the single-elimination bracket pictured below (identical to a “region” of the NCAA Division 1 basketball tournament). Assume that when the X-seed plays the Y-seed, the X-seed has a Y/(X+Y) probability of winning. E.g. in the first round, the 5-seed has a 12/17 chance of beating the 12-seed.
Suppose the 2-seed has the chance to secretly swap two teams’ placements in the bracket before the tournament begins. So, for example, say they choose to swap the 8- and 16-seeds. Then the 8-seed would play their first game against the 1-seed and have a 1/9 chance of advancing to the next round, and the 16-seed would play their first game against the 9-seed and have a 9/25 chance of advancing.
What seeds should the 2-seed swap to maximize their (the 2-seed’s) probability of winning the tournament, and how much does the swap increase that probability? Give your answer to six significant figures.
</pre>
<p><img src="/generated/assets/jane-street-april-2021/problem-562-08afad52e.png" alt="The input bracket arrangement" srcset="/generated/assets/jane-street-april-2021/problem-400-08afad52e.png 400w, /generated/assets/jane-street-april-2021/problem-562-08afad52e.png 562w" /></p>
<h1 id="approach">Approach</h1>
<p>There are \({16 \choose 2} = 120\) possible seed swaps, not that much to enumerate for a computer. The difficulty lies in the evaluation of a bracket arrangement: what is the probability that the 2-seed wins with a given bracket assignment?</p>
<p>A simplified example reveals the answer. Take a look at this sample arrangement with 4 seeds.</p>
<p><img src="/generated/assets/jane-street-april-2021/sample4-800-2892d7ed0.png" alt="[1, 2, 3, 4] sample arrangement" srcset="/generated/assets/jane-street-april-2021/sample4-400-2892d7ed0.png 400w, /generated/assets/jane-street-april-2021/sample4-600-2892d7ed0.png 600w, /generated/assets/jane-street-april-2021/sample4-800-2892d7ed0.png 800w, /generated/assets/jane-street-april-2021/sample4-1000-2892d7ed0.png 1000w" /></p>
<p>To calculate the probability that 2 wins in the second round, you have to take into account the case where 3 won the other tournament <em>and</em> the case where 4 won:</p>
\[\begin{aligned}
&\text{P}(\text{2 wins #2}) = \text{P}(\text{2 wins #2} \mid \text{2 wins #1}) \cdot \text{P}(\text{2 wins #1}) \\
&\text{P}(\text{2 wins #2} \mid \text{2 wins #1}) =
\end{aligned} \\
\begin{aligned}
&\text{P}(\text{2 wins #2} \mid \text{2 wins #1}, \text{3 wins #1}) \cdot \text{P}(\text{3 wins #1})
\\ +\; &\text{P}(\text{2 wins #2} \mid \text{2 wins #1}, \text{4 wins #1}) \cdot \text{P}(\text{4 wins #1})
\end{aligned}\]
<p>The amount of computations scales exponentially with the amount of rounds. Worse than exponentially, actually. For the final arrangement you have take into account all other competitors of 2-seed and those competitors have to take into account <em>their</em> competition as well, etc. You quickly end up with a lot of recursion, and time complexity becomes nontrivial.</p>
<p>We can avoid this recursive mess with some careful tabulation, a technique called <em>dynamic programming</em>. Let’s start with a 5x16 matrix: 5 rows for the rounds (I added a dummy round #0 for simplicity), and 16 columns for the competitors. Each round we memoize the probability that the \(i\)-seed has survived so far, and we store that information in the table. And each seed starts with a probability of survival of \(1.0\):</p>
<p><img src="/generated/assets/jane-street-april-2021/start-800-b13c37f6a.png" alt="A skeleton of the matrix we are about to fill" srcset="/generated/assets/jane-street-april-2021/start-400-b13c37f6a.png 400w, /generated/assets/jane-street-april-2021/start-600-b13c37f6a.png 600w, /generated/assets/jane-street-april-2021/start-800-b13c37f6a.png 800w, /generated/assets/jane-street-april-2021/start-1000-b13c37f6a.png 1000w" /></p>
<p>For round #1, we will start actually battling the seeds against each other. We store the probabilities of survival in the second row:</p>
<p><img src="/generated/assets/jane-street-april-2021/two-rows-800-a506a9347.png" alt="The second row of the skeleton filled" srcset="/generated/assets/jane-street-april-2021/two-rows-400-a506a9347.png 400w, /generated/assets/jane-street-april-2021/two-rows-600-a506a9347.png 600w, /generated/assets/jane-street-april-2021/two-rows-800-a506a9347.png 800w, /generated/assets/jane-street-april-2021/two-rows-1000-a506a9347.png 1000w" /></p>
<p>For the round #2, we will have to take into account the two different possible competitors of each participant. The probability that the 1-seed survives depends on the probabilities of survival of both the 8-seed and the 9-seed. Therefore we will group all competitors in groups of two, and carefully calculate the conditional probabilities for the next row of the table. We repeat the same for the next row in groups of four, and after that in groups of eight.</p>
<p><img src="/generated/assets/jane-street-april-2021/final-800-6541dd819.png" alt="The final filled matrix" srcset="/generated/assets/jane-street-april-2021/final-400-6541dd819.png 400w, /generated/assets/jane-street-april-2021/final-600-6541dd819.png 600w, /generated/assets/jane-street-april-2021/final-800-6541dd819.png 800w, /generated/assets/jane-street-april-2021/final-1000-6541dd819.png 1000w" /></p>
<p>And that’s it. We can retrieve our answer in \(\text{dp}[4][\text{indexOf}(2)]\), and it only takes a computer a few milliseconds. By enumerating all 120 possible swaps we find the optimal answer: by swapping 3 and 16, we increase the probability of the 2-seed winning from \(21.6039\%\) to \(28.1619\%\).:</p>
<h1 id="further-experimentation">Further experimentation</h1>
<p>We’ve done a single seed swap, but what happens when you repeatedly find the optimal seed swap and apply it? It turns out that if you take a random assignment of 16 seeds and repeatedly apply the greedy optimum seed swap, >99% of the time you converge an arrangement that gives the 2-seed a \(33.65\%\) chance of winning!</p>
<p>Most of these arrangements seem to be “equivalent” variants of the arrangement
[1, 3, 4, 5, 6, 9, 8, 7, 10, 11, 12, 13, 14, 15, 16, 2]
which is an arrangement that makes sense intuitively: you want to battle the 2-seed against high seeds to make it to the final round with the highest probability, and you pit all the other seeds in the other side of the bracket against each other to hopefully eliminate the 1-seed.</p>
<p>It’s quite surprising that greedily doing optimal swaps results in an optimal solution this often. Probably because 16 is such a low population size; I expect that this greedy algorithm becomes less effective once you start battling 256 seeds, for example.</p>
<h1 id="conclusion">Conclusion</h1>
<p>You can look at my implementation <a href="https://github.com/mvanaltvorst/janestreet/tree/main/04-2021">here</a>. I would like to thank Jane Street for another one of their interesting monthly challenges.</p>Three months ago, I wrote an article about Jane Street’s January challenge. Continuing in the same vein, I decided to write about this month’s Jane Street problem: Bracketology 101.A guide to backtracking (w/ Jane Street’s December 2020 challenge)2021-01-01T09:00:00+00:002021-01-01T09:00:00+00:00https://mauritsvanaltvorst.com/jane-street-december-2020<p>Backtracking is an important concept in computer science and can be used in many different applications.
There are many guides on the internet where backtracking is explained with relatively simple problems (n-queens, sudoku, etc.). In this a more challenging problem will be considered, and the intuition you might develop solving it.</p>
<p>We will look at <a href="https://www.janestreet.com/puzzles/twenty-four-seven-2-by-2-2/">Jane Street’s December 2020 challenge (Twenty Four Seven 2-by-2 #2)</a>:</p>
<pre style="white-space: pre-wrap;">
Each of the grids below is incomplete. Place numbers in some of the empty cells so that in total each grid’s interior contains one 1, two 2’s, etc., up to seven 7’s. Furthermore, each row and column within each grid must contain exactly 4 numbers which sum to 20. Finally, the numbered cells must form a connected region, but every 2-by-2 subsquare in the completed grid must contain at least one empty cell.
Some numbers have been placed inside each grid. Additionally, some blue numbers have been placed outside of the grids. These blue numbers indicate the first value seen in the corresponding row or column when looking into the grid from that location.
Once each of the grids is complete, create a 7-by-7 grid by “adding” the four grids’ interiors together (as if they were 7-by-7 matrices). The answer to this month’s puzzle is the sum of the squares of the values in this final grid.
</pre>
<p><img src="/generated/assets/jane-street-december-2020/problem-800-55042bce9.png" alt="The four input grids" srcset="/generated/assets/jane-street-december-2020/problem-400-55042bce9.png 400w, /generated/assets/jane-street-december-2020/problem-600-55042bce9.png 600w, /generated/assets/jane-street-december-2020/problem-800-55042bce9.png 800w, /generated/assets/jane-street-december-2020/problem-803-55042bce9.png 803w" /></p>
<h2 id="some-information-about-backtracking">Some information about backtracking</h2>
<p>Generally, backtracking is used when you want to consider all possible solutions to a problem. Fundamentally, backtracking is a variant of path finding in a more abstract space: the space of states and actions.</p>
<p>Consider the game as a graph. Each “game state” is a vertex, and each “action” you take is an edge. There are many possible actions. In this instance, we define an action as picking a not yet considered square and choosing to leave it empty or to fill in a number in range 1 through 7.</p>
<p><img src="/generated/assets/jane-street-december-2020/tree_overview-800-db463e129.jpg" alt="A game tree" srcset="/generated/assets/jane-street-december-2020/tree_overview-400-db463e129.jpg 400w, /generated/assets/jane-street-december-2020/tree_overview-600-db463e129.jpg 600w, /generated/assets/jane-street-december-2020/tree_overview-800-db463e129.jpg 800w, /generated/assets/jane-street-december-2020/tree_overview-1000-db463e129.jpg 1000w" /></p>
<p>Let’s consider the graph you could make if you start with an empty grid. With an empty grid, there are 49 squares where you can do an action, with 8 possible actions per square (empty or 1-7). After we’ve placed the first number, there are 48 squares left with 8 actions per square, et cetera.</p>
<p>This implies that we end up with \((49\times8)\times(48\times8)\times \dots \times (2\times 8) \times (1 \times 8) = 49! \times 8^{49} \approx 1.08 \times 10^{107}\) final game states.</p>
<p>How big is this number? Imagine a modern desktop computer for every atom in the visible universe. Even if you cluster them into one big supercomputer, brute force computation would still take slightly longer than the age of the universe. That’s where backtracking comes in. The key to backtracking is to cut off a branches of this game tree as quickly as possible:</p>
<p><img src="/generated/assets/jane-street-december-2020/cutoff-800-ceb236d4c.jpg" alt="A cut off tree" srcset="/generated/assets/jane-street-december-2020/cutoff-400-ceb236d4c.jpg 400w, /generated/assets/jane-street-december-2020/cutoff-600-ceb236d4c.jpg 600w, /generated/assets/jane-street-december-2020/cutoff-800-ceb236d4c.jpg 800w, /generated/assets/jane-street-december-2020/cutoff-1000-ceb236d4c.jpg 1000w" /></p>
<p>The solution should contain one 1, two 2’s, …, and seven 7’s. As soon as we notice that we have used number 1 two times, we can stop further search from this game state, because we’ll never arrive at the solution anyways. By cutting off the tree this early during the search, we save a massive amount of computation.</p>
<p>There are several observations you should consider when backtracking:</p>
<h1 id="key-observation-1-pruning-in-the-beginning-has-more-effect-than-pruning-at-the-end">Key observation #1: pruning in the beginning has more effect than pruning at the end</h1>
<p>The number of possible states rises exponentially. By cutting off the tree at the beginning you can reduce computation time by a lot. Some constraints are better at cutting off computation early than others. The rule that states that you can only use one 1, for example, already starts cutting off branches at move #2. In contrast, the constraint that all numbers should form one big connected island is less useful at the beginning of the search. There might be multiple disconnected islands at the beginning that converge later on.</p>
<h1 id="key-observation-2-more-constraints-make-computation-faster-not-slower">Key observation #2: more constraints make computation faster, not slower</h1>
<p>This might sound unintuitive. For humans, more rules usually make games more complex, but for computers more rules allow more tree pruning which reduces computation time.</p>
<p>With these observations, we can start optimizing our tree search:</p>
<h1 id="optimization-1-fix-the-order-of-actions">Optimization #1: fix the order of actions</h1>
<p>Consider the following two situations:</p>
<p><strong>Situation A:</strong></p>
<ol>
<li>place 2 at (0, 0)</li>
<li>place 5 at (0, 1)</li>
</ol>
<p><strong>Situation B:</strong></p>
<ol>
<li>place 5 at (0, 1)</li>
<li>place 2 at (0, 0)</li>
</ol>
<p><img src="/generated/assets/jane-street-december-2020/situations-800-7a234bcf3.jpg" alt="A diagram of the situations" srcset="/generated/assets/jane-street-december-2020/situations-400-7a234bcf3.jpg 400w, /generated/assets/jane-street-december-2020/situations-600-7a234bcf3.jpg 600w, /generated/assets/jane-street-december-2020/situations-800-7a234bcf3.jpg 800w, /generated/assets/jane-street-december-2020/situations-1000-7a234bcf3.jpg 1000w" /></p>
<p>You might notice that both of these situations lead to the same exact puzzle state. If you’ve already ruled out that situation A will never lead to a valid solution, any tree expansion from situation B will be redundant. This means that a naive expansion of the whole tree will lead to an enormous amount of redundant computation: every possible game final game state will be in the expanded tree \(49! \approx 6.08 \times 10^{68}\) times! There is an easy way to fix this: if we visit all coordinates in a fixed order, we can be certain that any state in the tree is unique. With this measure in place, we achieve a massive 608 novemdecillion times speedup.</p>
<p><img src="/generated/assets/jane-street-december-2020/fixed_order-800-946600f5c.jpg" alt="A diagram fixed order actions" srcset="/generated/assets/jane-street-december-2020/fixed_order-400-946600f5c.jpg 400w, /generated/assets/jane-street-december-2020/fixed_order-600-946600f5c.jpg 600w, /generated/assets/jane-street-december-2020/fixed_order-800-946600f5c.jpg 800w, /generated/assets/jane-street-december-2020/fixed_order-1000-946600f5c.jpg 1000w" /></p>
<h1 id="optimization-2-use-derivative-weak-rules">Optimization #2: use “derivative” weak rules</h1>
<p>Consider the following state:</p>
<p><img src="/generated/assets/jane-street-december-2020/weak-800-d3fc16129.jpg" alt="An illustrative diagram for weak rules" srcset="/generated/assets/jane-street-december-2020/weak-400-d3fc16129.jpg 400w, /generated/assets/jane-street-december-2020/weak-600-d3fc16129.jpg 600w, /generated/assets/jane-street-december-2020/weak-800-d3fc16129.jpg 800w, /generated/assets/jane-street-december-2020/weak-937-d3fc16129.jpg 937w" /></p>
<p>Consider column 6. The sum of that row so far is a 28. Because this sum can never get smaller, we know that this state is invalid. Mind that this is not one of the original constraints. This is a derivative rule of the constraint “the sum of each column is 20”. We can do something similar with row 3: even though that row is not filled yet, we can deduce that this row will never satisfy the constraint that the leftmost number should be a 5.</p>
<p>Similar derivative rules can be derived from the other constraints. Consider the following constraint: “all numbers should form one connected component”. This is a constraint you can only use relatively late in the game: there might be multiple disconnected islands in the beginning that only start to converge later on. However, we can still take advantage of this rule by creating our own “derivative rule”: never block off an island with empty squares. This rule can be enforced more easily at the beginning of the tree, and thus is more valuable when you’re trying to do backtracking.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">bool</span> <span class="nf">checkColumnWeak</span><span class="p">(</span><span class="kt">int</span> <span class="n">j</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">columnPieces</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="o"><=</span> <span class="mi">4</span> <span class="o">&&</span> <span class="n">columnSum</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="o"><=</span> <span class="mi">20</span>
<span class="p">}</span>
<span class="c1">// ...</span>
<span class="kt">bool</span> <span class="nf">checkWVWeak</span><span class="p">()</span> <span class="p">{</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="mi">7</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">wv</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">==</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="k">continue</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">j</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">j</span> <span class="o"><</span> <span class="mi">7</span><span class="p">;</span> <span class="n">j</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">grid</span><span class="p">[</span><span class="n">i</span><span class="p">][</span><span class="n">j</span><span class="p">]</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="k">continue</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">grid</span><span class="p">[</span><span class="n">i</span><span class="p">][</span><span class="n">j</span><span class="p">]</span> <span class="o">>=</span> <span class="mi">1</span> <span class="o">&&</span> <span class="n">grid</span><span class="p">[</span><span class="n">i</span><span class="p">][</span><span class="n">j</span><span class="p">]</span> <span class="o">!=</span> <span class="n">wv</span><span class="p">[</span><span class="n">i</span><span class="p">])</span> <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
<span class="k">break</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// ... similar functions for north view, east view and south view.</span>
</code></pre></div></div>
<h1 id="optimization-3-change-the-order-of-actions">Optimization #3: change the order of actions</h1>
<p>There are multiple ways to visit all coordinates one-by-one. The most straightforward way would be row-by-row:</p>
<p><img src="/generated/assets/jane-street-december-2020/row-by-row-800-6e41981ee.jpg" alt="A diagram of row-by-row search" srcset="/generated/assets/jane-street-december-2020/row-by-row-400-6e41981ee.jpg 400w, /generated/assets/jane-street-december-2020/row-by-row-600-6e41981ee.jpg 600w, /generated/assets/jane-street-december-2020/row-by-row-800-6e41981ee.jpg 800w, /generated/assets/jane-street-december-2020/row-by-row-1000-6e41981ee.jpg 1000w" /></p>
<p>There are also other possible orders in which you can consider squares. Consider a spiral, for example:</p>
<p><img src="/generated/assets/jane-street-december-2020/spiral-800-e34a9811e.jpg" alt="A diagram of spiral search" srcset="/generated/assets/jane-street-december-2020/spiral-400-e34a9811e.jpg 400w, /generated/assets/jane-street-december-2020/spiral-600-e34a9811e.jpg 600w, /generated/assets/jane-street-december-2020/spiral-800-e34a9811e.jpg 800w, /generated/assets/jane-street-december-2020/spiral-1000-e34a9811e.jpg 1000w" /></p>
<p>This has two big advantages over the row-by-row order:</p>
<ol>
<li>You can consider whole rows and whole columns earlier. After 7 squares, you can strongly check row #1. 6 squares further, and you can consider column #7, etc. With the row-by-row order you can only consider whole rows every 7 squares, and you can only start checking whole columns all the way at the end of the tree (when you start filling the last row).</li>
<li>By tracing along the edge in a spiral, you can start using the side views earlier on in your computation. When you use the row-by-row order, you can only start using the south view when you start filling the last row, for example.</li>
</ol>
<p>Formulas to calculate which coordinate follows in a spiral <a href="https://web.archive.org/web/20141202041502/https://danpearcymaths.wordpress.com/2012/09/30/infinity-programming-in-geogebra-and-failing-miserably/">exist</a>, though it’s easier and more efficient to just compute a table at the beginning:</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pair</span><span class="o"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="o">></span> <span class="n">nextCoordinate</span><span class="p">[</span><span class="mi">7</span><span class="p">][</span><span class="mi">7</span><span class="p">];</span>
<span class="kt">bool</span> <span class="n">isTurn</span><span class="p">[</span><span class="mi">7</span><span class="p">][</span><span class="mi">7</span><span class="p">];</span>
<span class="kt">void</span> <span class="nf">initCoordinateOrder</span><span class="p">()</span> <span class="p">{</span>
<span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">3</span><span class="p">,</span> <span class="n">j</span> <span class="o">=</span> <span class="mi">3</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">di</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span> <span class="n">dj</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">maxi</span> <span class="o">=</span> <span class="mi">3</span><span class="p">,</span> <span class="n">mini</span> <span class="o">=</span> <span class="mi">3</span><span class="p">,</span> <span class="n">maxj</span> <span class="o">=</span> <span class="mi">3</span><span class="p">,</span> <span class="n">minj</span> <span class="o">=</span> <span class="mi">3</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">k</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">k</span> <span class="o"><</span> <span class="mi">48</span><span class="p">;</span> <span class="n">k</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="n">nextCoordinate</span><span class="p">[</span><span class="n">i</span><span class="o">+</span><span class="n">di</span><span class="p">][</span><span class="n">j</span><span class="o">+</span><span class="n">dj</span><span class="p">]</span> <span class="o">=</span> <span class="n">make_pair</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">);</span>
<span class="n">i</span> <span class="o">+=</span> <span class="n">di</span><span class="p">;</span>
<span class="n">j</span> <span class="o">+=</span> <span class="n">dj</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">i</span> <span class="o">></span> <span class="n">maxi</span><span class="p">)</span> <span class="p">{</span>
<span class="n">maxi</span> <span class="o">=</span> <span class="n">i</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">buffer</span> <span class="o">=</span> <span class="n">dj</span><span class="p">;</span>
<span class="n">dj</span> <span class="o">=</span> <span class="n">di</span><span class="p">;</span>
<span class="n">di</span> <span class="o">=</span> <span class="o">-</span><span class="n">buffer</span><span class="p">;</span>
<span class="n">isTurn</span><span class="p">[</span><span class="n">i</span><span class="p">][</span><span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="nb">true</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">i</span> <span class="o"><</span> <span class="n">mini</span><span class="p">)</span> <span class="p">{</span>
<span class="n">mini</span> <span class="o">=</span> <span class="n">i</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">buffer</span> <span class="o">=</span> <span class="n">dj</span><span class="p">;</span>
<span class="n">dj</span> <span class="o">=</span> <span class="n">di</span><span class="p">;</span>
<span class="n">di</span> <span class="o">=</span> <span class="o">-</span><span class="n">buffer</span><span class="p">;</span>
<span class="n">isTurn</span><span class="p">[</span><span class="n">i</span><span class="p">][</span><span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="nb">true</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">j</span> <span class="o">></span> <span class="n">maxj</span><span class="p">)</span> <span class="p">{</span>
<span class="n">maxj</span> <span class="o">=</span> <span class="n">j</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">buffer</span> <span class="o">=</span> <span class="n">dj</span><span class="p">;</span>
<span class="n">dj</span> <span class="o">=</span> <span class="n">di</span><span class="p">;</span>
<span class="n">di</span> <span class="o">=</span> <span class="o">-</span><span class="n">buffer</span><span class="p">;</span>
<span class="n">isTurn</span><span class="p">[</span><span class="n">i</span><span class="p">][</span><span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="nb">true</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">j</span> <span class="o"><</span> <span class="n">minj</span><span class="p">)</span> <span class="p">{</span>
<span class="n">minj</span> <span class="o">=</span> <span class="n">j</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">buffer</span> <span class="o">=</span> <span class="n">dj</span><span class="p">;</span>
<span class="n">dj</span> <span class="o">=</span> <span class="n">di</span><span class="p">;</span>
<span class="n">di</span> <span class="o">=</span> <span class="o">-</span><span class="n">buffer</span><span class="p">;</span>
<span class="n">isTurn</span><span class="p">[</span><span class="n">i</span><span class="p">][</span><span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="nb">true</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="n">nextCoordinate</span><span class="p">[</span><span class="mi">3</span><span class="p">][</span><span class="mi">3</span><span class="p">]</span> <span class="o">=</span> <span class="n">make_pair</span><span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">3</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<h1 id="optimization-4-check-connectedness-every-once-in-a-while">Optimization #4: check connectedness every once in a while</h1>
<p>To check connectedness between islands, we do a <a href="https://en.wikipedia.org/wiki/Flood_fill">flood fill</a> across the entire grid every once in a while. This traversal of the whole grid is relatively expensive compared to the other constraint checks we’ve made so far. It happens to be faster to do this check every once in a while than to do it on every action.</p>
<p>There are alternatives to doing a whole flood fill to check connectedness. You could use a <a href="https://en.wikipedia.org/wiki/Disjoint-set_data_structure">disjoint-set</a> structure for example to keep track of which components are connected to which other components, for example. For small grids such as this one the speed difference with flood fill would probably be marginal, but especially for larger grids a disjoint-set structure would yield increasingly higher speed ups (floodfill takes \(O(n^2)\) while disjoint-set takes \(O(log(n))\)).</p>
<h1 id="conclusion">Conclusion</h1>
<p>You can look at my implementation <a href="https://github.com/mvanaltvorst/janestreet/tree/main/12-2020">here</a>. With all of these optimizations in place, solving all 4 grids takes less than 1 second on an Intel Core i5-6300HQ. I would like to thank Jane Street for their monthly challenges, this was an interesting problem with endless possibilities to speed up computation.</p>Backtracking is an important concept in computer science and can be used in many different applications. There are many guides on the internet where backtracking is explained with relatively simple problems (n-queens, sudoku, etc.). In this a more challenging problem will be considered, and the intuition you might develop solving it.My interactive LED strip2019-08-01T09:00:00+00:002019-08-01T09:00:00+00:00https://mauritsvanaltvorst.com/ledlight<p>My old light strip recently broke. Instead of buying a new one,
I thought I’d try making my own improved version.</p>
<p><img src="/generated/assets/ledlight/interactivestrip-800-d45e7addf.jpg" alt="A LED strip above my bed showing different segments, bounded by colors" srcset="/generated/assets/ledlight/interactivestrip-400-d45e7addf.jpg 400w, /generated/assets/ledlight/interactivestrip-600-d45e7addf.jpg 600w, /generated/assets/ledlight/interactivestrip-800-d45e7addf.jpg 800w, /generated/assets/ledlight/interactivestrip-1000-d45e7addf.jpg 1000w" /></p>
<p>It has several different “widgets”, from left to right:</p>
<ul>
<li>a 2 hour weather forecast (blue means rain, yellow means sun, quite
straightforward)</li>
<li>a timeline of upcoming calendar events in the next 4 hours, syncs with my phone</li>
<li>my stock portfolio (green means good, red means bad)</li>
</ul>
<p>It is also possible to use the strip as any ordinary RGB LED strip via a simple
web UI and there’s “night mode” that turns off the blue LEDs.</p>
<p><img src="/generated/assets/ledlight/nightlight-800-b2c4570a3.jpg" alt="The LED strip showing orange light (zero blue light emission)" srcset="/generated/assets/ledlight/nightlight-400-b2c4570a3.jpg 400w, /generated/assets/ledlight/nightlight-600-b2c4570a3.jpg 600w, /generated/assets/ledlight/nightlight-800-b2c4570a3.jpg 800w, /generated/assets/ledlight/nightlight-1000-b2c4570a3.jpg 1000w" /></p>
<h2 id="architecture">Architecture</h2>
<p>There are two parts to this project: the controller that talks directly to
the LEDs, and the server that fetches weather, stock and calendar information and passes that
information on to the controller.</p>
<p>The LED strip + controller consists of a few different parts:</p>
<ul>
<li>A 4 meter 30 LEDs/m WS2812B strip. This is a 5V LED strip that can be cut to
length. The thing that is special about this strip is that it allows each individual LED to be accessed via I2C.</li>
<li>An ESP8266. This is a cheap microcontroller with built-in WiFi.</li>
<li>3.3V to 5V logic level converter. The ESP8266 expects 3.3V, but the LED strip expects
5V. This logic level converter will act as a middle man between these two
components and convert 3.3V signals to 5V and vice versa.</li>
<li>A 5V 8A power supply. Each of these LEDs can pull up to 60mA. With 120 LEDs,
that is 120 * 0.060 = 7.2A.</li>
<li>A big 1000mF capacitor connected to the 5V rail, to prevent sudden power
spikes from damaging the ESP8266 or the LED strip.</li>
</ul>
<p>These parts cost me ~€26 in total.</p>
<p>One problem I ran into was the voltage drop after a few meters. 7.2 ampères
through a thin copper wire means that you lose a significant amount of voltage
over a few meters. This meant that the last LEDs in the strip got a red-ish tint
as the blue LEDs became dimmer (this has to do with the difference in <a href="http://www.learningaboutelectronics.com/Articles/What-is-the-forward-voltage-of-an-LED">forward voltage</a>
between red and blue LEDs; blue LEDs require a higher voltage to turn on than red LEDs).
This was easily solved by running two extra thin copper wires (one for +5V and
one for GND) along the LED strip and soldering them to the other end of the strip.</p>
<h2 id="the-controller">The controller</h2>
<p>The ESP8266 runs a very simple HTTP REST API that allows you to set the LED color of ranges of
the strip. It is very dumb and does not do anything other than turning some LEDS
off and on. The code is available <a href="https://github.com/mvanaltvorst/bedlightcontrol/">here</a>.</p>
<h2 id="the-server">The server</h2>
<p>The server that fetches weather, calendar and stock info was written in Go and is
available <a href="https://github.com/mvanaltvorst/bedlightserver">here</a>. It is easy to create new modules (see <a href="https://github.com/mvanaltvorst/bedlightserver/blob/master/widgets/weatherwidget.go">the weather widget</a>, for
example). The server will automatically give each module the chance to update
every 15 minutes. There is also a small web server running on port 8080 that
allows you to control the different modes of the strip, however its interface is
very rudimentary, ugly and not yet finished.</p>
<h2 id="future-possible-ideas">Future possible ideas</h2>
<ul>
<li>Use MQTT instead of HTTP</li>
<li>Adapt brightness depending on surrounding light</li>
<li>Make a wake-up light (<a href="https://www.amazon.com/Philips-Simulation-Headspace-Subscription-HF3520/dp/B0093162RM">these cost $140 on Amazon</a>)</li>
<li>Philips Hue integration</li>
<li>IFTTT integration</li>
<li>Extra widgets</li>
</ul>
<h2 id="things-i-would-have-done-differently">Things I would have done differently</h2>
<p>If I had the chance to start all over again, I would probably use SK6812 LEDs instead of WS2812B.
The WS2812B’s have 3 LED channels: one for red, blue and green.
This is great for most colors, though they have trouble recreating warm whites. SK6812
strips have a 4th LED channel for warm white LEDs, so they are much better at
creating warm white light. They are just ~€1 extra per meter, so I would probably choose these if
I had to start all over again.</p>My old light strip recently broke. Instead of buying a new one, I thought I’d try making my own improved version.Achieving remote code execution on an insecure IP camera2019-02-14T09:00:00+00:002019-02-14T09:00:00+00:00https://mauritsvanaltvorst.com/rce-insecure-ip-camera<p>Internet of Things devices are on the rise. Unfortunately, security on these devices is often an afterthought. I recently got my hands on the “Alecto DVC-155IP” IP camera. It has Wi-Fi, night vision, two-axis tilt and yaw control, motion sensing and more. My expectations regarding security were low, but this camera was still able
to surprise me.</p>
<p><img src="/generated/assets/rce-insecure-ip-camera/camera-600-92abf56cf.jpg" alt="The Alecto DVC-155IP" srcset="/generated/assets/rce-insecure-ip-camera/camera-400-92abf56cf.jpg 400w, /generated/assets/rce-insecure-ip-camera/camera-600-92abf56cf.jpg 600w" /></p>
<h2 id="setting-up-the-camera">Setting up the camera</h2>
<p>Setting up the camera using the app was a breeze. I had to enter my Wi-Fi details,
a name for the camera and a password. Nothing too interesting so far.</p>
<p>Using Nmap on the camera gave me the following results:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>➜ ~ nmap -A 192.168.178.59
Starting Nmap 7.70 ( https://nmap.org ) at 2019-02-09 12:59 CET
Nmap scan report for 192.168.178.59
Host is up (0.010s latency).
Not shown: 997 closed ports
PORT STATE SERVICE VERSION
23/tcp open telnet BusyBox telnetd
80/tcp open http thttpd 2.25b 29dec2003
|_http-server-header: thttpd/2.25b 29dec2003
|_http-title: Site doesn't have a title (text/html; charset=utf-8).
554/tcp open rtsp HiLinux IP camera rtspd V100R003 (VodServer 1.0.0)
|_rtsp-methods: OPTIONS, DESCRIBE, SETUP, TEARDOWN, PLAY
Service Info: Host: RT-IPC; Device: webcam
</code></pre></div></div>
<p>Three open ports: 23, 80 and 554. Surprisingly, port 23 doesn’t get mentioned anywhere in the manual.
Is this some debug port from the manufacturer, or a hidden backdoor?
After manually testing a few passwords via telnet I moved on.</p>
<p>When I connected to the admin panel - accessible on port 80 - I was greeted with a standard login screen
that prompts the user for a username and password.</p>
<p>The first step I took was opening the Chrome developer tab.
This allows you to inspect the network requests that Chrome made while
visiting a website. I saw that there were a lot of requests being made
for a simple login page.</p>
<p><img src="/generated/assets/rce-insecure-ip-camera/snap-800-5abcc0c08.png" alt="The Chrome developer tab" srcset="/generated/assets/rce-insecure-ip-camera/snap-400-5abcc0c08.png 400w, /generated/assets/rce-insecure-ip-camera/snap-600-5abcc0c08.png 600w, /generated/assets/rce-insecure-ip-camera/snap-800-5abcc0c08.png 800w, /generated/assets/rce-insecure-ip-camera/snap-1000-5abcc0c08.png 1000w" /></p>
<p>My eye quickly fell on a specific request: <code class="highlighter-rouge">/cgi-bin/hi3510/snap.cgi?&-getstream&-chn=2</code>
Hmm, “getstream”, I wonder what happens if I open this in another tab…</p>
<p><img src="/generated/assets/rce-insecure-ip-camera/live-800-7ed8cd572.png" alt="An unauthenticated live view of the camera" srcset="/generated/assets/rce-insecure-ip-camera/live-400-7ed8cd572.png 400w, /generated/assets/rce-insecure-ip-camera/live-600-7ed8cd572.png 600w, /generated/assets/rce-insecure-ip-camera/live-800-7ed8cd572.png 800w, /generated/assets/rce-insecure-ip-camera/live-1000-7ed8cd572.png 1000w" /></p>
<p>Within 2 minutes I’ve gained unauthenticated access to the live view of the camera.
I’ve heard that IP cameras aren’t secure before, but I didn’t expect it was this bad.</p>
<h2 id="other-observations">Other observations</h2>
<p>While looking through the network requests, I noticed some more notable endpoints:</p>
<ul>
<li>You are able to get the Wi-Fi SSID, BSSID, and password from the network the camera is connected
to by visiting <code class="highlighter-rouge">/cgi-bin/getwifiattr.cgi</code>. This allows you to retrieve the location of the
camera via a service such as <a href="https://wigle.net">wigle.net</a>.</li>
<li>You are able to set the camera’s internal time via
<code class="highlighter-rouge">/cgi-bin/hi3510/setservertime.cgi?-time=YYYY.MM.DD.HH.MM.SS&-utc</code>. I’m not sure if this
opens up any attack vectors, but it’s interesting nonetheless. It might be possible to do
some interesting things by sending invalid times or big strings, but I don’t want to risk
bricking my camera testing this.</li>
<li>You are able to get the camera’s password via <code class="highlighter-rouge">/cgi-bin/p2p.cgi?cmd=p2p.cgi&-action=get</code>.
Of course, you don’t even need the password to log in. Just set the “AuthLevel” cookie to 255
and you instantly get admin access.</li>
<li>You are able to get the serial number, hardware revision, uptime, and storage info
via <code class="highlighter-rouge">/web/cgi-bin/hi3510/param.cgi?cmd=getserverinfo</code></li>
</ul>
<p>All of these requests are unauthenticated.</p>
<h2 id="remote-code-execution">Remote code execution</h2>
<p>Let’s take another look at the requests made on the login page.
You can see a lot of “.cgi” requests. CGI-files are
“Common Gateway Interface” files. They are executable scripts used in web servers
to dynamically create web pages. Because they’re often based on bash scripts,
I started focusing on these requests first because I thought I might find an
endpoint susceptible to bash code injection.</p>
<p>To find out if a .cgi endpoint was vulnerable,
I tried substituting some request parameters with <code class="highlighter-rouge">$(sleep 3)</code>.
When I tried <code class="highlighter-rouge">/cgi-bin/p2p.cgi?cmd=p2p.cgi&-action=$(sleep 3)</code>,
it took a suspiciously long time before I got back my response. To confirm
that I can execute bash code, I opened Wireshark on my laptop and sent the
following payload to the camera:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$(ping -c2 192.168.178.243)
</code></pre></div></div>
<p>And sure enough, I saw two ICMP requests appear on my laptop.</p>
<p><img src="/generated/assets/rce-insecure-ip-camera/wireshark-800-f01c1e58a.png" alt="Two ping requests in Wireshark" srcset="/generated/assets/rce-insecure-ip-camera/wireshark-400-f01c1e58a.png 400w, /generated/assets/rce-insecure-ip-camera/wireshark-600-f01c1e58a.png 600w, /generated/assets/rce-insecure-ip-camera/wireshark-800-f01c1e58a.png 800w, /generated/assets/rce-insecure-ip-camera/wireshark-1000-f01c1e58a.png 1000w" /></p>
<p>But surely, nobody in their right mind would connect
such a cheap, insecure IP camera directly to the internet, right?</p>
<p><img src="/generated/assets/rce-insecure-ip-camera/shodan-800-40428d4f1.png" alt="Vulnerable IP cameras via shodan.io" srcset="/generated/assets/rce-insecure-ip-camera/shodan-400-40428d4f1.png 400w, /generated/assets/rce-insecure-ip-camera/shodan-600-40428d4f1.png 600w, /generated/assets/rce-insecure-ip-camera/shodan-800-40428d4f1.png 800w, /generated/assets/rce-insecure-ip-camera/shodan-1000-40428d4f1.png 1000w" /></p>
<p>That’s 710 Alecto DVC-155IP cameras connected to the internet
that disclose their Wi-Fi details (which means that I
can figure out its location by using a service such as
<a href="https://wigle.net">wigle.net</a>), allow anyone to view their live stream
and are vulnerable to RCE. And this is just their DVC-155IP model,
Alecto manufactures many different IP cameras each running the same software.</p>
<h2 id="returning-to-port-23">Returning to port 23</h2>
<p>Now that I’m able to run commands, it’s time to return to the mysterious port 23.
Unfortunately, I’m not able to get any output from the commands I execute.
Using netcat to send the output of the commands I executed also didn’t work for
some reason.</p>
<p>After spending way too much time without progress, this was the command that did the trick:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>telnetd -l/bin/sh -p9999
</code></pre></div></div>
<p>This starts a telnet server on port 9999. And sure enough,
after connecting to it I was greeted with an unauthenticated root shell.</p>
<p>Reading /etc/passwd gave me the following output:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root:$1$xFoO/s3I$zRQPwLG2yX1biU31a2wxN/:0:0::/root:/bin/sh
</code></pre></div></div>
<p>I didn’t even have to start Hashcat to crack this hash: a quick Google search
was all I needed to find that the password of the mysterious
backdoor port was <code class="highlighter-rouge">cat1029</code>.</p>
<p>The password to probably thousands of IP cameras on the internet is
<code class="highlighter-rouge">cat1029</code>. And the worst part is that there’s no possible way to change this
password anywhere in the typical user interface.</p>
<h2 id="contacting-the-manufacturer">Contacting the manufacturer</h2>
<p>When I contacted Alecto with my findings, they told
me they weren’t able to solve these problems because
they don’t develop the software for their devices. After a quick Shodan
search I found that there were also internet connected cameras from
other brands, such as Foscam and DIGITUS, that had these vulnerabilities.
Their user interfaces look different, but they were susceptible
to the same exact vulnerabilities via the same exact endpoints.</p>
<p>It seems that these IP cameras are manufactured by a Chinese company in bulk (OEM).
Other companies like Alecto, Foscam, and DIGITUS, resell them with slightly modified firmware and custom branding.
A vulnerability in the Chinese manufacturer’s software means that all of its children companies
are vulnerable too. Unfortunately, I don’t think that the Chinese OEM manufacturer will
do much about these vulnerabilities. I guess that the phrase
“The S in IoT stands for security” is true after all.</p>Internet of Things devices are on the rise. Unfortunately, security on these devices is often an afterthought. I recently got my hands on the “Alecto DVC-155IP” IP camera. It has Wi-Fi, night vision, two-axis tilt and yaw control, motion sensing and more. My expectations regarding security were low, but this camera was still able to surprise me.