Thoughts of a Servant

From Apple Trees to Search Trees: Optimising Classical AI to Overcome TicTacToe

2026-06-30T00:00:00+00:00

This post was proofread with the assistance of AI.

Last time, I handed a 4×4 TicTacToe board to children at a volunteering centre, changed the win condition to 3-in-a-row, and watched a familiar game become genuinely hard. I also showed why that hardness is not superficial: the number of reachable game states grows exponentially, early wins make closed-form analysis intractable, and even counting unique board positions requires a full recursive search.

That is where AI begins. Classical AI’s answer: treat the game as a search problem. Define it formally, build a tree of future possibilities, search it intelligently. This post traces how that idea evolved, from the naïve depth-first search a first-year student might attempt, through seven algorithms, to what actually powers modern chess engines.

Formalising the game: START

Before we can search, we need a language for the problem. Every search problem fits a five-part structure. I remember it as START: State, Transition, Action, Reward, Terminal.

State $s$ is a mathematical snapshot of the game at a moment in time. A 3×3 board becomes a 3×3 matrix:

 X | O | .         [["X",  "O",  null],
---+---+---   →     [null, "X",  null],
 . | X | .          [null, null, "O" ]]
---+---+---
 . | . | O

State design is an art. Include too little and the agent cannot distinguish situations that call for different moves. Include too much (weather, room temperature, a player’s hydration level…) and the search space balloons needlessly. In AI, a game where both players see the entire board is fully observable (TicTacToe, Chess); one where players see only a portion is partially observable (poker, autonomous driving). TicTacToe is fully observable, which keeps the state simple. The set of all possible states is the State Space. For TicTacToe it is finite, and for larger boards it is exponentially large.

Transition $T$ describes how state $s$ becomes $s’$: place an X on the top-left cell, and $\text{matrix}[0][0]$ goes from null to X. A useful sanity check: the number of null cells must drop by exactly 1 at each transition. Simple invariants like this catch implementation bugs before they compound.

Action $A(s)$ is the set of legal moves from state $s$, that is the occupation of any one empty cell in TicTacToe. The full set across all states is the Action Space, but not every action is legal in every state.

Reward $R(s)$ (alternatively, $R(s, a)$/$R(s, a, s’)$) assigns a numerical value to states or sequences so the agent can prefer some paths over others. For TicTacToe, the simplest choice is a terminal score: $+1$ for a win, $-1$ for a loss, $0$ for a draw. I will revisit reward design below (spoiler: it is easier to get wrong than it looks).

Terminal defines when search stops: $k$ consecutive marks in a line, or a full board with no winner.

The search tree

With the game formalised, the search becomes a tree. The root is the empty board. Each edge is a legal action. Each child is the resulting board.

              [empty board]
            /       |        \
     [X:top-L]  [X:centre]  [X:top-R]  ...  (9 branches)
     /      \
[O:top-M]  [O:centre]  ...                   (8 branches each)

Let $b$ be the branching factor and $d$ the maximum depth. Total leaf nodes: $O(b^d)$.

For 3×3 TicTacToe: $9! = 362{,}880$ nodes. Fine.
For 5×5 TicTacToe: $25! \approx 1.6 \times 10^{25}$ nodes. Not fine.

Every algorithm below is an attempt to search less of this tree while arriving at the same answer.

The algorithm chain

These seven algorithms are not a menu of alternatives. They are a chain: each one identifies the precise weakness of its predecessor and fixes it.

1. MiniMax

Square nodes maximise; circle nodes minimise. Terminal values ($+1$, $0$, $-1$) propagate upward.

MiniMax adapts depth-first search for two-player zero-sum games. One player (the maximiser, X) drives the terminal score up; the other (the minimiser, O) drives it down. Terminal nodes receive $+1$, $-1$, or $0$. Internal nodes take the max or min of their children:

\[v(s) = \begin{cases} R(s) & \text{if } s \text{ is terminal} \\ \max_{a \in A(s)}\, v(T(s, a)) & \text{if maximiser's turn} \\ \min_{a \in A(s)}\, v(T(s, a)) & \text{if minimiser's turn} \end{cases}\]

The result is optimal play against a perfectly rational opponent. The cost: every node, $O(b^d)$ of them (with patience — lots of patience).

2. Alpha-Beta Pruning

Crossed-out subtrees cannot change the root value regardless of their contents.

Alpha-Beta pruning keeps MiniMax’s exact guarantees while ignoring subtrees that cannot affect the result. It tracks two bounds:

$\alpha$ — the best score the maximiser has secured so far (lower bound)
$\beta$ — the best score the minimiser has secured so far (upper bound)

When $\beta \leq \alpha$, the branch is cut. The minimiser would never allow a result this good for the maximiser, so searching further is pointless.

With perfect move ordering — best moves explored first — complexity drops from $O(b^d)$ to $O(b^{d/2})$. A search that would take 100 hours now takes 10.

3. NegaMax

A sharp observation: for zero-sum two-player games, the maximiser and minimiser are doing the same thing in opposite directions. If we negate the returned score whenever the active player switches, both players become maximisers of their own perspective:

\[v(s) = \max_{a \in A(s)}\bigl(-v(T(s, a))\bigr)\]

Same search tree explored. Fewer variables. One recursive case instead of two.

(As software engineers, we know how it feels when the codebase becomes excessively complicated. The KISS principle is just like a first kiss, bringing warmth and comfort to our hearts.)

4. Scaled Rewards: a warning

An appealing idea: reward faster wins more than slower ones. Score a 5-move win as $+1.0$ and a 7-move win as $+0.8$, incentivising the search toward quicker victories.

The problem is subtle. In plain MiniMax with reward space ${-1, 0, +1}$, finding a $+1$ terminal means the absolute best outcome is secured. Search at that node stops. With scaled rewards, finding $+0.8$ does not justify stopping: there might be a $+1.0$ elsewhere via a faster path. The algorithm must now confirm it has the fastest win, not merely a win.

The heuristic changed the question from “find any win” to “find the fastest win.” The latter is strictly harder. A heuristic designed to shrink the search made it larger. Measure before assuming.

5. NegaScout

Alpha-Beta prunes when $\beta \leq \alpha$. NegaScout increases the chance of this condition holding.

If the first move explored at a node really is the best, then every sibling only needs to be confirmed worse, the exact alpha-beta values are unnecessary. NegaScout searches siblings with a null window $[\alpha,\, \alpha+1]$ rather than the full window $[\alpha, \beta]$:

Standard window:   [α ─────────────────── β]
Null window:       [α ─ α+1]

A null-window search triggers far more cutoffs. If the sibling falls inside the window, it is confirmed inferior (= move on). If it exceeds the window, the first-move assumption was wrong and a full re-search is needed.

NegaScout is fast when move ordering is good and degrades gracefully when it is not. Its efficiency is entirely dependent on exploring the best move first.

6. MTD(f)

MTD(f) takes the null-window idea to its conclusion: use only null-window searches, always, via binary search over the true minimax value.

Starting from a guess $f$, it calls NegaMax with window $[f, f+1]$. Each call returns a lower or upper bound on the true value. Successive calls narrow the interval until the bounds meet:

\[\text{repeat until } \ell = u: \quad \text{NegaMax}([f,\, f+1]) \to \text{new bound; update } \ell,\, u,\, f\]

Because every call is a null-window search, every call benefits from maximum pruning. The tradeoff: multiple passes revisit parts of the tree. A transposition table, a cache of already-seen positions, is not optional; without it, re-searches undo the savings entirely.

7. Best Node Search

Rather than finding the value of the best move, why not just identify which move is best? That is a strictly weaker question, and sometimes easier to answer.

Best Node Search iteratively guesses a threshold and counts how many moves score above and below it:

Guess 0.23 →  11 worse,  9 better
Guess 0.43 →  20 worse,  0 better
Guess 0.33 →  18 worse,  2 better
Guess 0.38 →  19 worse,  1 better  ← one candidate remains above threshold

Once only one move exceeds the threshold, that move is optimal by elimination. BNS achieves strong performance in large search spaces precisely because it stops asking “how good is this?” once the contest between top candidates is settled.

Iterative Deepening: the crosscutting technique

NegaScout requires good move ordering. MTD(f) and BNS need a good initial value. Iterative Deepening supplies both.

Run a complete search to depth 1. Use the best move found. Search to depth 2 using that result. Continue.

Revisiting shallower depths looks wasteful, but the overhead factor is only $\frac{b}{b-1}$, a constant — because the final depth dominates the total node count exponentially. The payoff: the best move from depth $d$ becomes the first move explored at depth $d+1$, giving NegaScout exactly the move ordering it needs; the minimax value from depth $d$ becomes MTD(f)’s and BNS’s starting guess at depth $d+1$.

Iterative Deepening doesn’t make any single algorithm asymptotically faster. It makes the whole family work better together.

Putting numbers to the theory

All algorithms were run as both players from the opening move, with a 1-hour wall-clock timeout. States visited counts total traversals including re-visits through the transposition table; timeout entries show progress at cutoff, not the full game-tree size.

Algorithm	3×3, k=3	4×4, k=3	4×4, k=4	5×5, k=5
	states / s	states / s	states / s	states / s
MiniMax	618,184 / 8.57	TIMEOUT	TIMEOUT	TIMEOUT
MiniMax with Alpha-Beta Pruning	21,652 / 0.30	42,340 / 19.13	TIMEOUT	TIMEOUT
NegaMax	24,698 / 0.34	282,469 / 132.45	TIMEOUT	TIMEOUT
NegaScout	45,801 / 0.68	390,989 / 350.56	TIMEOUT	TIMEOUT
Best Node Search (BNS)	58,226 / 0.81	2,553,446 / 2809.92	TIMEOUT	TIMEOUT
Best Node Search + Iterative Deepening	7,171 / 0.17	125,484 / 42.51	958,872 / 376.01	TIMEOUT (11.4M)
MTD(f)	1,817 / 0.04	113,351 / 6.29	196,677 / 19.43	TIMEOUT (9.7M)
MTD(f) + Iterative Deepening	5,297 / 0.15	40,377 / 13.39	265,719 / 147.50	TIMEOUT (13.7M)

n is the binding constraint, not k. The jump from 3×3 to 4×4 with k=n kills every algorithm except MTD(f). Alpha-Beta Pruning goes from 21K states to timeout; NegaScout and BNS never finish. Dropping k from 4 to 3 on the same 4×4 board rescues Alpha-Beta Pruning: 42K states in 19 seconds instead of timeout. At 5×5, k barely matters — the 25-cell branching factor overwhelms any depth reduction from an earlier win condition, and every exact solver times out regardless of k.

MTD(f) wins by a large margin — even accounting for re-visits. MTD(f) makes multiple passes over the tree, so its states-visited count includes states looked up more than once via the transposition table. Even so, it visits 1,817 states on 3×3 against Alpha-Beta Pruning’s 21,652: a 12× reduction. On 4×4 k=4, it is the only algorithm to finish at all, in 19 seconds against BNS + Iterative Deepening’s 376. The null-window bisection prunes so aggressively on each pass that the multi-pass overhead is more than recovered.

Theory and benchmarks diverge for NegaScout and BNS. Both are theoretically stronger than Alpha-Beta Pruning in the general case. On 4×4 k=3, Alpha-Beta Pruning visits 42K states in 19 seconds. NegaScout visits 391K in 350 seconds. BNS visits 2.5M in 2,800 seconds. The bottleneck is move ordering: both algorithms incur large overhead when moves aren’t explored best-first, because the null-window re-searches trigger frequently. A simpler algorithm with decent ordering outruns a sophisticated one without it. Theory describes best-case behaviour; these benchmarks measure actual behaviour.

Note that timeout occurred even at a very shallow depth, limiting our evaluation of iterative deepening. I believe that with more resources and a deeper search, algorithms using iterative deepening (i.e. MTD(f) and Best Node Search for our experiment) would demonstrate their asymptotic behaviour and the relative effectiveness of their strategies more strongly.

What I learnt

1. Progress is cumulative. MiniMax feels almost trivially simple: explore every state, propagate values upward. But Alpha-Beta has no foundation without it. NegaScout has no foundation without Alpha-Beta. MTD(f) builds on NegaScout. Each algorithm is a targeted fix to a precisely identified weakness, not a replacement of what came before. The pattern generalises: the great discoveries in any field are rarely isolated flashes of genius but iterations on a preceding idea that someone took seriously enough to examine closely. I have learnt not to trivialise incremental wins.

2. Heuristics are not the enemy; bad heuristics are. The Scaled Rewards example is easy to misread as a reason to distrust heuristics. The actual failure was simpler: the heuristic changed what was being optimised without anyone noticing. That is a calibration problem, not an indictment of the approach.

When accuracy is the bottleneck: medical diagnostics, logistics routing, you search more thoroughly. When time is the bottleneck: autonomous driving, real-time systems, a good-enough answer in 50ms beats a perfect answer in 5 seconds. Leaders face this daily: dozens of decisions, limited time, imperfect information. A good rule of thumb makes choices fast, simple, and consistent; the risk is when the rule drifts from the actual objective. In pathfinding algorithms, admissible heuristics carry a formal guarantee that the estimate never overstates the true cost, a concrete way to ask “is this heuristic any good?” The question is not whether to use heuristics; time and real-world complexity do not give you a choice, and neither does the autonomous car. What matters is whether your heuristics reflect what you actually care about.

3. Domain knowledge is the highest-leverage input. Domain-agnostic improvements (Alpha-Beta, null-window search) work on any game tree. Domain-aware heuristics, ones that know the specific game, reach gains no general technique can. The world is too complex to reduce entirely to heuristics, but being domain-aware consistently outperforms being broadly clever. I am encouraged to go deep, to read widely, to talk to people who know things I do not. Even though I will be a master of none, but at least I will be a jack of many trades: a childcare volunteer teacher by day and an algorithm enthusiast by night. 🙂

Food for thought

Is finding a closed form for TicTacToe’s state-count recurrence truly intractable, or is it just convoluted? The distinction matters for how much effort is worth spending on it.
Beyond instant-win heuristics and terminal reward, what other low-cost heuristics provably speed up TicTacToe search? Killer-move heuristics (moves that constrain the opponent’s future options) seem worth investigating.
Do these observations hold for $p$-player $n \times n$ TicTacToe, or $n$-dimensional boards? Intuition says pruning becomes less effective as the number of players grows, but by how much?

References

Helper stubs assumed throughout:

def is_terminal(state): ...
def reward(state): ...         # from current player's perspective
def actions(state): ...        # list of legal moves from this state
def apply(state, action): ...  # transition function, returns the new state after the move
INF = float('inf')

MiniMax

def minimax(state, is_maximising):
    if is_terminal(state):
        return reward(state)                          # base case: score the outcome
    scores = [minimax(apply(state, a), not is_maximising) for a in actions(state)]
    return max(scores) if is_maximising else min(scores)  # X maximises, O minimises

Alpha-Beta Pruning

def alpha_beta(state, alpha, beta, is_maximising):
    if is_terminal(state):
        return reward(state)
    if is_maximising:
        for a in actions(state):
            alpha = max(alpha, alpha_beta(apply(state, a), alpha, beta, False))
            if alpha >= beta:
                break              # β cut-off: minimiser won't allow this
        return alpha
    else:
        for a in actions(state):
            beta = min(beta, alpha_beta(apply(state, a), alpha, beta, True))
            if beta <= alpha:
                break              # α cut-off: maximiser already has better
        return beta

NegaMax

# Both players maximise from their own perspective; negate the child's score
# to flip from opponent's view back to the current player's view.
def negamax(state, alpha=-INF, beta=INF):
    if is_terminal(state):
        return reward(state)           # reward must be from current player's POV
    for a in actions(state):
        score = -negamax(apply(state, a), -beta, -alpha)  # swap and negate bounds
        alpha = max(alpha, score)
        if alpha >= beta:
            break                      # same cut-off logic as alpha-beta, unified
    return alpha

Scaled Rewards

# Standard: α-β can stop immediately on finding +1 — it's the ceiling
def reward(state):
    if winner(state) == MAX: return +1
    if winner(state) == MIN: return -1
    return 0

# Scaled: faster wins score higher — intended to incentivise quick victories
def reward_scaled(state, depth):
    if winner(state) == MAX: return +1 - 0.1 * depth
    if winner(state) == MIN: return -1 + 0.1 * depth
    return 0
# Problem: +0.8 is no longer proof of optimality.
# The search must keep looking for a faster +1.0 — making it strictly larger.

NegaScout

def negascout(state, alpha=-INF, beta=INF):
    if is_terminal(state):
        return reward(state)
    window = beta
    for i, a in enumerate(actions(state)):
        score = -negascout(apply(state, a), -window, -alpha)
        if i > 0 and alpha < score < beta:               # null-window was too narrow
            score = -negascout(apply(state, a), -beta, -score)  # full re-search
        alpha = max(alpha, score)
        if alpha >= beta:
            break
        window = alpha + 1   # narrow to null window: siblings only need to beat alpha
    return alpha

MTD(f)

# Converges on the true minimax value via repeated null-window passes.
# Each pass returns a bound; bounds tighten until they meet.
def mtdf(state, f=0):                          # f is the initial value guess
    lower, upper = -INF, INF
    while lower < upper:
        beta = max(f, lower + 1)
        f = negamax(state, beta - 1, beta)     # null-window [β-1, β]
        if f < beta:
            upper = f                          # returned below window: upper bound
        else:
            lower = f                          # returned at/above window: lower bound
    return f
# Transposition table is mandatory: without caching, re-searches undo all savings.

Best Node Search

# Binary search over move quality: eliminate candidates below the threshold each round.
def bns(state):
    candidates = list(actions(state))
    lower, upper = -INF, INF
    while len(candidates) > 1:
        guess = (lower + upper) / 2 # binary search, better alternate methods to narrow down guess can be implemented.
        # null-window test: does this move beat the current guess?
        better = [a for a in candidates
                  if -negamax(apply(state, a), -guess - 1, -guess) > guess]
        if better:
            lower = guess
            candidates = better           # keep only moves above the threshold
        else:
            upper = guess                 # all moves below; lower the bar
    return candidates[0]                  # last survivor is the best move

Iterative Deepening

def iterative_deepening(state, max_depth):
    best_move = None
    for depth in range(1, max_depth + 1):
        best_move = best_move_at_depth(state, depth)
        # best_move from depth d → first move explored at depth d+1 (move ordering)
        # minimax value from depth d → starting guess f for MTD(f)/BNS at depth d+1
    return best_move
# Overhead factor: b/(b-1) — constant, because the final depth dominates node count.

I Ruined TicTacToe for My Children, and for Math 😭

2026-06-17T00:00:00+00:00

This post was proofread with the assistance of AI.

All I did was draw one extra row and one extra column. That was enough to ruin TicTacToe.

The children at my volunteering centre love the normal 3x3 game. It is quick, familiar, and perfect for filling a spare couple of minutes with a primary school child. Then I gave them a 4x4 board. At first, they treated it like the same game with more space. After a few rounds, they noticed something annoying; it was much easier to block, much harder to finish, and the old tricks no longer worked cleanly.

So I made it worse. I kept the board at 4x4, but changed the win condition to 3 in a row.

That tiny rule change opened the floodgates. Suddenly every corner mattered. A harmless-looking move could threaten a row, a column, and a diagonal. The children got louder, more careful, and more competitive. I had turned a childhood game into a small mathematical headache.

And that was the point.

A Small Game With Serious AI Bones

TicTacToe is useful because it is simple enough to hold in your head, but rich enough to expose real ideas in Classical AI.

So what do we understand about TicTacToe as an AI environment? It is finite; the board eventually fills. It is deterministic; placing a mark has no randomness or chance attached to it. It is a perfect-information game; both players see the whole board at all times. It is adversarial; every good move for one player is bad news for the other.

That combination makes TicTacToe a clean playground for AI as search.

In standard 3x3 TicTacToe, the rules are fixed. X starts. Players alternate. Whoever gets 3 in a row wins. If the board fills without a winner, the game is a draw. Exhaustive search reveals (to no one’s surprise) that under perfect play, neither player can force a win, so the result is a draw.

But once we generalise the game, the neat little toy starts to misbehave.

Let the board be n x n. Let k be the number of consecutive marks needed to win, where k <= n. The childhood version is just one setting: n = 3, k = 3. A 4x4 game where you need the full row is n = 4, k = 4. The more chaotic version I gave the children is n = 4, k = 3.

That notation looks innocent. It is not.

Here is the difference in one position.

X X . .
O . . .
. . . .
. . . .

In n = 4, k = 4, X has only made a start. Two X marks are not close to winning yet. In n = 4, k = 3, X can win immediately by playing the third square in the top row:

X X X .
O . . .
. . . .
. . . .

The board size stayed the same. The meaning of danger changed.

Computing TicTacToe Requires Recursion

Suppose both players are young children playing without hints or strategy. At every turn, the current player chooses one legal empty square uniformly at random.

Here is the question: what is the probability that X wins?

For 3x3 TicTacToe, it is tempting to count board patterns directly. That temptation fades fast. To know whether X wins, we cannot only inspect the final board. We also need to know whether O would have won earlier, because TicTacToe stops the moment someone completes a line. A board that looks possible as a final arrangement may never appear in a real game, since the game may have ended two moves before.

There is no simple one-pass shortcut that respects turn order, illegal states, and early stopping. That pushes us into recursion.

Let P_X(s) be the probability that X eventually wins from state s. If s is already a terminal X win, P_X(s) = 1. If s is an O win or a draw, P_X(s) = 0. Otherwise, the current player has some set of legal moves, and random play averages over the child states:

P_X(s) = (1 / n_s) * sum over child states s' of P_X(s')

where s' represents each child state after each legal move, and n_s is the number of legal moves from state s.

This is a tiny equation with a nasty implication. To evaluate the root, the empty board, we must evaluate the children. To evaluate the children, we evaluate their children. The game becomes a search tree.

For example, from an empty 3x3 board, random X has 9 possible first moves. After X chooses one, random O has 8 possible replies. After that, X has 7. Even before we ask who is playing well, the tree begins to branch:

empty board
  -> X in top-left
    -> O in top-middle
    -> O in top-right
    -> ...
  -> X in center
    -> O in top-left
    -> O in top-middle
    -> ...

Recursion is just the clean way to say: solve each smaller board, then average the answers back up.

For the normal 3x3 board, this is still manageable. If both players choose uniformly random legal moves, X wins about 58.49% of games, O wins about 28.81%, and the game draws about 12.70%.

That first-player advantage is not magic. X always moves first, so X gets the first chance to complete the fifth move of the game, the earliest possible winning point in 3x3 TicTacToe. Random O does not defend well enough to erase that advantage.

Small Things Add Up, Very Quickly

A 3x3 board has 3^9 = 19,683 raw assignments if every square can be empty, X, or O. That number is already larger than most people expect, but it still fits comfortably on a slide.

The legal game is smaller in one sense and larger in another.

It is smaller because many raw assignments are illegal. X and O must alternate turns. The game stops early when someone wins. If we count reachable board states with early stopping, standard TicTacToe has 5,478 states.

It is larger because search algorithms care about paths, not only board snapshots. The same board can be reached through different move orders. Standard 3x3 TicTacToe has 255,168 terminal game sequences.

Now move from 3x3 to 4x4.

The raw board assignments jump from 3^9 = 19,683 to 3^16 = 43,046,721. If we ignore early wins and simply count full move orders, a 4x4 board has:

16! = 20,922,789,888,000

That is more than twenty trillion full-length move orders.

Of course, real games may stop early. That helps. It also makes the mathematics uglier, because early stopping depends on the exact sequence of previous moves. A win is not just a property of how many marks exist; it is a property of where they were placed and whether the game should have ended earlier. It is this property that makes finding a closed-form property intractable.

This is the first lesson algorithm learners should take seriously: small rule systems can produce huge search spaces without looking complicated. Think of the Travelling Salesman Problem or cryptography.

The Value of `k` Changes the Strategy of the Whole Game

The number of winning lines on an n x n board with k in a row is:

2n(n - k + 1) + 2(n - k + 1)^2

The first term counts horizontal and vertical windows. The second counts the two diagonal directions.

For standard 3x3 TicTacToe, n = 3, k = 3, so there are 8 winning lines. That matches what we learn as children: three rows, three columns, two diagonals.

For 4x4 with k = 4, there are only 10 winning lines. Four rows, four columns, two long diagonals. The board has more space, but each win still needs a full-length line. Blocking is cheap: one opponent mark ruins an entire candidate line.

For 4x4 with k = 3, the count jumps to 24 winning lines. The board did not change. The win condition did.

On a 10x10 board, the contrast is sharper. With k = 10, there are 22 winning lines. With k = 3, there are 288.

For a fixed board size, smaller k means more possible winning windows. Larger k means fewer windows, because each candidate line needs more uninterrupted space.

This is why my 4x4, 3-in-a-row version felt so different to the children. More winning windows means more local threats. A move can matter in several directions at once. Forks become easier to create. Defence becomes less obvious because blocking one threat may leave another one open.

The top row of a 4x4 board already shows the problem:

. X X .

With k = 4, this row is not an immediate emergency. X still needs both empty ends to complete the full row. With k = 3, either end is a winning move:

X X X .   or   . X X X

Diagonals create the same effect. This position is harmless in the 4-in-a-row version, but one move from over in the 3-in-a-row version:

X . . .
. X . .
. . . .
. . . .

If X plays the third diagonal square, the game ends:

X . . .
. X . .
. . X .
. . . .

That is why children who were comfortable with normal TicTacToe suddenly began missing threats. The board did not look crowded, but the number of short lines hiding inside it had grown.

This does not prove that X always wins as the board grows. That would be too strong. What it does show is that k is not a minor rule parameter. It changes the shape of the search problem.

Where the Problem Suddenly Changes: Phase Transition

Algorithm designers care about these turning points.

When k is close to n, wins are long and fragile. A single blocking mark can spoil a whole line. Draws become more plausible because players have enough room to interrupt one another.

When k is small compared with n, winning windows multiply. The board becomes full of short local races. It starts to resemble games like Gomoku, where threats and counter-threats stack quickly.

Somewhere between those extremes, the game changes character. It may shift from draw-heavy to win-heavy. It may shift from easy to search exhaustively to completely impractical. That kind of sharp behavioural change is often called a phase transition.

I point at a useful algorithmic question: as n grows, how should k grow if we want the game to stay balanced and analyzable?

A rough way to see the pressure is to look at the number of winning windows. If k = n, the count grows only linearly with n. If k stays fixed, the count grows quadratically with the board. Those are very different worlds.

That is the kind of question that turns a children’s game into research fuel.

Why This Matters for AI

Classical AI did not begin with neural networks. A lot of it began with search.

Take a game state. Generate legal actions. Apply one action to produce a new state. Check whether the new state is terminal. Repeat until the tree ends, then work backward to choose the best move.

That is the heart of tree search (think Breadth-First Search, BFS, and Depth-First Search, DFS). Making the search tree aware of two opposing players gives us MiniMax. Add alpha-beta pruning and you can ignore branches that cannot affect the final decision. Add better move ordering and the pruning improves. Add a transposition table and you avoid re-solving positions you have already seen. These are not decorative tricks; they are the difference between a solver that finishes and a solver that drowns.

Exact search has a hard ceiling. If the tree is too large, the correctness of BFS or DFS does not help you much. An optimal algorithm that cannot return before dinner is not useful during a game.

That is where Monte Carlo Tree Search enters the story. Instead of expanding every possible future, it samples futures and spends more effort where the search looks promising. AlphaZero-style systems go one step further: they use neural networks to guide search toward moves worth considering.

The machine does not become magical. It becomes selective.

And that is the second lesson. Real-world AI often hits the same wall that bigger TicTacToe hits: too many possible states, too many action sequences, too many futures to enumerate exactly. Planning, routing, scheduling, protein folding, game playing, program synthesis; once the branching factor grows, brute force collapses.

TicTacToe just lets us watch the collapse happen on a board small enough to draw by hand.

What I Took Away

I started with a game the children already loved. I made the board bigger. Then I changed one number.

That was enough to move from playground strategy to deep recursion, combinatorial explosion, phase-transition intuition, and explaining the limits of exact AI search.

The funny part is that the children understood the important bit before the math appeared. They could feel that the game had changed. More space did not simply mean more freedom. A smaller k did not simply mean an easier win. The rules interacted, and the board became harder to reason about.

That is algorithm design in miniature.

In the next post, I will take this broken version of TicTacToe and treat it the way classical AI would: as a search problem. We will start with MiniMax, sharpen it with alpha-beta pruning, and see exactly where exact search begins to run out of breath.

Machines Are Becoming Sophisicated Cyberattackers. Is Singapore Ready?

2026-04-19T00:00:00+00:00

This post was proofread with the assistance of AI.

I. You Are Already in the Fight

Have you dismissed a software update notification on your phone this week? Clicked a link in an SMS from “DBS” about a transaction you did not make? Streamed a Korean drama on a website you had never heard of? Reused the same password across your Singpass, banking, and e-commerce accounts? You are not alone, and you have never been a more attractive target.

Last year, Singaporeans lost S$1.1 billion to scams (Cyber Security Agency of Singapore [CSA], 2025), a 70% jump from last year, bringing cumulative losses since 2019 past S$3.4 billion. A single malware-enabled cryptocurrency scam cost one victim S$125 million in a single stroke. Behind each figure is a retiree whose CPF was drained, a small business whose accounts were emptied overnight, a family whose carefully-built nest egg simply vanished.

And that was the world before the machines got smart.

Within 9 days in April 2026, 3 announcements landed in rapid succession, from Anthropic, OpenAI, and our own CSA, that should have dominated front pages across Singapore. They did not. Most Singaporeans scrolled past them. That omission may prove expensive.

This article explains to the everyday Singaporean what happened, why it matters to you, and what must change (quickly!) if Singapore is to emerge from the coming decade as a global hub of secure computing rather than a cautionary tale.

II. A Nation Under Siege

First, let be clear-eyed about where Singapore already stood, even before frontier AI entered the picture. We were, by any reasonable measure, one of the most heavily targeted countries.

Organisations operating here faced an average of 2,272 cyber attacks per week in 2025, 17% per cent higher than the year before (CrowdStrike, 2026). Singapore was ranked the 7th most attacked country globally in the 4th quarter of 2024. SecurityScorecard found that every single one of Singapore’s top 100 publicly listed companies had at least one compromised third-party provider. Phishing cases rose 49% in 2024 alone; ransomware incidents climbed 21%, hammering small and medium enterprises [SMEs] in professional services hardest of all (CSA, 2025).

The threat actors are no amateurs. Most Singaporeans will remember the 2018 SingHealth breach that exposed 1.5 million patient records, including those of Prime Minister Lee Hsien Loong, and shook public confidence in our digital systems. What fewer people know is that the pressure has only intensified since. In July 2025, CSA publicly disclosed that UNC3886, a sophisticated China-linked Advanced Persistent Threat group, had been quietly targeting Singapore’s critical information infrastructure [CII] since late 2021, using zero-day exploits and rootkits designed to burrow deep into telecommunications and defence systems. In February 2026, Operation Cyber Guardian, a multi-agency response, was mounted specifically to counter UNC3886’s intrusions into our telecommunications networks.

To Singapore’s credit, the national architecture built to defend against these threats is genuinely formidable. CSA, established in 2015, now oversees roughly 500 personnel and enforces standards across 11 CII sectors under the Cybersecurity Act and its 2024 amendments, which expanded regulatory reach to cloud workloads, virtual systems, and third-party-owned infrastructure (Baker McKenzie, 2026). GovTech operates a round-the-clock Government Cybersecurity Operations Centre [GCSOC]. The Defence Science and Technology Agency [DSTA] builds bespoke military cyber solutions. The Digital and Intelligence Service [DIS], stood up in 2022 as the Singapore Armed Forces’ [SAF] 4th service branch, consolidates digital defence, cyber operations, and intelligence under a unified command. Singapore achieved the highest tier in the International Telecommunication Union’s 2024 Global Cybersecurity Index.

On paper, this is a nation that takes cybersecurity seriously. And on paper, it is well-prepared.

But paper defences are no match for what arrived in April.

III. The Models That Changed Everything

On 7 April 2026, Anthropic, the American AI company behind the Claude models, announced Project Glasswing, a consortium formed with Amazon Web Services [AWS], Apple, Google, Microsoft, Cisco, CrowdStrike, NVIDIA, Palo Alto Networks, and several others. The occasion was the unveiling of Claude Mythos Preview, a frontier AI model the company had determined was too dangerous for public release (Anthropic, 2026).

What had Mythos done? Operating autonomously, without human steering, it had discovered thousands of previously unknown security flaws in every major operating system and every major web browser on earth. It found a 27-year-old remotely exploitable bug in OpenBSD, a system long considered one of the most security-hardened in the world and used to run firewalls and critical infrastructure. It unearthed a 16-year-old vulnerability in FFmpeg, a media processing tool embedded in virtually every video application on the planet, in a line of code that automated testing tools had hit 5 million times without ever catching the problem. It chained together multiple vulnerabilities in the Linux kernel, the software that runs most of the world’s servers, to escalate from an ordinary user account to total control of the machine.

Anthropic was blunt about the implications: “AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities.” It then added a warning that should land with weight in every boardroom (and living room) in Singapore: “Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely” (Anthropic, 2026).

The benchmarks validates the leap. On CyberGym, a standardised test of offensive cybersecurity capability, Mythos scored 83.1%, compared to 66.6% for its predecessor, Opus 4.6. On Firefox JavaScript engine exploit development, Mythos succeeded 181 times where the previous model managed just 2 across hundreds of attempts. During testing, the model broke out of its sandbox and sent an unsolicited email to a researcher, a vivid demonstration of autonomous behaviour that no one had programmed it to exhibit.

Anthropic is not alone. One week later, on 14 April, OpenAI announced GPT-5.4-Cyber, a variant of its latest model deliberately fine-tuned for cybersecurity tasks, with refusal boundaries lowered to enable binary reverse engineering and other advanced defensive workflows. OpenAI classified GPT-5.4 as “high” cyber capability under its own preparedness framework, an unprecedented admission from a company that has historically been measured about model risks. “Cyber risk is already here and accelerating, but we can act… Safeguards cannot wait for a single future threshold” (OpenAI, 2026).

The next day, on 15 April, CSA issued a formal advisory titled Risks Associated with Frontier AI Models. The language was measured; the message was not. These frontier AI models, the advisory noted, “can reportedly reduce the time taken to identify vulnerabilities and engineer exploits, cutting short the duration from months to hours” (CSA, 2026). CSA urged organisations to patch every high-critical vulnerability on internet-facing systems, enforce multi-factor authentication across all administrative interfaces, review cloud configurations, segment networks against lateral movement, and deploy AI-powered vulnerability detection of their own.

Anthony Grieco, Cisco’s Chief Security and Trust Officer and a Glasswing partner, captured what those 9 days had accomplished: “AI capabilities have crossed a threshold that fundamentally changes the urgency required to protect critical infrastructure from cyber threats, and there is no going back. The old ways of hardening systems are no longer sufficient” (Anthropic, 2026).

Within a single week, the most sophisticated offensive cybersecurity capabilities, the kind that once required nation-state resources and years of elite training, became replicable by any computer running autonomously, in hours, for the cost of cloud computing credits.

IV. Singapore’s Hand: Strong Cards, Structural Gaps

Given what is now arriving, how does Singapore stand?

The genuinely good news is that we enter this new era better prepared than almost any comparable nation. The institutional architecture described above is real, not ornamental. Beyond it, the DIS Sentinel Programme trains approximately 800 students annually in penetration testing, network forensics, and, from this year, applied AI-for-cyber modules, developed with AI Singapore (Ministry of Defence [MINDEF], 2025). Budget 2026 committed to training 100,000 AI-savvy workers by 2029 and offered citizens on training courses 6 months of free access to premium AI tools. Singapore has invested over S$400 million in quantum technology through 2030, including Singtel’s quantum key distribution network, a hedge against the day AI-accelerated cryptanalysis makes conventional encryption obsolete.

Diplomatically, Singapore chaired the United Nations Open-Ended Working Group on ICT Security from 2021 to 2025, hosts the ASEAN-Singapore Cybersecurity Centre of Excellence, and has trained more than 900 senior ASEAN officials. By any reasonable standard, we are a regional leader.

But honesty demands we name the weaknesses too. Four stand out.

1. Talent. Singapore’s cybersecurity workforce grew from roughly 4,000 professionals in 2016 to 12,000 in 2022 — yet approximately 4,000 positions remain unfilled. Cybersecurity job postings rose 57% from 2024 to 2025. The Ministry of Manpower has placed cybersecurity roles on its 2026 Shortage Occupation List, a tacit admission that domestic supply cannot meet demand. Worse, the shortage is no longer merely about headcount. The SANS Institute’s 2026 Cybersecurity Workforce Research Report, which featured CSA as a case study, found that, for the first time, skills gaps have overtaken headcount shortages as the industry’s top workforce challenge, with AI-for-cyber the single largest deficit (SANS Institute, 2026). Bitdefender’s 2025 assessment found that 64% of Singapore’s cybersecurity professionals are experiencing burnout, and 53% plan to leave their roles within a year, both figures well above global averages (Bitdefender, 2025). A nation cannot defend itself with exhausted people heading for the exit.

2.Vendor dependence. Not a single Singapore-based company sits among the 12 Project Glasswing launch partners. Our cybersecurity ecosystem, while growing, remains heavily reliant on American, Israeli, and European security products. The CyberSG R&D Programme Office at Nanyang Technological University [NTU] has received S$62 million, a respectable sum, but modest next to the US$29.5 million DARPA spent on its AI Cyber Challenge alone, or the US$4.4 billion in venture funding that flowed into Israeli cybersecurity in 2025 (NTU Singapore, 2023; Ynet News, 2025). We largely buy our defence from abroad. In an era of sharpening great-power competition, that dependency is strategic vulnerability.

3. The quiet nationalisation of frontier AI itself. Analysts strategic think tanks have predicted and are starting to observe that major global powers are increasingly treating their most capable AI models as sovereign assets, tools reserved for national decisions on economics, defence, and intelligence, and wielded as diplomatic levers to draw smaller states into alignment. The evidence is already on the record. The United States’ January 2025 AI Diffusion Framework placed 120 countries, Singapore among them, into a three-tier system capping advanced AI chip imports, and, for the first time, applied formal export controls on the model weights of the most capable AI systems themselves, through a new Export Control Classification Number for AI models (CSIS, 2025). Enforcement actions across 2025 and 2026 specifically targeted entities in Singapore and Malaysia alleged to have diverted restricted Nvidia chips onward to China, with Washington pressuring regional governments to tighten their own licensing regimes (Sun, 2026). Anthropic itself framed its Project Glasswing partnership as a reason why “the US and its allies must maintain a decisive lead in AI technology” (Anthropic, 2026), language that makes plain how cybersecurity capability is now inseparable from great-power AI strategy. For all Singapore’s computing and policy prowess, it enters this contest as a middle power. Full sovereign control of the AI stack, from compute to foundation models to the cybersecurity applications that sit on top of them, lies beyond Singapore’s reach (Cambrian Research, 2025). The realistic strategy is not to match superpowers on their own terms but to preserve agency within Singapore’s competition: investing in targeted sovereign capabilities where scale permits, diversifying access pathways across multiple vendors and jurisdictions, and refusing the pressure toward simple alignment that an AI-bifurcated world will increasingly exert.

4. The softest target of all: the citizen, the small business, the household. Which is where this story stops being abstract.

V. Why This Is Personal

Before we discuss national policy responses, an uncomfortable conversation is overdue, one aimed squarely at the reader.

Have you been delaying the software updates on your phone and your laptop, the ones that pop up at inconvenient moments and that you dismiss with a sigh? Those updates are not cosmetic nor corporate trickery. They patch precisely the kind of vulnerabilities that Mythos has now proven a machine can discover in hours, at scale, autonomously. Every unpatched device is an open door. CSA’s April 15 advisory is explicit: “AI-powered attacks can weaponise newly disclosed vulnerabilities within hours of publication, making rapid patch deployment critical to preventing mass exploitation” (CSA, 2026). You are no longer racing human hackers working weekends. You are racing AI models that never sleep.

Have you clicked a link in a Telegram or Instagram advertisement promising cheap branded hauls, flash-sale sure-win Pokemon boster packs, or unbelievable investment returns? AI-generated phishing messages already achieve click-through rates of 54%, compared to 12% for human-crafted attempts. The spelling errors and awkward phrasing that used to be telltale signs are gone. Frontier models write in fluent, contextually appropriate English, Mandarin, Malay, and Tamil. They adapt to Singaporeans’ profiles, browsing histories, and social graphs. The next “DBS™ bank notification” you receive may have been optimised by AI to look exactly like something you would trust.

Have you streamed a drama on a website that was not Netflix, Viu, Disney+, or mewatch, but a “free” site cluttered with pop-ups and auto-playing advertisements? Such sites are among the most reliable distribution channels for malware in Southeast Asia. The moment a frontier AI model is used to engineer a novel browser exploit, every visitor to such a site becomes a potential victim of a drive-by compromise that needs no click, no download, no consent. Your device is infected by the mere act of loading the page.

Have you reused the same password across your Singpass, your bank, your Shopee and Lazada accounts, and your work login? A single credential breach, combined with an AI-accelerated credential-stuffing attack, is now sufficient to drain accounts, impersonate you on social media, and extract sensitive data from your employer. The attacker no longer needs to target you specifically. The attacker targets everyone, simultaneously, at machine speed, and you happen to be in the snare.

This is what changes when the machines learn to hack: the cost of carelessness rises precipitously, and the margin for error collapses.

Small and medium enterprises [SMEs], which form the backbone of Singapore’s economy, face this same dynamic at organisational scale, and with far higher stakes. CSA data shows that over 80% of organisations, many of them SMEs, suffer at least one cyber incident annually (CSA, 2025). The World Economic Forum [WEF] projects the global economic impact of cyberattacks will surge from US$8.44 trillion in 2022 to US$23.84 trillion by 2027. SMEs, lacking dedicated security teams, proper budgets, or specialist expertise, will absorb a disproportionate share of that damage (World Economic Forum, 2024). A single successful ransomware attack can close a neighbourhood clinic, a family-owned home catering business, or a third-party vendor whose breach cascades up into the multinational bank it services.

The new Cyber Resilience Centre, offering SME helplines, diagnostic clinics, and CISO-as-a-Service, is a step forward. Three homegrown startups, AgileMark, Scantist, and StrongKeep, backed by the S$20 million CyberSG TIG Centre at the National University of Singapore [NUS], are building affordable, accessible tools tailored for exactly this segment (CyberSG TIG Centre, 2026). These are the right instruments. The question is whether they reach the SMEs at Woodlands and Ubi as quickly as they reach the MNCs at Raffles Place, and whether SME owners recognise the urgency before an attack, rather than after one.

VI. Crisis as Catalyst: Three Imperatives

There is a version of this story that ends badly: a nation well-prepared for yesterday’s threats but overwhelmed by tomorrow’s. There is another version, arguably more consistent with Singapore’s historical instinct, in which an external shock catalyses transformation that would otherwise have taken a decade.

Singapore has done this before. The abrupt withdrawal of British forces in 1968 forced the creation of a national defence capability from scratch, giving rise to the Singapore Armed Forces. The SARS crisis of 2003 produced a public health infrastructure whose value was proven during COVID-19. Perpetual water dependency drove investment in NEWater and desalination that made Singapore a global model for water security. In each case, a perceived vulnerability became the foundation of a national strength.

Frontier AI in cybersecurity presents the same structural opportunity, if Singapore moves with the urgency the moment demands. Three imperatives stand out.

1. Build the local cybersecurity workforce as a national project, not a market outcome. The DIS Sentinel Programme, today training 800 students per year, should be scaled to 2000 or more, with structured post-National-Service pathways directly into private-sector cybersecurity careers. The model to study is Israel’s Unit 8200, whose alumni form the nucleus of a multi-billion-dollar cybersecurity industry, representing only 7% of Israel’s tech sector by number but attracting 36% to 38% of total tech investment (Startup Nation Central, 2025). Defence Cyber Chief Colonel Clarence Cai put the imperative clearly when the Sentinel Programme announced its new AI modules: “Cyber operations are already being transformed by AI. There is a need for Sentinel Programme, as a cyber youth programme, to prepare a new generation of defenders who are grounded in the fundamentals of cyber and also able to reflexively use AI for cyber operations” (MINDEF, 2025). Minister of State for Defence Desmond Choo, addressing the largest-ever cohort of Senior Military Experts in January 2026, added what should be this decade’s organising principle: “People remain our decisive advantage” (MINDEF, 2026).

CSA has trained over 22,000 individuals since 2020 (SANS Institute, 2026). The next 22,000 thousand must be trained in half the time, with applied AI-for-cyber skills at the core, or Singapore will find itself defending against machine-speed threats with human-speed teams.

2. Create non-negotiable urgency for both SMEs and citizens. CSA’s advisory recommendations are not suggestions. They are survival instructions. A national SME Cyber Voucher programme, modelled on the existing Productivity Solutions Grant, could subsidise AI-powered endpoint protection, managed security operations, and automated vulnerability scanning for SMEs with fewer than 200 employees. The cost would be a small fraction of the S$1.1 billion Singaporeans already lose to scams every year.

For citizens, CSA’s existing outreach must be equally direct. Update your devices when prompted. Do not click on links in unsolicited messages, no matter how authentic they appear. Do not stream from pirate sites. Use unique, strong passwords. Enable two-factor authentication on every account that offers it. Report suspicious activity to ScamShield without delay. These are not IT department problems. In the age of frontier AI, they are civic duties, on the same plane as locking your HDB flat door at night.

3. Position Singapore as the world’s most secure computing hub. This is not merely defensive; it is economic strategy. As Anthropic commits US$100 million in usage credits through Project Glasswing and OpenAI scales its Trusted Access for Cyber [TAC] programme, the global market for AI-secured infrastructure is forming in real time. Singapore’s strengths: regulatory maturity, data-centre density, a projected cybersecurity market reaching US$6.41 billion by 2031, and geopolitical neutrality, make it the natural home for organisations that demand both connectivity and demonstrable security.

But “demonstrable” is the operative word. The provisions for Entities of Special Cybersecurity Interest and Foundational Digital Infrastructure in the amended Cybersecurity Act should be commenced as soon as practicable, not stretched across a multi-year timeline (Baker McKenzie, 2026). CSA’s March 2026 announcement that the Centre for Strategic Infocomm Technologies [CSIT] will develop indigenous threat detection tools for critical infrastructure owners is a welcome signal of sovereign ambition, but the resourcing must match the ambition. And Singapore should actively seek partnership or observer status in initiatives like Project Glasswing; not because it needs access to Mythos itself, but because the intelligence-sharing, standard-setting, and advance warning such partnerships confer are worth more than any single tool.

VII. The Choice Before Us

Anthropic’s own Project Glasswing announcement concluded with a sentence that deserves quoting at full weight: “The work of defending the world’s cyber infrastructure might take years; frontier AI capabilities are likely to advance substantially over just the next few months. For cyber defenders to come out ahead, we need to act now” (Anthropic, 2026).

Singapore has every ingredient required to lead: institutional maturity, regulatory sophistication, a highly educated workforce, a government that moves quickly when it chooses to, and a citizenry that has historically responded to existential challenges with pragmatism and resolve. What remains is the choice to treat this moment as what it actually is, not a technical problem for IT departments, but a national challenge that demands a national response.

So here is the call, plainly stated, to every reader of this article.

To Government: fund the Sentinel Programme’s expansion, accelerate the remaining Cybersecurity Act provisions, bankroll an SME Cyber Voucher scheme, and pursue a seat at the international table where the rules of AI-enabled security are being written.

To businesses, especially SMEs: assume breach, patch everything, enforce multi-factor authentication (or stronger Identity and Access Management [IAM]), segment your networks, adopt the affordable locally-built tools that the CyberSG TIG Centre has already brought to market, and do not wait for an incident to force the investment.

To every Singaporean reading this: update your devices tonight, stop clicking unknown links, stop streaming from shady sites, use different passwords for different accounts, turn on two-factor authentication, and understand, as clearly as you understand that your HDB door needs to be locked when you leave home, that digital hygiene is now a matter of national security as much as personal security.

AI models are already at the gates. The question is no longer whether Singapore will be targeted, it already is, 2,272 times a week, but whether each of us, in our role, will act before the window closes.

The clock is running. It does not run slowly.

References

Anthropic. (2026, April 7). Project Glasswing: Securing critical software for the AI era. https://www.anthropic.com/glasswing

Baker McKenzie. (2026, March 31). Singapore: Cybersecurity licensing framework updates, new threat detection tools. https://www.bakermckenzie.com/en/insight/publications/2026/03/singapore-cybersecurity-licensing-framework-updates-threat-detection-tools

Bitdefender. (2025). 2025 Cybersecurity Assessment Report: Singapore findings. Bitdefender.

Cambrian Research. (2025, August 11). Singapore’s AI Strategy and the Limits of Digital Sovereignty. https://cambrianr.substack.com/p/singapores-ai-strategy-and-the-limits

Centre for Strategic and International Studies. (2025, April 2). Understanding U.S. allies’ current legal authority to implement AI and semiconductor export controls. https://www.csis.org/analysis/understanding-us-allies-current-legal-authority-implement-ai-and-semiconductor-export

CrowdStrike. (2026). 2026 Global Threat Report: AI accelerated adversaries. https://www.crowdstrike.com/en-us/press-releases/2026-crowdstrike-global-threat-report/

Cyber Security Agency of Singapore. (2025). Singapore Cyber Landscape 2024/2025. https://www.csa.gov.sg/resources/publications/singapore-cyber-landscape-2024-2025/

Cyber Security Agency of Singapore. (2026, April 15). Advisory on risks associated with frontier AI models. https://www.csa.gov.sg

CyberSG TIG Centre. (2026, March 23). Singapore cybersecurity firms showcase SME-focused innovations to counter rising cyber threats at RSAC 2026 Conference. Media OutReach Newswire.

Ministry of Defence, Singapore. (2025, November 28). SAF expands Cybersecurity Student Talent Development Programme to include AI skills in 2026. https://www.mindef.gov.sg/news-and-events/latest-releases/28nov25-nr2/

Ministry of Defence, Singapore. (2026, January 21). Speech by Minister of State for Defence, Mr Desmond Choo, for the 30/26 SAF Senior Military Expert Appointment Ceremony. https://www.mindef.gov.sg/news-and-events/latest-releases/21jan26-speech/

NTU Singapore. (2023). S$62 million CyberSG R&D Programme Office. https://www.ntu.edu.sg/news/detail/sgd62-million-cybersg-r-d-programme-office

OpenAI. (2026, April 14). Trusted access for the next era of cyber defense. https://openai.com

SANS Institute. (2026). 2026 SANS GIAC Cybersecurity Workforce Research Report. SANS Institute.

Startup Nation Central. (2025). Israeli cybersecurity is defining the future in 2025. https://startupnationcentral.org/hub/blog/israeli-cybersecurity-is-defining-the-future-in-2025/

Sun, M. (2026, April). Manacled Manus: The limits of “Singapore washing” for China AI. Asia Times. https://asiatimes.com/2026/04/manacled-manus-the-limits-of-singapore-washing-for-china-ai/

World Economic Forum. (2024). Global Cybersecurity Outlook 2024. WEF.

Ynet News. (2025). Record $4.4B flows into Israeli cybersecurity as global VCs outpace locals in 2025 boom. https://www.ynetnews.com/business/article/rjggjusz11g

The views expressed are the author’s own.

Every Scam Site Leaves One Trace Before It Goes Live. We Built a Tool to Catch It.

2026-03-27T00:00:00+00:00

This post was proofread with the assistance of AI. The source code backing this story is open-sourced and available on GitHub.

When we pointed CertStream at two influential Singapore organisations, it surfaced over ten thousands of suspicious domains in a single day. No human team could triage that volume before the threats became active. So we built a machine to do it — one that classifies domains in real time, flags the dangerous ones before the first victim ever loads the page, and hands analysts a curated shortlist instead of a firehose. This is the story of how that tool came to be, what it took to build it, and what it taught us about defending brands at machine speed.

Before I get into the technical details, I want to take a moment to acknowledge the people who made this work possible.

During my internship, I had the privilege of learning from the Asia-Pacific Digital Risk Protection team at a leading cybersecurity company. They showed me what it really looks like to defend a brand in the wild — the volume, the pace, the judgment calls analysts make under pressure, and the genuine satisfaction of taking down a scam site before it harms someone. This project is built on everything they taught me, and I dedicate it to them.

The Problem: We Found the Domains. Now What?

In my previous post, I described how Certificate Transparency logs — and specifically CertStream, the real-time WebSocket feed built by Cali Dog Security — give us a powerful early warning system. Every domain that obtains an HTTPS certificate is publicly logged the moment it does, and brand-targeted scam domains are no exception. That means we can see them at the point of certificate issuance, often before the site is even reachable.

The problem we were left with was a different one entirely: volume.

CertStream processes millions of certificate events every day. Even after filtering to domains that contain your brand keywords, you’re looking at thousands of candidates per day for a single brand. A human analyst cannot manually investigate each one fast enough for the intelligence to be actionable. By the time they’ve worked through the queue, the scam domains they flagged are either weaponised or gone.

We had solved the sourcing problem. We had created a new one.

The Implication: Speed Is Not a Nice-to-Have

Let me be direct about the stakes. In threat intelligence, there are two kinds of errors. False positives — flagging a benign domain as malicious — waste analyst time and erode trust in the tool. Annoying, but recoverable. False negatives — missing a real scam domain — mean a live phishing site with your brand’s logo on it is collecting victims’ credentials, authorising fraudulent transactions, and destroying customer trust, with nobody at your organisation aware it exists.

False negatives are not a metric. They are harm to real people.

The implication is this: an automated classification layer that works around the clock, surfaces threats with high recall, and hands analysts only what truly requires human judgment is not a convenience — it is a prerequisite for any serious DRP operation. That is what we set out to build.

BrandSentinel: A Live Feed of Malicious Domains

Methodology

BrandSentinel is an open-source Digital Risk Protection pipeline. It ingests domains from multiple threat intelligence feeds continuously, classifies them using an ensemble of heuristics, and writes verdicts to per-category output files — all while the analyst monitors a live dashboard in Prefect’s UI.

Ingestion. Ten source workers run as independent async tasks, each polling a different feed on its own schedule:

Source	Frequency
CertStream	Live (CT log stream)
URLhaus	Every 5 minutes
PhishTank	Every 1 hour
OpenPhish	Every 12 hours
CERT Polska, Phishing.Database, Phishing Army, Botvrij.eu, DigitalSide	Configurable
Manual import file	Every 30 seconds

Every domain that any source produces is placed onto a shared, deduplicated asyncio.Queue. If a domain has been seen before — across any source — it is silently dropped. The pipeline never processes the same domain twice.

Filtering. A keyword-based filter checks each domain against the brand configurations in config.yaml. Domains that do not contain a brand keyword are classified as IRRELEVANT and logged immediately. This is the first gate, and it is intentionally cheap — regex matching costs nothing compared to what comes next.

Classification. For every domain that passes the filter, BrandSentinel concurrently fetches the HTTP response (following redirects, capturing page content), resolves DNS records, and retrieves the TLS certificate — all in parallel. This enriched context is then passed through a pipeline of 15 heuristics.

The heuristics run lazily. If any single heuristic produces a definitive verdict — such as matching a known brand favicon hash on a non-canonical domain, or detecting a phishing kit directory structure — the pipeline short-circuits and assigns that verdict immediately, without running the remaining checks. For contributing signals, scores are accumulated and normalised. The final verdict is one of four categories:

SCAM — High confidence. Investigate and initiate takedown.
INCONCLUSIVE — Suspicious. Requires analyst review.
BENIGN — Passed all checks. Continue monitoring passively.
IRRELEVANT — Not related to the monitored brand. Dropped.

A few heuristics are worth calling out explicitly because they do the heaviest lifting:

Inactive domain: An HTTP timeout, connection refusal, or non-200 response is a strong signal — scam domains are frequently pre-registered before they’re weaponised, or already retired.
Parking detection: The page content or DNS fingerprint matches a known domain parking service. These are typically benign, but they are watched — parked domains can be activated at any time.
Brand lookalike: Levenshtein distance between the registered domain and the brand’s canonical domain detects typosquatting and combosquatting that a keyword filter alone would miss.
Favicon hash matching: A SHA-1 hash of the site’s favicon is compared against a curated list of brand favicon hashes. A match on a non-canonical domain is a near-definitive SCAM signal — an attacker has copied the brand’s own visual identity.
Forms exfiltration: Login forms or input fields that submit to a different domain than the one being analysed are a reliable indicator of a credential-harvesting phishing page.

Output. Verdicts are written continuously to scam.txt, inconclusive.txt, benign.txt, and irrelevant.txt. The entire pipeline is orchestrated by Prefect, which provides a live UI for observing flow runs, retrying failed tasks, and inspecting per-domain logs without touching the terminal.

Results

The version I am describing here is the current, fully instrumented release. But the insight that validated the approach came from something far simpler.

An early prototype ran only two heuristics: the inactive domain check and the parking detection check. No lookalike scoring, no favicon hashing, no content analysis — just those two. Even with only CertStream as the data source, the results were striking.

Of every hundred domains flagged by the brand keyword filter, roughly ninety-five were resolved by those two heuristics alone. The inactive check cleared out domains that had been registered but not yet deployed — or already retired. The parking check cleared out benign infrastructure that had been registered speculatively by domain investors, not attackers. What remained — the five or so domains that were live, active, and neither parked nor obviously benign — was the set that actually warranted human attention.

In operational terms, this meant a DRP analyst who previously faced a hundred manual lookups per day was reduced to reviewing fewer than ten. The pipeline had not replaced the analyst. It had given them back their day.

Discussion

The two-heuristic result is encouraging, but it is not the ceiling — it is the floor. The more sources you ingest from, and the more precisely your heuristics are calibrated, the tighter the classification becomes.

This brings us back to the two-error asymmetry. False positives are an inconvenience: the analyst reviews a benign domain and clears it. The cost is time. False negatives are a failure: a scam domain is classified as benign or irrelevant and goes unactioned. The cost is victims. BrandSentinel is deliberately calibrated toward recall over precision — when evidence is ambiguous, the verdict is INCONCLUSIVE, not BENIGN. The analyst sees it. The domain does not slip through.

As research matures, more can be layered in. New ingestion sources plug in by implementing a single Source base class. New heuristics extend HeuristicBase and register themselves; the orchestrator handles the rest. Commercial feeds — URLScan.io, OTX, VirusTotal — can join the source pool without touching the classification logic. An ML-based scoring layer can replace or complement the weighted heuristic sum for teams with labelled datasets. The architecture was designed to absorb these improvements without structural change.

Who Should Use This

Digital Risk Protection teams are the primary audience. BrandSentinel is not a replacement for experienced analysts — it is the triage layer that lets them work on what matters. Instead of spending their shift manually checking whether mybank-secure-login.xyz resolves to anything, analysts can focus on the pre-scored shortlist that BrandSentinel surfaces: live domains, with suspicious content, with brand-identical favicons, submitting credentials to a foreign server. That is a very different kind of work.

Small and medium enterprises are the second audience, and perhaps the more important one. Most SMEs do not have a DRP team. They may not even know how actively their brand is being impersonated online. Running BrandSentinel — even in its simplest configuration, with CertStream and a handful of brand keywords — produces a sobering and actionable picture of digital threat exposure. For an SME trying to decide whether to invest in a commercial DRP solution, that picture is exactly the evidence they need.

What Comes Next

BrandSentinel is open source. It is a gift to the cybersecurity community, in the same spirit as the team that first gave me this perspective. If you work in threat intelligence, digital risk, or brand protection, I hope you find it useful — and I hope you make it better. Open an issue, submit a heuristic, add a feed.

The threat is automated. The defence should be too.

What I Took Away

1. Open-source intelligence is a treasure trove — especially for those who need it most.

Open-source intelligence will not beat a dedicated proprietary threat intelligence platform on visibility, reliability, or breadth of coverage. The commercial vendors have more sources, better enrichment, and dedicated teams curating signal quality. That is simply the reality.

But here is the counterpoint: for the organisations most exposed to brand-impersonation attacks — small and medium enterprises — a commercial DRP subscription is often out of reach. The same businesses that cannot absorb the cost of a cyberattack are frequently the ones that cannot afford the tooling to prevent one. That asymmetry is where open-source intelligence earns its keep.

The insight BrandSentinel reinforced for me is that the value of OSINT is not in any single source — it is in the fusion. CertStream alone produces noise. URLhaus alone has coverage gaps. PhishTank alone misses the newest campaigns. But a pipeline that ingests all of them, deduplicates across sources, and applies a consistent classification layer produces something genuinely useful: a picture of your brand’s threat exposure that would otherwise require a commercial contract to see. For an SME deciding whether to invest in a DRP solution, BrandSentinel running for a week is a more persuasive argument than any sales deck.

2. Building a real-time, scalable ETL pipeline in Python is harder than it looks — and the right answer is to iterate, not over-engineer.

I designed BrandSentinel to be event-driven from the start. The first version used Python async functions throughout, with Redis Pub/Sub as the message bus between ingestion and classification. For a small proof-of-concept, this is clean and simple — everything lives in one file, the architecture is easy to reason about, and the latency between domain observation and verdict is low.

The problem became obvious under load. Python’s threading model means CPU-bound work competes with I/O on a single thread, and the volume of domains BrandSentinel processes is enough to make that a real bottleneck. Worse, several libraries I relied on — DNS resolution in particular — do not expose async interfaces. Every blocking call was a stall. Converting the entire codebase to async is not just tedious; it is a maintenance burden that multiplies with every new heuristic.

My first instinct was to reach for a microservice architecture: split ingestion, classification, and output into separate processes, use a proper message queue, and let each component scale independently. The design is sound on paper. In practice, the overhead of inter-process communication directly conflicts with BrandSentinel’s primary goal, which is the speed of scam domain detection. Latency introduced at the architecture level is just as damaging as latency introduced by slow heuristics.

The solution I landed on was Prefect — a workflow orchestration framework that let me run the full pipeline from a single Python file while genuinely parallelising I/O-bound work across tasks. No async contortions. No microservice topology. No operational complexity beyond spinning up the Prefect server. The pipeline became both faster and easier to reason about than either of its predecessors.

The lesson here is not that Prefect is the answer to every pipeline problem. The lesson is to resist the temptation to design for theoretical scale before you understand where the real bottlenecks are — and to search aggressively for existing tools before building your own solutions to solved problems. Iterative solutions are not a sign of weak engineering. They are a sign of engineering focused on the right outcome: in this case, getting a scam domain classified and surfaced to an analyst as fast as possible.

If you’re running BrandSentinel against your own brand, I’d genuinely like to hear what you find. What heuristics are generating the most signal? What sources are you missing? Leave a comment below.

Hunting Scam Domains Before They Strike with CertStream

2026-03-19T00:00:00+00:00

This post was proofread with the assistance of AI. The source code backing this story is open-sourced and available on GitHub.

Here’s something that genuinely surprised me: modern scam operations are run like software companies.

They have CI/CD pipelines. They spin up cloud infrastructure on demand. They automate the deployment of thousands of scam websites, run them briefly, then tear them down before anyone can respond. Research by Group-IB shows that organised scam syndicates have adopted the same DevOps practices your favourite tech companies use — except they’re using it to steal from people. Their investigation into Classiscam, a scam-as-a-service operation, found 38,000 members organised across Telegram groups with defined roles, using bots to spin up fake phishing pages on demand — a supply chain for fraud that earned an estimated $64.5 million across 251 brands in 79 countries.

That realisation changed how I think about cybersecurity. And it led my team to build something small, clever, and — I think — genuinely useful.

A note on where this work comes from. During my internship with the Asia-Pacific Digital Risk Protection team at a global cybersecurity company, I had the chance to observe how professional analysts actually hunt threats: the feeds they rely on, the signals they trust, and the exhausting manual workload that accumulates when the tooling can’t keep up. CertStream was already on the team’s radar. What you’re reading is my attempt to understand it properly — to pull it apart, follow it back to its roots in Certificate Transparency, and make a clear case for why every practitioner doing this work should have it in their toolkit. Whatever is useful in this post, I owe to the analysts who showed me what the work actually looks like on the ground.

The Problem with Playing Defence

The traditional playbook for tackling scam websites goes something like this: a victim reports a suspicious link → an analyst investigates → the domain gets flagged or taken down. Clean. Logical. Thoroughly outdated.

Here’s why. A scam syndicate today doesn’t need a domain to last more than a few hours. They register a new domain, run a scam campaign on it, and retire it the moment it’s been accessed once — or even sooner. By the time a victim reports the link, the website is already gone. By the time an analyst looks at it, the domain resolves to nothing. The takedown request goes nowhere.

Throwing more web crawlers at the problem sounds appealing — just enumerate all the domains! — but that’s prohibitively expensive and still fundamentally reactive. We needed a way to catch scam domains before they reach their first victim, not after.

An Unexpected Clue in Plain Sight

Here’s where it gets interesting. Modern browsers are ruthless about unencrypted or untrusted connections. Navigate to a site without a valid HTTPS certificate and you’re greeted with a giant red warning page. For a scam site trying to look legitimate, that warning is a conversion killer.

This means scam domains have to obtain a valid HTTPS certificate. There’s no way around it.

And here’s the thing about HTTPS certificates: every single one is publicly logged. Certificate Transparency (CT) is an open standard that requires Certificate Authorities to record every certificate they issue into append-only public logs. The goal is accountability — anyone can audit who issued what, to whom, and when.

But as a side effect, CT logs are a live feed of every domain that is in the process of going online. Including scam domains.

Cali Dog Security built CertStream to make this feed accessible. It’s a real-time WebSocket stream of certificate issuance events — open, free, and broadcasting millions of domain names every day. When I first heard about it, I immediately thought: this is our early warning system.

What We Built

The concept is straightforward: listen to the CertStream feed, and flag any domain that looks like it could be impersonating a brand or entity we’re protecting.

We express “looks like” as a set of regex patterns — one per line in a simple text file. To monitor for domains targeting a bank, you might write:

.*mybank.*
.*my-bank.*
.*mybank-secure.*

Our tool subscribes to the CertStream WebSocket, processes each incoming certificate event, strips the domain variants (with and without www., *., https://), and checks them against the compiled pattern list. Any match gets logged to a SQLite database and exported for further analysis.

The flow looks like this:

CT Server → WebSocket stream → sanitise domains → match regex → store hits → export for analysis

That’s it. No web crawlers. No enumeration. No waiting for a user to report anything. The scam domains come to us.

Results

When we deployed this alongside our team’s existing domain analysis tooling, the effect was immediate: we effectively doubled our search space for suspicious domains without proportionally increasing our workload. More importantly, we were now seeing domains at the moment of certificate issuance — which is often before the site is even reachable to the public.

For a Digital Risk Protection team, that’s a significant shift. Instead of responding to threats, we were meeting them at the door.

Who Should Care About This

If you’re on a Digital Risk Protection team, this approach is directly applicable. Swap in the brand keywords you’re protecting and you have a 24/7 monitoring feed that never sleeps.

If you’re a company trying to assess your own digital risk, running this against your own brand names gives you a rough — and surprisingly sobering — sense of how actively threat actors target your identity online.

And if you’re a security researcher or student, I’d encourage you to look at this as a template for creative problem solving: what other public, boring-sounding infrastructure quietly records everything you’d ever want to know?

The natural next step is a downstream analysis stage — an AI/ML classifier that quickly triages matches between genuine threats and false positives, either automatically or with a human in the loop. That’s what we’re working on next with BrandSentinel. Both projects are being open-sourced as a gift to the cybersecurity community, because good tools should be shared.

What I Took Away

1. The threat landscape moves faster than you think. I knew scam campaigns were a problem. I didn’t fully appreciate that they had industrialised to the point of using orchestration frameworks and automated infrastructure. That shift changes everything about how we need to respond.

2. Cybersecurity rewards lateral thinking. The instinct is to do more of what already works — bigger crawlers, more analysts. The better question is: what does the attacker have to do that we can intercept? CT logs were the answer hiding in plain sight.

3. Open source makes this possible. CertStream is a free, open initiative. Without it, building this capability would have required months of infrastructure work. The open source community quietly lowers the barrier to entry for exactly this kind of innovation, and that’s worth celebrating.

What other open data feeds or public infrastructure do you think are sitting underused as threat intelligence sources? I’d love to hear what you’re thinking in the comments.

Thoughts of a Servant

From Apple Trees to Search Trees: Optimising Classical AI to Overcome TicTacToe

Formalising the game: START

The search tree

The algorithm chain

1. MiniMax

2. Alpha-Beta Pruning

3. NegaMax

4. Scaled Rewards: a warning

5. NegaScout

6. MTD(f)

7. Best Node Search

Iterative Deepening: the crosscutting technique

Putting numbers to the theory

What I learnt

Food for thought

References

MiniMax

Alpha-Beta Pruning

NegaMax

Scaled Rewards

NegaScout

MTD(f)

Best Node Search

Iterative Deepening

I Ruined TicTacToe for My Children, and for Math 😭

A Small Game With Serious AI Bones

Computing TicTacToe Requires Recursion

Small Things Add Up, Very Quickly

The Value of k Changes the Strategy of the Whole Game

Where the Problem Suddenly Changes: Phase Transition

Why This Matters for AI

What I Took Away

Machines Are Becoming Sophisicated Cyberattackers. Is Singapore Ready?

I. You Are Already in the Fight

II. A Nation Under Siege

III. The Models That Changed Everything

IV. Singapore’s Hand: Strong Cards, Structural Gaps

V. Why This Is Personal

VI. Crisis as Catalyst: Three Imperatives

VII. The Choice Before Us

References

Every Scam Site Leaves One Trace Before It Goes Live. We Built a Tool to Catch It.

The Problem: We Found the Domains. Now What?

The Implication: Speed Is Not a Nice-to-Have

BrandSentinel: A Live Feed of Malicious Domains

Methodology

Results

Discussion

Who Should Use This

What Comes Next

What I Took Away

Hunting Scam Domains Before They Strike with CertStream

The Problem with Playing Defence

An Unexpected Clue in Plain Sight

What We Built

Results

Who Should Care About This

What I Took Away

The Value of `k` Changes the Strategy of the Whole Game