Recent posts

Surviving the Blizzard: An Application of Markov Chains

In the Call of Cthulhu adventure “Chateau of Blood”, the characters are faced between spending the day trapped in an ominous chateau where they will likely be attacked by monsters or attempting to navigate a treacherous mountain trail as a ferocious blizzard blows. Inspired by Hammer films, the characters are encouraged to stay inside the chateau, find clues, and face the monsters rather than go out into almost certain doom. However, as this is a Hammer film, few are expected to survive the supernatural horrors. What are the character’s chance in the storm? We’ll use Markov chains and a short Rust program to work out the odds.

Read more

Slaying Dragon

Slaying the Dragon: A Secret History of Dungeons & Dragons by Ben Riggs

Author’s website

Slaying the Dragon is a business history of TSR, focusing on the Lorraine Williams period (1985-1997). During this period, TSR recovered from the excesses of the Gygax/Blume Brothers period, launched the 2nd edition rules, novels line, and attempted pivots and expansions of their IP. However, the sales declined continued until they were purchased by Wizards of the Coast. Ben Riggs chronicles the company’s history, key products, and their strategic mistakes.

Read more

Simplifying Logic

Slack’s business logic for showing notification periodically appears in LinkedIn posts and such. In Slack’s original post, this diagram was meant to illustrate what logic was being transferred from the multiple Slack clients to the server, but the diagram has since been taken as an example of product complexity and how development is harder than it may sound. In contrast, I think the diagram obscures the intended logic, but the logic itself is not complex. I’ll describe how to simplify it and where diagrams might not be the right approach to convey requirements.

Read more

A Pain Scale for On-call?

Being on-call is often a necessary part of the job, particularly for engineers in a SaaS business. The burden of operations often negatively impacts morale and productivity. If we were to estimate the impact on a team, we could build a model based on incident frequency, ticket severity, time of alert versus sleeping schedules, and other operational metrics. Alternatively, we can ask the on-call engineers directly, which should be more accurate, and use the metrics to help drive improvements. This article describes an on-call experience program intended to be integrated into an operational review system.

Read more