Building fault-tolerant edge computing from the ground up, one kernel panic at a time.

What Is KryOS?

KryOS is a custom Buildroot-based embedded Linux distribution I built from scratch for my final-year B.Tech project, CryoSentinel: a fault-tolerant edge computing system for vaccine cold chain monitoring.

The core problem it solves is simple but critical: vaccines in transit or storage can fail silently. Temperature excursions go unlogged, or logs arrive too late. Existing solutions are often proprietary black boxes or consumer IoT platforms not engineered for medical-grade reliability in low-connectivity environments like rural health centers and last-mile cold-storage depots.

KryOS is the operating layer that changes that. It runs on a Raspberry Pi 4, stays intentionally minimal, and boots into a purpose-built environment where every package, service, and driver has a reason to exist.

Why Build a Custom OS?

Fair question. The default answer is usually “just use Raspberry Pi OS.” Here is why I did not:

Attack surface. A general-purpose OS ships with hundreds of packages you do not need. In a safety-critical medical context, every extra daemon is a liability: potential crash vector, security hole, or resource drain.

Boot determinism. KryOS boots into a known-good state every time, governed by a tightly controlled init sequence. No display manager, no desktop environment, no package manager in the running image.

Learning. I wanted to understand what actually happens between power-on and a running userspace. Buildroot forced me to answer that question at every layer: toolchain, kernel config, init system, and rootfs layout.

The Stack

LayerChoiceWhy
Build systemBuildroot 2024.02 LTSReproducible, minimal, well-documented
Target boardRaspberry Pi 4 (4GB)Available, strong community BSP support
KernelLinux 6.6 LTSStable, long-term maintenance window
InitBusyBox init + custom S* scriptsNo systemd overhead, full visibility into boot sequence
NetworkingWPA3 WiFi, Avahi mDNS, Dropbear SSHSecure remote access over local networks
Time syncChrony + hardware RTC (DS3231)Accurate timestamps even without internet
DatabaseSQLiteEmbedded, zero-config, sufficient for edge telemetry
MQTTEclipse Paho (C client)Lightweight pub/sub for upstream data relay
ML runtimeTensorFlow Lite (in progress)On-device anomaly detection without cloud round-trips
Securityiptables, read-only rootfs mountsDefense-in-depth for unattended deployment

Hardware Interfaces

The Pi is wired to real sensors over the buses KryOS exposes:

  • I2C: BME280 for ambient temperature/humidity, DS3231 RTC for hardware timekeeping
  • SPI: custom kryos_spi.ko kernel module (out-of-tree, cross-compiled with the Buildroot toolchain)
  • GPIO watchdog: hardware watchdog via bcm2835_wdt; the system reboots itself if the monitoring daemon hangs
  • 1-Wire: planned DS18B20 temperature probe support for cold-storage probes

The SPI driver is the most recent addition: a skeleton loadable kernel module cross-compiled against the exact kernel headers baked into the Buildroot tree. Getting it to load cleanly with insmod kryos_spi.ko after fighting toolchain ABI mismatches for a week was genuinely satisfying.

The Boot Sequence

KryOS uses BusyBox init with a conventional rcS-style script chain.

Power-on -> U-Boot -> Linux kernel -> BusyBox init
  -> /etc/init.d/S01logging     # syslog
  -> /etc/init.d/S10network     # wpa_supplicant + udhcpc
  -> /etc/init.d/S20chrony      # NTP sync
  -> /etc/init.d/S30avahi       # mDNS
  -> /etc/init.d/S40dropbear    # SSH
  -> /etc/init.d/S50cryo        # CryoSentinel monitoring daemon
  -> /etc/init.d/S99motd        # MOTD + sysinfo splash

Every S* script is an explicit decision. Nothing runs unless I put it there.

MOTD - The Login Screen

The first thing visible over SSH is the KryOS MOTD, split into two layers:

  1. Static MOTD (/etc/motd): the KryOS ASCII banner, hardcoded at build time
  2. Dynamic sysinfo (/etc/profile.d/sysinfo.sh): live system metrics on every login

The sysinfo script runs on BusyBox sh (not bash), so there are practical constraints: no arrays, limited nesting, and several shell differences that required careful debugging. It pulls uptime, CPU temperature, memory usage, IP address, and CryoSentinel daemon status.

KryOS MOTD on SSH login

The banner reads KryOS in block ASCII. Below it: hostname, kernel version, uptime, CPU temperature (from vcgencmd measure_temp), free memory, network interface, and a live daemon status line.

Fault Tolerance

“Fault-tolerant” is not marketing in KryOS. It means:

Hardware watchdog. If cryo_daemon stops petting /dev/watchdog, the system reboots in 15 seconds.

Persistent local logging. Temperature and humidity readings are written to SQLite on a separate data partition before MQTT publish attempts. If the network is down, data is queued and replayed when connectivity returns.

Read-only rootfs (planned). Production builds mount root read-only with a tmpfs overlay for runtime writes, reducing filesystem corruption from abrupt power loss.

Debug init hook. A development S00debug script runs before everything else and writes checkpoints to tmpfs logs, helping isolate boot-stage failures without UART access.

The TFLite Incident

Integrating TensorFlow Lite into the Buildroot image broke bootability due to dependency and toolchain conflicts, eventually corrupting the filesystem partition table.

Recovery was intentionally methodical:

  • mount SD card on host
  • run fsck.ext4 -y on the data partition
  • inspect superblock manually
  • rebuild from a known-good Buildroot defconfig
  • re-apply patches incrementally to isolate the failing package config

The debug init hook was added after this incident. TFLite integration is still in progress: the anomaly model works in isolation, but stable in-image deployment remains open.

Build Reproducibility

A design goal of KryOS is full reproducibility from source. Buildroot defconfig, kernel config (kryos_defconfig), board overlays, and custom package recipes are version-controlled. A clean make on the build host produces a flashable image.

That reproducibility matters in medical contexts: you need to prove what software is running, rebuild it identically, and audit it.

Current Status

ComponentStatus
Bootable KryOS imageWorking
I2C / SPI / watchdog interfacesVerified over SSH
CryoSentinel monitoring daemon (C)Running
SQLite telemetry loggingWorking
MQTT relay to AWS IoT CoreWorking
Dropbear SSH + WPA3 WiFiWorking
Split MOTD / sysinfo systemWorking
kryos_spi.ko kernel moduleLoads cleanly
Read-only rootfsIn progress
TFLite anomaly detectionDebugging
OTA update mechanismPlanned

Lessons

A few lessons from building KryOS:

Buildroot external toolchain saves time. Pinning a known-good Linaro toolchain consistently across image builds and out-of-tree modules removed an entire class of ABI mismatch issues.

BusyBox sh is not bash. Scripts that run fine on a laptop can silently fail on the device.

Filesystem architecture matters early. Separate read-only rootfs, writable data partition, and tmpfs runtime layer are architectural decisions, not cleanup tasks.

UART is worth the wires. Early SSH-only debugging hid boot failures that serial logs would have surfaced immediately.

What Is Next

  • finalize TFLite integration with stable memory footprint
  • implement read-only rootfs with overlay
  • add OTA update support with an A/B partition scheme
  • complete anomaly detection model training pipeline (synthetic + real data)
  • harden MQTT TLS with mutual authentication against AWS IoT Core

KryOS and CryoSentinel are my final-year B.Tech ECE project at North-Eastern Hill University, Shillong. The name KryOS comes from Greek “kryos” meaning frost.