Should operating systems be replaced with an LLM-based Universal Kernel?

Rex St John
6 min readDec 3, 2023

--

Abstract

In this paper, I will explore the theoretical concept of a Universal Kernel based on LLM technology which is capable of “speaking” with thousands of devices, natively fluent in x86, Arm, RISC-V, MIPS, WASM and other instruction sets and is capable of running on any device.

The vision of such a Universal Kernel would be to completely replace all operating systems, and attempt to eliminate the distinction between different devices by having a single software capable of interfacing with the file systems, underlying hardware, SoC, device drivers and user space of any system.

Exploring the idea of a Universal Kernel based on LLMs

There has been discussion on the X platform of finding ways to increasingly integrate LLMs directly into the operating system of computers. Specifically, questions have arisen as to whether LLMs might enable the replacement / augmentation of Operating Systems as a concept.

This leads to some big questions:

  • Do LLMs spell the end of modern operating systems?
  • How might we go about initiating the process of placing LLMs at the center of a modern operating system?
  • If we DID replace operating systems with LLM-based systems, what might be the advantages
  • What are potential downsides from doing so?

Having worked in embedded systems, IoT, edge computing, AI, AIOT, TinyML, robotics, cloud-native technologies at Intel, Arm and NVIDIA for the last decade — I have significant exposure to the topic of modern operating systems in production environments. I may not be the best person to answer these questions, but I have some ideas.

Table of Contents

  1. A quick rundown of modern operating systems
  2. Problems with modern operating systems
  3. Why add LLMs to modern operating systems
  4. How to arrive at a Universal Kernel LLM
  5. Potential downsides of a Universal Kernel LLM

Sound interesting? Let’s get started.

A Quick Rundown of Modern Operating Systems

Modern operating systems are commonly divided into two zones. A user space where user applications and input live and the Linux Kernel, where system-level components, drivers and functionality exist. The Linux Kernel is the effective “AI” of modern operating systems and uses various algorithms to schedule and prioritize workloads within the system.

The Linux Kernel also controls connected devices such as storage, RAM, I/O, CPU, GPU and relays between these hardware gadgets and the application layer to bring the entire system to life.

As a construct, it is often said that “everything in Linux is a file.” In other words, all the I/O, storage, applications in Linux are managed as though they were files. This construction of the operating system means that very interesting opportunities begin to emerge once it is understood that all the underlying hardware, applications, users, storage and I/O are modeled in such a simple fashion despite the complexity of the Kernel itself.

Problems with modern operating systems

The biggest challenges of modern operating systems often revolve around the proliferation of millions of devices, often running outdated software. Every year, hardware vendors mass produce new chipsets and software to facilitate the functioning of these chipsets. Once these devices are out the door, hardware vendors then migrate to the next set of hardware and often abandon and fail to maintain the drivers and operating systems of past products.

The end result is an incredibly fragmented world with tens of millions of devices requiring software, driver, and security updates and little or no economic incentive for device manufacturers or anyone else to provide the support necessary. Software for embedded systems, IoT devices, mobile handsets, servers, edge computing racks are frequently subject to highly disjoint and poorly functioning lifecycles requiring large teams to maintain.

What are LLMs and why insert them into the Operating System?

Large Language Models have proven that they can do all kinds of things, including speak any language and write code that works decently. It is expected that improvements in LLMs will result in a world where more and more of the software being written will be partially or substantially delivered by LLMs or some unknown successor perhaps using Q-learning techniques (TBD).

Taking a step back, it seems possible that the following might work:

  • Train and fine-tune an LLM specifically using the I/O and observability data of the Linux Kernel on a running machine or machines …
  • To create a specialized LLM which “Speaks” Linux Kernel and can effectively act as a “Black Box” replacement of the Linux Kernel …
  • If such an LLM were trained, one might imagine that it could be possible to create a “Universal Kernel” which can “Speak” the language of thousands of device drivers, SoCs, GPUs, storage devices and to any microprocessor across every architecture …
  • If you had such a Universal Kernel LLM, it might be possible to attempt to replace modern operating systems with a single LLM (or it’s derivatives) that can be constantly “trained” to “speak” to any new chipsets, hardware devices, storage devices or CPU / GPU which is introduced to the market …

How might one go about training a Universal Kernel LLM?

Most modern LLMs are trained on massive datasets and a number of specialized techniques involving huge amounts of information from the internet. The creation of a Universal Kernel LLM might involve creating a specialized LLM which is specifically trained to mimic the inputs and outputs of the Linux Kernel talking to the user space and underlying hardware. Such a Universal Kernel would need to be “fluent” in the languages and behaviors of thousands of devices and running systems and be capable of reproducing the expected I/O of a kernel on a running device in production.

One idea might be to pick a truly open source platform platform such as those from SiFive. Attempting to create a Universal Kernel for platforms such as Arm, x86, NVIDIA may run into hurdles around closed aspects of the OS, drivers or instruction sets making it challenging to properly train the LLM to completely take over the system.

RISC-V, on the other hand, has a completely open source instruction set. It might be possible to start in RISC-V and pick a single target device to train an LLM kernel replacement, and then prove the concept and then migrate to other platforms if it works.

Potential downsides to a Universal Kernel LLM

Obviously security might be a huge concern. However, for the time being this idea is purely conceptual and mostly a thought experiment about how such a system might be developed.

Closing Thoughts

We have already seen the arrival of three significant advancements in how LLMs can augment or take over aspects of modern operating systems over the last few weeks. These include:

  • llamafile: Allows you to package LLMs as an executable binary
  • self operating computer: Allows users to directly command their computer to engage in a series of tasks using the english language
  • Ollama and other downloadable LLM clients to run LLMs locally on the desktop

We are starting to see a new direction in how operating systems behave, I suspect things may go much further as the creativity of the open source ecosystem is unleashed on this topic.

--

--

Rex St John
Rex St John

Written by Rex St John

Exploring the intersection between AI, blockchain, IoT, Edge Computing and robotics. From Argentina with love.

No responses yet