- Why I Did This
- Prerequisites
- Cloning the Kernel
- Creating a Case-Sensitive Workspace
- The allnoconfig Mistake
- The macos-include Shim
- A 14-Line Init Binary
- Packing the initramfs
- A Silent First Boot
- Starting Fresh with defconfig
- Missing defconfig Flags
- Image vs Image.gz
- The Silent Hello World
- Reading init/main.c
- Watching the Boot from Inside
- Where to Go from Here
- References
How I Compiled and Ran a Linux Kernel for RISC-V on My Apple Silicon Mac
- Why I Did This
- Prerequisites
- Cloning the Kernel
- Creating a Case-Sensitive Workspace
- The allnoconfig Mistake
- The macos-include Shim
- A 14-Line Init Binary
- Packing the initramfs
- A Silent First Boot
- Starting Fresh with defconfig
- Missing defconfig Flags
- Image vs Image.gz
- The Silent Hello World
- Reading init/main.c
- Watching the Boot from Inside
- Where to Go from Here
- References
I wrote this while still learning the topic. I might not have fully understood everything I put here, and parts of it might be wrong. By the time you read this, I might understand it better. If you have any comments or thoughts, I would love to hear them.
Why I Did This
I have been writing software for years but always at the higher levels. Web apps and Node apps. The kernel always felt like a black box to me. Things go in, things come out, somewhere in between something happens.
I wanted to open the box.
One thing I only understood after watching this video is that Linux is just a kernel, and a kernel alone cannot show anything to the user. The kernel boots, sets up its world, then hands control to the first userspace1 program it is given. That first program is called init2. Linux itself does not ship with one. Init can be anything: systemd or sysvinit on a normal desktop, or my own 14-line C program in this article.
From there I started looking for how to actually build the kernel. Linux is open source, so I tried reading it. The codebase was huge and I got lost. I also realized that to even start building, I needed to pick a CPU architecture to target first.
I knew almost nothing about CPU architectures. The closest I had come was knowing my Mac uses Apple Silicon3, which is ARM64, and that most PCs use x86. I started searching for the friendliest architecture to learn the kernel on, and the answer that kept coming up was RISC-V4. Small, open, designed to be simple. So I picked it.
Now there was a problem. I do not have any RISC-V hardware. So I had to cross-compile5: build the kernel for RISC-V on my ARM64 Mac, then boot it inside QEMU which emulates RISC-V in software.
That meant I needed a working recipe to build Linux on macOS for RISC-V. I found it in Building Linux Kernel on macOS Natively by Seiya, which handles the build side cleanly. I picked up from there: running the kernel in QEMU, writing my own init binary, and watching the boot from the inside with lldb.
This post walks through that journey error by error. Some errors were obvious; others took hours. Neither the build nor the boot worked the first time. The first program I wrote in C had a compiler bug that took me a disassembly session to spot. By the end though, I had a kernel that I built myself, booting into a userspace program that I wrote myself, both running on emulated hardware on my laptop. That moment is what makes this worth doing.
Here is what we will do:
- Set up the toolchain
- Get the source onto the Mac
- Build the kernel
- Boot it in QEMU
- Write a tiny init program in C
- Step through the boot with lldb
Prerequisites
To build a kernel on a Mac, we need several tools that macOS does not ship with by default. Most of them can be installed through Homebrew. I assume you already have Homebrew. If not, follow the installation guide at brew.sh.
Here is what we need and why:
- LLVM (
clangandlld6): the kernel can be built withclangvia theLLVM=1flag, which saves us from setting up a separate cross-gcc toolchain. macOS already ships with Apple’sclang, but it is older and may not support all the features the kernel build needs. Homebrew gives us a modernclangandlldin one package. - GNU make: macOS only ships with BSD make by default, but the kernel needs GNU make. After installation we invoke it as
gmake. - coreutils: kernel build scripts use commands like
nprocandheadwhich either are not available on macOS or have BSD versions that behave slightly differently. - gnu-sed: kernel scripts assume GNU sed semantics.
- findutils: kernel scripts use
find -printfwhich BSD find does not have. - libelf: needed by some kernel host tools to parse ELF7 files.
- QEMU: to actually run the kernel after we build it.
Install everything in one command:
brew install llvm make coreutils libelf gnu-sed findutils qemuAfter installation, we need to set up PATH so Homebrew’s llvm and the GNU versions of coreutils, gnu-sed, and findutils come before the macOS defaults. Add these lines to ~/.zshrc:
# ~/.zshrc
LLVM_PREFIX="$(brew --prefix llvm)"
COREUTILS_PREFIX="$(brew --prefix coreutils)"
GNU_SED_PREFIX="$(brew --prefix gnu-sed)"
FINDUTILS_PREFIX="$(brew --prefix findutils)"
export PATH="$LLVM_PREFIX/bin:$PATH"
export PATH="$COREUTILS_PREFIX/libexec/gnubin:$PATH"
export PATH="$GNU_SED_PREFIX/libexec/gnubin:$PATH"
export PATH="$FINDUTILS_PREFIX/libexec/gnubin:$PATH"Then run source ~/.zshrc or open a new terminal. To verify everything is set up correctly, run clang --version and find --version | head -1. The clang version should mention “Homebrew” rather than Apple, and find should show “GNU findutils”.
Cloning the Kernel
Now we have the toolchain. Time to get the source code. The Linux kernel lives at git.kernel.org and is mirrored to GitHub. Let’s clone it. We use --depth=1 so we only download the latest snapshot, not the full history. We do not need the history for this project.
git clone --depth=1 \
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git~/Learning/LINUX
❯ git clone --depth=1 \
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Cloning into 'linux'...
remote: Enumerating objects: 99198, done.
remote: Counting objects: 100% (99198/99198), done.
remote: Compressing objects: 100% (96420/96420), done.
remote: Total 99198 (delta 7850), reused 21828 (delta 1720), pack-reused 0 (from 0)
Receiving objects: 100% (99198/99198), 274.01 MiB | 6.43 MiB/s, done.
Resolving deltas: 100% (7850/7850), done.
Updating files: 100% (93697/93697), done.
warning: the following paths have collided (e.g. case-sensitive paths
on a case-insensitive filesystem) and only one from the same
colliding group is in the working tree:
'include/uapi/linux/netfilter/xt_CONNMARK.h'
'include/uapi/linux/netfilter/xt_connmark.h'
'include/uapi/linux/netfilter/xt_DSCP.h'
'include/uapi/linux/netfilter/xt_dscp.h'
'include/uapi/linux/netfilter/xt_MARK.h'
'include/uapi/linux/netfilter/xt_mark.h'
'include/uapi/linux/netfilter/xt_RATEEST.h'
'include/uapi/linux/netfilter/xt_rateest.h'
'include/uapi/linux/netfilter/xt_TCPMSS.h'
'include/uapi/linux/netfilter/xt_tcpmss.h'
'include/uapi/linux/netfilter_ipv4/ipt_ECN.h'
'include/uapi/linux/netfilter_ipv4/ipt_ecn.h'
'include/uapi/linux/netfilter_ipv4/ipt_TTL.h'
'include/uapi/linux/netfilter_ipv4/ipt_ttl.h'
'include/uapi/linux/netfilter_ipv6/ip6t_HL.h'
'include/uapi/linux/netfilter_ipv6/ip6t_hl.h'
'net/netfilter/xt_DSCP.c'
'net/netfilter/xt_dscp.c'
'net/netfilter/xt_HL.c'
'net/netfilter/xt_hl.c'
'net/netfilter/xt_RATEEST.c'
'net/netfilter/xt_rateest.c'
'net/netfilter/xt_TCPMSS.c'
'net/netfilter/xt_tcpmss.c'
'tools/memory-model/litmus-tests/Z6.0+pooncelock+poonceLock+pombonce.litmus'
'tools/memory-model/litmus-tests/Z6.0+pooncelock+pooncelock+pombonce.litmus'The clone finishes, but at the very end git prints a long warning: “the following paths have collided (e.g. case-sensitive paths on a case-insensitive filesystem) and only one from the same colliding group is in the working tree”.
The kernel has files whose names differ only by capitalization. Look at the warning list. xt_CONNMARK.h and xt_connmark.h live in the same directory. Same for xt_DSCP.h and xt_dscp.h. The only difference between each pair is capitalization. On Linux these are two different files. On macOS’s APFS8, which is case-insensitive by default, they are treated as the same file. Only one file from each colliding pair ends up on disk.
Even if we ignore the warning, this breaks the build. The header files we need are missing from disk because their case-collision twins took their place.
We need a case-sensitive filesystem.
Creating a Case-Sensitive Workspace
macOS does not let us change the case-sensitivity of our existing disk, but it does let us create a separate volume that is case-sensitive. The hdiutil command makes a sparse disk image we can attach as a volume.
I create a 20 GB sparse image at ~/Learning/LINUX/linuxkernel.dmg. Sparse means the file only grows as it is used. The kernel source plus a build directory comfortably fits in 20 GB.
hdiutil create -size 20g -fs "Case-sensitive APFS" \
-volname linuxkernel ~/Learning/LINUX/linuxkernel.dmg
hdiutil attach ~/Learning/LINUX/linuxkernel.dmg~/Learning/LINUX
❯ hdiutil create -size 20g -fs "Case-sensitive APFS" \
-volname linuxkernel ~/Learning/LINUX/linuxkernel.dmg
created: /Users/jefrydco/Learning/LINUX/linuxkernel.dmg
~/Learning/LINUX took 5s
❯ hdiutil attach ~/Learning/LINUX/linuxkernel.dmg
/dev/disk6 GUID_partition_scheme
/dev/disk6s1 EFI
/dev/disk6s2 Apple_APFS
/dev/disk7 EF57347C-0000-11AA-AA11-0030654
/dev/disk7s1 41504653-0000-11AA-AA11-0030654 /Volumes/linuxkernelOnce attached, the volume appears at /Volumes/linuxkernel/. From now on, we work inside that volume. Re-clone the kernel there:
cd /Volumes/linuxkernel
git clone --depth=1 \
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.gitThis time the clone finishes without the case-collision warning. All the files are on disk.
The allnoconfig Mistake
I wanted to start small. The kernel has thousands of options, and I thought I would learn faster if I began from nothing and added things only as needed. So I picked allnoconfig, which disables every option Kconfig9 allows.
This was the wrong choice for a first build, but I did not know that yet.
gmake ARCH=riscv LLVM=1 allnoconfigThe first thing that happened was unexpected: clang refused to run.
linux
❯ gmake ARCH=riscv LLVM=1 allnoconfig
HOSTCC scripts/basic/fixdep
HOSTCC scripts/kconfig/conf.o
HOSTCC scripts/kconfig/confdata.o
HOSTCC scripts/kconfig/expr.o
LEX scripts/kconfig/lexer.lex.c
YACC scripts/kconfig/parser.tab.[ch]
HOSTCC scripts/kconfig/lexer.lex.o
HOSTCC scripts/kconfig/menu.o
HOSTCC scripts/kconfig/parser.tab.o
HOSTCC scripts/kconfig/preprocess.o
HOSTCC scripts/kconfig/symbol.o
HOSTCC scripts/kconfig/util.o
HOSTLD scripts/kconfig/conf
clang: unknown C compiler
scripts/Kconfig.include:45: Sorry, this C compiler is not supported.
gmake[2]: *** [scripts/kconfig/Makefile:85: allnoconfig] Error 1
gmake[1]: *** [/Volumes/linuxkernel/linux/Makefile:755: allnoconfig] Error 2
gmake: *** [Makefile:248: __sub-make] Error 2The cause was a config file I had at ~/.config/clang/arm64-apple-darwin25.cfg from another project. Clang automatically loads this on every invocation. Here is what was inside:
# ~/.config/clang/arm64-apple-darwin25.cfg
-isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk
-I/opt/homebrew/opt/llvm/include
-I/opt/homebrew/opt/boost/include
-L/opt/homebrew/opt/llvm/lib/c++
-L/opt/homebrew/opt/llvm/lib/unwind
-L/opt/homebrew/opt/boost/lib
-lunwind
-Wl,-rpath,/opt/homebrew/opt/llvm/lib/c++
-std=c++26These are C++ flags I use for personal projects: a Boost include path, libc++ link paths, and -std=c++26. When the kernel build invoked clang, clang silently picked up these flags, tried to compile kernel C code with a C++26 standard, and broke.
The fix was to move the file aside:
mv ~/.config/clang/arm64-apple-darwin25.cfg \
~/.config/clang/arm64-apple-darwin25.cfg.bakNow gmake could actually invoke clang. I re-ran allnoconfig to generate the .config file:
gmake ARCH=riscv LLVM=1 allnoconfig❯ gmake ARCH=riscv LLVM=1 allnoconfig
#
# configuration written to .config
#This time it succeeded. Then I tried the actual build:
gmake ARCH=riscv LLVM=1 -j$(nproc)The -j$(nproc) flag tells make how many compile jobs to run in parallel. nproc is part of the coreutils we installed earlier and prints the number of processors on the machine, so on an 8-core Mac this becomes -j8. The kernel has thousands of independent files, so parallel compilation cuts the build time significantly.
linux on master took 14s
❯ gmake ARCH=riscv LLVM=1 -j$(nproc)
WRAP arch/riscv/include/generated/uapi/asm/errno.h
WRAP arch/riscv/include/generated/uapi/asm/fcntl.h
WRAP arch/riscv/include/generated/uapi/asm/param.h
[... truncated ...]
HOSTCC scripts/elf-parse.o
In file included from scripts/elf-parse.c:12:
In file included from scripts/elf-parse.hscripts/sorttable.c::535::
10:scripts/elf-parse.h:5: 10: fatal error: 'elf.h' file not found
5 | #include <elf.h>
fatal error: | ^~~~~~~'elf.h' file
not found
5 | #include <elf.h>
| ^~~~~~~
1 error generated.
1 error generated.
gmake[2]: *** [scripts/Makefile.host:131: scripts/elf-parse.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[2]: *** [scripts/Makefile.host:131: scripts/sorttable.o] Error 1
UPD include/config/kernel.release
UPD include/generated/utsrelease.h
gmake[1]: *** [/Volumes/linuxkernel/linux/Makefile:1356: scripts] Error 2
gmake[1]: *** Waiting for unfinished jobs....
UPD include/generated/compile.h
gmake: *** [Makefile:248: __sub-make] Error 2This is the first kernel-side error. The build is calling host tools, programs that run on my Mac to prepare the kernel source for compilation. Those host tools include headers that do not exist on macOS. We need a different fix.
The macos-include Shim
The shim approach in this section follows Seiya’s article referenced in the intro.
The elf.h error tells us a host tool wants <elf.h>, but macOS does not ship elf.h. The libelf package we installed earlier provides the equivalent headers, just at a different path: <libelf/gelf.h>.
We need a layer of indirection. Create a directory scripts/macos-include/ and add a stub elf.h that proxies to libelf’s headers:
// scripts/macos-include/elf.h
#pragma once
#include <libelf/gelf.h>
#define STT_SPARC_REGISTER 3
#define R_386_32 1Then re-run gmake, telling clang where to find our shim and where libelf’s headers live:
gmake ARCH=riscv LLVM=1 -j$(nproc) \
HOSTCFLAGS="-Iscripts/macos-include -I$(brew --prefix libelf)/include"linux on master took 6s
❯ gmake ARCH=riscv LLVM=1 -j$(nproc) \
HOSTCFLAGS="-Iscripts/macos-include -I$(brew --prefix libelf)/include"
HOSTCC scripts/basic/fixdep
HOSTCC scripts/dtc/dtc.o
HOSTCC scripts/dtc/flattree.o
[... truncated ...]
HOSTCC scripts/elf-parse.o
In file included from scripts/elf-parse.cIn file included from :scripts/sorttable.c:3512:
:
scripts/elf-parse.hscripts/elf-parse.h::6262::2323:: error: incompatible pointer types passing 'Elf64_Off *' error: (aka 'unsigned long *') incompatible pointerto types parameter ofpassing type'Elf64_Off *' (aka 'unsigned long *') 'const uint64_t *'to
parameter (aka 'const unsigned long long *')of [-Wincompatible-pointer-types]type
'const uint64_t *'
(aka 'const unsigned long long *') [-Wincompatible-pointer-types]
62 | r e62t | urrentu renlf _eplafr_spearr.sre8r(.&re8h(d&re-h>der6-4>.ee6_4s.heo_fsfh)o;ff
) ;|
^~~~~~~~~~~~~~~~~~
[... truncated ...]
6 errors generated.
6 errors generated.
gmake[2]: *** [scripts/Makefile.host:131: scripts/elf-parse.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[2]: *** [scripts/Makefile.host:131: scripts/sorttable.o] Error 1
gmake[1]: *** [/Volumes/linuxkernel/linux/Makefile:1356: scripts] Error 2
gmake: *** [Makefile:248: __sub-make] Error 2The build progresses past the elf.h error but hits pointer type warnings. libelf’s gelf functions return slightly different types than what the host tool expects, and the build treats these warnings as errors. We silence them with -Wno-incompatible-pointer-types:
gmake ARCH=riscv LLVM=1 -j$(nproc) \
HOSTCFLAGS="-Iscripts/macos-include -I$(brew --prefix libelf)/include -Wno-incompatible-pointer-types"Why silence rather than fix? The mismatch is cosmetic at the C type system level. libelf’s pointers point to data with the same layout as what the host tool expects, just declared with slightly different types. The tools run correctly. Fixing it properly would mean patching the kernel source files themselves, which means maintaining a local fork. Suppressing the warning only affects our build and keeps the kernel tree clean.
linux on master took 4s
❯ gmake ARCH=riscv LLVM=1 -j$(nproc) \
HOSTCFLAGS="-Iscripts/macos-include -I$(brew --prefix libelf)/include -Wno-incompatible-pointer-types"
[... truncated ...]
HOSTCC scripts/mod/symsearch.o
In file included from scripts/mod/symsearch.c:8:
scripts/mod/modpost.h:2:10: fatal error: 'byteswap.h' file not found
2 | #include <byteswap.h>
| ^~~~~~~~~~~~
In file included from scripts/mod/sumversion.c:13:
scripts/mod/modpost.h:2:10: fatal error: 'byteswap.h' file not found
2 | #include <byteswap.h>
| ^~~~~~~~~~~~
In file included from scripts/mod/modpost.c:28:
scripts/mod/modpost.h:2:10: fatal error: 'byteswap.h' file not found
2 | #include <byteswap.h>
| ^~~~~~~~~~~~
In file included from scripts/mod/file2alias.c:19:
scripts/mod/modpost.h:2:10: fatal error: 'byteswap.h' file not found
2 | #include <byteswap.h>
| ^~~~~~~~~~~~
1 error generated.
1 error generated.
gmake[2]: *** [scripts/Makefile.host:131: scripts/mod/symsearch.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[2]: *** [scripts/Makefile.host:131: scripts/mod/sumversion.o] Error 1
1 error generated.
gmake[2]: *** [scripts/Makefile.host:131: scripts/mod/modpost.o] Error 1
1 error generated.
gmake[2]: *** [scripts/Makefile.host:131: scripts/mod/file2alias.o] Error 1
gmake[1]: *** [/Volumes/linuxkernel/linux/Makefile:1372: prepare0] Error 2
gmake: *** [Makefile:248: __sub-make] Error 2Next missing header: byteswap.h. macOS does not have it but clang has builtins10 for byte-swapping. Stub it:
// scripts/macos-include/byteswap.h
#pragma once
#define bswap_16 __builtin_bswap16
#define bswap_32 __builtin_bswap32
#define bswap_64 __builtin_bswap64Re-run gmake:
linux on master took 8s
❯ gmake ARCH=riscv LLVM=1 -j$(nproc) \
HOSTCFLAGS="-Iscripts/macos-include -I$(brew --prefix libelf)/include -Wno-incompatible-pointer-types"
HOSTCC scripts/mod/modpost.o
HOSTCC scripts/mod/symsearch.o
HOSTCC scripts/mod/sumversion.o
HOSTCC scripts/mod/file2alias.o
scripts/mod/file2alias.c:112:3: error: typedef redefinition with different types ('struct uuid_t' vs '__darwin_uuid_t' (aka 'unsigned char[16]'))
112 | } uuid_t;
| ^
/Library/Developer/CommandLineTools/SDKs/MacOSX26.sdk/usr/include/sys/_types/_uuid_t.h:31:25: note: previous definition is here
31 | typedef __darwin_uuid_t uuid_t;
| ^
scripts/mod/modpost.c:1177:7: error: use of undeclared identifier 'R_386_PC32'
1177 | case R_386_PC32:
| ^~~~~~~~~~
[... truncated ...]
17 errors generated.
gmake[2]: *** [scripts/Makefile.host:131: scripts/mod/file2alias.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
16 errors generated.
gmake[2]: *** [scripts/Makefile.host:131: scripts/mod/modpost.o] Error 1
gmake[1]: *** [/Volumes/linuxkernel/linux/Makefile:1372: prepare0] Error 2
gmake: *** [Makefile:248: __sub-make] Error 2Two issues are interleaved here.
uuid_t redefinition in file2alias.c. Both macOS’s <unistd.h> and the kernel’s host tool define uuid_t, but with different shapes. macOS’s is unsigned char[16], while the kernel’s is a struct. macOS’s header guards its typedef with an #ifndef _UUID_T check, so if we predefine _UUID_T on the command line, the header sees it as already defined and skips its own typedef. The kernel’s definition is the only one left. Add -D_UUID_T to HOSTCFLAGS.
Many missing R_* relocation constants in modpost.c and file2alias.c. The kernel’s host tools handle relocations for many CPU architectures, including x86, ARM, MIPS, and AArch64. The constants come from Linux’s elf.h. Our shim only had R_386_32. Update scripts/macos-include/elf.h with the rest:
// scripts/macos-include/elf.h
#pragma once
#include <libelf/gelf.h>
#define STT_SPARC_REGISTER 3
#define R_386_32 1
#define R_386_PC32 2
#define R_MIPS_HI16 5
#define R_MIPS_LO16 6
#define R_MIPS_26 4
#define R_MIPS_32 2
#define R_ARM_ABS32 2
#define R_ARM_REL32 3
#define R_ARM_PC24 1
#define R_ARM_CALL 28
#define R_ARM_JUMP24 29
#define R_ARM_THM_JUMP24 30
#define R_ARM_THM_PC22 10
#define R_ARM_MOVW_ABS_NC 43
#define R_ARM_MOVT_ABS 44
#define R_ARM_THM_MOVW_ABS_NC 47
#define R_ARM_THM_MOVT_ABS 48
#define R_ARM_THM_JUMP19 51
#define R_AARCH64_ABS64 257
#define R_AARCH64_PREL64 260Re-run with both fixes, expanded elf.h and added -D_UUID_T:
gmake ARCH=riscv LLVM=1 -j$(nproc) \
HOSTCFLAGS="-Iscripts/macos-include -I$(brew --prefix libelf)/include -Wno-incompatible-pointer-types -D_UUID_T"linux on master took 10s
❯ gmake ARCH=riscv LLVM=1 -j$(nproc) \
HOSTCFLAGS="-Iscripts/macos-include -I$(brew --prefix libelf)/include -Wno-incompatible-pointer-types -D_UUID_T"
HOSTCC scripts/basic/fixdep
In file included from scripts/basic/fixdep.c:94:
In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX26.sdk/usr/include/unistd.h:670:
/Library/Developer/CommandLineTools/SDKs/MacOSX26.sdk/usr/include/gethostuuid.h:41:25: error: expected identifier
41 | int gethostuuid(uuid_t, const struct timespec *) __API_AVAILABLE(macos(10.5)) __API_UNAVAILABLE(ios, tvos, watchos);
| ^
1 error generated.
gmake[2]: *** [scripts/Makefile.host:114: scripts/basic/fixdep] Error 1
gmake[1]: *** [/Volumes/linuxkernel/linux/Makefile:663: scripts_basic] Error 2
gmake: *** [Makefile:248: __sub-make] Error 2-D_UUID_T blocks macOS’s uuid_t typedef, which is what we wanted, but <gethostuuid.h>, which <unistd.h> pulls in, references uuid_t and now fails. Replace the header with an empty stub:
// scripts/macos-include/gethostuuid.h
#pragma onceNow the host tools build cleanly. The scripts/macos-include/ directory holds three small files: elf.h, byteswap.h, and gethostuuid.h. Two HOSTCFLAGS additions go with them: -Wno-incompatible-pointer-types and -D_UUID_T.
linux on master took 2s
❯ gmake ARCH=riscv LLVM=1 -j$(nproc) \
HOSTCFLAGS="-Iscripts/macos-include -I$(brew --prefix libelf)/include -Wno-incompatible-pointer-types -D_UUID_T"
HOSTCC scripts/basic/fixdep
HOSTCC scripts/dtc/dtc.o
[... truncated ...]
AR built-in.a
AR vmlinux.a
LD vmlinux.o
MODPOST vmlinux.symvers
CC .vmlinux.export.o
UPD include/generated/utsversion.h
CC init/version-timestamp.o
KSYMS .tmp_vmlinux0.kallsyms.S
AS .tmp_vmlinux0.kallsyms.o
LD .tmp_vmlinux1
NM .tmp_vmlinux1.syms
KSYMS .tmp_vmlinux1.kallsyms.S
AS .tmp_vmlinux1.kallsyms.o
LD .tmp_vmlinux2
NM .tmp_vmlinux2.syms
KSYMS .tmp_vmlinux2.kallsyms.S
AS .tmp_vmlinux2.kallsyms.o
LD vmlinux.unstripped
NM System.map
SORTTAB vmlinux.unstripped
OBJCOPY vmlinux
GEN modules.builtin.modinfo
GEN modules.builtin
OBJCOPY arch/riscv/boot/Image
Kernel: arch/riscv/boot/Image is ready
GZIP arch/riscv/boot/Image.gz
Kernel: arch/riscv/boot/Image.gz is readyA 14-Line Init Binary
The kernel needs a userspace1 program to run as PID 111 after it finishes booting. On a real Linux system this would be systemd or sysvinit. We just need something tiny that proves our own code is running in userspace.
The smallest such program prints Hello, World! and loops forever. Here it is:
// /Volumes/linuxkernel/initramfs/init.c
void _start() {
const char msg[] = "Hello, World!\n";
__asm__ volatile (
"li a7, 64\n"
"li a0, 1\n"
"mv a1, %0\n"
"li a2, 14\n"
"ecall\n"
: : "r"(msg)
);
while(1);
}Let me walk through each piece.
_start instead of main. The kernel jumps to whatever address the ELF7 binary sets as its entry point. On a normal system that entry is _start, provided by libc12’s startup code called crt013, which sets up argc/argv and then calls your main. We are not linking libc, so we name our function _start directly and skip everything in between.
The __asm__ keyword is the GCC and Clang extension that lets us embed raw assembly instructions inside C code. The volatile keyword tells the compiler not to remove or reorder the block. The syscall writes bytes to a file descriptor, a side effect the compiler cannot see from the C alone.
The block invokes a Linux syscall14 using the RISC-V ABI15.
The asm uses two RISC-V instructions: li (“load immediate”) puts a constant into a register16, and mv (“move”) copies one register’s value into another. In mv a1, %0, %0 is a placeholder that the compiler replaces with whichever register holds msg. So the block sets up four register arguments and then runs ecall:
a7 = 64: the syscall number forwriteon RISC-Va0 = 1: stdout’s file descriptor17a1 = msg: pointer18 to the string to writea2 = 14: the byte count of “Hello, World!\n”
ecall19 is the instruction that traps20 from U-mode21 into S-mode to ask the kernel to run the syscall.
while(1) at the end: PID 1 cannot return. There is no caller to return to. If _start ever returns, the kernel panics because PID 1 has died. So we loop forever.
Build it with clang:
clang --target=riscv64-linux-gnu \
-static \
-nostdlib \
-fuse-ld=lld \
-o /Volumes/linuxkernel/initramfs/init \
/Volumes/linuxkernel/initramfs/init.cEach flag matters:
--target=riscv64-linux-gnu: tellclangto produce RISC-V Linux code instead of the host’s ARM64. This sets the cross-compile5 target.-static: link the binary statically. No dynamic loader, no shared libraries. The kernel will execute this binary directly, without any of the usual/lib/ld-linux*.somachinery, because our initramfs does not contain those files.-nostdlib: skip libc startup objects entirely. Without this,clangwould try to link in crt0 and libc, which would clash with our hand-written_start.-fuse-ld=lld: use LLVM’s lld6 instead of the host’s default linker. macOS’s system linker only knows Mach-O22, the macOS binary format. We need an ELF binary for Linux, and lld produces ELF.
We can disassemble the binary to see what clang produced:
llvm-objdump -d /Volumes/linuxkernel/initramfs/init/Volumes/linuxkernel/initramfs via C v21.0.0-clang
❯ llvm-objdump -d /Volumes/linuxkernel/initramfs/init
/Volumes/linuxkernel/initramfs/init: file format elf64-littleriscv
Disassembly of section .text:
00000000000111fc <_start>:
111fc: 1101 addi sp, sp, -0x20
111fe: ec06 sd ra, 0x18(sp)
11200: e822 sd s0, 0x10(sp)
11202: 1000 addi s0, sp, 0x20
11204: 4501 li a0, 0x0
11206: fea40723 sb a0, -0x12(s0)
1120a: 6505 lui a0, 0x1
1120c: a2150513 addi a0, a0, -0x5df
11210: fea41623 sh a0, -0x14(s0)
11214: 646c7537 lui a0, 0x646c7
11218: 26f50513 addi a0, a0, 0x26f
1121c: fea42423 sw a0, -0x18(s0)
11220: fffff517 auipc a0, 0xfffff
11224: f7050513 addi a0, a0, -0x90
11228: 6108 ld a0, 0x0(a0)
1122a: fea43023 sd a0, -0x20(s0)
1122e: fe040513 addi a0, s0, -0x20
11232: 04000893 li a7, 0x40
11236: 4505 li a0, 0x1
11238: 85aa mv a1, a0
1123a: 4639 li a2, 0xe
1123c: 00000073 ecall
11240: a009 j 0x11242 <_start+0x46>
11242: a001 j 0x11242 <_start+0x46>A quick word on those hex addresses before we move on, because they show up everywhere from here forward. Memory is one big array of bytes. Each byte has an index. We write those indices in hex, which is just the base-16 number system. Hex maps cleanly to binary, because each hex digit is exactly four binary bits. So 0x111fc is the number 70140 written in hex. Same number, different notation.
One thing to flag before going further: these are virtual addresses23, not physical RAM locations. Every userspace program sees its own private address space. The kernel sets up the MMU so the program’s virtual address 0x111fc maps to whatever physical RAM byte the kernel allocated for that page. Two programs can both claim virtual 0x10000 without conflict because each has its own page table.
Why this exact virtual address for _start? Because lld, our linker, put it there. When a linker builds a program, it picks an image base24: the starting address in the program’s own address space. For static RISC-V Linux binaries, lld’s default image base is 0x10000. From that base, the linker lays out the binary in order. First comes the ELF7 header, a small block of metadata describing the file. Then come the program headers, which tell the kernel how to load each piece. After those, the .text section begins. .text holds our machine code, and our _start is the first function in it. So after the headers, _start lands at 0x111fc.
0x10000 +---------------------+ <-- lld's default image base
| ELF header |
| program headers |
| ... |
0x111fc +---------------------+ <-- _start (our code begins here)
| addi sp, sp, -0x20 |
| sd ra, ... |
| ... |
| ecall |
| j . |
0x11242 +---------------------+ <-- end of _startIf we made the binary longer or shorter, every address inside it would shift accordingly. Passing -Wl,--image-base=0x12345 to clang at link time would tell lld to start the binary at a different base.
The kernel itself sits at high addresses like 0xffffffff80c00360, set by the kernel’s linker script. The userspace stack25 lives near the top of user-accessible memory, chosen by the kernel at runtime when it creates our process. None of these are random. Every address in this article comes from somewhere concrete: a config file, a linker script, a CPU spec, or a runtime decision made by code.
The disassembly looks fine at first glance. It loads the values and runs ecall. I will come back to it later. There is something subtly wrong here, but I did not notice it until much later, after the kernel was running and the program was producing no output.
Packing the initramfs
The kernel can boot from an initramfs but only if we give it one. An initramfs26 is a cpio27 archive containing the files we want present at boot. The kernel ships a tool called gen_init_cpio that builds this archive from a simple spec file. We need to compile it ourselves and then run it.
The first compile attempt hits a familiar problem:
cc /Volumes/linuxkernel/linux/usr/gen_init_cpio.c \
-o /Volumes/linuxkernel/linux/usr/gen_init_cpiolinux on master [?]
❯ cc /Volumes/linuxkernel/linux/usr/gen_init_cpio.c \
-o /Volumes/linuxkernel/linux/usr/gen_init_cpio
/Volumes/linuxkernel/linux/usr/gen_init_cpio.c:460:16: error: call to undeclared function 'copy_file_range'; ISO C99 and later do not
support implicit function declarations [-Wimplicit-function-declaration]
460 | this_read = copy_file_range(file, NULL, outfd, NULL, size, 0);
| ^
/Volumes/linuxkernel/linux/usr/gen_init_cpio.c:677:31: error: use of undeclared identifier 'O_LARGEFILE'
677 | O_WRONLY | O_CREAT | O_LARGEFILE | O_TRUNC,
| ^~~~~~~~~~~
2 errors generated.O_LARGEFILE is a Linux-specific fcntl flag. macOS does not need it because all file operations are 64-bit by default. We add a shim that wraps macOS’s <fcntl.h> and defines the missing flag as 0:
// scripts/macos-include/fcntl.h
#pragma once
#include_next <fcntl.h>
#define O_LARGEFILE 0Re-compile, this time pointing at our shim:
cc -I/Volumes/linuxkernel/linux/scripts/macos-include \
/Volumes/linuxkernel/linux/usr/gen_init_cpio.c \
-o /Volumes/linuxkernel/linux/usr/gen_init_cpiolinux on master
❯ cc -I/Volumes/linuxkernel/linux/scripts/macos-include \
/Volumes/linuxkernel/linux/usr/gen_init_cpio.c \
-o /Volumes/linuxkernel/linux/usr/gen_init_cpio
/Volumes/linuxkernel/linux/usr/gen_init_cpio.c:460:16: error: call to undeclared function 'copy_file_range'; ISO C99 and later do not
support implicit function declarations [-Wimplicit-function-declaration]
460 | this_read = copy_file_range(file, NULL, outfd, NULL, size, 0);
| ^
1 error generated.Next: copy_file_range. It is a Linux syscall that macOS’s libc does not have. We stub it as an always-failing function so gen_init_cpio’s code falls back to plain read/write:
// scripts/macos-include/unistd.h
#pragma once
#include_next <unistd.h>
#include <sys/types.h>
static inline ssize_t copy_file_range(int a, void *b, int c, void *d, size_t e, unsigned int f) { return -1; }Re-compile. No errors. Now we have gen_init_cpio. Write the spec file describing what should be in the archive:
# /Volumes/linuxkernel/initramfs.txt
dir /dev 755 0 0
nod /dev/console 644 0 0 c 5 1
file /init /Volumes/linuxkernel/initramfs/init 755 0 0The nod line is the key. Reading the fields: /dev/console is the path, 644 is the file mode, the two 0s set the owner and group to root, c marks it as a character device, and 5 1 are the major and minor numbers that Linux reserves for the system console. gen_init_cpio records all of this inside the cpio archive directly, without needing a device node on the macOS filesystem at all. macOS does ship mknod28, but on a modern Mac it cannot create device nodes on regular filesystems anyway, so the cpio approach sidesteps the question entirely.
The node only needs to exist inside the archive, which Linux unpacks into its own tmpfs at boot. The kernel needs this device file to exist because, when it starts our init process, it opens /dev/console to wire up stdin, stdout, and stderr. Without the device, our write syscall to fd 1 would have nowhere to go.
Pack it:
/Volumes/linuxkernel/linux/usr/gen_init_cpio /Volumes/linuxkernel/initramfs.txt \
| gzip > /Volumes/linuxkernel/initramfs.cpio.gzThe pipeline writes the cpio archive into /Volumes/linuxkernel/initramfs.cpio.gz and produces no terminal output. We are ready to boot.
A Silent First Boot
We have an Image, an initramfs.cpio.gz, and a tiny init binary inside it. Let’s run it.
QEMU boots with the RISC-V virt29 machine, takes our kernel, our initramfs, and redirects all I/O to our terminal:
qemu-system-riscv64 \
-M virt \
-kernel /Volumes/linuxkernel/linux/arch/riscv/boot/Image.gz \
-initrd /Volumes/linuxkernel/initramfs.cpio.gz \
-nographic \
-append "console=ttyS0"/Volumes/linuxkernel
❯ qemu-system-riscv64 \
-M virt \
-kernel /Volumes/linuxkernel/linux/arch/riscv/boot/Image.gz \
-initrd /Volumes/linuxkernel/initramfs.cpio.gz \
-nographic \
-append "console=ttyS0"
OpenSBI v1.7
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|____/_____|
| |
|_|
Platform Name : riscv-virtio,qemu
Platform Features : medeleg
Platform HART Count : 1
Platform IPI Device : aclint-mswi
Platform Timer Device : aclint-mtimer @ 10000000Hz
Platform Console Device : uart8250
Platform HSM Device : ---
Platform PMU Device : ---
Platform Reboot Device : syscon-reboot
Platform Shutdown Device : syscon-poweroff
Platform Suspend Device : ---
Platform CPPC Device : ---
Firmware Base : 0x80000000
Firmware Size : 317 KB
Firmware RW Offset : 0x40000
Firmware RW Size : 61 KB
Firmware Heap Offset : 0x46000
Firmware Heap Size : 37 KB (total), 2 KB (reserved), 11 KB (used), 23 KB (free)
Firmware Scratch Size : 4096 B (total), 1400 B (used), 2696 B (free)
Runtime SBI Version : 3.0
Standard SBI Extensions : time,rfnc,ipi,base,hsm,srst,pmu,dbcn,fwft,legacy,dbtr,sse
Experimental SBI Extensions : none
Domain0 Name : root
Domain0 Boot HART : 0
Domain0 HARTs : 0*
Domain0 Region00 : 0x0000000000100000-0x0000000000100fff M: (I,R,W) S/U: (R,W)
Domain0 Region01 : 0x0000000010000000-0x0000000010000fff M: (I,R,W) S/U: (R,W)
Domain0 Region02 : 0x0000000002000000-0x000000000200ffff M: (I,R,W) S/U: ()
Domain0 Region03 : 0x0000000080040000-0x000000008004ffff M: (R,W) S/U: ()
Domain0 Region04 : 0x0000000080000000-0x000000008003ffff M: (R,X) S/U: ()
Domain0 Region05 : 0x000000000c400000-0x000000000c5fffff M: (I,R,W) S/U: (R,W)
Domain0 Region06 : 0x000000000c000000-0x000000000c3fffff M: (I,R,W) S/U: (R,W)
Domain0 Region07 : 0x0000000000000000-0xffffffffffffffff M: () S/U: (R,W,X)
Domain0 Next Address : 0x0000000080200000
Domain0 Next Arg1 : 0x0000000087e00000
Domain0 Next Mode : S-mode
Domain0 SysReset : yes
Domain0 SysSuspend : yes
Boot HART ID : 0
Boot HART Domain : root
Boot HART Priv Version : v1.12
Boot HART Base ISA : rv64imafdch
Boot HART ISA Extensions : sstc,zicntr,zihpm,zicboz,zicbom,sdtrig,svadu
Boot HART PMP Count : 16
Boot HART PMP Granularity : 2 bits
Boot HART PMP Address Bits : 54
Boot HART MHPM Info : 16 (0x0007fff8)
Boot HART Debug Triggers : 2 triggers
Boot HART MIDELEG : 0x0000000000001666
Boot HART MEDELEG : 0x0000000000f4b509OpenSBI30 starts up, prints its banner with platform info, and then the screen just stops. No kernel boot messages. No “Hello, World!”. Nothing.
I had no idea what was wrong. The kernel was supposed to take over from OpenSBI and print its own boot messages. It did not. I searched through the kernel documentation and Stack Overflow looking for an answer.
The pattern that kept coming up was clear: almost nobody starts a kernel build from allnoconfig. With allnoconfig we had turned off everything Kconfig would let us, which includes the parts of the kernel needed to even talk to the user. No HVC_RISCV_SBI so no console output. No BLK_DEV_INITRD so no initramfs support. No BINFMT_ELF so no way to run our compiled init binary. The recommended starting point is defconfig31, a per-architecture default config that has all the basics enabled.
So I started over with defconfig.
Starting Fresh with defconfig
Switching to defconfig means wiping the current kernel tree and starting again. But we do not want to lose the scripts/macos-include/ directory we just built. Those shim files are local additions, not part of the kernel source, so they would disappear with a fresh clone.
Before deleting the tree, I copy the macos-include/ directory out of it into ~/Learning/LINUX/macos-include/. From now on, the shim files live there and I link them back into the kernel tree with a symlink. That way, any time I re-clone, the shims survive. The init.c and the packed initramfs.cpio.gz live in /Volumes/linuxkernel/initramfs/ and /Volumes/linuxkernel/, outside the kernel tree, so they survive the reset too. Only the .config and the build artifacts get wiped.
rm -rf /Volumes/linuxkernel/linux
cd /Volumes/linuxkernel
git clone --depth=1 https://github.com/torvalds/linux.git
ln -s ~/Learning/LINUX/macos-include linux/scripts/macos-include
cd linux
gmake ARCH=riscv LLVM=1 defconfigWhat this does:
rm -rfdeletes theallnoconfigtreegit clone --depth=1brings in a fresh kernelln -ssymlinks our preserved shim directory back intoscripts/macos-include/gmake ... defconfiggenerates a fresh.configfrom the RISC-V default config
defconfig first builds the kconfig9 tools, then runs them to generate the .config file:
linux on master
❯ gmake ARCH=riscv LLVM=1 defconfig
HOSTCC scripts/basic/fixdep
HOSTCC scripts/kconfig/conf.o
HOSTCC scripts/kconfig/confdata.o
HOSTCC scripts/kconfig/expr.o
LEX scripts/kconfig/lexer.lex.c
YACC scripts/kconfig/parser.tab.[ch]
HOSTCC scripts/kconfig/lexer.lex.o
HOSTCC scripts/kconfig/menu.o
HOSTCC scripts/kconfig/parser.tab.o
HOSTCC scripts/kconfig/preprocess.o
HOSTCC scripts/kconfig/symbol.o
HOSTCC scripts/kconfig/util.o
HOSTLD scripts/kconfig/conf
*** Default configuration is based on 'defconfig'
#
# configuration written to .config
#Missing defconfig Flags
With defconfig in place, I rebuild the kernel with the macos-include shims still pointed to via HOSTCFLAGS:
gmake ARCH=riscv LLVM=1 -j$(nproc) \
HOSTCFLAGS="-Iscripts/macos-include -I$(brew --prefix libelf)/include -Wno-incompatible-pointer-types -D_UUID_T"This build takes considerably longer than the allnoconfig one. defconfig enables a lot of drivers and features, so there is a lot more code to compile. Good time for a break.
The build runs through without errors. Image and Image.gz are ready. Time to boot:
qemu-system-riscv64 \
-M virt \
-kernel /Volumes/linuxkernel/linux/arch/riscv/boot/Image.gz \
-initrd /Volumes/linuxkernel/initramfs.cpio.gz \
-nographic \
-append "console=ttyS0"Still silent. OpenSBI banner, then nothing. The same silence we saw before.
Back to searching. The answer came from reading QEMU’s virt machine documentation alongside the kernel’s HVC Kconfig file. For QEMU’s RISC-V virt machine, the right console driver is HVC_RISCV_SBI32, which talks through OpenSBI’s SBI33 debug console called DBCN34. It turns out defconfig also does not enable this driver by default, and it depends on a config option called NONPORTABLE which is also off by default.
Enable both:
./scripts/config --enable NONPORTABLE
./scripts/config --enable HVC_RISCV_SBIVerify they are on in .config:
grep -E "^CONFIG_NONPORTABLE|^CONFIG_HVC_RISCV_SBI" .configlinux on master [?]
❯ grep -E "^CONFIG_NONPORTABLE|^CONFIG_HVC_RISCV_SBI" .config
CONFIG_NONPORTABLE=y
CONFIG_HVC_RISCV_SBI=yThere is one more thing to do before rebuilding. Enabling NONPORTABLE unlocks a handful of new config options that have to be answered. If we just start the build, gmake will pause and ask each question on the terminal, one at a time. To accept the defaults for all of them at once, run olddefconfig first:
gmake ARCH=riscv LLVM=1 olddefconfigThis is also the routine recovery step to run whenever you switch kernel versions, checkout a different commit, or toggle config options. It walks the existing .config and silently accepts the default for any option that is new or has changed, so the next build starts from a consistent state.
linux on master took 3m18s
❯ gmake ARCH=riscv LLVM=1 olddefconfig
HOSTCC scripts/basic/fixdep
HOSTCC scripts/kconfig/conf.o
HOSTCC scripts/kconfig/confdata.o
HOSTCC scripts/kconfig/expr.o
HOSTCC scripts/kconfig/lexer.lex.o
HOSTCC scripts/kconfig/menu.o
HOSTCC scripts/kconfig/parser.tab.o
HOSTCC scripts/kconfig/preprocess.o
HOSTCC scripts/kconfig/symbol.o
HOSTCC scripts/kconfig/util.o
HOSTLD scripts/kconfig/conf
#
# configuration written to .config
#Now rebuild the kernel with the new config:
gmake ARCH=riscv LLVM=1 -j$(nproc) \
HOSTCFLAGS="-Iscripts/macos-include -I$(brew --prefix libelf)/include -Wno-incompatible-pointer-types -D_UUID_T"Image vs Image.gz
With HVC_RISCV_SBI enabled, the kernel can talk through the SBI debug console. We update the QEMU command line to use it. console=hvc0 selects the SBI console as the main console; earlycon=sbi35 adds an early-stage console for messages that come before the main one initializes, so we do not miss anything in early boot.
qemu-system-riscv64 \
-M virt \
-kernel /Volumes/linuxkernel/linux/arch/riscv/boot/Image.gz \
-initrd /Volumes/linuxkernel/initramfs.cpio.gz \
-nographic \
-append "earlycon=sbi console=hvc0"Still silent. OpenSBI banner, then nothing.
This took a while to figure out. The answer is in QEMU’s source code. The function that loads the kernel for RISC-V is riscv_load_kernel in hw/riscv/boot.c. It tries three loaders in order:
load_elf_ram_symreads the first bytes and checks for ELF’s magic bytes36. OurImage.gzis gzipped, so it has gzip’s magic bytes, not ELF’s. The ELF loader rejects it.load_uimage_aschecks for U-Boot uImage magic bytes. Same story, our file is not a uImage. Rejected.load_image_targphys_asis the unconditional fallback. It does not check any magic. It just copies the file bytes into emulated RAM at the kernel load address.
By the third step the file has loaded successfully, but it loaded as raw gzipped bytes. After OpenSBI hands off, the CPU jumps to the kernel address and tries to decode the gzip header as RISC-V instructions. Those bytes are nonsense as instructions. The CPU faults, but no console driver is alive yet to tell us about it. So we see silence.
The fix is simple: use the uncompressed Image instead of Image.gz. Both files are produced by the build. Image is the same kernel, just not gzipped:
qemu-system-riscv64 \
-M virt \
-kernel /Volumes/linuxkernel/linux/arch/riscv/boot/Image \
-initrd /Volumes/linuxkernel/initramfs.cpio.gz \
-nographic \
-append "earlycon=sbi console=hvc0"linux on master took 12s
❯ qemu-system-riscv64 \
-M virt \
-kernel /Volumes/linuxkernel/linux/arch/riscv/boot/Image \
-initrd /Volumes/linuxkernel/initramfs.cpio.gz \
-nographic \
-append "earlycon=sbi console=hvc0"
OpenSBI v1.7
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|____/_____|
| |
|_|
Platform Name : riscv-virtio,qemu
Platform HART Count : 1
[... truncated ...]
Runtime SBI Version : 3.0
Standard SBI Extensions : time,rfnc,ipi,base,hsm,srst,pmu,dbcn,fwft,legacy,dbtr,sse
[... truncated ...]
Boot HART ID : 0
Boot HART Base ISA : rv64imafdch
[... truncated ...]
[ 0.000000] Booting Linux on hartid 0
[ 0.000000] Linux version 7.1.0-rc3 (jefrydco@jefrydco-macbook-personal.local) (Homebrew clang version 22.1.4, Homebrew LLD 22.1.4) #1 SMP PREEMPT Mon May 11 08:01:53 WIB 2026
[ 0.000000] Machine model: riscv-virtio,qemu
[ 0.000000] SBI specification v3.0 detected
[ 0.000000] SBI DBCN extension detected
[ 0.000000] earlycon: sbi0 at I/O port 0x0 (options '')
[ 0.000000] printk: legacy bootconsole [sbi0] enabled
[ 0.000000] Kernel command line: earlycon=sbi console=hvc0
[... truncated ...]
[ 0.256876] Unpacking initramfs...
[... truncated ...]
[ 1.123813] Freeing unused kernel image (initmem) memory: 2484K
[ 1.124509] Run /init as init processOutput! The kernel boots, prints its early messages like “Booting Linux on hartid37 0”, mounts the initramfs, finds /init, and runs it.
The Silent Hello World
After Run /init as init process the screen showed nothing else. No “Hello, World!”, no crash, no panic. The init binary ran, did not crash, and looped forever in while(1), exactly as written. The kernel could not tell me what was wrong because there was no error to print. So I went to look at what the compiler actually produced for the asm block. The llvm-objdump38 tool turns a binary back into readable assembly:
llvm-objdump -d /Volumes/linuxkernel/initramfs/init/Volumes/linuxkernel/initramfs via C v21.0.0-clang
❯ llvm-objdump -d /Volumes/linuxkernel/initramfs/init
/Volumes/linuxkernel/initramfs/init: file format elf64-littleriscv
Disassembly of section .text:
00000000000111fc <_start>:
111fc: 1101 addi sp, sp, -0x20
111fe: ec06 sd ra, 0x18(sp)
11200: e822 sd s0, 0x10(sp)
11202: 1000 addi s0, sp, 0x20
11204: 4501 li a0, 0x0
11206: fea40723 sb a0, -0x12(s0)
1120a: 6505 lui a0, 0x1
1120c: a2150513 addi a0, a0, -0x5df
11210: fea41623 sh a0, -0x14(s0)
11214: 646c7537 lui a0, 0x646c7
11218: 26f50513 addi a0, a0, 0x26f
1121c: fea42423 sw a0, -0x18(s0)
11220: fffff517 auipc a0, 0xfffff
11224: f7050513 addi a0, a0, -0x90
11228: 6108 ld a0, 0x0(a0)
1122a: fea43023 sd a0, -0x20(s0)
1122e: fe040513 addi a0, s0, -0x20
11232: 04000893 li a7, 0x40
11236: 4505 li a0, 0x1
11238: 85aa mv a1, a0
1123a: 4639 li a2, 0xe
1123c: 00000073 ecall
11240: a009 j 0x11242 <_start+0x46>
11242: a001 j 0x11242 <_start+0x46>It took me a while to read through the disassembly. The crucial line in my asm was mv a1, %0. I had thought %0 would somehow stand for msg itself, but what it actually does is get replaced by the compiler with whatever register holds msg.
Here is what really happened:
- The constraint
"r"(msg)told the compiler to putmsginto some register, without saying which one. - The compiler picked
a0. Look ataddi a0, s0, -0x20right before the asm body. That is the compiler putting the stack address ofmsgintoa0. - The asm body then ran in order:
li a7, 64, thenli a0, 1. The second instruction loaded the number 1 intoa0, overwriting the address ofmsg. mv a1, %0got replaced withmv a1, a0, copying the now-overwritten value 1 intoa1. Soa1held 1, not the address ofmsg.
When ecall ran, the registers held a0=1, a1=1, a2=14, a7=64. The kernel saw this as write(1, (char *)1, 14): write 14 bytes from address 1 to file descriptor 1. Address 1 is not real memory. The syscall returned -EFAULT, an error code meaning “bad address”, and nothing was printed.
What I needed was a way to tell the compiler exactly which register each value should be in. I looked up how to do this and found the syntax. The register keyword in C is a hint to the compiler to keep a variable in a register. On its own it does not do much; compilers manage registers automatically without it. When you combine it with an asm(...) clause naming a specific register, like asm("a0"), after the variable name, the two together become a GCC and Clang extension called local register variables39. They pin the variable to that specific hardware register. Both keywords have to be there together: asm(...) after a variable declaration only works if you also write register in front.
The RISC-V Linux syscall convention tells us which register each value needs to be in. a7 holds the syscall number, and a0 through a5 hold the arguments. For write(fd, buf, count), that means a0 is the fd, a1 is the buffer pointer, and a2 is the byte count. With each value pinned to its named register, the asm body itself only needs to do the ecall:
// /Volumes/linuxkernel/initramfs/init.c
void _start() {
const char msg[] = "Hello, World!\n";
register long a0 asm("a0") = 1;
register const char *a1 asm("a1") = msg;
register long a2 asm("a2") = 14;
register long a7 asm("a7") = 64;
__asm__ volatile (
"ecall\n"
: "+r"(a0)
: "r"(a1), "r"(a2), "r"(a7)
: "memory"
);
while(1);
}Rebuild, disassemble again:
clang --target=riscv64-linux-gnu \
-static \
-nostdlib \
-fuse-ld=lld \
-o /Volumes/linuxkernel/initramfs/init \
/Volumes/linuxkernel/initramfs/init.cllvm-objdump -d /Volumes/linuxkernel/initramfs/init/Volumes/linuxkernel/initramfs via C v21.0.0-clang
❯ llvm-objdump -d /Volumes/linuxkernel/initramfs/init
/Volumes/linuxkernel/initramfs/init: file format elf64-littleriscv
Disassembly of section .text:
00000000000111fc <_start>:
111fc: 7139 addi sp, sp, -0x40
111fe: fc06 sd ra, 0x38(sp)
11200: f822 sd s0, 0x30(sp)
11202: 0080 addi s0, sp, 0x40
11204: 4501 li a0, 0x0
11206: fea40723 sb a0, -0x12(s0)
1120a: 6505 lui a0, 0x1
1120c: a2150513 addi a0, a0, -0x5df
11210: fea41623 sh a0, -0x14(s0)
11214: 646c7537 lui a0, 0x646c7
11218: 26f50513 addi a0, a0, 0x26f
1121c: fea42423 sw a0, -0x18(s0)
11220: fffff517 auipc a0, 0xfffff
11224: f7050513 addi a0, a0, -0x90
11228: 6108 ld a0, 0x0(a0)
1122a: fea43023 sd a0, -0x20(s0)
1122e: 4505 li a0, 0x1
11230: fca43c23 sd a0, -0x28(s0)
11234: fe040513 addi a0, s0, -0x20
11238: fca43823 sd a0, -0x30(s0)
1123c: 4539 li a0, 0xe
1123e: fca43423 sd a0, -0x38(s0)
11242: 04000513 li a0, 0x40
11246: fca43023 sd a0, -0x40(s0)
1124a: fd843503 ld a0, -0x28(s0)
1124e: fd043583 ld a1, -0x30(s0)
11252: fc843603 ld a2, -0x38(s0)
11256: fc043883 ld a7, -0x40(s0)
1125a: 00000073 ecall
1125e: fca43c23 sd a0, -0x28(s0)
11262: a009 j 0x11264 <_start+0x68>
11264: a001 j 0x11264 <_start+0x68>Look at the four ld instructions right before ecall. Each one loads a value into a specific named register: a0 for the fd, a1 for the buffer pointer, a2 for the byte count, a7 for the syscall number. The pinning worked.
Before booting, we need to repack the initramfs so the new init binary lands inside the cpio archive that the kernel reads. The on-disk binary is fresh, but the cpio archive still contains the old one until we rebuild it:
/Volumes/linuxkernel/linux/usr/gen_init_cpio /Volumes/linuxkernel/initramfs.txt \
| gzip > /Volumes/linuxkernel/initramfs.cpio.gzBoot once more:
qemu-system-riscv64 \
-M virt \
-kernel /Volumes/linuxkernel/linux/arch/riscv/boot/Image \
-initrd /Volumes/linuxkernel/initramfs.cpio.gz \
-nographic \
-append "earlycon=sbi console=hvc0"linux on master took 4m28s
❯ qemu-system-riscv64 \
-M virt \
-kernel /Volumes/linuxkernel/linux/arch/riscv/boot/Image \
-initrd /Volumes/linuxkernel/initramfs.cpio.gz \
-nographic \
-append "earlycon=sbi console=hvc0"
OpenSBI v1.7
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|____/_____|
| |
|_|
Platform Name : riscv-virtio,qemu
Platform HART Count : 1
[... truncated ...]
Runtime SBI Version : 3.0
Standard SBI Extensions : time,rfnc,ipi,base,hsm,srst,pmu,dbcn,fwft,legacy,dbtr,sse
[... truncated ...]
Boot HART ID : 0
Boot HART Base ISA : rv64imafdch
[... truncated ...]
[ 0.000000] Booting Linux on hartid 0
[ 0.000000] Linux version 7.1.0-rc3 (jefrydco@jefrydco-macbook-personal.local) (Homebrew clang version 22.1.4, Homebrew LLD 22.1.4) #1 SMP PREEMPT Mon May 11 08:01:53 WIB 2026
[ 0.000000] Machine model: riscv-virtio,qemu
[ 0.000000] SBI specification v3.0 detected
[ 0.000000] SBI DBCN extension detected
[ 0.000000] earlycon: sbi0 at I/O port 0x0 (options '')
[ 0.000000] printk: legacy bootconsole [sbi0] enabled
[ 0.000000] Kernel command line: earlycon=sbi console=hvc0
[... truncated ...]
[ 0.263436] Unpacking initramfs...
[... truncated ...]
[ 1.167737] Freeing unused kernel image (initmem) memory: 2484K
[ 1.168408] Run /init as init process
Hello, World!The first program I wrote in C ran on a kernel I built myself, on RISC-V emulated by QEMU on my ARM64 Mac. The message printed. That is the moment that made the whole journey worth it.
Reading init/main.c
Hello, World prints. Everything from CPU power-on to our 14 lines of assembly ran. The kernel did a lot of work in between, almost all of it inside init/main.c. This section walks through that file. By recognizing the parts while things work, you will know where to look when something breaks.
Everything between firmware and our /init lives in one file: init/main.c. The file has two functions worth knowing about: start_kernel and kernel_init.
start_kernel is the kernel’s main. It does most of the setup work:
- Sets up memory: figures out what RAM is available so the kernel can hand out memory to anything that asks for it later.
- Sets up interrupts: tells the CPU how to handle timer ticks, page faults, and system calls.
- Sets up the scheduler: the part that decides which task gets to run on the CPU next.
- Sets up the console: the reason we see boot messages on our terminal.
By the time start_kernel finishes, the kernel is alive and printing to our console. But there is no /init running yet. That happens next.
kernel_init is the function that runs after start_kernel. It is the bridge from “kernel running on bare hardware” to “our userspace program running”. It:
- Waits for
initramfs.cpio.gzto unpack into memory. - Opens
/dev/consoleso that file descriptors 0, 1, 2 work for any program it runs. - Calls
run_init_process("/init"), the function that printsRun /init as init processand then hands control to our binary.
Below are common symptoms, each paired with where to look in the source code:
- No kernel messages at all, only the OpenSBI banner: the early console did not come up. Look at
setup_archinarch/riscv/kernel/setup.cand theearlycon=sbiparsing insetup_earlyconinsidedrivers/tty/serial/earlycon.c. This is the same fix we used when we first addedearlycon=sbi. - Boot stops somewhere between OpenSBI and
Run /init: something instart_kernelorkernel_initpanicked or hung. The last line printed before silence is your landmark. Openinit/main.c, grep for the exact text, and read what runs immediately after. Run /init as init processand then silence: the kernel did its job. The bug is in your/initbinary. We just experienced this in the previous section.Warning: unable to open an initial console:console_on_rootfsfailed. Either/dev/consoleis missing from your initramfs, or the console driver was not built into the kernel. Our work onHVC_RISCV_SBIcovered the second case.- Kernel panic with
No working init found: the kernel finished its setup but could not find/initin your initramfs. Check that you actually packed it in.
Knowing these landmarks turns “the boot is broken” into a much shorter list of places to check.
The kernel has its own version of printf called printk40. Modern kernel code usually calls it through wrappers such as pr_info, pr_warn, and pr_err. Each wrapper calls printk at a different log level. The output goes to the same console our boot log uses. You can add pr_info("hello from my code\n"); anywhere in the kernel, rebuild, boot, and look for the line. This is the fastest way to confirm a landmark you found in the source actually ran.
Let me show this with the two landmarks we just named. Open init/main.c and find the start_kernel function. Near the end of it, just before it calls rest_init, add:
pr_info("hello from start_kernel\n");Now find kernel_init lower in the same file. At the top of the function body, just before the call to wait_for_completion(&kthreadd_done), add:
pr_info("hello from kernel_init\n");Rebuild the kernel:
gmake ARCH=riscv LLVM=1 -j$(nproc) \
HOSTCFLAGS="-Iscripts/macos-include -I$(brew --prefix libelf)/include -Wno-incompatible-pointer-types -D_UUID_T"Boot it:
qemu-system-riscv64 \
-M virt \
-kernel /Volumes/linuxkernel/linux/arch/riscv/boot/Image \
-initrd /Volumes/linuxkernel/initramfs.cpio.gz \
-nographic \
-append "earlycon=sbi console=hvc0"The boot log now carries two new lines, one from each function:
linux on master took 22s
❯ qemu-system-riscv64 \
-M virt \
-kernel /Volumes/linuxkernel/linux/arch/riscv/boot/Image \
-initrd /Volumes/linuxkernel/initramfs.cpio.gz \
-nographic \
-append "earlycon=sbi console=hvc0"
OpenSBI v1.7
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|____/_____|
| |
|_|
Platform Name : riscv-virtio,qemu
Platform HART Count : 1
[... truncated ...]
Runtime SBI Version : 3.0
Standard SBI Extensions : time,rfnc,ipi,base,hsm,srst,pmu,dbcn,fwft,legacy,dbtr,sse
[... truncated ...]
Boot HART ID : 0
Boot HART Base ISA : rv64imafdch
[... truncated ...]
[ 0.000000] Booting Linux on hartid 0
[ 0.000000] Linux version 7.1.0-rc3-dirty (jefrydco@jefrydco-macbook-personal.local) (Homebrew clang version 22.1.4, Homebrew LLD 22.1.4) #2 SMP PREEMPT Mon May 11 09:42:18 WIB 2026
[ 0.000000] Machine model: riscv-virtio,qemu
[ 0.000000] SBI specification v3.0 detected
[ 0.000000] SBI DBCN extension detected
[ 0.000000] earlycon: sbi0 at I/O port 0x0 (options '')
[ 0.000000] printk: legacy bootconsole [sbi0] enabled
[ 0.000000] Kernel command line: earlycon=sbi console=hvc0
[... truncated ...]
[ 0.031952] hello from start_kernel
[ 0.039693] hello from kernel_init
[... truncated ...]
[ 0.267852] Unpacking initramfs...
[... truncated ...]
[ 1.177287] Freeing unused kernel image (initmem) memory: 2484K
[ 1.178003] Run /init as init process
Hello, World!You can add a pr_info anywhere and learn whether that code path runs. That is the whole technique.
printk is enough to answer “did this run?”. For “what value is in this register?” or “we want to pause the kernel mid-boot”, the next section attaches lldb to a running QEMU.
Watching the Boot from Inside
printk answers “did this run?”. For deeper questions, like “what is the value of this register?” or “we want to pause the kernel and look at memory”, we need a debugger. lldb plus QEMU’s gdb stub gives us that.
The setup has four steps:
- Build the kernel with debug symbols so the debugger knows the names of functions and variables.
- Start QEMU with its gdb stub turned on and the CPU paused, so we can attach the debugger before any kernel code runs.
- Attach
lldbto the gdb stub. - Walk through the boot.
For lldb to map addresses to source lines, the kernel must be built with DWARF41 symbols. It turns out defconfig ships with DEBUG_INFO_NONE turned on, which strips symbols away. We need to turn that off and enable DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT instead. While we are there, GDB_SCRIPTS adds helper scripts that gdb-compatible debuggers can load:
./scripts/config --disable DEBUG_INFO_NONE
./scripts/config --enable DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT
./scripts/config --enable GDB_SCRIPTSVerify both are on in .config:
grep -E "^CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT|^CONFIG_GDB_SCRIPTS" .configlinux on master [?]
❯ grep -E "^CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT|^CONFIG_GDB_SCRIPTS" .config
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_GDB_SCRIPTS=yThen run olddefconfig and rebuild:
gmake ARCH=riscv LLVM=1 olddefconfig
gmake ARCH=riscv LLVM=1 -j$(nproc) \
HOSTCFLAGS="-Iscripts/macos-include -I$(brew --prefix libelf)/include -Wno-incompatible-pointer-types -D_UUID_T"olddefconfig reconciles the three toggles with any other config options that depend on them, accepting the default for anything new. Then the build runs against a consistent .config.
This produces a new vmlinux42 in the kernel tree alongside Image. vmlinux is the unstripped ELF kernel binary, with all symbol tables intact. lldb reads vmlinux to learn where each function lives. The Image file we have been booting from is vmlinux with debug info stripped and the ELF wrapper removed.
Now start QEMU with two new flags:
qemu-system-riscv64 \
-M virt \
-kernel /Volumes/linuxkernel/linux/arch/riscv/boot/Image \
-initrd /Volumes/linuxkernel/initramfs.cpio.gz \
-nographic \
-append "earlycon=sbi console=hvc0" \
-s -SThe new flags:
-sturns on QEMU’s gdbstub43 on TCP port 1234. Any debugger that speaks the GDB protocol can connect.-Swith a capital S tells QEMU to start the guest CPU paused. Without this, the kernel would run paststart_kerneland beyond before we have any chance to attach.
The terminal sits there waiting. QEMU has loaded the kernel into RAM and is paused at the reset vector44. No instructions have executed yet.
Open a second terminal. Start lldb with the unstripped kernel binary:
lldb /Volumes/linuxkernel/linux/vmlinuxlinux on master via 🐍 v3.14.5
❯ lldb /Volumes/linuxkernel/linux/vmlinux
(lldb) target create "/Volumes/linuxkernel/linux/vmlinux"
Current executable set to '/Volumes/linuxkernel/linux/vmlinux' (riscv64).lldb reads vmlinux, learns every function in the kernel, and shows a prompt. It is not yet connected to QEMU.
Connect to QEMU’s gdb stub and set a breakpoint45 at the kernel’s entry C function:
(lldb) gdb-remote localhost:1234
(lldb) breakpoint set --name start_kernel
(lldb) continuegdb-remote localhost:1234 opens a TCP connection to QEMU. lldb tells QEMU “stop when you reach this address” using the GDB wire protocol. continue releases the paused CPU. QEMU runs from the reset vector, through the architecture-specific assembly in head.S whose details I am still learning, and stops at start_kernel:
(lldb) gdb-remote localhost:1234
Process 1 stopped
* thread #1, stop reason = signal SIGTRAP
frame #0: 0x0000000000001000
-> 0x1000: auipc t0, 0x0
0x1004: addi a2, t0, 0x28
0x1008: csrr a0, mhartid
0x100c: ld a1, 0x20(t0)
(lldb) breakpoint set --name start_kernel
Breakpoint 1: where = vmlinux`start_kernel + 12 at main.c:1019:8, address = 0xffffffff80c00360
(lldb) continue
Process 1 resuming
Process 1 stopped
* thread #1, stop reason = breakpoint 1.1
frame #0: 0xffffffff80c00360 vmlinux`start_kernel at main.c:1019:8
1016 asmlinkage __visible __init __no_sanitize_address __noreturn __no_stack_protector
1017 void start_kernel(void)
1018 {
-> 1019 char *command_line;
1020 char *after_dashes;
1021
1022 set_task_stack_end_magic(&init_task);We are standing inside the kernel’s first C function. Every register value, every variable, every memory address is inspectable from this point.
Let me jump to a more interesting place. run_init_process is the function we named in the previous section as the one that prints Run /init as init process and hands off to our binary. Set a breakpoint there, continue, and inspect the argument:
(lldb) breakpoint set --name run_init_process
(lldb) continue
(lldb) register read a0
(lldb) memory read --format s --count 1 $a0When the breakpoint hits, we are at the moment the kernel is about to call our init binary. The first argument is in a0 per the RISC-V calling convention15. run_init_process takes one argument, const char *init_filename, so a0 should hold a pointer to the string /init:
(lldb) breakpoint set --name run_init_process
Breakpoint 2: where = vmlinux`run_init_process + 18 at main.c:1507:15, address = 0xffffffff80002012
(lldb) continue
Process 1 resuming
Process 1 stopped
* thread #1, stop reason = breakpoint 2.1
frame #0: 0xffffffff80002012 vmlinux`run_init_process(init_filename="/init") at main.c:1507:15
1504 {
1505 const char *const *p;
1506
-> 1507 argv_init[0] = init_filename;
1508 pr_info("Run %s as init process\n", init_filename);
1509 pr_debug(" with arguments:\n");
1510 for (p = argv_init; *p; p++)
(lldb) register read a0
a0 = 0xffffffff8134d295
(lldb) memory read --format s --count 1 $a0
0xffffffff8134d295: "/init"memory read --format s reads the address in $a0 and interprets the bytes as a null-terminated string. The kernel really is about to run /init, not some other binary.
One more stop. start_thread is the architecture-specific function the kernel calls to set up a new userspace thread’s state before the CPU returns to U-mode21. On RISC-V it takes three arguments: a pointer to the task’s saved register state, the program counter46 where userspace should begin executing, and the initial stack pointer47. Set the breakpoint and read the three argument registers:
(lldb) breakpoint set --name start_thread
(lldb) continue
(lldb) register read a0 a1 a2(lldb) breakpoint set --name start_thread
Breakpoint 3: where = vmlinux`start_thread + 24 at process.c:147:15, address = 0xffffffff8001411c
(lldb) continue
Process 1 resuming
Process 1 stopped
* thread #1, stop reason = breakpoint 3.1
frame #0: 0xffffffff8001411c vmlinux`start_thread(regs=0xff2000000000bee0, pc=70140, sp=140737414192480) at process.c:147:15
144 void start_thread(struct pt_regs *regs, unsigned long pc,
145 unsigned long sp)
146 {
-> 147 regs->status = SR_PIE;
148 if (has_fpu()) {
149 regs->status |= SR_FS_INITIAL;
150 /*
(lldb) register read a0 a1 a2
a0 = 0x0000000000000020
a1 = 0x00000000000111fc
a2 = 0x00007ffffb945d60a1 is the program counter, which is the entry address of _start from our init ELF. a2 is the top of the userspace stack the kernel allocated for the new process.
After start_thread returns and the kernel transitions back to U-mode via sret48, the CPU’s program counter becomes the value from a1 and execution jumps into our 14 lines of inline assembly. Hello World prints.
We just watched the boot from the inside: reset vector through start_kernel, through kernel_init, through run_init_process, through start_thread, into userspace. Every transition we read about in the previous section just happened live, with breakpoints and register reads as proof.
Where to Go from Here
What I built works, but understanding it deeply is its own journey. Here is what I am looking at next, in case it helps you pick a direction.
Run a real userspace. Our 14-line init proves the kernel can hand off to userspace, but it can only print one line. Replace it with BusyBox to get a working shell along with ls, cat, and the rest. Build BusyBox statically for RISC-V, drop the binary into the initramfs, point /init at BusyBox’s init, and reboot.
Reference: https://busybox.net/
Add a custom syscall. Pick an unused number, add an entry in arch/riscv/kernel/syscall_table.c, implement a SYSCALL_DEFINE function, rebuild. From userspace, call it through syscall(NR_my_call, ...). This is the cleanest way to feel the userspace-to-kernel contract from both sides.
Reference: https://docs.kernel.org/process/adding-syscalls.html
I am still working through the fundamentals myself. The textbooks on my list:
- Computer Systems: A Programmer’s Perspective by Bryant and O’Hallaron for how registers, memory, and assembly fit together
- xv6 from MIT for a kernel small enough to read end to end
- The RISC-V Reader by Waterman and Patterson for a compact overview of the ISA
- The C Programming Language by Kernighan and Ritchie for the language
I got this far. I have done the hardest part: opened the box and looked inside. Everything from here is exploring the contents, one piece at a time.
References
- Building Linux Kernel on macOS Natively by Seiya
- BusyBox
- Homebrew
- Linux kernel: Adding a New System Call
- Linux kernel: Early Userspace Support
- Linux kernel: HVC Kconfig
- Linux kernel: Linux Allocated Devices
- QEMU: RISC-V virt Machine
- QEMU source: riscv_load_kernel in hw/riscv/boot.c
Footnotes
-
The world where regular programs run, separate from the kernel. The CPU enforces a hard boundary: userspace code runs at a lower privilege level and cannot directly touch hardware or kernel memory. To do anything across that boundary, like reading a file, writing to the screen, or exiting the program, userspace asks the kernel through a syscall. Reference: https://man7.org/linux/man-pages/man2/intro.2.html ↩ ↩2
-
The first userspace program the kernel runs after booting. The kernel looks for
/initin the initramfs and executes it. Common implementations on real Linux systems are systemd, sysvinit, and OpenRC. Reference: https://man7.org/linux/man-pages/man7/boot.7.html ↩ -
Apple’s branding for ARM-based Macs starting with the M1. The CPU architecture is ARM64. Reference: https://support.apple.com/en-us/HT211814 ↩
-
An open-source CPU architecture. Anyone can implement a chip without paying licensing fees. The instruction set is small and readable, which makes it popular for learning. Reference: https://riscv.org/about/ ↩
-
Building a binary for a different architecture than the machine you’re building on. Compiling a RISC-V kernel on an ARM Mac is cross-compilation. Reference: https://clang.llvm.org/docs/CrossCompilation.html ↩ ↩2
-
LLVM’s linker. Comes with clang. On macOS it’s already installed once you
brew install llvm. Reference: https://lld.llvm.org/index.html ↩ ↩2 -
Executable and Linkable Format. The standard binary format on Linux. A compiled program, a shared library, even an object file are all ELF. Reference: https://man7.org/linux/man-pages/man5/elf.5.html ↩ ↩2 ↩3
-
Apple File System. The default filesystem on modern macOS. Case-insensitive by default. Reference: https://support.apple.com/guide/disk-utility/file-system-formats-dsku19ed921c/mac ↩
-
The kernel’s configuration system. Files literally named
Kconfigdescribe options. Tools inscripts/kconfig/parse them and produce a.configfile. Reference: https://docs.kernel.org/kbuild/kconfig-language.html ↩ ↩2 -
Compiler intrinsics provided by clang and gcc that map directly to CPU instructions.
__builtin_bswap16,__builtin_bswap32, and__builtin_bswap64swap byte order in 16-, 32-, and 64-bit integers respectively. Reference: https://clang.llvm.org/docs/LanguageExtensions.html#builtin-bswap16-builtin-bswap32-builtin-bswap64 ↩ -
Process ID 1. The first userspace process the kernel runs. If PID 1 dies, the kernel panics. Reference: https://man7.org/linux/man-pages/man7/boot.7.html ↩
-
C standard library. Provides functions like
printf,malloc,strcpy, plus the startup code that calls yourmain. We skip it in our init to keep the binary tiny. Reference: https://sourceware.org/glibc/manual/2.39/html_node/Introduction.html ↩ -
Tiny startup code linked into a binary by default. It sets up the stack, calls
main, then callsexitwhen main returns. Without crt0, you have to write_startyourself. Reference: https://sourceware.org/git/?p=glibc.git;a=tree;f=csu ↩ -
System call. The way a regular program asks the kernel to do something it can’t do on its own, like reading a file or writing to a screen. On RISC-V, you put the number in register a7, the arguments in a0 to a5, then run
ecall. Reference: https://man7.org/linux/man-pages/man2/syscalls.2.html ↩ -
Application Binary Interface. Where an API tells you what functions exist, an ABI tells you exactly how to call them: which registers hold arguments, which holds the return value, how the stack is laid out. Reference: https://github.com/riscv-non-isa/riscv-elf-psabi-doc ↩ ↩2
-
A small storage cell built directly into the CPU. Reading or writing a register is much faster than memory, because the register lives inside the CPU itself while memory sits outside. An analogy: memory is the pantry, huge but you have to walk to it; the register is the small workspace on the counter next to the stove, tiny but instantly within reach. RISC-V has 32 general-purpose registers, named
x0throughx31, with ABI names likea0-a7for arguments,t0-t6for temporaries, ands0-s11for saved values. Reference: RISC-V Unprivileged ISA spec at https://github.com/riscv/riscv-isa-manual ↩ -
A small integer the kernel hands back when a program opens something. The thing opened can be a regular file, a device driver, a network socket, a pipe, or the console. The shell
|operator works because the kernel can connect one program’s output fd 1 to the next program’s input fd 0. Every process starts with three fds already open: 0 for input, 1 for output, 2 for errors. Reference: https://man7.org/linux/man-pages/man2/open.2.html ↩ -
A number that holds the memory address of some data, not the data itself. Picture a sticky note with a locker number on it: the note is tiny, but the locker is where the stuff actually lives. To get the stuff, you read the number, walk to that locker, and open it. In our asm,
a1 = msgwrites the locker number where “Hello, World!\n” lives intoa1. The kernel then uses that number to walk to the locker and read the bytes there. Reference: https://en.cppreference.com/w/c/language/pointer ↩ -
Environment Call instruction. The RISC-V trap instruction. From U-mode it traps into S-mode, where the kernel runs. It’s how a syscall is initiated. Reference: Unprivileged ISA spec at https://github.com/riscv/riscv-isa-manual ↩
-
A controlled jump from a less-privileged mode to a more- privileged one. Triggered by exceptions, interrupts, or the
ecallinstruction. Reference: Privileged Architecture spec at https://github.com/riscv/riscv-isa-manual ↩ -
RISC-V has three privilege levels. M-mode is the machine mode, used by the firmware. S-mode is the supervisor mode, used by the kernel. U-mode is the user mode, used by regular programs. M is most privileged, U is least. Reference: Privileged Architecture spec at https://github.com/riscv/riscv-isa-manual ↩ ↩2
-
The native binary format on macOS and iOS. Different from ELF used on Linux. Mach-O binaries have their own magic bytes and structure, so a linker for one format cannot produce the other. Reference: https://developer.apple.com/library/archive/documentation/DeveloperTools/Conceptual/MachORuntime/ ↩
-
Memory Management Unit. The hardware that translates virtual addresses, which are the ones your program sees, into physical addresses, which are the actual locations of the data in RAM. Every program has its own virtual address space, so two programs can both use the address
0x10000without conflict. The MMU translates each program’s virtual address into a different physical RAM location behind the scenes, and also enforces permissions like read-only. Reference: Privileged Architecture spec at https://github.com/riscv/riscv-isa-manual ↩ -
The starting address where the linker places a program in memory.
lldpicks a default based on the target platform. The default can be overridden with-Wl,--image-base=ADDRat link time. Reference: https://lld.llvm.org/ ↩ -
A region of memory that holds temporary data with last-in-first-out rules. Picture a stack of plates fresh from the dishwasher: every new plate goes on top, and when someone wants a plate they take from the top. Programs use the stack to track function calls, local variables, and the return path to the caller. The stack pointer is the marker at the top: every push moves it down, every pop moves it back up. Reference: RISC-V calling convention at https://github.com/riscv-non-isa/riscv-elf-psabi-doc ↩
-
Initial RAM Filesystem. A cpio archive the kernel unpacks into memory at boot. The kernel runs
/initfrom it. Reference: https://docs.kernel.org/filesystems/ramfs-rootfs-initramfs.html ↩ -
Copy In, Out. A simple Unix archive format. Used for the kernel initramfs because the kernel’s unpacker doesn’t need a real filesystem yet. Reference: https://www.gnu.org/software/cpio/manual/cpio.html ↩
-
A Unix command that creates special files like device nodes. For example,
mknod /dev/console c 5 1makes a character device named/dev/consolewith major number 5 and minor number 1. Reference: https://man7.org/linux/man-pages/man1/mknod.1.html ↩ -
QEMU’s generic virtual platform.
-M virtgives you a synthetic board with a CPU, memory, a UART, and a PCIe bus. Reference: https://www.qemu.org/docs/master/system/riscv/virt.html ↩ -
An open-source implementation of SBI. What QEMU uses by default. It runs first at boot, then hands off to the kernel. Reference: https://github.com/riscv-software-src/opensbi ↩
-
A default
.configfor an architecture, living atarch/<arch>/configs/defconfig. Runningmake defconfigresets to that baseline. Reference: https://docs.kernel.org/kbuild/kconfig.html ↩ -
Hypervisor Virtual Console. A Linux console framework used by consoles that don’t fit the regular UART model, including the SBI debug console on RISC-V. Reference: https://github.com/torvalds/linux/blob/master/drivers/tty/hvc/Kconfig ↩
-
Supervisor Binary Interface. The contract between the kernel (in S-mode) and the firmware (in M-mode). When the kernel needs to do something only firmware can do, it calls into SBI. Reference: https://github.com/riscv-non-isa/riscv-sbi-doc ↩
-
Debug Console Extension. A part of SBI 2.0+ that lets the kernel print characters via the firmware. Reference: DBCN extension chapter at https://github.com/riscv-non-isa/riscv-sbi-doc ↩
-
A kernel boot parameter that turns on a minimal console driver very early in boot, before the main console is registered. Useful for capturing crash messages and configuration errors that would otherwise be lost during early startup. Reference: https://docs.kernel.org/admin-guide/kernel-parameters.html ↩
-
A unique sequence at the start of a file that identifies its format. ELF starts with
7F 45 4C 46, gzip with1F 8B 08, PNG with89 50 4E 47. Tools that handle multiple formats check these bytes to know which format the file is. Reference: https://man7.org/linux/man-pages/man1/file.1.html ↩ -
Hardware Thread ID. RISC-V’s name for a CPU core’s identifier. Hart is short for hardware thread. Hart 0 is the first core. Reference: https://github.com/riscv/riscv-isa-manual ↩
-
Reading a compiled binary by translating its machine code back into assembly instructions. Useful for checking what the compiler actually produced. Reference: https://llvm.org/docs/CommandGuide/llvm-objdump.html ↩
-
A GCC and Clang extension that pins a C variable to a specific hardware register. Written as
register long var asm("a0") = value;. Theregisterkeyword and theasm(...)clause must both be present. Reference: https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html ↩ -
The kernel’s
printf. It writes to a ring buffer in memory. Reference: https://docs.kernel.org/core-api/printk-basics.html ↩ -
A standard format for storing source-level debug info inside a compiled binary: function names, variable types, line numbers. With DWARF, a debugger can tell us we are at line 42 of
init/main.cinstead of just an address like 0xffffffff80123456. Reference: https://dwarfstd.org/ ↩ -
The unstripped ELF version of the Linux kernel produced by the build. Contains all symbol tables and debug info. The
Imagefile we boot from isvmlinuxwith the debug info stripped and the ELF wrapper removed. Reference: https://docs.kernel.org/admin-guide/bug-hunting.html ↩ -
A small piece of code that speaks the GDB Remote Serial Protocol over a TCP socket. QEMU has one built in. A debugger can connect to it and step through code running inside QEMU. Reference: https://www.qemu.org/docs/master/system/gdb.html ↩
-
The address the CPU jumps to on power-up or reset. The very first instruction the system runs lives there. On QEMU’s RISC-V virt machine, the reset vector points into a small ROM containing a few instructions that hand off to OpenSBI. Reference: https://www.qemu.org/docs/master/system/riscv/virt.html ↩
-
A marker that tells the debugger to pause execution when the program reaches a specific function or address. While paused, the debugger can inspect registers, memory, and variables before letting the program continue. Reference: https://lldb.llvm.org/use/tutorial.html ↩
-
Program Counter. A special CPU register that holds the address of the next instruction to execute. After the CPU finishes one instruction, it reads the next one from the address in the program counter. Reference: RISC-V Unprivileged ISA spec at https://github.com/riscv/riscv-isa-manual ↩
-
Stack Pointer. A CPU register holding the current top of the stack. Pushing data onto the stack moves the stack pointer down. Popping data moves it back up. Reference: RISC-V calling convention at https://github.com/riscv-non-isa/riscv-elf-psabi-doc ↩
-
Supervisor Return instruction. The return half of
ecall. It drops privilege from S-mode back to U-mode and resumes execution in userspace. Reference: Privileged Architecture spec at https://github.com/riscv/riscv-isa-manual ↩