XTLDR exposes PML table memory to the kernel as LoaderFree #25

Open
opened 2026-03-07 09:58:17 +01:00 by harraiken · 2 comments
Owner

There is a critical memory management bug in the XTLDR. When the bootloader constructs the page tables and enables paging before handing off control to the kernel, it fails to explicitly reserve the physical frame containing the top-level PML table. As a result, the physical address residing in the CR3 register is passed to the kernel as LoaderFree (Type 2). When the kernel's memory manager begins allocating memory and zeroes out a newly requested page via RTL::Memory::ZeroMemory, it inadvertently overwrites the active PML table. This instantly destroys the entire virtual memory address space, leading to a system freeze and subsequent Double/Triple Faults due to the CPU being unable to read the IDT or stack.

Problem analysis:

The active CR3 register points to physical address 0x7CE0D000:

(qemu) info registers

CPU#0
RAX=0000000000000000 RBX=000000007e261018 RCX=fffff8080005b708 RDX=fffff6fc00000000
RSI=0000000000000000 RDI=000000007e261018 RBP=fffff80800051110 RSP=fffff80800051030
R8 =0000000000000001 R9 =fffff80800051138 R10=0000800000000000 R11=ffff7fffffffffff
R12=0000000000000000 R13=000000007ee5f652 R14=0000000000000000 R15=000000007ff25be0
RIP=fffff8080002c47a RFL=00000086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =002b 0000000000000000 ffffffff 00cff300 DPL=3 DS   [-WA]
CS =0010 0000000000000000 00000000 00209a00 DPL=0 CS64 [-R-]
SS =0018 0000000000000000 00000000 00209300 DPL=0 DS   [-WA]
DS =002b 0000000000000000 ffffffff 00cff300 DPL=3 DS   [-WA]
FS =0053 0000000000000000 00000fff 0040f300 DPL=3 DS   [-WA]
GS =002b fffff8080005adb0 ffffffff 00cff300 DPL=3 DS   [-WA]
LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
TR =0040 fffff8080005b6a0 00000067 00008900 DPL=0 TSS64-avl
GDT=     fffff808000595b0 000007ff
IDT=     fffff80800059db0 00000fff
CR0=80050011 CR2=0000000000000000 CR3=000000007ce0d000 CR4=000006f8
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d01
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=fffff8080165f140 fffff8080165f180 XMM01=0000000000000000 0000000000000000
XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000
XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000
XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000
XMM08=0000000000000000 0000000000000000 XMM09=0000000000000000 0000000000000000
XMM10=0000000000000000 0000000000000000 XMM11=0000000000000000 0000000000000000
XMM12=0000000000000000 0000000000000000 XMM13=0000000000000000 0000000000000000
XMM14=0000000000000000 0000000000000000 XMM15=0000000000000000 0000000000000000

A physical memory dump of this exact address confirms the PML4 has been completely zeroed out:

(qemu) xp /80x 0x7CE0D000
000000007ce0d000: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d010: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d020: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d030: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d040: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d050: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d060: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d070: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d080: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d090: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d0a0: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d0b0: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d0c0: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d0d0: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d0e0: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d0f0: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d100: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d110: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d120: 0x00000000 0x00000000 0x00000000 0x00000000
000000007ce0d130: 0x00000000 0x00000000 0x00000000 0x00000000

Code analysis:

The Xtos::RunBootSequence function in the xtos_o module, exhibits an architectural flaw:

/* 1. The memory map dump for the kernel is created */
Status = InitializeLoaderBlock(&PageMap, &VirtualAddress, Parameters);

/* 2. Page tables are actually allocated and built */
Status = EnablePaging(&PageMap);

At the time InitializeLoaderBlock is executed, the memory for the PML4 table has not yet been reserved, which is why XTLDR marks it as free. To diagnose the issue, I added custom diagnostic code to the Xtos::GetMemoryDescriptorList function:

    while(ListEntry != &PageMap->MemoryMap)
    {
        /* Retrieve the internal memory mapping record from the current list entry */
        MemoryMapping = CONTAIN_RECORD(ListEntry, XTBL_MEMORY_MAPPING, ListEntry);

        /* Transfer memory type and address information to the kernel descriptor */
        Descriptor->MemoryType = MemoryMapping->MemoryType;
        Descriptor->BasePage = (UINT_PTR)(MemoryMapping->PhysicalAddress / EFI_PAGE_SIZE);
        Descriptor->PageCount = (ULONG)MemoryMapping->NumberOfPages;

        /* Print mapped PFN range for diagnostics */
        XtLdrProtocol->Debug.Print(L"[XTOS] Memory descriptor %lu: Type=%lu, PFN=[0x%zX, 0x%zX)\n",
                                   (ULONG)DescriptorIndex, Descriptor->MemoryType,
                                   Descriptor->BasePage, Descriptor->BasePage + Descriptor->PageCount);

        /* Link the entry */
        XtLdrProtocol->LinkedList.InsertTail(MemoryDescriptorList, &Descriptor->ListEntry);

        /* Move to the next slot in the allocated buffer */
        Descriptor++;
        ListEntry = ListEntry->Flink;
        DescriptorIndex++;
    }

With these additional logs, I was able to analyze the memory map being passed to the kernel and noticed that the address used by the PML4 table is explicitly marked as LoaderFree:

[XTOS] Memory descriptor 13: Type=5, PFN=[0x7BFDC, 0x7BFFC)
[XTOS] Memory descriptor 14: Type=2, PFN=[0x7BFFC, 0x7CE0E)   <--- PML4 (0x7CE0D)
[XTOS] Memory descriptor 15: Type=20, PFN=[0x7CE0E, 0x7CE10)

The physical address of the PML4 table from the CR3 register is 0x7CE0D000. Dividing this by the page size (4096 bytes) gives us the PFN: 0x7CE0D000 / 4096 = 0x7CE0D. Mathematically, this frame (our PML4) falls within the bounds of descriptor 14, i.e., 0x7BFFC < 0x7CE0D < 0x7CE0E. Consequently, the kernel assumes it has every right to overwrite this area.
In contrast, a log read from inside XtLdrProtocol->Memory.BuildPageMap shows what the bootloader does a fraction of a second later:

Mapping and dumping EFI memory:
...
Type=20, PhysicalBase=0x000000007CE0D000, VirtualBase=0000000000000000, Pages=1
...

To expose this bug, I had to force XTLDR to log mappings that do not have a virtual address set. It then becomes apparent that at this stage, the bootloader finally reserves frames from the free pool and marks them as Type=20 (LoaderMemoryData), subsequently building the paging tree within them. However, this updated data never makes it to the kernel.

There is a critical memory management bug in the XTLDR. When the bootloader constructs the page tables and enables paging before handing off control to the kernel, it fails to explicitly reserve the physical frame containing the top-level PML table. As a result, the physical address residing in the CR3 register is passed to the kernel as LoaderFree (Type 2). When the kernel's memory manager begins allocating memory and zeroes out a newly requested page via RTL::Memory::ZeroMemory, it inadvertently overwrites the active PML table. This instantly destroys the entire virtual memory address space, leading to a system freeze and subsequent Double/Triple Faults due to the CPU being unable to read the IDT or stack. ### Problem analysis: The active CR3 register points to physical address 0x7CE0D000: ``` (qemu) info registers CPU#0 RAX=0000000000000000 RBX=000000007e261018 RCX=fffff8080005b708 RDX=fffff6fc00000000 RSI=0000000000000000 RDI=000000007e261018 RBP=fffff80800051110 RSP=fffff80800051030 R8 =0000000000000001 R9 =fffff80800051138 R10=0000800000000000 R11=ffff7fffffffffff R12=0000000000000000 R13=000000007ee5f652 R14=0000000000000000 R15=000000007ff25be0 RIP=fffff8080002c47a RFL=00000086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =002b 0000000000000000 ffffffff 00cff300 DPL=3 DS [-WA] CS =0010 0000000000000000 00000000 00209a00 DPL=0 CS64 [-R-] SS =0018 0000000000000000 00000000 00209300 DPL=0 DS [-WA] DS =002b 0000000000000000 ffffffff 00cff300 DPL=3 DS [-WA] FS =0053 0000000000000000 00000fff 0040f300 DPL=3 DS [-WA] GS =002b fffff8080005adb0 ffffffff 00cff300 DPL=3 DS [-WA] LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT TR =0040 fffff8080005b6a0 00000067 00008900 DPL=0 TSS64-avl GDT= fffff808000595b0 000007ff IDT= fffff80800059db0 00000fff CR0=80050011 CR2=0000000000000000 CR3=000000007ce0d000 CR4=000006f8 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000d01 FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80 FPR0=0000000000000000 0000 FPR1=0000000000000000 0000 FPR2=0000000000000000 0000 FPR3=0000000000000000 0000 FPR4=0000000000000000 0000 FPR5=0000000000000000 0000 FPR6=0000000000000000 0000 FPR7=0000000000000000 0000 XMM00=fffff8080165f140 fffff8080165f180 XMM01=0000000000000000 0000000000000000 XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000 XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000 XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000 XMM08=0000000000000000 0000000000000000 XMM09=0000000000000000 0000000000000000 XMM10=0000000000000000 0000000000000000 XMM11=0000000000000000 0000000000000000 XMM12=0000000000000000 0000000000000000 XMM13=0000000000000000 0000000000000000 XMM14=0000000000000000 0000000000000000 XMM15=0000000000000000 0000000000000000 ``` A physical memory dump of this exact address confirms the PML4 has been completely zeroed out: ``` (qemu) xp /80x 0x7CE0D000 000000007ce0d000: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d010: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d020: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d030: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d040: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d050: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d060: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d070: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d080: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d090: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d0a0: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d0b0: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d0c0: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d0d0: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d0e0: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d0f0: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d100: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d110: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d120: 0x00000000 0x00000000 0x00000000 0x00000000 000000007ce0d130: 0x00000000 0x00000000 0x00000000 0x00000000 ``` ### Code analysis: The `Xtos::RunBootSequence` function in the xtos_o module, exhibits an architectural flaw: ``` /* 1. The memory map dump for the kernel is created */ Status = InitializeLoaderBlock(&PageMap, &VirtualAddress, Parameters); /* 2. Page tables are actually allocated and built */ Status = EnablePaging(&PageMap); ``` At the time `InitializeLoaderBlock` is executed, the memory for the PML4 table has not yet been reserved, which is why XTLDR marks it as free. To diagnose the issue, I added custom diagnostic code to the `Xtos::GetMemoryDescriptorList` function: ``` while(ListEntry != &PageMap->MemoryMap) { /* Retrieve the internal memory mapping record from the current list entry */ MemoryMapping = CONTAIN_RECORD(ListEntry, XTBL_MEMORY_MAPPING, ListEntry); /* Transfer memory type and address information to the kernel descriptor */ Descriptor->MemoryType = MemoryMapping->MemoryType; Descriptor->BasePage = (UINT_PTR)(MemoryMapping->PhysicalAddress / EFI_PAGE_SIZE); Descriptor->PageCount = (ULONG)MemoryMapping->NumberOfPages; /* Print mapped PFN range for diagnostics */ XtLdrProtocol->Debug.Print(L"[XTOS] Memory descriptor %lu: Type=%lu, PFN=[0x%zX, 0x%zX)\n", (ULONG)DescriptorIndex, Descriptor->MemoryType, Descriptor->BasePage, Descriptor->BasePage + Descriptor->PageCount); /* Link the entry */ XtLdrProtocol->LinkedList.InsertTail(MemoryDescriptorList, &Descriptor->ListEntry); /* Move to the next slot in the allocated buffer */ Descriptor++; ListEntry = ListEntry->Flink; DescriptorIndex++; } ``` With these additional logs, I was able to analyze the memory map being passed to the kernel and noticed that the address used by the PML4 table is explicitly marked as LoaderFree: ``` [XTOS] Memory descriptor 13: Type=5, PFN=[0x7BFDC, 0x7BFFC) [XTOS] Memory descriptor 14: Type=2, PFN=[0x7BFFC, 0x7CE0E) <--- PML4 (0x7CE0D) [XTOS] Memory descriptor 15: Type=20, PFN=[0x7CE0E, 0x7CE10) ``` The physical address of the PML4 table from the CR3 register is 0x7CE0D000. Dividing this by the page size (4096 bytes) gives us the PFN: 0x7CE0D000 / 4096 = 0x7CE0D. Mathematically, this frame (our PML4) falls within the bounds of descriptor 14, i.e., `0x7BFFC < 0x7CE0D < 0x7CE0E`. Consequently, the kernel assumes it has every right to overwrite this area. In contrast, a log read from inside `XtLdrProtocol->Memory.BuildPageMap` shows what the bootloader does a fraction of a second later: ``` Mapping and dumping EFI memory: ... Type=20, PhysicalBase=0x000000007CE0D000, VirtualBase=0000000000000000, Pages=1 ... ``` To expose this bug, I had to force XTLDR to log mappings that do not have a virtual address set. It then becomes apparent that at this stage, the bootloader finally reserves frames from the free pool and marks them as Type=20 (LoaderMemoryData), subsequently building the paging tree within them. However, this updated data never makes it to the kernel.
harraiken added the BUG label 2026-03-07 09:58:17 +01:00
harraiken added this to the ExectOS Development Board project 2026-03-07 09:58:17 +01:00
harraiken moved this to In Progress in ExectOS Development Board on 2026-03-07 10:00:38 +01:00
harraiken self-assigned this 2026-03-07 10:00:49 +01:00
Author
Owner

I need to vent about how unbelievably fucked up the xtos_o module is right now. It is an absolute clusterfuck of bad memory management architecture and desperately needs a complete rewrite from the ground up.

I need to vent about how unbelievably fucked up the xtos_o module is right now. It is an absolute clusterfuck of bad memory management architecture and desperately needs a complete rewrite from the ground up.
Author
Owner
https://git.codingworkshop.eu.org/xt-sys/exectos/commit/21b3b269a712f46927f5336b3e994b1ea5ff498a
Sign in to join this conversation.