Skip to content

Commit bf58a4d

Browse files
committed
0.2.0 version
Signed-off-by: roots666 <m0980701299@gmail.com>
1 parent 5799575 commit bf58a4d

27 files changed

Lines changed: 806 additions & 202 deletions

.github/workflows/release.yml

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,9 @@ jobs:
2424
sudo apt-get update
2525
sudo apt-get install -y nasm make binutils
2626
27-
- name: Build amd64 binary
27+
- name: Build and test amd64
2828
run: |
29-
make build-amd64
29+
make test-all
3030
strip build/mini-init-amd64
3131
3232
- name: Upload amd64 artifact
@@ -47,13 +47,17 @@ jobs:
4747
- name: Install cross-compile dependencies
4848
run: |
4949
sudo apt-get update
50-
sudo apt-get install -y nasm make binutils gcc-aarch64-linux-gnu binutils-aarch64-linux-gnu
50+
sudo apt-get install -y nasm make binutils gcc-aarch64-linux-gnu binutils-aarch64-linux-gnu qemu-user-static file
5151
5252
- name: Build arm64 binary
5353
run: |
5454
make build-arm64
5555
aarch64-linux-gnu-strip build/mini-init-arm64
5656
57+
- name: ARM64 fallback smoke (QEMU-user)
58+
run: |
59+
ARM64_FALLBACK=1 EP_ARM64_FALLBACK=1 bash scripts/test_harness_arm64.sh build/mini-init-arm64
60+
5761
- name: Upload arm64 artifact
5862
uses: actions/upload-artifact@v4
5963
with:

CHANGELOG.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# Changelog
2+
3+
## 0.2.0 - 2025-12-13
4+
5+
### Fixed
6+
7+
- Fix critical PID1 hang: main-child exit could be missed during SIGCHLD storms (now reliably detected/reported on both amd64 and arm64).
8+
- Fix amd64 restart-mode stack safety when max restarts are reached.
9+
- Prevent `epoll` fd leakage into the exec’ed child (`epoll_create1(EPOLL_CLOEXEC)`).
10+
- Avoid SIGKILL escalation to a reused/nonexistent PGID (`kill(-pgid, 0)` probe before escalation).
11+
- Fix verbose logging writing a NUL byte in timestamps.
12+
13+
### Improved
14+
15+
- More actionable verbose logs: signal number, grace seconds, restart backoff seconds, and restart count.
16+
- Harden `EP_SIGNALS` parsing:
17+
- Trim trailing whitespace per token.
18+
- Real-time signals are now bounded to the kernel max (`RT1..RT30`).
19+
- Numeric env vars are now parsed strictly as decimal digits; invalid/overflow values are ignored (warnings in verbose mode).
20+
- Timer-related seconds (grace/backoff) are clamped to fit signed 64-bit seconds.
21+
- `EP_EXIT_CODE_BASE` is now validated as `0..255` (out-of-range values are ignored; default applies).
22+
- ARM64/QEMU: remove `msub` usage in hot paths; in `EP_ARM64_FALLBACK=1` mode timestamps are omitted to reduce QEMU-user flakiness.
23+
24+
### Tests/Docs
25+
26+
- Add an edge-case test covering “main child exits while many other children are reaped”.
27+
- Update docs to reflect RT signal bounds and ARM64 fallback timestamp behavior.
28+
- ARM64/QEMU: in fallback mode, smoke tests now exercise `--version` and the wait4-only path (helper exit propagation) instead of skipping entirely.
29+
- Clean up `scripts/*` quoting to avoid common ShellCheck warnings (SC2016/SC2086).

Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@ test-all: test-amd64
5555
bash scripts/test_edge_cases.sh $(BUILD_DIR)/$(TARGET_AMD64)
5656
bash scripts/test_exit_code_mapping.sh $(BUILD_DIR)/$(TARGET_AMD64)
5757
bash scripts/test_restart.sh $(BUILD_DIR)/$(TARGET_AMD64)
58+
bash scripts/test_diagnostics.sh $(BUILD_DIR)/$(TARGET_AMD64)
5859

5960
$(AMD64_BUILD_DIR):
6061
mkdir -p $@

README.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -176,14 +176,16 @@ mini-init-{amd64|arm64} [--verbose|-v] [--version|-V] -- <command> [args...]
176176

177177
### Environment variables
178178

179+
Numeric env vars are parsed as **non-negative decimal**. If a value is invalid/overflows, it is ignored (defaults apply); in verbose mode a warning is logged. Timer-related values (grace/backoff seconds) are clamped to fit in signed 64-bit seconds.
180+
179181
- `EP_GRACE_SECONDS`
180182
Grace period (in seconds) from the *first* forwarded soft signal to `SIGKILL` escalation.
181183
Default: `10`.
182184

183185
- `EP_SIGNALS`
184186
CSV of **additional** signal names to monitor/forward (case-sensitive).
185-
Supported names: `USR1,USR2,PIPE,WINCH,TTIN,TTOU,CONT,ALRM,RT1,...,RT31`
186-
(`RTN` = `SIGRTMIN+N`, 1–31).
187+
Supported names: `USR1,USR2,PIPE,WINCH,TTIN,TTOU,CONT,ALRM,RT1,...,RT30`
188+
(`RTN` = `SIGRTMIN+N`, 1–30).
187189
These **augment** the built-in set: `HUP,INT,QUIT,TERM,CHLD` plus default forwarding
188190
of `USR1,USR2,PIPE,WINCH,TTIN,TTOU,CONT,ALRM`.
189191
Unknown tokens are ignored with a warning. In verbose mode we only log
@@ -192,12 +194,14 @@ mini-init-{amd64|arm64} [--verbose|-v] [--version|-V] -- <command> [args...]
192194
- `EP_SUBREAPER`
193195
If set to `1`, enables `PR_SET_CHILD_SUBREAPER` so that `mini-init-asm` adopts orphaned
194196
grandchildren. Useful when nested processes need proper reaping.
197+
`mini-init-asm` still exits when the main child exits (it does not wait indefinitely for adopted descendants).
195198
Default: disabled.
196199

197200
- `EP_EXIT_CODE_BASE`
198201
Base value for mapping “killed by signal” to exit code:
199202
`exit_code = EP_EXIT_CODE_BASE + signal_number` (default base `128`, like shells).
200203
For example, `SIGKILL` (9) with base 200 → exit code 209.
204+
Valid range: `0..255` (out-of-range values are ignored; default applies).
201205

202206
- `EP_RESTART_ENABLED`
203207
If set to `1`, enables **restart-on-crash**: when the child is killed by a signal
@@ -219,6 +223,8 @@ mini-init-{amd64|arm64} [--verbose|-v] [--version|-V] -- <command> [args...]
219223
- `EP_ARM64_FALLBACK` (ARM64/QEMU only)
220224
If set to `1`, ARM64 builds skip the epoll/signalfd path and use a simpler
221225
`wait4` loop. Intended as a workaround for QEMU user-mode flakiness in CI smoke tests.
226+
This mode does **not** provide the full signal-forwarding/grace-timer behavior; it primarily verifies spawn + exit-code propagation.
227+
In fallback mode, verbose logs may omit timestamps to avoid QEMU-user emulation issues.
222228
Default: `0` (CI jobs typically set this).
223229

224230
### Examples

ROADMAP.md

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,22 @@ This document tracks planned features and improvements for `mini-init-asm`.
44

55
## Short-term (Next Release)
66

7+
### Possible next steps
8+
9+
- Native ARM64 validation (real hardware or full-system QEMU) for the normal epoll/signalfd path.
10+
- Consider optional `EP_SUBREAPER_WAIT=1` (wait for adopted children after main child exit) and document tradeoffs.
11+
- Consider `EP_RESTART_ON_NONZERO_EXIT=1` (opt-in) if restart-on-crash should include nonzero normal exits.
12+
- Clamp `EP_MAX_RESTARTS` to a sane upper bound to avoid pathological loops.
13+
- Continue improving diagnostics while keeping pure-syscall design (e.g., log child PGID, kill/escalation decisions).
14+
715
### Arm64 tests on linux
816

9-
- QEMU user-mode remains flaky: ARM64 binary hangs/SIGILLs under `qemu-aarch64-static` right after startup (even with fallback mode).
17+
- QEMU user-mode remains flaky: ARM64 binary hangs/SIGILLs under `qemu-aarch64-static` right after startup (historically even with fallback mode).
1018
- Helpers (`helper-exit42`, `helper-sleeper`) run fine under QEMU; issue is specific to `mini-init-arm64` user-mode emulation.
1119
- Instrumentation shows execution reaches `get_timestamp_ptr`/epoll setup, then no further syscalls; QEMU SIGILL is likely emulator-specific.
1220
- Added `EP_ARM64_FALLBACK`/`ARM64_FALLBACK` env to skip the QEMU smoke in CI while keeping native behavior unchanged.
13-
- Next: validate on native ARM64 hardware or full-system QEMU; try newer QEMU user-mode or replace `udiv`/`msub` in `get_timestamp_ptr` with a simpler divide loop if emulation keeps failing.
21+
- Implemented a safer path: removed `msub` usage and made `EP_ARM64_FALLBACK=1` omit timestamp formatting to avoid QEMU-user issues.
22+
- Next: validate on native ARM64 hardware or full-system QEMU; try newer QEMU user-mode if emulation still fails.
1423

1524
---
1625

include/macros.inc

Lines changed: 36 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
je %%done
3030
call get_timestamp_ptr
3131
mov rax, SYS_write
32-
mov rdx, 19
32+
mov rdx, 18
3333
mov rdi, 2
3434
mov rsi, rax
3535
syscall
@@ -86,3 +86,38 @@ parse_u64_dec:
8686
jmp .loop
8787
.done:
8888
ret
89+
90+
; strict parse decimal u64 (digits only, non-empty, full-string)
91+
; in: rsi = ptr to NUL-terminated string
92+
; out: rax = value (undefined if rdx=0), rdx = 1 if valid else 0
93+
parse_u64_dec_checked:
94+
xor rax, rax
95+
xor r11, r11 ; ok=0
96+
mov r8b, [rsi]
97+
test r8b, r8b
98+
je .bad
99+
.loop_checked:
100+
mov r8b, [rsi]
101+
test r8b, r8b
102+
je .ok
103+
cmp r8b, '0'
104+
jb .bad
105+
cmp r8b, '9'
106+
ja .bad
107+
mov r11, 1
108+
mov rcx, 10
109+
mul rcx ; rdx:rax = rax*10
110+
test rdx, rdx
111+
jne .bad
112+
movzx rcx, r8b
113+
sub rcx, '0'
114+
add rax, rcx
115+
jc .bad
116+
inc rsi
117+
jmp .loop_checked
118+
.ok:
119+
mov rdx, r11
120+
ret
121+
.bad:
122+
xor rdx, rdx
123+
ret

include/macros_arm64.inc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
bl get_timestamp_ptr
2828
mov x1, x0
2929
mov x0, #2
30-
mov x2, #19
30+
mov x2, #18
3131
SYSCALL SYS_write
3232
mov x0, #2
3333
adrp x1, \msg_ptr

include/syscalls_aarch64.inc

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
// epoll/op flags
2626
.equ EPOLL_CTL_ADD, 1
2727
.equ EPOLLIN, 1
28+
.equ EPOLL_CLOEXEC, 02000000
2829

2930
// sigprocmask how
3031
.equ SIG_BLOCK, 0
@@ -57,6 +58,10 @@
5758
.equ SIGRTMIN, 34
5859
.equ SIGRTMAX, 64
5960

61+
// errno subset
62+
.equ ESRCH, 3
63+
.equ EINTR, 4
64+
6065
// timerfd/signalfd flags
6166
.equ TFD_CLOEXEC, 02000000
6267
.equ TFD_NONBLOCK, 00004000

include/syscalls_amd64.inc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
; epoll/op flags
2424
%define EPOLL_CTL_ADD 1
2525
%define EPOLLIN 1
26+
%define EPOLL_CLOEXEC 02000000o
2627

2728
; sigprocmask how
2829
%define SIG_BLOCK 0
@@ -56,6 +57,7 @@
5657
%define SIGRTMAX 64
5758

5859
; errno subset
60+
%define ESRCH 3
5961
%define EINTR 4
6062

6163
; timerfd/signalfd flags (octal per manpages)

scripts/test_diagnostics.sh

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
#!/usr/bin/env bash
2+
set -euo pipefail
3+
4+
BIN="${1:-./build/mini-init-amd64}"
5+
6+
echo "[test] Diagnostics/verbose logging checks"
7+
8+
tmp="$(mktemp)"
9+
tmp2="$(mktemp)"
10+
cleanup() {
11+
rm -f "$tmp" "$tmp2"
12+
}
13+
trap cleanup EXIT
14+
15+
echo "[test] 1) Logs include signal and grace_seconds"
16+
EP_GRACE_SECONDS=2 "$BIN" -v -- /bin/bash scripts/fixtures/trap_exit0.sh 2>"$tmp" &
17+
pid=$!
18+
sleep 0.5
19+
kill -TERM "$pid"
20+
set +e
21+
wait "$pid"
22+
wait_rc=$?
23+
set -e
24+
echo "[test] rc=$wait_rc"
25+
test "$wait_rc" -eq 0
26+
grep -q "DEBUG: signal=" "$tmp"
27+
grep -q "DEBUG: grace_seconds=" "$tmp"
28+
29+
echo "[test] 2) Logs include restart_count on restart"
30+
set +e
31+
EP_RESTART_ENABLED=1 EP_MAX_RESTARTS=1 EP_RESTART_BACKOFF_SECONDS=0 \
32+
"$BIN" -v -- /bin/sh -c "kill -SEGV \$\$" 2>"$tmp2"
33+
wait_rc=$?
34+
set -e
35+
echo "[test] rc=$wait_rc"
36+
test "$wait_rc" -eq 139
37+
grep -q "DEBUG: restart_count=" "$tmp2"
38+
39+
echo "[test] Diagnostics tests passed"

0 commit comments

Comments
 (0)