Conversation
There was a problem hiding this comment.
Code Review
This pull request improves the Megatron server startup script by pre-creating log files and tailing them for real-time monitoring. It also updates the set_template method to include a default model_id and adds support for Qwen3.6 in the template mapping. Feedback was provided to ensure the tail command terminates if the server process crashes and to allow model_id overrides by using setdefault.
I am having trouble creating individual review comments. Click here to see my feedback.
cookbook/client/server/megatron/run.sh (394)
The tail -f command blocks the script indefinitely. If the server process crashes or exits, tail will continue to run, which can be misleading to the user. Using the --pid flag ensures that tail terminates automatically when the monitored server process ends.
tail --pid=$SERVER_PID -f "$LOG_FILE"
src/twinkle/model/megatron/megatron.py (1330)
Hard-coding model_id in kwargs prevents users from explicitly providing a custom model or tokenizer ID when calling set_template. Using setdefault ensures that self.tokenizer_id is used as the default while still allowing for manual overrides if necessary.
kwargs.setdefault('model_id', self.tokenizer_id)
There was a problem hiding this comment.
Pull request overview
This PR aims to fix model/template identification issues when running Qwen3.6 models in the Twinkle server stack (template selection and template construction), and improves the Megatron cookbook run script’s log handling.
Changes:
- Add Qwen3.6 →
Qwen3_5Templatemapping for server-side template selection. - Ensure Megatron
set_template()passesmodel_id(viatokenizer_id) when constructingTemplateinstances. - Pre-create the server log file and follow logs after starting the server in the Megatron cookbook
run.sh.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/twinkle/server/utils/template_utils.py |
Extends template auto-selection to include Qwen3.6 model names. |
src/twinkle/model/megatron/megatron.py |
Injects model_id for template construction to prevent Template init failures / wrong model id usage. |
cookbook/client/server/megatron/run.sh |
Avoids tail -f failing on missing log file; adds blocking log-follow behavior. |
| # 启动服务器并实时显示日志 | ||
| touch "$LOG_FILE" # 预创建文件,避免 tail -f 在文件尚未写入时报错 | ||
| nohup python -m twinkle.server --config "$SERVER_CONFIG_FILE" > "$LOG_FILE" 2>&1 & | ||
| SERVER_PID=$! | ||
| print_success "Twinkle Server 已启动 (PID: $SERVER_PID)" | ||
|
|
||
| # 实时显示日志(阻塞进程) | ||
| tail -f "$LOG_FILE" |
There was a problem hiding this comment.
tail -f "$LOG_FILE" will block the script indefinitely (and will usually keep running even after the server process exits). If this script is used in automation, that behavior can hang the job. Consider using tail --pid=$SERVER_PID -f (GNU tail) so it exits when the server dies, and/or gate log-following behind a CLI flag/environment variable so the script can return immediately after starting the server when desired.
| # 启动服务器并实时显示日志 | |
| touch "$LOG_FILE" # 预创建文件,避免 tail -f 在文件尚未写入时报错 | |
| nohup python -m twinkle.server --config "$SERVER_CONFIG_FILE" > "$LOG_FILE" 2>&1 & | |
| SERVER_PID=$! | |
| print_success "Twinkle Server 已启动 (PID: $SERVER_PID)" | |
| # 实时显示日志(阻塞进程) | |
| tail -f "$LOG_FILE" | |
| # 启动服务器;默认实时显示日志,可通过 FOLLOW_SERVER_LOGS=0 禁用 | |
| FOLLOW_SERVER_LOGS="${FOLLOW_SERVER_LOGS:-1}" | |
| touch "$LOG_FILE" # 预创建文件,避免 tail 在文件尚未写入时报错 | |
| nohup python -m twinkle.server --config "$SERVER_CONFIG_FILE" > "$LOG_FILE" 2>&1 & | |
| SERVER_PID=$! | |
| print_success "Twinkle Server 已启动 (PID: $SERVER_PID)" | |
| echo " - 设置 FOLLOW_SERVER_LOGS=0 可在启动后立即返回" | |
| # 实时显示日志(阻塞进程),并在服务器退出后自动结束 | |
| if [ "$FOLLOW_SERVER_LOGS" = "1" ]; then | |
| if tail --pid="$SERVER_PID" -f /dev/null >/dev/null 2>&1; then | |
| tail --pid="$SERVER_PID" -f "$LOG_FILE" | |
| else | |
| print_warning "当前 tail 不支持 --pid,改为等待服务器进程退出;如需立即返回,请设置 FOLLOW_SERVER_LOGS=0" | |
| wait "$SERVER_PID" | |
| fi | |
| fi |
PR type
PR information
Write the detail information belongs to this PR.
Experiment results
Paste your experiment result here(if needed).