Skip to content

Fix model id 0416#162

Open
Yunnglin wants to merge 3 commits intomainfrom
fix_model_id_0416
Open

Fix model id 0416#162
Yunnglin wants to merge 3 commits intomainfrom
fix_model_id_0416

Conversation

@Yunnglin
Copy link
Copy Markdown
Collaborator

PR type

  • Bug Fix
  • New Feature
  • Document Updates
  • More Models or Datasets Support

PR information

Write the detail information belongs to this PR.

Experiment results

Paste your experiment result here(if needed).

Copilot AI review requested due to automatic review settings April 16, 2026 12:27
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the Megatron server startup script by pre-creating log files and tailing them for real-time monitoring. It also updates the set_template method to include a default model_id and adds support for Qwen3.6 in the template mapping. Feedback was provided to ensure the tail command terminates if the server process crashes and to allow model_id overrides by using setdefault.

I am having trouble creating individual review comments. Click here to see my feedback.

cookbook/client/server/megatron/run.sh (394)

medium

The tail -f command blocks the script indefinitely. If the server process crashes or exits, tail will continue to run, which can be misleading to the user. Using the --pid flag ensures that tail terminates automatically when the monitored server process ends.

tail --pid=$SERVER_PID -f "$LOG_FILE"

src/twinkle/model/megatron/megatron.py (1330)

medium

Hard-coding model_id in kwargs prevents users from explicitly providing a custom model or tokenizer ID when calling set_template. Using setdefault ensures that self.tokenizer_id is used as the default while still allowing for manual overrides if necessary.

        kwargs.setdefault('model_id', self.tokenizer_id)

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to fix model/template identification issues when running Qwen3.6 models in the Twinkle server stack (template selection and template construction), and improves the Megatron cookbook run script’s log handling.

Changes:

  • Add Qwen3.6 → Qwen3_5Template mapping for server-side template selection.
  • Ensure Megatron set_template() passes model_id (via tokenizer_id) when constructing Template instances.
  • Pre-create the server log file and follow logs after starting the server in the Megatron cookbook run.sh.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/twinkle/server/utils/template_utils.py Extends template auto-selection to include Qwen3.6 model names.
src/twinkle/model/megatron/megatron.py Injects model_id for template construction to prevent Template init failures / wrong model id usage.
cookbook/client/server/megatron/run.sh Avoids tail -f failing on missing log file; adds blocking log-follow behavior.

Comment on lines 387 to +394
# 启动服务器并实时显示日志
touch "$LOG_FILE" # 预创建文件,避免 tail -f 在文件尚未写入时报错
nohup python -m twinkle.server --config "$SERVER_CONFIG_FILE" > "$LOG_FILE" 2>&1 &
SERVER_PID=$!
print_success "Twinkle Server 已启动 (PID: $SERVER_PID)"

# 实时显示日志(阻塞进程)
tail -f "$LOG_FILE"
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tail -f "$LOG_FILE" will block the script indefinitely (and will usually keep running even after the server process exits). If this script is used in automation, that behavior can hang the job. Consider using tail --pid=$SERVER_PID -f (GNU tail) so it exits when the server dies, and/or gate log-following behind a CLI flag/environment variable so the script can return immediately after starting the server when desired.

Suggested change
# 启动服务器并实时显示日志
touch "$LOG_FILE" # 预创建文件,避免 tail -f 在文件尚未写入时报错
nohup python -m twinkle.server --config "$SERVER_CONFIG_FILE" > "$LOG_FILE" 2>&1 &
SERVER_PID=$!
print_success "Twinkle Server 已启动 (PID: $SERVER_PID)"
# 实时显示日志(阻塞进程)
tail -f "$LOG_FILE"
# 启动服务器;默认实时显示日志,可通过 FOLLOW_SERVER_LOGS=0 禁用
FOLLOW_SERVER_LOGS="${FOLLOW_SERVER_LOGS:-1}"
touch "$LOG_FILE" # 预创建文件,避免 tail 在文件尚未写入时报错
nohup python -m twinkle.server --config "$SERVER_CONFIG_FILE" > "$LOG_FILE" 2>&1 &
SERVER_PID=$!
print_success "Twinkle Server 已启动 (PID: $SERVER_PID)"
echo " - 设置 FOLLOW_SERVER_LOGS=0 可在启动后立即返回"
# 实时显示日志(阻塞进程),并在服务器退出后自动结束
if [ "$FOLLOW_SERVER_LOGS" = "1" ]; then
if tail --pid="$SERVER_PID" -f /dev/null >/dev/null 2>&1; then
tail --pid="$SERVER_PID" -f "$LOG_FILE"
else
print_warning "当前 tail 不支持 --pid,改为等待服务器进程退出;如需立即返回,请设置 FOLLOW_SERVER_LOGS=0"
wait "$SERVER_PID"
fi
fi

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants