-
Notifications
You must be signed in to change notification settings - Fork 600
Filesystem: optionally report ps -f even when killing many processes
#2114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -269,7 +269,26 @@ currently accessing the mount directory. | |
| avoid functions that could potentially block during process | ||
| detection | ||
| "false" : Do not kill any processes. | ||
| "move" : like "safe", but try to mount --move first | ||
|
|
||
| You may add one or more comma separated modifiers | ||
| "move", "[no]xargs", "[no]ps", "kill_one_by_one", | ||
| for example: | ||
|
|
||
| "true,xargs,nops": | ||
| find processes using system tools, | ||
| then kill them "simultaneously" using "xargs kill", | ||
| do not bother to show process details. | ||
|
|
||
| "safe,move,xargs,ps": | ||
| move the mount first, then | ||
| find processes by walking /proc/ "manually", then | ||
| use "xargs ps" to show process details before | ||
| using "xargs kill" to get rid of them. | ||
|
|
||
| "safe,move,noxargs,ps": | ||
| move the mount first, then | ||
| find processes by walking /proc/ "manually", then | ||
| show process details and kill them individually in a loop. | ||
|
|
||
| The 'safe' option uses shell logic to walk the /proc/<pid>/ directories | ||
| for pids using the mount point while the default option uses the | ||
|
|
@@ -281,8 +300,9 @@ party apps, we likely never win the race and the file system will be kept busy. | |
| Which may result in a timeout and stop failure, potentially escalating to | ||
| hard-reset of this node via fencing. | ||
|
|
||
| The 'move' option tries to move the mount point somewhere those "rogue apps" | ||
| do not expect it, then proceed to kill current users and attempt to umount. | ||
| The 'move' option tries to "mount --move" the mount point somewhere those | ||
| "rogue apps" do not expect it, then proceed to kill current users and attempt | ||
| to umount. | ||
|
|
||
| For 'move' to work, you will have to make sure the mount point does not reside | ||
| under a shared mount, for example by mount -o bind,private /mount /mount | ||
|
|
@@ -773,22 +793,39 @@ signal_processes() { | |
| if [ $nr_pids = 0 ]; then | ||
| ocf_log info "No processes on $dir were signalled. force_unmount is set to '$FORCE_UNMOUNT'" | ||
| return 1 | ||
| elif [ $nr_pids -le 24 ]; then | ||
| fi | ||
| if $do_xargs_kill; then | ||
| if $do_ps_f; then | ||
| # echo "$pids" | xargs -r kill -s STOP | ||
| echo "sending signal $sig to $nr_pids processes:" | ||
|
|
||
| # According to my man page, 'ps -f "$pid"' might be good enough, | ||
| # but needs rather specific formatting of that argument. | ||
| # And 'ps -f $pid' might produce too many words. | ||
| # Use xargs anyway. | ||
| echo "$pids" | xargs ps -f 2>&1 | ||
| fi | ||
| echo "$pids" | xargs -r kill -s $sig 2>&1 | ||
| if [ $nr_pids -gt 24 ]; then | ||
| sed_script="11 s/^.*/... and more .../; 12,$(( $nr_pids - 10))d" | ||
| pids=$(echo "$pids" | sed -e "$sed_script" | tr '\n' ' ') | ||
| fi | ||
| echo "sent signals $sig to ${nr_pids} processes ["${pids}"]" | ||
| else | ||
| for pid in $pids; do | ||
| ocf_log info "sending signal $sig to: $(ps -f $pid | tail -1)" | ||
| if $do_ps_f; then | ||
| echo "sending signal $sig to: $(ps -f $pid | tail -1)" | ||
| else | ||
| echo "sending signal $sig to $pid" | ||
| fi | ||
| kill -s $sig $pid | ||
| done | ||
| else | ||
| echo "$pids" | xargs -r kill -s $sig | ||
| sed_script="11 s/^.*/... and more .../; 12,$(( $nr_pids - 10))d" | ||
| pids=$(echo "$pids" | sed -e "$sed_script" | tr '\n' ' ') | ||
| ocf_log info "sent signals $sig to ${nr_pids} processes [${pids}]" | ||
| fi | ||
| fi | ocf_log_pipe info | ||
| return 0 | ||
| } | ||
| try_umount() { | ||
| local force_arg="$1" SUB="$2" | ||
| $UMOUNT $force_arg "$SUB" | ||
| $UMOUNT $force_arg "$SUB" 2>&1 | ocf_log_pipe warn | ||
| list_mounts | grep "${TAB}${SUB}${TAB}" >/dev/null 2>&1 || { | ||
| ocf_log info "unmounted $SUB successfully" | ||
| return $OCF_SUCCESS | ||
|
|
@@ -1181,10 +1218,67 @@ if [ ! -z "$OCF_RESKEY_options" ]; then | |
| fi | ||
| FAST_STOP=${OCF_RESKEY_fast_stop:="yes"} | ||
|
|
||
| case $FORCE_UNMOUNT in | ||
| move) move_before_umount=true; FORCE_UNMOUNT=safe ;; | ||
| *) move_before_umount=false ;; | ||
| esac | ||
| parse_force_unmount_modifiers() | ||
| { | ||
| # keep previous "kill one by one" behavior as default for now | ||
| move_before_umount=false | ||
| do_xargs_kill=false | ||
| do_ps_f=true | ||
|
|
||
| local IFS=',' | ||
| local m | ||
|
|
||
| set -- $FORCE_UNMOUNT | ||
|
|
||
| if [ $1 = "move" ]; then | ||
| FORCE_UNMOUNT=safe | ||
| else | ||
| FORCE_UNMOUNT=$1 | ||
| shift | ||
| fi | ||
|
|
||
| for m ; do | ||
| case $m in | ||
| move) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We shouldnt set multiple variables here, as they might be changed or not depending on order later, so we should also make sure that opposites, e.g. only move and nomove are not set. Let's set one variable to true or false per match. |
||
| move_before_umount=true | ||
| do_xargs_kill=true | ||
| do_ps_f=false | ||
| ;; | ||
| nomove) | ||
| # in case this becomes the default | ||
| move_before_umount=false | ||
| ;; | ||
| ps) | ||
| do_ps_f=true | ||
| ;; | ||
| nops) | ||
| do_ps_f=false | ||
| ;; | ||
| xargs) | ||
| do_xargs_kill=true | ||
| ;; | ||
| noxargs) | ||
| do_xargs_kill=false | ||
| ;; | ||
| kill_one_by_one) | ||
| do_xargs_kill=false | ||
| do_ps_f=true | ||
| ;; | ||
|
|
||
| # catch typos | ||
| *) | ||
| ocf_log warn "force_unmount: unknown modifier $m ignored" | ||
| esac | ||
| done | ||
|
|
||
| # catch typos | ||
| if ! ocf_is_true $FORCE_UNMOUNT && [ $FORCE_UNMOUNT != "safe" ]; then | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This can be removed if you move it to a case-statement in |
||
| ocf_log warn "force_unmount: value '$FORCE_UNMOUNT' interpreted as 'false'" | ||
| fi | ||
| } | ||
|
|
||
| parse_force_unmount_modifiers | ||
|
|
||
|
|
||
| OP=$1 | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use a case for this parameter, and add it to the the for-case below, so we dont depend on it being first.