Linux 101: Disown long running jobs like an InsightIQ database upgrade

Long InsightIQ upgrade processIf you’re remotely managing a Linux machine, you’ll probably use an SSH connection to run commands on that machine. There’s one problem with this approach: if you close the SSH connection, any long-running jobs/commands will halt. If you know a job will take a long time and you won’t be able to babysit the SSH connection, you can plan accordingly. But what if you underestimated the time a job will take, and you need to disconnect anyway? Here’s how to keep the job running AND make it home in time for dinner!

InsightIQ database upgrade

When upgrading from InsightIQ 3.0 or older releases to 3.1 you need to perform a database upgrade with the update_iiq_datastore command. The wizard is pretty intuitive, so no worries there.

If you’ve got pre-3.0 InsightIQ data in the database, that will need an additional conversion. The wizard will give you the option to either upgrade or delete the pre-3.0 information, and show you which time span of statistics that will impact. In our case, it was roughly 6 months spanning 2013-2014, so I chose to delete that information instead of upgrading it.

The 3.0 data will also need an upgrade though, and mind you: it takes a LONG time. If you made the same mistake as me running the command from an SSH session while you need to go home, know that exiting an SSH session with a running process usually doesn’t end well. So to avoid stopping the upgrade process, let’s push the process to the background and keep it running.

Hit Ctrl-Z; the SSH session will show you a paused job and a number (most likely 1). Next, push the job 1 to the background with bg 1. It will return a process ID, e.g. 3122. At this point, you’ll still be getting the progress bar updates, and the job is still yours: you can list it with jobs -l. You could now choose to run other commands, but disconnecting the SSH session will halt the job. So the next step is disowning the job.

Run disown 1 or just disown. If you run jobs -l again, the list should be empty.

disown upgrade job
Screenshot is a bit dirty due to the progress bar spamming…

Run top, and you’ll still see the upgrade process consuming resources. It is now safe to exit your SSH session.

At your next convenience, connect to the server again and monitor the processes with top. If there isn’t an update_iiq_datastore process hogging the top rows anymore, it has probably finished. Alternatively, you can list the processes with ps -ef | grep <processID>: if nothing is returned, the job finished. To verify, run the update_iiq_datastore again and it should return output that no database upgrade is necessary anymore.

What if I know a job will take a while to complete?

The disown command is useful if you’re in a pinch and need to run. However there’s a downside: any output (like the progress bar in the screenshot above) will continue to spam your SSH session. As soon as you disconnect the SSH session, the output of the command isn’t logged anywhere and lost. So there’s no way to really check the outcome of the process. Not a massive problem for this upgrade (since it’s OK to run the upgrade command again), but a problem for copy/move operations. So how can you plan for long running processes?

If it’s a simple command that does not require any interactive input, you can use ‘nohup <command> & ‘ . This runs the command in the background and logs the output of it to nohup.log in your current directory.

If your command does require some interactive input, try ‘screen’. Make sure it’s installed first, then open a standard SSH session. Run the command screen. This will open a virtual terminal inside of your SSH session. Run all the commands you like, then press Ctrl+A and Ctrl+D to exit the screen while leaving it running. It’s now safe to disconnect your SSH session. When you return, run the ‘screen -r’ command to reattach to the previous screen session.