Log Rotation for a Good Night’s Sleep

It’s one of the dumbest things you can ever get called out in the middle of the night for – because a filesystem has filled up. Dumb, because it’s preventable and because you shouldn’t be the one doing housework. It’s a computer – its whole purpose is to do the work for you.

The logrotate script was created to monitor, archive and delete log files so you don’t have to. It is an absolutely vital utility with which, in theory, a Linux host could run literally forever without maintenance. It’s installed in the base bundle on all major versions of Linux.

The key things you need to know is that the logrotate process is called by the cron daemon, with the wrapper script located at:

   /etc/cron.daily/logrotate

And each logfile (or set of logfiles) which is to be monitored and archived has its own logrotate configuration file under:

   /etc/logrotate.d/

Every time the logrotate program is executed by cron, it checks every monitored logfile against the conditions in the logrotate.d configuration, and then copies it aside, with a number added as an extension, while resetting the original logfile size to zero.

Most well-written programs output to a logfile. In order to do so, the filehandle is held open so that the process can exclusively write to this file. But for write efficiency, Linux systems use a buffer cache, which means that output is not written immediately to the filesystem, but is rather buffered in memory and then written out in chunks. During log rotation, if the buffer cache hasn’t been flushed when the logfile gets rotated (that is, moved) then data can be lost. With databases which make an even greater use of memory for speed and internal consistency, this is even more important.

The man page for logrotate is extensive and well-described, but for the purposes of this post, I’m going to give three real world examples which illustrate three different challenges one faces when rotating log files.

Apache HTTPD

The Apache webserver provides a good candidate for logrotate, because its usual mode of operation is to record all incoming connections, and so log files can become quite large, quickly for a busy server. The configuration file is a simple one, because HTTPD can be reloaded quickly and cleanly, allowing easy refreshing of its log file. The following is the configuration file:
/etc/logrotate.d/httpd

/var/sky/logs/apache/*log {
    missingok
    sharedscripts
    daily
    rotate 31
    postrotate
       /sbin/service httpd reload > /dev/null 2>/dev/null || true
    endscript
}

In short, this will perform the rotation at most on a daily basis and retain 31 old logfiles. The postrotate/endscript block will be executed once only per rotation due to the sharedscripts keyword (rather than once per file), and will reload the daemon in order to refresh its logfile.

Apache Tomcat

Tomcat on the other hand, doesn’t have the “reload” option for the daemon, as it spawns a separate JVM on startup. What’s more, Tomcat can take some time to restart, and often isn’t particular reliable at shutting down either. This therefore requires something with more brute force and less elegant, but this is a necessary trade-off. Here is the configuration file:
/etc/logrotate.d/tomcat

/var/sky/tomcat/logs/catalina.out {
    copytruncate
    daily
    rotate 7
    compress
    missingok
    size 5M
    dateext
}

In order to overcome the problem of not being able to easily refresh the daemon and begin a new log file, the copytruncate directive is used. What this does is to copy the log file, but then truncate the current version, that is cat /dev/null > logfile, thus preserving the open file handle, but zeroing the file. The process continues to write to the logfile as if nothing has happened. The problem with this method is that there is a chance that some log output will be lost due to the buffer cache not being flushed. This risk needs to be weighed against the benefits, which are considerable.
Also note in the previous example that the catalina.out file will be checked daily but only rotated if it has reached 5MB in size. The dateext directive will append the date, in the form YYYYMMDD to the end of the filename, i.e. catalina.out.20120325 rather than just appending an incremented integer.

MySQL

The final example is a log rotation config for the MySQL database daemon. As databases often do, MySQL caches its logs in memory, and if full query logging is enabled these can be considerable. The logs can only be flushed to the filesystem by sending a command to the MySQL daemon:

    # /usr/bin/mysqladmin flush-logs

However, this command requires root privileges and will prompt for a password, and logrotate, being instigated by cron, cannot answer an interactive session. So, to pass login credentials to MySQL (so the automated logrotate script can authenticate), create the following file with contents:

/root/.my.cnf

[mysqladmin]
user = root
password = changeme

(change the root password to match that of your database instance).
Then, secure this file so that only root can read it, like so:

   # chmod 600 /root/.my.cnf

With this in place in the root home directory, the mysqladmin command can be authenticated noninteractively. This allows us to configure logrotate to work with MySQL. The following logrotate configuration file will do the trick:

/etc/logrotate.d/mysql

/var/log/mysql/*.log {
    create 644 mysql mysql
    notifempty
    daily
    rotate 5
    missingok
    nocompress
    sharedscripts
    postrotate
        if [ `pgrep -n mysqld` ]; then
          /usr/bin/mysqladmin flush-logs
        fi
    endscript
}

Again, the man pages will explain all of these options, but the create directive is here important. MySQL requires that the log file already exists before it will log to it. Thus the create directive touches a new logfile with the specified ownership and permissions because the daemon won’t recreate it. The other thing to look at is the postrotate script. In this case, before the mysqladmin command is run, logrotate checks that the mysqld daemon is actually running, otherwise an error would be produced.

Postscript: Hunting the Disk Hog

As an afterthought, diagnosing the cause of a filesystem filling up can sometimes be a tricky task, particularly if one is not completely familiar with the system. It’s a matter of locating the file that’s the culprit.
The easiest way to do this is, of course, with the find command. There are two assumptions one can make – that the file would have been written to very recently, and the file is probably over 50 MB in size. The second assumption is an interesting one – Linux systems have very few files larger than 50MB unless they are database files, backup dumps or log files. This narrows the list of suspects.
Thus, if the root filesystem is filling up, the following command will search it for files written to in the last hour, which are greater than 50 MB in size, and will restrict itself to the / mountpoint (rather than drilling down to any other filesystems that could be mounted under it):

# find / -xdev -size +50000 -mtime -60

It of course should be self evident that writing log files to the root filesystem is a really bad idea, but sometimes /tmp or /usr/local/var get used by third party programs which can cause this to fill up. Once the file is discovered, it may be possible to infer which process uses it from the location or filename. If not, use the lsof command to find out which process is using the file, e.g.:

# lsof /var/log/syslog
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
rsyslogd 948 syslog 1w REG 8,4 50143 783384 /var/log/syslog

Once you know which file is bloating, and which process is feeding it, you can take action accordingly to configure Logrotate to keep the logfile in check.

So if Logrotate isn’t yet implemented on your systems, get in and set it up everywhere, before you get that unnecessary 2 a.m. booty call from your webserver.

Matt Parsons

Matt is a freelance Linux consultant who’s been a Sysadmin since a time when the job required touch-typing. He believes this whole computer thing to be a fad that people will soon tire of.

He blogs about quick and dirty Unix fixes at www.terminalinflection.com

Twitter 


All rights reserved ©

2 Responses to Log Rotation for a Good Night’s Sleep

  1. Good article! There is some more information, not related to logrotate, that ppl should know.

    The BEST solution to this problem comes when the system is first setup. rotating is done when run by cron, but what about sudden change in log levels? A spike in logging or DDOS could fill up the drive between cron invocations of logrotate.

    On systems that will be busy make sure to configure multiple filesystem volumes, LVM is a great tool for this but you can use partitions especially a GNU-PT or disklablel(BSD).

    There is more then one good reason to split a system into partitions, setting noatime on filesystems where it’s not needed is a good performance enhancement as are other filesystem options and tunables important for different branches of the Unix namespace. Plus putting /var/log on it’s own filesystem prevents this issue from waking you up at night nearly entirely, especially when combined with a remote log server. This becomes an issue you can sleep through or even wait till Monday. There are plenty more reasons to do this, I couldn’t list them all. /usr, /var, /home are all good candidates as well as /usr/local and /var/cache, /opt. On modern systems /var/run is moved to /run and made into a tmpfs.

    Another GOOD tip is that [1]Debian Policy dictates that log files MUST be rotated, so this should not be an issue on any decent operating system.

    1. http://www.debian.org/doc/debian-policy/ch-files.html#s10.8

    Nitpick: logrotate is not a deamon. At best it’s a process or cron job, although program, application, and cron *script all might apply apply.

    * It’s a compiled binary not a script on my system.

  2. That was a good point about Logrotate not being a daemon. That was a really silly mistake on my part that I should have turned up in proof reading.

    Thanks for the discussion about segregation of filesystems. It highlights an issue which is surprisingly often overlooked.

    But I do think that even if you have segregated your dynamic log files to a separate filesystem, there will always be a need for some form of automated maintenance, like logrotate, just to keep things in order. And if your /var/log fills up, that’s still going to prevent applications from logging and your monitoring may still throw you an alert of “No space left on device”.

    Having said that, archiving and remote shipping of logs are further issues I didnt’ touch upon either in this post, but also increasingly important in maintaining systems.

    Thanks for reading, and taking the time to reply.

Leave a Reply

Your email address will not be published. Required fields are marked *

*


1 + 2 =

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="">