arclog - Archive the Log Files Monthly
Description
arclog archives the log files monthly. It strips off previous
months’ log records from the log file, and save them to compressed
archive files named logfile.yyyymm. It then saves the hard disk
space and prevents potential attacks on log files.
Currently, arclog supports [Apache] access log, Syslog, [NTP],
Apache 1 SSL engine log, and my own bracketed, modified ISO date/time
log file formats, and gzip, bzip2, and xz compression methods.
Several software projects log (or can log) in a format compatible with
the Apache access log, like [CUPS], [ProFTPD], [Pure-FTPd]… etc., and
arclog can archive their Apache-like log files, too.
Caution
-
Archival takes time. To reduce the time occupying the source log file,
arclogcopies the content of the source log file to a temporary working file and restart the source log file first. Thenarclogcan take its time working on the temporary working file. However, please note:-
If you have a huge log file (several hundreds of MBs), merely copying still takes a lot of time. You had better stop logging first, archive the log file and restart logging, to avoid racing condition in writing. If you archive the log file periodically, it shall not grow too big.
-
If
arclogstops in the middle of the execution, it will leave a temporary working file. The next timearclogruns, it stops when it sees that temporary working file. You have to process that temporary working file first. That temporary working file is merely a copy of the original log file. You can rename and archive it like an ordinary log file to solve this.
-
-
Do not sort unless you have a particular reason. Sorting has the following potential problems:
-
Sorting may eat huge memory on large log files. The amount of the memory required depends on the number of records in each archived month. Modern Linux and MS-Windows kill processes that eat too much memory, but it still takes minutes, and your system hangs for that. I do not know other operating systems. Try at your own risk.
-
The time unit of all recognized log formats is second. Log records happen in a same second are sorted by the log file order (if you are archiving several log files at a time) and then the log record order. I try to ensure that the sorted archived records are in a correct order of the happening events, but I cannot guarantee. You have to watch out if the order in a second is important.
-
-
Be careful on the Syslog and NTP log files: Syslog and NTP does not record the year.
arcloguses Date::Parse to parse the date, which assumes the year between this month and last next month if the year is missing. For example, if today is 2001/6/8, it assumes the year between 2001/6/30 back to 2000/7/1. This is fair. However, if you do have a Syslog or NTP log file that has records older than one year, do not usearclog. It will destroy your log file. -
If read from
STDIN, please note:-
You must specify the output prefix if you want to read from
STDIN, since what it needs is an output pathname prefix, not an output file. -
STDINcannot be deleted, restarted or partially kept. If you read fromSTDIN, the keep mode is always "keep all". If you archive several source log files includingSTDIN, the keep mode will be "keep all" for all source log files, to prevent disaster. -
The answers of the
askmode is obtained fromSTDIN, too. Since you have only oneSTDIN, you cannot specify the ask mode while reading fromSTDIN. It falls back to the fail mode in that case.
-
-
I suggest that you install File::MMagic instead of counting on the
fileexecutable. The internal magic file of File::MMagic works better than thefileexecutable.arclogtreats everything not gzip, bzip2, or xz compressed as plain text. When a compressed log file is wrongly recognized as an image,arclogtreats it as plain text, reads directly from it, and fails. This does not hurt the source log files, but is still annoying.
System Requirement
-
Perl, version 5.8.0 or above.
arcloguses 3-argument open() to duplicate file handles, which is only supported since 5.8.0. I have not successfully port this onto earlier versions yet. Please tell me if you made it.You can run
perl -vto check your current Perl version. If you do not have Perl, or if you have an older version of Perl, you can download and install/upgrade it from the Perl website. For MS-Windows, you can download and install Strawberry Perl or ActivePerl. -
Required Perl modules:
-
This is used to parse the timestamp of the log records. You can download and install Date::Parse from the CPAN archive, or install it with the CPAN shell:
cpan Date::Parseor with the CPANPLUS shell:
cpanp i Date::ParseFor Debian/Ubuntu:
sudo apt install libtimedate-perlFor Red Hat/Fedora/CentOS:
sudo yum install perl-TimeDateFor FreeBSD:
ports install p5-TimeDateFor ActivePerl:
ppm install TimeDate
-
-
Optional Perl modules:
-
This is used to check the file type. If this is not available,
arclogtries thefileexecutable instead. If that is not available, too,arclogjudges the file type by its name suffix (extension). In that casearclogfails when reading fromSTDIN. You can download and install File::MMagic from the CPAN archive, or install it with the CPAN shell:cpan File::MMagicor with the CPANPLUS shell:
cpanp i File::MMagicFor Debian/Ubuntu:
sudo apt install libfile-mmagic-perlFor Red Hat/Fedora/CentOS:
sudo yum install perl-File-MMagicFor FreeBSD:
ports install p5-File-MMagicFor ActivePerl:
ppm install File-MMagicThe alternative
file.exefor MS-Windows can be obtained from the GnuWin32 home page. Be sure to save it asfile.exesomewhere in yourPATH. -
IO::Compress::Gzip and IO::Uncompress::Gunzip
They are used to support reading/writing the gzip compressed files. It is only needed when gzip compressed files are encountered. If they are not available,
arclogtries thegzipexecutable instead. If that is not available, too,arclogfails. You should not worry about IO::Compress::Gzip since it comes with Perl since version 5.9.3. If not, it is contained in the IO-Compress distribution. You can download and install it from the CPAN archive, or install it with the CPAN shell:cpan IO::Compress::Gzipor with the CPANPLUS shell:
cpanp i IO::Compress::GzipFor Debian/Ubuntu:
sudo apt install libio-compress-perlFor Red Hat/Fedora/CentOS:
sudo yum install perl-IO-CompressFor FreeBSD:
ports install p5-IO-CompressFor ActivePerl:
ppm install IO-CompressThe alternative
gzip.exefor MS-Windows can be obtained from the gzip website. Be sure to save it asgzip.exesomewhere in yourPATH. -
IO::Compress::Bzip2 and IO::Uncompress::Bunzip2
They are used to support reading/writing the bzip2 compressed files. They are only needed when bzip2 compressed files are encountered. If they are not available,
arclogtries thebzip2executable instead. If that is not available, too,arclogfails. You should not worry about IO::Compress::Bzip2 since it comes with Perl since version 5.10.1. If not, it is contained in the IO-Compress distribution. You can download and install it from the CPAN archive, or install it with the CPAN shell:cpan IO::Compress::Bzip2or with the CPANPLUS shell:
cpanp i IO::Compress::Bzip2For Debian/Ubuntu:
sudo apt install libio-compress-perlFor Red Hat/Fedora/CentOS:
sudo yum install perl-IO-CompressFor FreeBSD:
ports install p5-IO-CompressFor ActivePerl:
ppm install IO-CompressThe alternative
bzip2.exefor MS-Windows can be obtained from the bzip2 website. Be sure to save it asbzip2.exesomewhere in yourPATH. -
IO::Compress::Xz and IO::Uncompress::UnXz
They are used to support reading/writing the xz compressed files. It is only needed when xz compressed files are encountered. If it is not available,
arclogtries thexzexecutable instead. If that is not available, too,arclogfails. They are contained in the IO-Compress-Lzma distribution. You can download and install it from the CPAN archive, or install them with the CPAN shell:cpan IO::Compress::Xzor with the CPANPLUS shell:
cpanp i IO::Compress::XzFor Debian/Ubuntu:
sudo apt install libio-compress-lzma-perlFor Red Hat/Fedora/CentOS:
sudo yum install perl-IO-Compress-LzmaFor FreeBSD:
ports install p5-IO-Compress-LzmaFor ActivePerl:
ppm install IO-Compress-LzmaThe alternative
xz.exefor MS-Windows can be obtained from the XZ Utils website. Be sure to save it asxz.exesomewhere in yourPATH. -
This is used to display the progress bar. The progress bar is a good visual feedback of what
arclogis currently doing, butarclogis safe without it. You can download and install Term::ReadKey from the CPAN archive, or install it with the CPAN shell:cpan Term::ReadKeyor with the CPANPLUS shell:
cpanp i Term::ReadKeyFor Debian/Ubuntu:
sudo apt install libterm-readkey-perlFor Red Hat/Fedora/CentOS:
sudo yum install perl-TermReadKeyFor FreeBSD:
ports install p5-Term-ReadKeyFor ActivePerl:
ppm install TermReadKey
-
Download
arclog is hosted is on…
You can always download the newest version of arclog from…
imacat’s PGP public key is at…
Install
If you are upgrading from arclog.pl 2.1.1dev4 or earlier, please
read the upgrade instruction later in this document.
Install with ExtUtils::MakeMaker
% perl Makefile.PL
% make
% make test
% make install
When running make install, make sure you have the privilege to
write to the installation locations. This usually requires the root
privilege.
If you want to install into another location, you can set the
PREFIX. For example, to install into your home when you are not
root:
% perl Makefile.PL PREFIX=/home/jessica
Refer to the documentation of ExtUtils::MakeMaker for more
installation options (by running perldoc ExtUtils::MakeMaker).
For MS-Windows, since make is not universally available,
Module::Build is preferred to ExtUtils::MakeMaker. See the
instructions below.
Install with Module::Build
% perl Build.PL
% ./Build
% ./Build test
% ./Build install
When running ./Build install, make sure you have the privilege to
write to the installation locations. This usually requires the root
privilege.
If you want to install into another location, you can set the
--prefix. For example, to install into your home when you are not
root:
% perl Build.PL --prefix=/home/jessica
Refer to the documentation of Module::Build for more installation
options (by running perldoc Module::Build).
Upgrade Instruction
Here are a few hints for people upgrading from 2.1.1dev4 or earlier:
The Script Name is Changed from arclog.pl to arclog
This is obvious. If you have any scripts or cron jobs that are
running arclog, remember to modify your script for the new name.
Of course, you can rename arclog to arclog.pl. It still works.
The reason I changed the script and project name is that: A dot .
in the project name is not valid everywhere. At least SourceForge
don’t accept it. Besides, arclog is enough for a script name under
UNIX. The .pl file name suffix/extension may be convenient on
MS-Windows, but MS-Windows users won’t run it with explorer file name
association anyway, and there is a pl2bat to convert arclog to
arclog.bat, which would make more sense. The only disadvantage is
that I was using UltraEdit, which depends on the file name extension
for the syntax highlighting rules. I can manually set it anyway. I’m
using gedit on Linux now. This is not a problem anymore.
The Default Installation Location Is at /usr/bin
Also, the man page is at /usr/share/man/man1/arclog.1. This is to
follow Perl’s standard convention, and to avoid breaking
ExtUtils::MakeMaker with future versions.
When you run perl Makefile.PL or perl Build.PL, it hints a
list of existing old files to be removed. Please delete them
manually.
If you saved them in other places, you have to delete them yourself.
Also, if you have any scripts or cron jobs that are running arclog,
remember to modify your script for the new arclog location. Of
course, you can copy arclog to the original location. It still
works.
The Argument of --keep and --override Options Are Required Now
Support for omitting the --keep or --override arguments are
removed. This helps to avoid confusion for the log file name and the
option arguments.
Options
./arclog [options] logfile… [output]
./arclog [-h|-v]
-
logfileThe log file to be archived. Specify
-to read fromSTDIN. You can specify multiple log files.gzip,bzip2, orxzcompressed files are supported. -
outputThe prefix of the output files. The output files are named as
output.yyyymm, i.e.,output.200101,output.200102. If not specified, the default is the same as the log file. You must specify this if you want to read fromSTDIN. You cannot specify-(STDOUT), sincearclogneeds a name prefix, not the output file. -
-c,--compress methodSpecify the compression method for the archived files. Log files usually have large number of similar lines. Compress them saves you lots of disk spaces. (And this is why we want to archive them.) The following compression methods are supported:
-
g,gzipCompress with
gzip. This is the default.arclogcan useIO::Compress::Gzipto compress instead of callinggzip. This can be safer and faster for not calling foreign binaries. IfIO::Compress::Gzipis not installed, it triesgzipinstead. Ifgzipis not available, either, it fails. -
b,bzip2Compress with
bzip2.arclogcan useIO::Compress::Bzip2to compress instead of callingbzip2. This can be safer and faster for not calling foreign binaries. IfIO::Compress::Bzip2is not installed, it triesbzip2instead. Ifbzip2is not available, either, it fails. -
x,xzCompress with
xz.arclogcan useIO::Compress::Xzto compress instead of callingxz. This can be safer and faster for not calling foreign binaries. IfIO::Compress::Xzis not installed, it triesxzinstead. Ifxzis not available, either, it fails. -
n,noneNo compression at all. (Why? :p)
-
-
--nocompressDo not compress the archived files. This is equivalent to
--compress none. -
-s,--sortSort the records by time (and then the record order). Sorting eats huge memory and CPU, so it is disabled by default. Refer to the description above for a detailed illustration on sorting.
-
--nosortDo not sort the records. This is the default.
-
-o,--override modeWhat to do with the existing archived files. The following modes are supported:
-
o,overwriteOverwrite existing target files. You will lose these existing records. Use with care. This is helpful if you are sure the main log file has the most complete records.
-
a,appendAppend the records to the existing target files. You may destroy the log file completely by putting irrelevant entries altogether by accident. Use with care. This is helpful if you append want to merge 2 or more log files, for example, 2 log files of different periods.
-
i,ignoreIgnore any existing target file, and discard all the records of those months. You will lose these log records. Use with care. This is helpful if you are supplying log records for the missing months, or if you are merging the log records in a complex manner.
-
f,failStop whenever a target file exists, to prevent destroying existing files by accident. This should be mostly desired when run from some automatic mechanism, like
crontab. So, this is the default if no terminal is found atSTDIN. -
askAsk you what to do when a target file exists. This should be mostly desired if you are running
arcloginteractively. So, this is the default if a terminal is found atSTDIN. The answers are read fromSTDIN. Since you have only oneSTDIN, you cannot specify this mode if you want read the log file fromSTDIN. In that case, it falls back to thefailmode. Also, ifarclogcannot get its answer fromSTDIN, for example, on a closedSTDINfromcrontab, it falls back tofailmode.
-
-
-k,--keep modeWhat to keep in the source file. Currently, the following modes are supported:
-
a,allKeep the source file after records are archived.
-
r,restartRestart the source log file after records are archived.
-
d,deleteDelete the source log file after records are archived.
-
t,this-monthArchive and strip records of previous months off from the log file. Keep the records of this month in the source log file, to be archived next month. This is designed to be run from
crontabmonthly, so this is the default.
-
-
-d,--debugShow the detailed debugging messages. More
-dto be more detailed. -
-q,--quietHush! Only yell on error.
-
-h,--helpDisplay the help message and exit.
-
-v,--versionOutput version information and exit.
Documentation
Type perldoc arclog to read the arclog manual.
News, Changes and Updates
Refer to the Changes for changes, bug fixes, updates, new functions,
etc.
Support
The arclog project is hosted on GitHub. Address your issues on the
GitHub issue tracker https://github.com/imacat/arclog/issues.
Thanks
-
Thanks to Chen-hsiu Huang for reporting the bug that
$WORKING_RESwas not locked when opened. -
Thanks to SourceForge for providing compiling farm for projects to test on different platforms.
License
Copyright (C) 2001-2021 imacat.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
imacat ^_*'
2007/12/3
imacat@mail.imacat.idv.tw
https://www.imacat.idv.tw