AWK: Extract Logs for the Given Date(s) from a Log File

If your log file has entries like these:


2011-12-10T22:00:27.996+0000 [http-8080-1] INFO  my.package.MyClass Hello, I'm alive!
2011-12-11T17:05:46.811+0000 [http-8080-15] ERROR my.package.MyClass  - Error caught in DispatcherServlet
        at my.package.MyServiceClass(MyServiceClass.java:36)
...
2011-12-11T17:06:10.120+0000 [http-8080-14] DEBUG my.package.MyClass Whoo, that has been a long day!


Then you can use the following bash script snippet to extract logs only for a particular day or consecutive days, including everything - even lines not starting with the date such as stacktraces - between the first log of the date up to the first log of a subsequent date (default: yesterday):


LOGFILE_ORIG="$0"; LOGFILE="${LOGFILE_ORIG}.subset"
if [ -z "$LOGDAY" ]; then LOGDAY=$(date +%F -d "-1 days"); fi
if [ -z "$AFTERLOGDAY" ]; then AFTERLOGDAY=$(date +%F -d "$LOGDAY +1 days"); fi
echo "Extracting logs in the range (>= $LOGDAY && < $AFTERLOGDAY) into $LOGFILE ..." awk "/^$LOGDAY/,/^$AFTERLOGDAY/ {if(!/^$AFTERLOGDAY/) print}" $LOGFILE_ORIG > $LOGFILE


This date format works on Linux. Date is very flexible and can provide dates in any format, not only yyyy-mm-dd. You may also want to read more about Awk ranges and other tips.

You would run it in one of the following ways:


$ ./analysis.sh /path/to/logfile.log
$ LOGDAY=2011-12-12 AFTERLOGDAY=2011-12-17 ./analysis.sh /path/to/logfile.log

Tags: DevOps analysis


Copyright © 2024 Jakub Holý
Powered by Cryogen
Theme by KingMob