AIX系统日志学习笔记之一
来源:岁月联盟
时间:2012-02-18
Errdemon是aix的一个守护进程,该进程会实时检查/dev/drror设备文件,查看是否有新的内容,并将与系统错误模版对比,将错误信息写入系统错误日志中。
Errdemon守护进程会在系统启动是自动启动,也可以手动启动:
#/usr/lib/errdemon
关闭errdemon守护进程
#/usr/lib/errstop
#ps –ef | greperrdemon
AIX错误日志记录在/var/adm/ras/errlog中、
以下可以确定系统中错误日志文件的位置,日志文件的大小,缓存占用情况等
/usr/lib/errdemon–l
以下命令可以更改日志文件的大小
/usr/lib/errdemon–s 2097153
日志缓存设置
/usr/lib/errdemon–B 16384
AIX将日志记录下来之后,同时提供errpt命令来查看错误日志。另外一个诊断命令是diag用来诊断和分析硬件错误,而errpt仅仅是打印错误。
1、errpt命令
# errpt --h
errpt: Not arecognized flag: -
Usage: errpt -@ wpar_name -actgDP -s startdate -eenddate
-N resource_name_list -Sresource_class_list -R resource_type_list
-T err_type_list -d err_class_list -jid_list -k id_list
-J label_list -K label_list -lseq_no_list -F flags_list
-m machine_id -n node_id -i filename -yfilename -z filename
-I filename
Process errorlog entries from the supplied file(s).
-i filename Uses the error log file specified by thefilename parameter.
-y filename Uses the error record template file specifiedby the filename
parameter.
-z filename Uses the error logging message catalogspecified by the filename
parameter.
-I filename Uses the diagnostics error log specified bythe filename
parameter.
Output formattederror log entries sorted chronologically.
显示全部错误日志的详细信息
-a Print adetailed listing. Default is a summary listing.
-c Concurrent mode. Display error logentries as they arrive.
-t Print error templates instead of errorlog entries.
-g Output raw ascii error record structures.
-D Consolidate duplicate errors.
-P Show only duplicates from the errordevice driver.
Error log entryqualifiers:
-@wpar_name Select entries for the wparname.
下面两个是起止日期
-s startdate Selectentries posted later than date.(MMddhhmmyy)
-e enddate Selectentries posted earlier than date. (MMddhhmmyy)
-N list Select resource_names in 'list'.
-S list Select resource_classes in 'list'.
-R list Select resource_types in 'list'.
-T list Select types in 'list'.
-d list Select classes in 'list'.
指定错误ID
-j list Selectids in 'list'.
-k list Select ids NOT in 'list'.
-J list Select labels in 'list'.
-K list Select labels NOT in 'list'.
-l list Select sequence_numbers in 'list'.
-F list Select templates according to the valueof the
Alert, Log, or Report field.
-m machine_idSelect entries for the machine id as output by uname -m.
-n node_id Select entries for the node id as output by uname -n.
'list' is a listof entries separated by commas.
错误信息严重性:
error_type =PERM,TEMP,PERF,PEND,UNKN,INFO
错误类型:
error_class = H (HARDWARE), S (SOFTWARE), O (errloggerMESSAGES), U (UNDETERMINED)
常用的命令有:
1、列出简短的出错信息
errpt | more
2、列出所有硬件出错信息
errpt -d H
3、列出所有软件错误信息
errpt -d S
4、列出详细的出错信息
errpt –a
5、指定错误id号查询
errpt -aj ERROR_ID
6、永久错误信息
errpt -T PERM -d H
2、错误日志处理方法
#errclear 从错误日志中删除记录
#errstop/errdemon 停止错误记录守护进程/启动错误记录守护进程
#errclear
0315-136 Number of days is required, and must be zero or greater.
Usage:
errclear -@ wpar_name -J err_label_list -K err_label_list -Nresource_name_list
-R resource_type_list -S resource_class_list -T err_type_list
-d err_class_list -i filename -m machine_id -n node_id
-j id_list -k id_list -l seq_no_list -y filename number_of_days
Delete error log entries in the specified list that are older than
number_of_days specified. Number_of_days refers to the number of twenty
four hour periods from command invocation time.
-@ wpar_name Delete entriesfor the wpar name.
-J list Select onlyerror_labels in 'list'.
-K list Select onlyerror_labels not in 'list'.
-N list Select onlyresource_names in 'list'.
-S list Select onlyresource_classes in 'list'.
-R list Select onlyresource_types in 'list'.
-T list Select onlyerror_types in 'list'.
-d list Select only error_classes in'list'.
-i filename Uses the errorlog file specified by the filename parameter.
-j list Select onlyerror_ids in 'list'.
-k list Select onlyerror_ids not in 'list'.
-l list Selectsequence_numbers in 'list'.
-m machine_id Delete entries for the machine id as output by uname-m.
-n node_id Delete entriesfor the node id as output by uname -n.
-y filename Uses the error recordtemplate file specified by the filename
parameter.
'list' is a list of entries separated by commas.
error_type = PERM,TEMP,PERF,PEND,UNKN,INFO
error_class = H (HARDWARE), S (SOFTWARE), O(errlogger MESSAGES), U (UNDETERMINED)
常用的errclear命令
从错误日志中删除所有记录,请输入:
errclear 0
从错误日志中删除所有软件错误类的条目
errclear -d S 0
从错误日志中删除所有硬件错误类的条目
errclear -d H 0
摘自 wolf
上一篇:Linux bond的参数解释
下一篇:AIX系统日志学习笔记之二