Understanding File Reader connector framework Farid Merchant, Dipali Shah Technical Solution Consultants
Agenda Log formats and connector types File Reader thread and persistence Log rotation types Folder follower operation Common issues and customer cases Best practices 3
Understanding File Reader connector framework Log formats and connector types (18 pt. HP simplified)
Log file formats and parsers CSV Data Delimited Data Parser 2200100C3A3A, 5, 2, SZA0002, Administrator, EICAR Test String, C:\eicar.com,5,1,1,256,4214852,,0,,0,,222822400, 11101,0,1,0,0,0,0,,0,2,4,0,MOTOSOC Free Form Data Regex Parser Oct 21 07:43:48 172.16.0.254 aaa[452]: <125022> <WARN> aaa Authentications failed for User admin, Logged in from 72.16.0.87 port 20817 Key Value Pairs Key Value Parser $IfNo=0, $ruleid=1048909, $rulename=icmp PING *NIX, $ori=built-in, $cat=others, $srcip=192.168.11.103, $dstip=192.168.168.254 CEF CEF Parser CEF:0 Reconnex iguard 2.1 0_any_mail 0_any_mail Medium cs1=1:admin\content Traffic cs1label=policies cn1=1 cn1label=matchcount src=1.1.1.1 dst=4.4.4.4 spt=34817 dpt=25 XML - Xquery Parser 5 <test status= not-vulnerable id= generic-icmp-netmask:> <Paragraph>No Response</Paragraph> </test>
File Reader and folder follower connector types Single File Reader Connectors Realtime Single Folder Follower Connectors Batch Mode DHCP Connector ISS Connector Multiple File Reader Connectors Realtime Multiple Folder Follower Connectors Batch Mode Realtime Blue Coat Multiple Server Connector Oracle SYSDBA Audit Multiple Folder Connector 6
Understanding File Reader connector framework File Reader thread and persistence18 pt. HP simplified)
File Reader thread Features Can read the file from the beginning or end (startatend) Allows configuring any java supported character encoding for the log file (encoding) Automatically detects ZIP or GZIP file formats and uncompresses them before processing. Works only in batch mode file processing Allows for using Non Locking Windows File Reader on windows platform to enable devices rotate the log files (usenonlockingwindowsfilereader) Detects loss of network connection when log file is remote and recovers automatically Remembers the file reading state and starts processing from the same state when a connector is restarted (only when preservestate is enabled) Note: This is related to all log file type except for XML files 8
File Reader persistence File Reader persistence state consists of 3 components Byte offset - the byte position in the file where we left last time Char offset the character position in the file where left last time Remnant any portion of the line left unprocessed from the buffer Is stored in a file under user/agent/agentdata Named after the log file path and agent id of the connector Enabled by setting connector parameter preservestate to true Saving the state happens in background threads. Other parameters that govern the saving of persisted state Preservestatecount=10 determines after how many calls made by the code to save preservestateinterval=30 determines in how many seconds of interval to save Whichever happens first. This state is also saved on a graceful shutdown of the connector On restart, connector uses the saved state to resume the file processing 9
Understanding File Reader connector framework Log rotation types8 pt. HP simplified)
Log rotation types Understanding the concept of Log Rotation Case 1: HP ArcSight New Thread Created Reading file 1 from the device Device File 1 Case 2: HP ArcSight Terminates Current Thread Starts New Thread Device has completed writing on file1 and created new file 2 Device File 1 File 2 11
Log rotation types Name Following Rotation - Connector follows the same file name - Ex. Apache HTTP Server Access File Connector Daily Rotation (Includes hourly and monthly rotations) - Connector follows the file name that has today s time stamp(both time and date) in it - Ex. Microsoft IIS File Connector Indexed Rotation - Connector follows the file that has the latest index - Ex. Enterasys Dragon Export Tool File Connector Other types exist in individual connectors, but not implemented in framework - Bluecoat connector sorts the files in a folder by timestamp and reads the next file in the order of timestamps 12
Name following log rotation Operation - Device renames the current log file (xyz.log => xyz1.log) - Device starts writing to a new empty file with the same name (xyz.log) - Connector detects the rotation by the sudden drop in the size of the file and takes the following action Terminates the current File Reader thread after the old xyz.log is completely processed Launches a new File Reader thread to read the new file How to enable this rotation? - Connector parameter followexternalrotation should be set to true 13
Daily log rotation Operation - Device writes to a different log file every day and the log file name has the timestamp in it (xyz_<timestamp>.log) - Connector reads the log file for the day and continues to read the same file on the following day as well until a new file with timestamp for the day appears. When that happens, it takes the following actions Terminates the current File Reader thread after old log file is completely processed Launches a new File Reader thread for the new log file How to enable the feature? - For single File Reader connector rotationscheme = Daily, rotationschemeparams = dateformat> Example: Dhcp_,yyyyMMdd,log - For multi File Reader connector Log file name should be specified such that the date pattern is specified in SimpleDateFormat notation and non-date format portions enclosed in single quote. Example: /var/log/dhcp_ yyyymmdd.log 14
Indexed rotation Operation Device writes to an indexed log file Example: Dhcp.log.001, Dhcp.log.002, and so on Upon startup, connector reads log file with the highest index and continues to read the same file until a new file with current index incremented by 1 appears. When that happens, it takes the following actions Terminates the current File Reader thread after old log file is completely processed Launches a new File Reader thread for the new log file How to enable the feature? For single File Reader connector rotationscheme = Index, rotationschemeparams = %0Nd,Min,Max N is the number of digits in the index, smaller digits will be padded with leading zeros. Min and Max define the allowed range for the index. On reaching Max, the next index will be Min. For multi File Reader connector Index format should be embedded in the log file name (Example: Dhcp.log.%03d,0,999) 15
Other log rotation parameters usenonlockingwindowsfilereader Windows JVM opens the file in a non-sharing mode where the file is locked and can not be rotated by the device when another process is reading it. When this parameter is enabled, we use File Reader which opens the file in non-locking share mode. usealternaterotationdetection Used in conjunction with followexternationrotation parameter. It tells the connector to use an alternate log rotation detection logic. Alternate log rotation detection is more accurate on different operating systems onrotation, onrotationoptions Allow for deleting or renaming a file after rotation. Nothing is done by default. 16
Other log rotation parameters rotationdelay How long to wait once a new file is detected before the File Reader thread for the current file is terminated and a File Reader thread is launched for the new file. Default value is 30 sec rotationonlywheneventexists, rotationsleeptime These parameters tell the daily log follower to consider rotation only if there are events in the new file or some delay equal to rotationsleeptime has elapsed after the new file appeared. This is not enabled by default. When enabled, the default value for rotationsleeptime is 10 sec. 17
Understanding File Reader connector framework Folder follower operation
Folder follower operation - single Allows for a single folder Processes files only in batch mode. Parameters and default values folder absolute path for the folder where the files are processed agents[0].foldertable[0].folder=c\:\\logs\\ processfoldersrecursively whether to process the subfolders in the folder recursively (false) agents[0].foldertable[0].processfoldersrecursively=false wildcard name pattern to select which files in the folder are picked up for processing (*.*) agents[0].foldertable[0].wildcard=u_ex*.log sleeptime how often to check for the new files to process in the folder (5 sec) agents[0].foldertable[0].sleeptime=30000 usetriggerfile whether to use a trigger file to indicate when a file is ready for processing (false) agents[0].foldertable[0].usetriggerfile=false triggerextension trigger file extension when the previous parameter is enabled (.done) agents[0].foldertable[0].triggerextension=.done 19
Folder follower operation - single delay how long after the file appeared in the folder to consider it for processing (10 sec) agents[0].foldertable[0].delay=10000 minfilelength minimum file length restriction before a file is considered for processing (-1) agents[0].foldertable[0].minfilelenght=-1 retryinterval how long to wait before retrying if file processing failed (10 sec) agents[0].foldertable[0].retryinterval=1000 maxretries how many times to retry when the file processing fails (-1) agents[0].foldertable[0].maxretries=-1 badsubfolder subfolder where a file will be transferred if file processing fails for maxretries (bad) agents[0].foldertable[0].badsubfolder=bad mode whether to rename, delete or remember the file after it is processed (rename) agents[0].foldertable[0].mode=persistfile modeoptions extension for the renamed file after it is processed (.processed) agents[0].foldertable[0].modeoptions=processed 20
Folder follower operation - multiple Allows for multiple folders to be configured. Parameters are configured per folder. This enables tuning the processing of files in each folder independent of the other Parameters and default values All parameters for the single folder follower processingmode Allows for processing the files in a folder batch mode or real time (batch) agents[0].foldertable[0].processingmode=realtime processingtimeout idle time after which real time processing will be temporarily suspended (-1) agents[0].foldertable[0].processingtimeout=-1 retryinterval how often to check for any activity on suspended file and resume processing (-1) agents[0].foldertable[0].retryinterval=-1 processingthreshold idle time after which real time processing will be completely stopped (-1) agents[0].foldertable[0].processingthreshold=-1 21
Folder follower operation - multiple configfile parser properties file agents[0].foldertable[0].configfile=iis\\iis_file configtype parser type (regex, key value, delimited or cef) agents[0].foldertable[0].configtype=sdkfilereader Some File Reader connector parameters agents[0].foldertable[0].startatend=true agents[0].foldertable[0].encoding=utf8 agents[0].foldertable[0].usealternaterotationdetection=false agents[0].foldertable[0].usenonlockingwindowsfilereader=false agents[0].foldertable[0].followexternalrotation=false 22
Folder follower operation - multiple processinglimit maximum number of concurrent File Reader threads for real time processing (256) agents[0].foldertable[0].processinglimit=256 configfolder relative aup folder for the parser content agents[0].foldertable[0].configfolder=config\\agent\\oldsdk\\ Persistence parameters (preservestate, prerservestatecount, preservestateinterval) agents[0].persistenceinterval=0 agents[0].preservedstatecount=10 agents[0].preservedstateinterval=30000 Field Extractor Parameters (usefieldextractor, extractsource, extractregex, extract fieldnames) agents[0].foldertable[0].extractfieldnames=devicehostname,devicecustomnumber1 agents[0].foldertable[0].extractregex=(\\w+)\\.(\\d+)\\.log agents[0].foldertable[0].extractsource=file Name agents[0].foldertable[0].usefieldextractor=true 23
Understanding File Reader connector framework Common issues and customer cases
Common issues Connector not able to read files from the folder; Ex. File doesn t exist error in logs Connector not able to rename/delete Delay in events due to time zone difference between connector server and device Connector stopped processing events 25
Customer case 1 Bluecoat file connector generate failed to open log file and not processing events Observations: In agent.log file 26 INFO jvm 1 INFO jvm 1 2014/03/18 21:46:41 FATAL EXCEPTION: 2014/03/18 21:46:41 Failed to open log file [/usr/bcreporter/sg_192_168_1_91_main 910318131832.log.gz] for locating fields File format is.gz Example: SG_192_168_1_91_main 910318131832.log.gz Blue coat is configured in continuous mode. SmartConnector is configured to read in realtime. Resolution: In realtime connector is expecting file to be read continuously without any constraint of opening/closing.gz file. Changing the processing mode from realtime to batch and bluecoat configuration from continuous to periodic allows it to read the file without any constraints.
Customer case 2 IIS MultiServer connector not able to read Observations: Connector logs indicates that it is not able to find files in the configured path 2014-05-10 21:10:39,029][INFO ][default.com.arcsight.agent.yc.b][run] 0 files processed [2014-05-10 21:10:39,763][INFO ][default.com.arcsight.agent.xh][logstatus] {Agent Type=iis_multiserver, foldertable[0].folder=c:\inetpub\logs\logfiles\w3svc1, foldertable[0].latestlogonly=true, foldertable[0].version=7.0, foldertable[0].wildcard=u_ex*.log} Resolution: Connector is expecting a folder name W3SVCX under the configured path. Incorrect configuration: agents[0].foldertable[1].folder=c:\\inetpub\\logs\\logfiles\\w3svc1 Correct configuration: agents[0].foldertable[1].folder =C:\\\inetpub\\logs\\LogFiles 27
Customer case 3 5 hours delay in event processing for Microsoft TMG connectors Observation: Connector is installed in EST and the file role overs(rotates) as per GMT time. Which means file gets rolled over at 7PM (EST) and connector doesn t process events for next 5 hours till the connector server time reaches 12 AM (EST). Resolution: Have connector and Microsoft TMG server in same timezone. Modify the following parameter to the timezone where the ISA server is located isalogfiletimezoneid 28
Customer case 4 IIS connector skips reading every alternate file Observation: Connector reads File A for Day 1; when the timestamp move to UTC mid-nite (Day 2), IIS creates File B Connector create a new file thread to read File B. What happen is that File A and File B is now having the same timestamp IIS adds more events into File B without changing the timestamp, Connector reads file B but it does not tail it. It reads the initial events and closes the thread. While File A tailing still continues. On Day 3, File C is created. Because by 3rd day, the timestamp for File C and File A is different, connector continues to tail File C and File A is closed. This cycle continues, which means that every other day, almost one day worth of logs is missing Resolution: A hotfix has been created to fix this issue, in this fix connector was made to follow the filename rather than the timestamp of the file. This hotfix will be merged in the GA in the upcoming release of connector. 29
Understanding File Reader connector framework Best practices
Best practices Verify new files/events are generated Verify Network is accessible remotely or locally Connectivity Verify user running the connector has sufficient permission to access Review logs for error File not found error, Possible causes File permission Network Connectivity Configuration error file or path specified may be incorrect Parsing error, Possible causes Vendor not supported Version not supported No Error, Possible causes No events to read No new events generated Connector hung 31
Q&A
For more information Attend these sessions TT3113, Exploration of HP ArcSight Database Connectors & Best Practices (Wed 11:30 AM) After the event Stop by at the Support Booth and meet the expert engineers Provide valuable feedback on how support can serve you better Your feedback is important to us. Please take a few minutes to complete the session survey. 33
Please give me your feedback Session 3114 Speaker Farid Merchant and Dipali Shah Please fill out a survey. Hand it to the door monitor on your way out. Thank you for providing your feedback, which helps us enhance content for future events. 34
Thank you!