Setting Up parsers

Prev Next

Parsing Data From Your Datasources

ZSA currently assumes that no data can be handled as is, and will try to parse each data entry before saving it in database. 

Types of parsers

3 parsers are available:

  • LEGACY_PARSER
    • Manage data already parsed by the Mainframe Zetaly Agent
    • Main task is to insert data as it is to the database
  • RAW_PARSER
    • Try to find a ZUP parser in line with SMF binary type of data. In a SMF record is defined as "SMF-16" and a ZUP parser exists for this "SMF-16", the parser will be used to parse the data and insert it to the database
  • CSV_PARSER
    • Parser used to handle CSV data from file


Starting the Parser

The parser has its dedicated Parser List that resembles the connector one. 

All buttons function as they do for the connectors.
Once you start the parser with the play button, the Messaging will start parsing data. See this page to make them available to the other Zetaly apps.

Configuring the Parser

As with connectors, the number of instances for the parser can be changed using the edit function, and clicking on it opens the dialog box. 





 

The dialog box contains the following fields:

  • Number of instances - defines the minimum number of instances.
  • Number of records - defines the number of records to be parsed at once.
  • Poll interval - defines the time (in seconds)  to wait before attempting to parse records if none are waiting to be parsed.
  • Database buffer size - Size of the data bulk to insert at once
  • The Autoscale instances option automatically creates a new instance when the Usage Value reaches 90%. (read more here)
  • Keep raw record without parser (only for RAW parser): if for a given record, no parser is found, the record is stored in binary within the database (raw_smf_2 table)

ZSA Parsing Statistics

Expand the row to see the statistics of the ZSA parser. This is used to control the parser operation and can help you determine if more instances are needed.

Parsers Charts

Multiple graphs help you track the performance of the parsers:

  • Current/Dequeue per second : Track the queue usage. If current is high, it means parsers can't process all records in memory and is using buffer on disk. Dequeue is the rate
  • Parsed error/ignored/success per second: parser rate per status
  • Insert errors/insered: database insert/error rate
  • Usage: Measure the time (%) the instances are processing records
  • Average insert/parsing/process time : Time average for each work load
  • Instances: Instances count over time (can vary if autoscale option is enabled)

 


How to find the right configuration

The current architecture introduces 2 buffers:

  • one at the mainframe level: DXQ
  • one at the Linux level: ZQM

Purpose of connectors and parsers tuning is to free-up those 2 buffers fast enough.


Connector and parser

The goal of the collect processes is to collect all data from the mainframe, parse it and insert it into the database. There are therefore two possible points of contention:

  • A given LPAR or the sum of all LPAR generate too much data, and we have a "network" contention between the DXQ process and the ZQM process
    • Increase the network bandwidth
    • Reduce the collected data (ZSA option)
    • Increase the DXQ buffer size
    • Increase the ZQM buffer size
    • Increase ressource allocated to the Linux ZSA processes
  • The parser doesn't process fast enough
    • Reduce the collected data (ZSA option)
    • Increase the ZQM buffer size
    • Increase ressource allocated to the Linux ZSA processes
  • Database insertions are not fast enough
    • Reduce the collected data (ZSA option)
    • Tune the database infrastructure to allow more write operations in parallel

Parser process

Open the parser statistics, observe the graph named "Queues". The "Number of stored messages" curve should be stable and close to 0. If this is not the case, then you have a potential problem. Performance can be enhanced via two properties:

  • Number of records
  • Quantity of instance

Increasing number of records will reduce the number of calls made to ZQM, thus reducing the incompressible network load and improving response time by reducing the number of calls. The default value is 5000, but it can be increased drastically (100,000, for example). The aim is to ensure that the processing time for a defined message packet takes no more than a few seconds. 

If you're in a saturation situation, you can check that the value you've set is not too high. Open the "Queues" graph, look at the value of the "Number of processed messages" curve and divide it by 5, then divide the value obtained by the value of "Number of records". This will give the number of seconds required to process a message packet. This number must remain below 5 (note that this number is only valid in the event of saturation).

If you've already increased the "Number of records" property but you're still in a saturation situation, you can increase the number of parser instances. This will allow more CPU to be allocated to message processing. Beware, however, that increasing the number of instances increases the competition for ZQM access, which can slow it down. The aim is therefore to increase the number of instances to benefit from parallel processing, while avoiding overloading ZQM access.

Open the "Queues" graph, look at the value of the "Number of processed messages" curve. At the same time, increase the number of parser instances. You should see an increase in this number after a few minutes. As long as you remain saturated, repeat the operation. If the number no longer increases, or even decreases, then return to the previous value. If this doesn't resolve the ZQM saturation, please contact our support team with your configuration, analysis and environment specifications.

Parser insert into database

Open the parser statistics, observe the first graph named "Inserted records". Take a look to the value of "Waiting for bulk size or flush time". This value should be stable and close to zero. If this value increases and doesn't seem to decrease, then you need to modify your configuration. This is made possible by two properties:

  • Bulk insert quantity
  • Quantity of instance

Increasing the "Bulk insert quantity" value will reduce the number of calls made to the database for mass-produced SMFs. This value can be increased to several hundred thousand if necessary. Be careful, however, as this will increase the RAM consumption of both ZSA and your database.

If you've already increased the "Bulk insert quantity " property but you're still in a saturation situation, you can increase the number of parser instances. This will allow more CPU to be allocated to message inserts. Beware, however, that increasing the number of instances increases the competition for ZQM access and database access, which can slow it down. The aim is therefore to increase the number of instances to benefit from parallel processing, while avoiding overloading ZQM and database access. 

Open the "Inserted records" graph, look at the value of the "Inserted in last 30 secs" curve. At the same time, increase the number of parser instances. You should see an increase in this number after a few minutes. As long as you remain saturated, repeat the operation. If the number no longer increases, or even decreases, then return to the previous value. If this doesn't resolve the database saturation, please contact our support team with your configuration, analysis and environment specifications.