We’ve all been there. Response times become erratic, odd unexplainable things start to occur on the system, calls into the Service Desk dramatically increase but you can’t see anything noticeably wrong as you frantically conduct a series of checks. As you verify auxiliary storage levels via WRKSYSSTS you notice that the system ASP percentage is shown in inverse video and depicting a high percentage close to 100%. Cue panic.
Likely candidates for sudden disk growth are a runaway query writing to physical file or spooled file, a build-up of journal receivers or maybe another process not running as expected and using an excessive amount of temporary storage.
Slower organic disk growth can normally be attributed to a build-up of spooled files, application log files not being cleared down, temporary save files or libraries remaining on the system after use or simply natural organic growth.
Away from production, the backup server is often neglected and assumed to be running fine. As its not running production applications a certain apathy towards it can slowly creep in despite backup servers forming an integral part of high availability environments. Any prolonged outages can impact both high availability along with RPO/RTO, and in some cases the ability to conduct BI or backups, often tasks performed away from the production workload so as not to have a negative impact on performance.
To avoid hitting 100% there are several things you can configure in the operating system:
1) In SST (System Storage Tools) you can define a threshold (percentage) at which point CPF0907 ‘Serious storage condition may exist) is sent to the QSYSOPR and QSYSMSG message queues at hourly intervals.
2) There are two System Values QSTGLOWLMT and QSTGLOWACN that work in tandem to alert you:
QSTGLOWLMT (Auxiliary storage lower limit). This value holds a percentage figure that specifies the threshold of concern for you with regards to the system auxiliary storage pool occupancy. The default value is 5 (percent) which means *SYSBAS can reach 95% full before the system reacts.
QSTGLOWACN (Auxiliary storage lower limit action) defines the action you’d like to occur when QSTRLOWLMT is reached. The available actions are detailed below:
*MSG – CPI099C ‘Critical storage lower limit reached’ is sent to the QSYSOPR or QSYSMSG message queues.
*CRITMSG – CPI099B ‘Critical storage condition exists’ is sent to the user(s) defined in the service attribute to receive critical messages. These users can be seen and changes by using the CHGSRVA ‘Change Service Attributes’ command.
*REGFAC – Submit a job to make a call to the exit program(s) defined for the QIBM_QWC_QSTGLOWACN exit point.
*ENDSYS – End the system to a restricted state.
*PWRDWNSYS – IPL the system.
It's worth noting that there is a modern method via ‘Run SQL Scripts’ in Access Client Solutions to check a number of server related elements such as:
ASP information.
SELECT * FROM QSYS2.ASP_INFO;
Temporary storage sorted by the current bucket size.
SELECT * FROM QSYS2.SYSTMPSTG ORDER BY BUCKET_CURRENT_SIZE DESC;
Many more IBM i Services are available and can be viewed here.
Maxava Monitor Mi8 allows you to define monitoring rules across a large variety of metrics including ASPs, both the *SYSBAS and IASPs (Independent Auxiliary Storage Pools). Monitoring both the slower organic growth and sudden changes in disk space occupancy is possible, with alerts capable of being raised, distributed, and escalated to meet any variety of support structure. Mi8 also has a tiered level of monitoring where you can raise differing levels of alerts depending on the length of time thresholds have been breached. ASP levels can also be checked via an at-a-glance view in Maxava Monitor Mi8.
In addition, Maxava provides the capability to conduct root cause analysis to identify the detail behind any ASP related alerts. In the example below we can see that QSPL has largest current level of growth.
Temporary storage can also be monitored with alerts raised for any jobs that exceeded a customizable threshold.
In addition, there is a feature provided in Maxava Monitor Mi8 to monitor for key messages such as CPF0907, CPI099B or CPI099C appearing on messages queues so that you do not need to proactively check for these yourself, instead an alert will be raised and sent to you and your team on a device of your choice.
Where Maxava HA is also used, Maxava Monitor Mi8 can also be used to obtain visual information on remote journaling elements including apply backlog, receiver backlog along with any noticeable lag.
Mi8 is a trademark of Furasta, a sister company of Maxava.
Comentarios