Performance engineering

It is pervasive, involving people from multiple organizational units; but predominantly within the information technology organization.

Typically they are classified as critical based upon revenue value, cost savings, or other assigned business value.

High level risks that may impact system performance are identified and described at this time.

However, if for some reason (perhaps proper performance engineering working practices were not applied) there are tests that cannot be tuned into compliance, then it will be necessary to return portions of the system to development for refactoring.

Transaction response time is logged in a database such that queries and reports can be run against the data.

When user transactions fall out of band, the events should generate alerts so that attention may be applied to the situation.

This means executing trend analysis on historical monitoring generated data, such that the future time of non compliance is predictable.

For example, if a system is showing a trend of slowing transaction processing (which might be due to growing data set sizes, or increasing numbers of concurrent users, or other factors) then at some point the system will no longer meet the criteria specified within the service level agreements.

Capacity management is charged with ensuring that additional capacity is added in advance of that point (additional CPUs, more memory, new database indexing, et cetera) so that the trend lines are reset and the system will remain within the specified performance range.

These typically involve system tuning, changing operating system or device parameters, or even refactoring the application software to resolve poor performance due to poor design or bad coding practices.