KPI for Quality of Service

There is a clear need for measuring the quality of service by defining relevant KPI criteria.

We put all WebLogic server logs inside a partitioned Apache Hive table which is  stored inside HDFS as the Optimized Row Columnar (ORC) file format.

KPI – Total Errors and Average MTBE (Mean Time Between Errors)
This KPI shows the total error count and the mean time between errors (in hours) for all servers. The query runs at the beginning of  every month and collects data for the previous month. Results are then appended  in corresponding server graphs. In this way it is very easy to see trends, improvement results in development and testing processes and maturity level of products.

select t.servername,count(*),avg(t.next-t.tstamp)/3600
from (
  select servername,severity,tstamp,
        lead(tstamp) over (partition by servername order by tstamp) next
  from wlogs1
  where  severity = ‘Error’ and tstamp > unix_timestamp() – 3600*24*30
  order by servername,tstamp
) t
group by t.servername;

Example results for one month:

Server Name Errors Average MTBE (hours)

AppSrv1

3 0.14

SOA1

13 8

SOA2

29 6

Portal1

2 0.89

If we add condition and logmessage like ‘Exception%’ in the where clause, we can track development errors. The rest belongs to support organization (e.g. logmessage Tunneling result not OK, result: ‘DEAD’, and alike).

Oglasi
Ovaj unos je objavljen u Nekategorizirano. Bookmarkirajte stalnu vezu.

Komentiraj

Popunite niže tražene podatke ili kliknite na neku od ikona za prijavu:

WordPress.com Logo

Ovaj komentar pišete koristeći vaš WordPress.com račun. Odjava / Izmijeni )

Twitter picture

Ovaj komentar pišete koristeći vaš Twitter račun. Odjava / Izmijeni )

Facebook slika

Ovaj komentar pišete koristeći vaš Facebook račun. Odjava / Izmijeni )

Google+ photo

Ovaj komentar pišete koristeći vaš Google+ račun. Odjava / Izmijeni )

Spajanje na %s