Escaping the Zabbix UI pain: How to create a combined graph for a number of hosts using the Zabbix API



This post will answer two questions:
  • How to display the same item, f.ex. Processor load, for a number of hosts on the same graph
  • How to avoid getting crazy from all the slow clicking in the Zabbix UI by using its API
I will indicate how it could be done with plain HTTP POST and then show a solution using the Python library for accessing the Zabix API.

The problem we want to solve is to create a graph that plots the same item for a number of hosts that all are from the same Host group but not all hosts in the group should be included.

 Zabbix API

Zabbix API is a REST API introduced in 1.8 that enables the management of Zabbix resources such as items, triggers, and graphs. The resources are connected together via IDs so you usually need to get a resource by its name, extract its id, and use that to get related resources.

The API documentation in v1.8 is ... lacking so I usually read the v2.2 documentation then check the corresponding page and example in 1.8 and how it will work. The documentation occasionally contains mistakes so don't trust it. (F.ex. item.get's filter should be an array but in fact is an object.) It also isn't clear, for get, when fields are ORed and when ANDed and whether you can do anything about it.

There are multiple libraries for the Zabbix API, I've chosen Python because I like it and the zabbix_api.py because it seems to be maintained and developed. I had an issue in authorization with it but managed to work around it.

Using the API

Authentication & authorization

You usually first authenticate with Zabbix and use the auth token you get from it in all subsequent calls.

Catch: Zabbix API must be enabled for the user

The Zabbix user used for communication with the API must be authorized to use the API (there is a check box for that in Zabbix administration). In our configuration this is off by default and, in our case, users must be added to the Zabbix api group.

If you do not have access, you will be able to authenticate with the API but other requests will fail with "No API access".

Common get attributes

The get requests have some common attributes such as
  • output = (extend|shorten|refer|...) - how much to include in the output (refer => only ids, extend => as much as possible)
  • filter - we will see this below

Implementation

Creating a graph with HTTP POST

You communicate with the API ny posting JSON to it. The easiset thing is to execute
curl -i -n -X POST --header "Content-Type: application/json" -d@- https://zabbix.example.com/api_jsonrpc.php
and paste the JSON there, add a new line and press Control-D to finish the input.

Authenticate with Zabbix:
curl -i -n -X POST --header "Content-Type: application/json" -d@- https://zabbix.example.com/api_jsonrpc.php
{
    "jsonrpc": "2.0",
    "method": "user.authenticate",
    "params": {
        "user": "my zabbix username",
        "password": "my precious"
    },
    "auth": null,
    "id": 0
}
=>
{"jsonrpc":"2.0","result":"2dddea29e1d37b9f90069dd129d7a66d","id":0}
  • I believe the value of id isn't important but you need to provide some to get a reasonable response; 0, 1, 2 work fine.
Get all items in the Host group Analytics production, using the auth token:
{
"jsonrpc":"2.0",
"method":"item.get",
"params":{
    "output":"shorten",
    "filter": {"description": "Processor load15"},
    "group": "Analytics production",
    "select_hosts": "extend",
    "limit": 10
},
"auth":"2dddea29e1d37b9f90069dd129d7a66d",
"id":2
}
=> [{"itemid":"40002","hosts":[{"maintenances":[..],"hostid":"10242","proxy_hostid":"10381","host":"my-server.example.com","dns":"my-server.example.com",...}]},{"itemid":"40003",...
Well, we will skip the rest and go to the real fun - the Python API.

Creating a graph with zabbix_api.py

Some notes:
  1. I had troubles with authorization, I had to specify user & password both in the constructor (for http basic auth. headers) and call the login method to make it work; in theory, only one of these two shall be necessary. (I might have made a mistake somewhere.)
  2. There are some advantages over curl such as not needing to specify unimportant attributes such as request id and having automatic mapping between Python lists/dicts and JSON.
Before we show the code, let's see how to use it:
bash$ ipython
In [1]: %run create_graph.py
In [2]: g = ZabbixGrapher(user="my zabbix user", passwd="my precious")
20: url: https://zabbix.example.com/api_jsonrpc.php
20: HTTP Auth enabled
20: Sending: {"params": {"password": "my precious", "user": "my zabbix user"}, "jsonrpc": "2.0", "method": "user.authenticate", "auth": "", "id": 0}
20: Response Code: 200
Logged in, auth: c417623c2d72e0f14ddab044429b80e7
In [3]: g.create_graph("CPU iowait on data nodes(avg1)", item_key="system.cpu.util[,iowait,avg1]", item_descr = None, host_group = "Analytics staging", hostname_filter_fn = lambda dns: "-data" in dns)
The long, scary code itself:
create_graph.py

import logging
import sys

from zabbix_api import ZabbixAPI, ZabbixAPIException

BOLD = "\033[1m" RESET = "\033[0;0m"

class Palette: last_idx = 0 colors = ["C04000", "800000", "191970", "3EB489", "FFDB58", "000080", "CC7722", "808000", "FF7F00", "002147", "AEC6CF", "836953", "CFCFC4", "77DD77", "F49AC2", "FFB347", "FFD1DC", "B39EB5", "FF6961", "CB99C9", "FDFD96", "FFE5B4", "D1E231", "8E4585", "FF5A36", "701C1C", "FF7518", "69359C", "E30B5D", "826644", "FF0000", "414833", "65000B", "002366", "E0115F", "B7410E", "FF6700", "F4C430", "FF8C69", "C2B280", "967117", "ECD540", "082567"]

def next(self): self.last_idx = (self.last_idx + 1) % len(self.colors) return self.colors[self.last_idx]

class ZabbixGrapher:

def __init__(self, user, passwd, log_level=logging.INFO):

try: # I had to spec. user+psw here to use Basic http auth to be able # to log in even though I supply them to login below; # otherwise the call failed with 'Error: HTTP Error 401: Authorization Required' self.zapi = ZabbixAPI( server="https://zabbix.example.com", path="/api_jsonrpc.php", user=user, passwd=passwd, log_level=log_level) # or DEBUG

# BEWARE: The user must have access to the Zabxi API enabled (be # in the Zabbix API user group) self.zapi.login(user, passwd) print "Logged in, auth: " + self.zapi.auth except ZabbixAPIException as e: msg = None if "" in str(e): msg = "Connection to Zabbix timed out, it's likely having temporary problems, retry now or later'" else: msg = "Error communicating with Zabbix. Please check your authentication, Zabbix availability. Err: " + str(e)

print BOLD + "\n" + msg + RESET raise ZabbixAPIException, ZabbixAPIException(msg), sys.exc_info()[2]

def create_graph(self, graph_name="CPU Loads All Data Nodes", item_descr="Processor load15", item_key=None, host_group="Analytics production", hostname_filter_fn=lambda dns: "-analytics-prod-data" in dns and ("-analytics-prod-data01" in dns or dns >= "aewa-analytics-prod-data15"), #show_legend = True - has no effect (old Z. version?) ):

palette = Palette() try:

items = self.get_items(item_descr = item_descr, item_key = item_key, host_group = host_group)

if not items: raise Exception("No items with (descr=" + str(item_descr) + ", key=" + str(item_key) + ") in the group '" + host_group + "' found")

# Transform into a list of {'host':.., 'itemid':..} pairs, # filter out unwanted hosts and sort by host to have a defined order item_maps = self.to_host_itemid_pairs(items, hostname_filter_fn) item_maps = sorted( filter(lambda it: hostname_filter_fn(it['host']), item_maps), key = lambda it: it['host'])

if not item_maps: raise Exception("All retrieved items filtered out by the filter function; orig. items: " + str(item_maps))

# The graph will be created on the 1st item's host: graph_host = item_maps[0]['host']

## CREATE GRAPH # See https://www.zabbix.com/documentation/2.0/manual/appendix/api/graph/definitions graph_items = []

for idx, item in enumerate(item_maps): graph_items.append({ "itemid": item['itemid'], "color": palette.next(), "sortorder": idx })

graph = self.zapi.graph.create({ "gitems": graph_items, "name": graph_name, "width":900, "height":200 #,"show_legend": str(int(show_legend)) })

print "DONE. The graph %s has been created on the node %s: %s." % (graph_name, graph_host, str(graph)) except Exception as e: msg = None if "No API access while sending" in str(e): msg = "The user isn't allowed to access the Zabbix API; go to Zabbix administration and enable it (f.ex. add the group API access to the user)'" else: msg = "Error communicating with Zabbix. Please check your request and whether Zabbix is available. Err: " + str(e)

print BOLD + "\n" + msg + RESET raise type(e), type(e)(msg), sys.exc_info()[2]

def get_items(self, item_descr = None, item_key = None, host_group = None): if not item_descr and not item_key: raise ValueError("Either item_key or item_descr must be provided")

## GET ITEMS to include in the graph # See (Zabbix 2.0 so not 100% relevant but better doc) # https://www.zabbix.com/documentation/2.0/manual/appendix/api/item/get filters = {} if item_descr: filters["description"] = item_descr if item_key: filters["key_"] = item_key

return self.zapi.item.get({ "output":"shorten", "filter": filters, "group": host_group, "select_hosts": "extend" })

@staticmethod def to_host_itemid_pairs(items, hostname_filter_fn): # List of (host, itemid) pairs sorted by host items_by_host = []

for item in items: itemid = item['itemid'] dns = item['hosts'][0]['dns']

if hostname_filter_fn(dns): items_by_host.append({"host": dns, "itemid": itemid})

return items_by_host


Other ways

As my colleague Markus Krüger has noted:
You could also use auto registration or auto discovery to add hosts to groups, then extract aggregated data across all hosts in the host group. (Granted, that only works if you want data from all hosts in the group - but if you don't want the data, don't add the host to that group.) That way, no manual work is needed to add monitoring and graphing across multiple instances winking into and out of existence.
Some links:
Auto registration (2.0 docs, but should be fairly accurate for 1.8.3 as well)
That being said, Zabbix is still fairly awkward to work with.
This makes it possible to get aggregate metrics such as avg, max, min, sum of a metric for the whole host group. Using auto iscovery and auto registration makes it possible to assign hosts to groups automatically.

Conclusion

Using the API is easy and quick, especially with Python. Working the the UI is so slow and painful that I really recommend using the API.


Tags: DevOps python


Copyright © 2024 Jakub Holý
Powered by Cryogen
Theme by KingMob