Infrastructure monitoring with Pappaya Cloud ELK

Infrastructure monitoring is simply a business process that collects and analyses data from the established IT infrastructure. The data collected is used to improve business outcomes and improve value creation for the organization. IT Infrastructure monitoring should focus on all endpoints or applications connected to the organization’s network, including hardware, operating system, network, and applications. 

Pappaya Cloud ELK automatically monitors the entire infrastructure. Gain deep visibility into the IT as a whole infrastructure across cloud, on-premise, virtual machines, or even containers. Pappaya Cloud ELK allows organizations to act on potential security threats in real-time and reduce unplanned application downtime.

Benefits of Pappaya Cloud ELK

  • One dashboard to analyze data metrics from servers, docker, Kubernetes, and applications
  • Fast search data
  • In-depth metrics analysis

Working of Pappaya Cloud ELK

Here Pappaya Logstash will read all files and index into Pappaya Elasticsearch, and finally, Pappaya Kibana will provide UI for the Pappaya Elasticsearch results. 

Legacy way

Our server1, server2, and server3, where all the logs will be generated here, Pappaya Logstash will index these files to Pappaya Elasticsearch, which runs on a different server here. We need to install Pappaya Logstash on every server. Here the Pappaya Logstash being huge will occupy huge infra resources, which will not be an optimal solution to run. We use beats, i.e., available few file- metric-beat, Heartbeat to avoid consumption of resources.

Optimized way

Here beats act like lightweight data shippers (hence consume fewer resources). Pappaya installs them as agents on servers to send specific types of operational data to Pappaya Logstash, i.e., file beat will be responsible for sending the data to Pappaya Logstash. Pappaya Logstash will process this data and index it to Pappaya Elasticsearch.

File Beat

This beat is mainly used for server logs. Functionality includes a backpressure-sensitive protocol, i.e., sending data to Pappaya Logstash or Pappaya Elasticsearch to account for higher volumes of data. If Pappaya Logstash is busy processing the data, it lets Filebeat know to slow down its read. Once the processing is completed, Filebeat will build back up to its original pace and keep on sending files to Pappaya Logstash.

Harvester

Reads the content of a single file and sends the content to the output.  

Input

An input manages the harvesters and finds all sources of logs. Once the user starts the file beat service, it starts with one or more inputs that look into the log data location. 

Metric Beat

This beat is used for INFRA and SERVICE level Monitoring. The metric beat is a lightweight shipper that you can install on your servers to periodically collect metrics from the OS and services running on the server. The metric beat takes the metrics and statistics that it collects and ships them to the output you specify, such as Pappaya Elasticsearch or Logstash. 

Metric beat helps you monitor your servers by collecting metrics from the system and services running on the server, such as Webserver, Load balancer, Database.

Audit beat

Audit beat is a lightweight shipper that you can install on your servers to audit users’ activities and processes on your systems.

Pappaya Cloud Xpack

Pappaya Cloud Elastic Stack extension provides security, alerting, monitoring, reporting, machine learning, and many other capabilities.

Pappaya Cloud Elasticsearch

It is a search engine that typically stores data in the form of documents (JSON). Each of the documents can be compared to a row in an RDBMS. Like any other database, the ‘insert,’ ‘delete,’ ‘update’ and ‘retrieve’ operations can be done in Pappaya Cloud Elasticsearch. The main purpose or the primary objective of Pappaya Elasticsearch lies in its very powerful search capacity.

Pappaya Cloud Kibana

It is a visualization tool that analyses and visually represents the data from Pappaya Elasticsearch in the form of a chart, graph, and many other formats. It can manage parts of Pappaya Elasticsearch and Pappaya Logstash. Pappaya Kibana can create a Dashboard for System admins, Developers, Management.

Pappaya Cloud Logstash

A data processing pipeline pulls in data in parallel from various sources or parses and transforms them into a convenient format for fast and easy analysis. Finally, it sends it to Pappaya Elasticsearch or any other destination of choice.

Pappaya Cloud Kibana

Once all the documents are indexed, all indexes can be found under management> Index Management in Pappaya Cloud Kibana. 

Several visualizations can be created with those indexes/documents using Pappaya Cloud Kibana.

Shards and Replicas

Documents within an index can be split across multiple nodes (Pappaya Cloud Elasticsearch clusters) and physically stored in a disc called Shards. However, the Pappaya Cloud Elasticsearch combines the data from different shards and responds when queried, implying the request can be made to any node. It makes sure the shards’ data are in complete sync with its replicas in operations like indexing, deletion, etc.; shards allow parallel operations across nodes to improve performance.

Pappaya Cloud Elasticsearch allows copies of the shards to ensure a failover mechanism in a distributed environment. These copies are called Replicas. It’s very important to note here that the original shard and its replica should never be on the same node. This defeats the purpose of having a replica if that node goes down for any reason. 

Querying exact value and full text

Querying the exact value is like extracting data with a where clause in a SQL statement.