Friday, 19 December 2014

HDFS Admin Commands


My title page contents HDFS Admin Commands

While the dfs module for bin/hadoop provides common file and directory manipulation commands, they all work with objects within the file system. The dfsadmin module manipulates or queries the file system as a whole. The operation of the commands in this module is described in this section.
Getting overall status: A brief status report for HDFS can be retrieved with bin/hadoop dfsadmin -report. This returns basic information about the overall health of the HDFS cluster, as well as some per-server metrics.
More involved status: If you need to know more details about what the state of the NameNode's metadata is, the command bin/hadoop dfsadmin -metasave filename will record this information in filename. The metasave command will enumerate lists of blocks which are under-replicated, in the process of being replicated, and scheduled for deletion. NB: The help for this command states that it "saves NameNode's primary data structures," but this is a misnomer; the NameNode's state cannot be restored from this information. However, it will provide good information about how the NameNode is managing HDFS's blocks.
Safemode: Safemode is an HDFS state in which the file system is mounted read-only; no replication is performed, nor can files be created or deleted. This is automatically entered as the NameNode starts, to allow all DataNodes time to check in with the NameNode and announce which blocks they hold, before the NameNode determines which blocks are under-replicated, etc. The NameNode waits until a specific percentage of the blocks are present and accounted-for; this is controlled in the configuration by thedfs.safemode.threshold.pct parameter. After this threshold is met, safemode is automatically exited, and HDFS allows normal operations. The bin/hadoop dfsadmin -safemode whatcommand allows the user to manipulate safemode based on the value of what, described below:
  • enter - Enters safemode
  • leave - Forces the NameNode to exit safemode
  • get - Returns a string indicating whether safemode is ON or OFF
  • wait - Waits until safemode has exited and returns
Changing HDFS membership - When decommissioning nodes, it is important to disconnect nodes from HDFS gradually to ensure that data is not lost. See the section on decommissioning later in this document for an explanation of the use of the -refreshNodes dfsadmin command.
Upgrading HDFS versions - When upgrading from one version of Hadoop to the next, the file formats used by the NameNode and DataNodes may change. When you first start the new version of Hadoop on the cluster, you need to tell Hadoop to change the HDFS version (or else it will not mount), using the command: bin/start-dfs.sh -upgrade. It will then begin upgrading the HDFS version. The status of an ongoing upgrade operation can be queried with the bin/hadoop dfsadmin -upgradeProgress status command. More verbose information can be retrieved with bin/hadoop dfsadmin -upgradeProgress details. If the upgrade is blocked and you would like to force it to continue, use the command: bin/hadoop dfsadmin -upgradeProgress force. (Note: be sure you know what you are doing if you use this last command.)
When HDFS is upgraded, Hadoop retains backup information allowing you to downgrade to the original HDFS version in case you need to revert Hadoop versions. To back out the changes, stop the cluster, re-install the older version of Hadoop, and then use the command: bin/start-dfs.sh -rollback. It will restore the previous HDFS state.
Only one such archival copy can be kept at a time. Thus, after a few days of operation with the new version (when it is deemed stable), the archival copy can be removed with the command bin/hadoop dfsadmin -finalizeUpgrade. The rollback command cannot be issued after this point. This must be performed before a second Hadoop upgrade is allowed.
Getting help - As with the dfs module, typing bin/hadoop dfsadmin -help cmd will provide more usage information about the particular command.
For Detail commands click on the below link

ZOOKEEPER

ZOOKEEPER

Apache ZooKeeper provides operational services for a Hadoop cluster. ZooKeeper provides a distributed configuration service, a synchronization service and a naming registry for distributed systems. Distributed applications use Zookeeper to store and mediate updates to important configuration information.

What ZooKeeper Does

ZooKeeper provides a very simple interface and services. ZooKeeper brings these key benefits:
  • Fast. ZooKeeper is especially fast with workloads where reads to the data are more common than writes. The ideal read/write ratio is about 10:1.
  • Reliable. ZooKeeper is replicated over a set of hosts (called an ensemble) and the servers are aware of each other. As long as a critical mass of servers is available, the ZooKeeper service will also be available. There is no single point of failure.
  • Simple. ZooKeeper maintain a standard hierarchical name space, similar to files and directories.
  • Ordered. The service maintains a record of all transactions, which can be used for higher-level abstractions, like synchronization primitives.

How ZooKeeper Works

ZooKeeper allows distributed processes to coordinate with each other through a shared hierarchical name space of data registers, known as znodes. Every znode is identified by a path, with path elements separated by a slash (“/”). Aside from the root, every znode has a parent, and a znode cannot be deleted if it has children.
This is much like a normal file system, but ZooKeeper provides superior reliability through redundant services. A service is replicated over a set of machines and each maintains an in-memory image of the the data tree and transaction logs. Clients connect to a single ZooKeeper server and maintains a TCP connection through which they send requests and receive responses.
This architecture allows ZooKeeper to provide high throughput and availability with low latency, but the size of the database that ZooKeeper can manage is limited by memory.


For More details on Zookeeper Admin,see the below link

Zookeeper Admin Guide

SQOOP & FLUME

 SQOOP

Apache Sqoop is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases. Sqoop imports data from external structured datastores into HDFS or related systems like Hive and HBase. Sqoop can also be used to extract data from Hadoop and export it to external structured datastores such as relational databases and enterprise data warehouses. Sqoop works with relational databases such as: Teradata, Netezza, Oracle, MySQL, Postgres, and HSQLDB.

What Sqoop Does

Designed to efficiently transfer bulk data between Apache Hadoop and structured datastores such as relational databases, Apache Sqoop:
  • Allows data imports from external datastores and enterprise data warehouses into Hadoop
  • Parallelizes data transfer for fast performance and optimal system utilization
  • Copies data quickly from external systems to Hadoop
  • Makes data analysis more efficient 
  • Mitigates excessive loads to external systems.

How Sqoop Works

Sqoop provides a pluggable connector mechanism for optimal connectivity to external systems. The Sqoop extension API provides a convenient framework for building new connectors which can be dropped into Sqoop installations to provide connectivity to various systems. Sqoop itself comes bundled with various connectors that can be used for popular database and data warehousing systems.

Refer to the below link for complete details on the sqoop and its commands

 Sqoop User Guide

hortonworks sqoop example


FLUME

Apache™ Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of streaming data into the Hadoop Distributed File System (HDFS). It has a simple and flexible architecture based on streaming data flows; and is robust and fault tolerant with tunable reliability mechanisms for failover and recovery.

What Flume Does

Flume lets Hadoop users make the most of valuable log data. Specifically, Flume allows users to:
  • Stream data from multiple sources into Hadoop for analysis
  • Collect high-volume Web logs in real time
  • Insulate themselves from transient spikes when the rate of incoming data exceeds the rate at which data can be written to the destination
  • Guarantee data delivery
  • Scale horizontally to handle additional data volume

How Flume Works

Flume’s high-level architecture is focused on delivering a streamlined codebase that is easy-to-use and easy-to-extend. The project team has designed Flume with the following components:
  • Event – a singular unit of data that is transported by Flume (typically a single log entry)
  • Source – the entity through which data enters into Flume. Sources either actively poll for data or passively wait for data to be delivered to them. A variety of sources allow data to be collected, such as log4j logs and syslogs.
  • Sink – the entity that delivers the data to the destination. A variety of sinks allow data to be streamed to a range of destinations. One example is the HDFS sink that writes events to HDFS.
  • Channel – the conduit between the Source and the Sink. Sources ingest events into the channel and the sinks drain the channel.
  • Agent – any physical Java virtual machine running Flume. It is a collection of sources, sinks and channels.
  • Client – produces and transmits the Event to the Source operating within the Agent
A flow in Flume starts from the Client. The Client transmits the event to a Source operating within the Agent. The Source receiving this event then delivers it to one or more Channels. These Channels are drained by one or more Sinks operating within the same Agent. Channels allow decoupling of ingestion rate from drain rate using the familiar producer-consumer model of data exchange. When spikes in client side activity cause data to be generated faster than what the provisioned capacity on the destination can handle, the channel size increases. This allows sources to continue normal operation for the duration of the spike. Flume agents can be chained together by connecting the sink of one agent to the source of another agent. This enables the creation of complex dataflow topologies.

Reliability & Scaling

Flume is designed to be highly reliable, thereby no data is lost during normal operation. Flume also supports dynamic reconfiguration without the need for a restart, which allows for reduction in the downtime for flume agents. Flume is architected to be fully distributed with no central coordination point. Each agent runs independent of others with no inherent single point of failure. Flume also features built-in support for load balancing and failover. Flume’s fully decentralized architecture also plays a key role in its ability to scale. Since each agent runs independently, Flume can be scaled horizontally with ease.

For more details on the Flume click on the below link

Flume User Guide

Hortonworks example on Flume

Hortonworks Example 2

HIVE Vs PIG

HIVE Vs PIG

While I was looking at Hive and Pig for processing large amounts of data without the need to write MapReduce code I found that there is no easy way to compare them against each other without reading into both in greater detail.

In this post I am trying to give you a 10,000ft view of both and compare some of the more prominent and interesting features. The following table - which is discussed below - compares what I deemed to be such features:


Feature
Hive
Pig
Language
SQL-like
PigLatin
Schemas/Types
Yes (explicit)
Yes (implicit)
Partitions
Yes
No
Server
Optional (Thrift)
No
User Defined Functions (UDF)
Yes (Java)
Yes (Java)
Custom Serializer/Deserializer
Yes
Yes
DFS Direct Access
Yes (implicit)
Yes (explicit)
Join/Order/Sort
Yes
Yes
Shell
Yes
Yes
Streaming
Yes
Yes
Web Interface
Yes
No
JDBC/ODBC
Yes (limited)
No


Let us look now into each of these with a bit more detail.

General Purpose

The question is "What does Hive or Pig solve?". Both - and I think this lucky for us in regards to comparing them - have a very similar goal. They try to ease the complexity of writing MapReduce jobs in a programming language like Java by giving the user a set of tools that they may be more familiar with (more on this below). The raw data is stored in Hadoop's HDFS and can be any format although natively it usually is a TAB separated text file, while internally they also may make use of Hadoop's SequenceFile file format. The idea is to be able to parse the raw data file, for example a web server log file, and use the contained information to slice and dice them into what is needed for business needs. Therefore they provide means to aggregate fields based on specific keys. In the end they both emit the result again in either text or a custom file format. Efforts are also underway to have both use other systems as a source for data, for example HBase.

The features I am comparing are chosen pretty much at random because they stood out when I read into each of these two frameworks. So keep in mind that this is a subjective list.

Language

Hive lends itself to SQL. But since we can only read already existing files in HDFS it is lacking UPDATE or DELETE support for example. It focuses primarily on the query part of SQL. But even there it has its own spin on things to reflect better the underlaying MapReduce process. Overall is seems that someone familiar with SQL can very quickly learn Hive's version of it and get results fast.

Pig on the other hand looks more like a very simplistic scripting language. As with those (and this is a nearly religious topic) some are more intuitive and some are less. As with PigLatin I was able to see what the samples do, but lacking the full knowledge of its syntax I was somewhat finding myself thinking if I really would be able to get what I needed without too many trial-and-error loops. Sure, the Hive SQL needs probably as many iterations to fully grasp - but there is at least a greater understanding of what to expect.

Schemas/Types

Hive uses once more a specific variation of SQL's Data Definition Language (DDL). It defines the "tables" beforehand and stores the schema in a either shared or local database. Any JDBC offering will do, but it also comes with a built in Derby instance to get you started quickly. If the database is local then only you can run specific Hive commands. If you share the database then others can also run these - or would have to set up their own local database copy. Types are also defined upfront and supported types are INT, BIGINT, BOOLEAN, STRING and so on. There are also array types that lets you handle specific fields in the raw data files as a group.

Pig has no such metadata database. Datatypes and schemas are defined within each script. Types furthermore are usually automatically determined by their use. So if you use a field as an Integer it is handled that way by Pig. You do have the option though to override it and have explicit type definitions, again within the script you need them. Pig has a similar set of types compared to Hive. For example it also has an array type called "bag".

Partitions

Hive has a notion of partitions. They are basically subdirectories in HDFS. It allows for example processing a subset of the data by alphabet or date. It is up to the user to create these "partitions" as they are not enforced nor required.

Pig does not seem to have such a feature. It may be that filters can achieve the same but it is not immediately obvious to me.

Server

Hive can start an optional server, which is allegedly Thrift based. With the server I presume you can send queries from anywhere to the Hive server which in turn executes them.

Pig does not seem to have such a facility yet.

User Defined Functions

Hive and Pig allow for user functionality by supplying Java code to the query process. These functions can add any additional feature that is required to crunch the numbers as required.

Custom Serializer/Deserializer

Again, both Hive and Pig allow for custom Java classes that can read or write any file format required. I also assume that is how it connects to HBase eventually (just a guess). You can write a parser for Apache log files or, for example, the binary Tokyo Tyrant Ulog format. The same goes for the output, write a database output class and you can write the results back into a database.

DFS Direct Access

Hive is smart about how to access the raw data. A "select * from table limit 10" for example does a direct read from the file. If the query is too complicated it will fall back to use a full MapReduce run to determine the outcome, just as expected.

With Pig I am not sure if it does the same to speed up simple PigLatin scripts. At least it does not seem to be mentioned anywhere as an important feature.

Join/Order/Sort

Hive and Pig have support for joining, ordering or sorting data dynamically. They perform the same purpose in both pretty allowing you to aggregate and sort the result as is needed. Pig also has a COGROUP feature that allows you to do OUTER JOIN's and so on. I think this is where you spent most of your time with either package - especially when you start out. But from a cursory look it seems both can do pretty much the same.

Shell

Both Hive and Pig have a shell that allows you to query specific things or run the actual queries. Pig also passes on DFS commands such as "cat" to allow you to quickly check what an outcome of a specific PigLatin script was.

Streaming

Once more, both frameworks seem to provide streaming interfaces so that you can process data with external tools or languages, such as Ruby or Python. How the streaming performs I do not know and if they affect them differently. This is for you to tell me :)

Web Interface

Only Hive has a web interface or UI that can be used to visualize the various schemas and issue queries. This is different to the above mentioned Server as it is an interactive web UI for a human operator. The Hive Server is for use from another programming or scripting language for example.

JDBC/ODBC

Another Hive only feature is the availability of a - again limited functionality - JDBC/ODBC driver. It is another way for programmers to use Hive without having to bother with its shell or web interface, or even the Hive Server. Since only a subset of features is available it will require small adjustments on the programmers side of things but otherwise seems like a nice-to-have feature.

Conclusion

Well, it seems to me that both can help you achieve the same goals, while Hive comes more natural to database developers and Pig to "script kiddies" (just kidding). Hive has more features as far as access choices are concerned. They also have reportedly roughly the same amount of committers in each project and are going strong development wise.

This is it from me. Do you have a different opinion or comment on the above then please feel free to reply below. Over and out!

HIVE Vs HBASE

HIVE Vs HBASE

What They Do

Apache Hive is a data warehouse infrastructure built on top of Hadoop. It allows for querying data stored on HDFS for analysis via HQL, an SQL-like language that gets translated to MapReduce jobs. Despite providing SQL functionality, Hive does not provide interactive querying yet - it only runs batch processes on Hadoop.
Apache HBase is a NoSQL key/value store which runs on top of HDFS. Unlike Hive, HBase operations run in real-time on its database rather than MapReduce jobs. Hive is partitioned to tables, and tables are further split into column families. Column families, which must be declared in the schema, group together a certain set of columns (columns don’t require schema definition). For example, the "message" column family may include the columns: "to", "from", "date", "subject", and "body". Each key/value pair in HBase is defined as a cell, and each key consists of row-key, column family, column, and time-stamp. A row in HBase is a grouping of key/value mappings identified by the row-key. HBase enjoys Hadoop’s infrastructure and scales horizontally using off the shelf servers.

Features

Hive can help the SQL savvy to run MapReduce jobs. Since it’s JDBC compliant, it also integrates with existing SQL based tools. Running Hive queries could take a while since they go over all of the data in the table by default. Nonetheless, the amount of data can be limited via Hive’s partitioning feature. Partitioning allows running a filter query over data that is stored in separate folders, and only read the data which matches the query. It could be used, for example, to only process files created between certain dates, if the files include the date format as part of their name.
HBase works by storing data as key/value. It supports four primary operations: put to add or update rows, scan to retrieve a range of cells, get to return cells for a specified row, and delete to remove rows, columns or column versions from the table. Versioning is available so that previous values of the data can be fetched (the history can be deleted every now and then to clear space via HBase compactions). Although HBase includes tables, a schema is only required for tables and column families, but not for columns, and it includes increment/counter functionality.

Limitations

Hive does not currently support update statements. Additionally, since it runs batch processing on Hadoop, it can take minutes or even hours to get back results for queries. Hive must also be provided with a predefined schema to map files and directories into columns and it is not ACID compliant.
HBase queries are written in a custom language that needs to be learned. SQL-like functionality can be achieved via Apache Phoenix, though it comes at the price of maintaining a schema. Furthermore, HBase isn’t fully ACID compliant, although it does support certain properties. Last but not least - in order to run HBase, ZooKeeper is required - a server for distributed coordination such as configuration, maintenance, and naming.

Use Cases

Hive should be used for analytical querying of data collected over a period of time - for instance, to calculate trends or website logs. Hive should not be used for real-time querying since it could take a while before any results are returned.
HBase is perfect for real-time querying of Big Data. Facebook use it for messaging and real-time analytics. They may even be using it to count Facebook likes.

Summary

Hive and HBase are two different Hadoop based technologies - Hive is an SQL-like engine that runs MapReduce jobs, and HBase is a NoSQL key/value database on Hadoop. But hey, why not use them both? Just like Google can be used for search and Facebook for social networking, Hive can be used for analytical queries while HBase for real-time querying. Data can even be read and written from Hive to HBase and back again.

You would not compare so does Hive vs Hbase - Commonly happend because of SQL-like layer on Hive - Hbase is a Database but Hive is never a Database.

Hive is a MapReduce based Analysis/ Summarisation tool running on Top of Hadoop. Hive depends on Mapreduce( Batch Processing) + HDFS

Hbase is a Database (NoSQL) - which is used to store and retrieve data.
To Query(Scans) Hbase - mapreduce is not required - So HBase depends only on HDFS - not on Mapreduce So Hbase is Online Processing System.

Wednesday, 27 August 2014

Hadoop (HDFS) version upgrade without losing data

 We have achieved upgrading the version from

 Hadoop 0.20.205.0 to ==> hadoop 1.0.3

 Hbase 0.90.4 to      ==> hbase 0.94.1

 

 We have followed the following steps and hope it helps

 

 Before upgrading the HDFS make sure existing cluster is working fine and filesystem is Healthy.

 1.       Stop all client applications running on the MapReduce cluster.

 stop-mapred.sh

 2.       kill any orphaned task process on the TaskTrackers.

 3.       Perform a filesystem check:

 hadoop fsck / -files -blocks -locations  dfs-v-old-fsck-1.log

 4.       Save a complete listing of the HDFS namespace to a local file.

 hadoop dfs -lsr /  dfs-v-old-lsr-1.log

 5.       Create a list of DataNodes participating in the cluster.

 hadoop dfsadmin -report  dfs-v-old-report-1.log.

 6.    stop and restart HDFS cluster( To create an checkpoint of the old version)

 stop-dfs.sh

 start-dfs.sh

 7.       Before stop the dfs take the backup of the Data Directory specified for storing image and other files of the HDFS

         (name specified in conf/hdfs-site.xml for <namedfs.data.dir</name property)

 8.       stop the hdfs cluster.

 stop-dfs.sh            

 After you have installed the new Hadoop version

 1.       Change the following files to redirect

 conf/slaves , conf/masters, conf/core-site.xml , conf/hdfs-site.xml, conf/mapred-site.xml

 2.       Start the actual HDFS upgrade process.

 hadoop-daemon.sh start namenode –upgrade

 3.       Check the upgrade process status

 hadoop dfsadmin -upgradeProgress status this should give you

 Upgrade for version –(new version_no) has been completed.

 Upgrade is not finalized.

 4.       Compare the namespace log by taking the new log.

 hadoop dfs -lsr /  dfs-v-new-lsr-0.log

 Compare it with old

 5.       Perform a filesystem check

 hadoop fsck / -files -blocks -locations  dfs-v-new-fsck-1.log

 and compare it with old
 6. Create list of DataNodes participating in the cluster.

    hadoop dfsadmin -report  dfs-v-old-report-1.log.

     and compare it with old

 7.       Start the HDFS cluster

 start-dfs.sh

 8.       Start the MapReduce cluster

 start-mapred.sh

 9.       Finalize the upgrade

        hadoop dfsadmin –finalizeUpgrade

Tuesday, 8 July 2014

Ganglia Monitoring Tool 1


What is Ganglia:

  • It is a highly scalable monitoring system for high performance computing.
  • It can monitor a system or clusters of systems or grid of clusters.
  • It uses the XML technology for data representation.
  • It uses the RRDtool for the data storage and visualization..
  • The implementation of ganglia is robust, has been ported to an extensive set of operating systems and processor architectures, and is currently in use on thousands of clusters around the world.
  • It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes.
In a simple manner, “Ganglia is a real time cluster monitoring tool that collects information from each computers in the cluster and provides and interactive way to view the performance of computers and cluster a whole.”

Like other monitoring tool ganglia only provide a way to view but not control the performance of the cluster.
Architecture of Ganglia:
      The Ganglia system consists of, two daemons gmond and gmetad, a PHP based web frontend, and two other utilities gmetric and gstat.
What is Gmond:
      Gmond runs on every node of the cluster and gather the information like CPU, memory, network, disk, swap etc.
What is Gmetad:
      Gmetad runs on head node. It gathers data from all other nodes and stores them in round robin database. It can poll multiple clusters and aggregate the metrics. It is also used by the web frontend in generating the UI.
What is PHP Web Frontend:
     The Ganglia web front-end provides a view of the gathered information via real-time dynamic web pages. Most importantly, it displays Ganglia data in a meaningful way for system administrators and computer users. It should be installed on the same machine where gmetad is installed.
Ganglia Installation:
Installation of ganglia on master node:
 apt-get install ganglia-monitor rrdtool gmetad ganglia-webfrontend

The above command will install the gmond, gmetad and ganglia web UI on the node. The ganglia web frontend package also installs the required apache server and php modules. In order to deploy and run Ganglia in Apache server, it is required to copy the apache.conf file from /etc/ganglia-webfrontend/apache.conf to /etc/apache2/sites-enabled/:
sudo cp /etc/ganglia-webfrontend/apache.conf /etc/apache2/sites-enabled/ganglia.conf

The /etc/ganglia-webfrontend/apache.conf contains a simple alias for /ganglia towards /usr/share/ganglia-webfrontend.
Installation of ganglia on other nodes:
  apt-get install ganglia-monitor

The above command will install the ganglia monitor.
Gmond configuration on master node:
      There are two type of configuration ganglia supports, one is multicast and other is unicast. Here I am taking an example of a cluster to configure the ganglia in unicast mode. I have a cluster named “Test” with the 192.168.1.1 as a master node and 192.168.1.2 and 192.168.1.3 as slave nodes.
 globals {                   
  daemonize = yes             
  setuid = yes            
  user = ganglia             
  debug_level = 0              
  max_udp_msg_len = 1472       
  mute = no             
  deaf = no         
  allow_extra_data = yes  
  host_dmax = 0 /*secs */
  cleanup_threshold = 300 /*secs */
  gexec = no            
  send_metadata_interval = 30                                       
}
cluster {
  name = "Test"
  owner = "clusterOwner"
  latlong = "unspecified"
  url = "unspecified"
}
udp_send_channel {
  host = 192.168.1.1
  port = 8649
  ttl = 1
} 
udp_recv_channel {
  port = 8649
}
tcp_accept_channel {
  port = 8649
}
Gmond configuration on other nodes:
globals {                   
  daemonize = yes             
  setuid = yes            
  user = ganglia             
  debug_level = 0              
  max_udp_msg_len = 1472       
  mute = no            
  deaf = no         
  allow_extra_data = yes  
  host_dmax = 0 /*secs */
  cleanup_threshold = 300 /*secs */
  gexec = no            
  send_metadata_interval = 30
}
cluster {
  name = "Test"
  owner = "clusterOwner"
  latlong = "unspecified"
  url = "unspecified"
}
udp_send_channel {
   # mcast_join = 239.2.11.71
  host = 192.168.1.1
  port = 8649
  ttl = 1
} 
tcp_accept_channel {
  port = 8649
}
The gmond configuration defines the following properties.
global section :
  • daemonize : It is a Boolean attribute. When true, gmond will daemonize. When false, gmond will run in the foreground.
  • setuid : The setuid attribute is a boolean. When true, gmond will set its effective UID to the uid of the user specified by the user attribute. When false, gmond will not change its effective user. 
  • debug_level : The debug_level is an integer value. When set to zero (0), gmond will run normally. A debug_level greater than zero will result in gmond running in the foreground and outputting debugging information. The higher the debug_level the more verbose the output.
  • mute : The mute attribute is a boolean. When true, gmond will not send data regardless of any other configuration directives.
  • deaf : The deaf attribute is a boolean. When true, gmond will not receive data regardless of any other configuration directives. 
  • allow_extra_data  : The allow_extra_data attribute is a boolean. When false, gmond will not send out the EXTRA_ELEMENT and EXTRA_DATA parts of the XML. This might be useful if you are using your own frontend to the metric data and will like to save some bandwith.
  • host_dmax: The host_dmax value is an integer with units in seconds. When set to zero (0), gmond will never delete a host from its list even when a remote host has stopped reporting. If host_dmax is set to a positive number then gmond will flush a host after it has not heard from it for host_dmax seconds. 
  • cleanup_threshold : The cleanup_threshold is the minimum amount of time before gmond will cleanup any hosts or metrics where tn > dmax a.k.a. expired data. 
  • gexec : The gexec boolean allows you to specify whether gmond will announce the hosts availability to run gexec jobs. Note: this requires that gexecd is running on the host and the proper keys have been installed.
  • send_metadata_interval  : The send_metadata_interval establishes an interval in which gmond will send or resend the metadata packets that describe each enabled metric. This directive by default is set to 0 which means that gmond will only send the metadata packets at startup and upon request from other gmond nodes running remotely. If a new machine running gmond is added to a cluster, it needs to announce itself and inform all other nodes of the metrics that it currently supports. In multicast mode, this isn't a problem because any node can request the metadata of all other nodes in the cluster. However in unicast mode, a resend interval must be established. The interval value is the minimum number of seconds between resends.

Cluster section : 
  • name : The name attributes specifies the name of the cluster of machines.
  • owner : The owner tag specifies the administrators of the cluster. The pair name/owner should be unique to all clusters in the world.
  • latlong : The latlong attribute is the latitude and longitude GPS coordinates of this cluster on earth.
    Specified to 1 mile accuracy with two decimal places per axis in decimal.
  • url : The url for more information on the cluster. Intended to give purpose, owner, administration, and account details for this cluster.
Udp_send_channel :
      You can define as many udp_send_channel sections as you like within the limitations of memory and file descriptors. If gmond is configured as mute this section will be ignored.
The udp_send_channel has a total of five attributes: mcast_join, mcast_if, host, port and ttl.
  • mcast _join and mcast_if : The mcast_join and mcast_if attributes are optional. When specified gmond will create the UDP socket and join the mcast_join multicast group and send data out the interface specified by mcast_if.
  • ttl : The ttl is time to live field for send data.
  • host and port : If only a host and port are specified then gmond will send unicast UDP messages to the hosts specified. You could specify multiple unicast hosts for redundancy as gmond will send UDP messages to all UDP channels.
Udp_recv_channel :
      You can specify as many udp_recv_channel sections as you like within the limits of memory and file descriptors. If gmond is configured deaf this attribute will be ignored.
The udp_recv_channel section has following attributes: mcast_join, bind, port, mcast_if, family.
  • mcast_join and mcast_if : The mcast_join and mcast_if should only be used if you want to have this UDP channel receive multicast packets the multicast group mcast_join on interface mcast_if. If you do not specify multicast attributes then gmond will simply create a UDP server on the specified port.
  • port : The port is for creating a udp server on port.
  • bind : You can use the bind attribute to bind to a particular local address.
Tcp_accept_channel :
      You can specify as many tcp_accept_channel sections as you like within the limitations of memory and file descriptors. If gmond is configured to be mute, then these sections are ignored.
  • bind : The bind address is optional and allows you to specify which local address gmond will bind to for this channel. 
  • port : The port is an integer than specifies which port to answer requests for data.

Gmetad Configuration:
 data_source "Test" 15 192.168.1.1:8649
The gmetad configuration defines the data source configuration with cluster name, pooling interval and the gmond running ip and port. In data source configuration“Test” is the cluster name, 15 is the gmetad polling interval for metrics and “192.168.1.1:8649” is the gmond ip and port of head node.
Starting Ganglia :
No old process of gmetad and gmond should be running on machines.
Starting gmetad : Run the below command on head node of cluster.
 sudo service gmetad start
Starting gmond : Run the below command on all the nodes of cluster.
 sudo service ganglia-monitor start
Starting Apache Server :
Stop old running instance of apache2 server. Then run the below command to start apache server.

Ganglia Monitoring tool

Ganglia Monitoring tool


Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. It leverages widely used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization.
It uses carefully engineered data structures and algorithms to achieve very low per-node overheads and high concurrency. The implementation is robust, has been ported to an extensive set of operating systems and processor architectures, and is currently in use on thousands of clusters around the world. It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes.
Be mindful that Ganglia will only help you to view the performance of your servers, and it doesn’t tweak or improve the performance. In this tutorial, we are going to implement Ganglia Monitoring Tool on Ubuntu 13.10 server and let us use Ubuntu 13.04 as our Monitoring target. Though it was tested on Ubuntu 13.10, the same method should work on Debian 7 and other Ubuntu versions as well.
Install Ganglia On Ubuntu 13.10
Before proceeding to install Ganglia, you have to complete the following tasks.
Make sure your Server has a properly installed and configured LAMP stack. To install and configure LAMP server, refer the following link.
If you’re using Debian, refer the following link.
Ganglia consists of two main daemons called gmond (Ganglia Monitoring Daemon) and gmetad (Ganglia Meta Daemon), a PHP-based web front-end and a few other small utilities.
Ganglia Monitoring Daemon (gmond):
Gmond runs on each node you want to monitor and monitor changes in the host state, announce relevant changes, listen to the state of all other ganglia nodes via a unicast or multicast channel and answer requests for an XML description of the cluster state.
Ganglia Meta Daemon (gmetad):
Gmetad runs on the master node which gathers all information from the client nodes.
Ganglia PHP Web Front-end:
It displays all the gathered information from the clients in a meaningful way like graphs via web pages.
Ganglia Installation On Master node
Install Ganglia using command:
$ sudo apt-get install ganglia-monitor rrdtool gmetad ganglia-webfrontend
During installation, you’ll be asked to restart apache service to activate the new configuration. Click Yes to continue.
sk@server: ~_001
Configure Master node
Now copy ganglia configuration file /etc/ganglia-webfrontend/apache.conf to /etc/apache2/sites-enabled/ directory as shown below.
$ sudo cp /etc/ganglia-webfrontend/apache.conf /etc/apache2/sites-enabled/ganglia.conf
Then edit file /etc/ganglia/gmetad.conf,
$ sudo nano /etc/ganglia/gmetad.conf
Find the following line and modify as shown below.
data_source "my cluster" 50 192.168.1.101:8649
As per the above line, the logs will be collected from each node every 50 seconds. Also, you can assign a name for your client groups. In my case, I use the default group name “my cluster”. Here 192.168.1.101 is my master node IP address.
Save and close the file.
Edit file /etc/ganglia/gmond.conf,
$ sudo nano /etc/ganglia/gmond.conf
Find the following sections and modify them with your values.
[...]
cluster {
  name = "my cluster"  ## Name assigned to the client groups
  owner = "unspecified"
  latlong = "unspecified"
  url = "unspecified"
}

[...]

udp_send_channel   {
#mcast_join = 239.2.11.71 ## Comment
  host = 192.168.1.101   ## Master node IP address
  port = 8649
  ttl = 1
}

[...]

udp_recv_channel {
  port = 8649
}

/* You can specify as many tcp_accept_channels as you like to share
   an xml description of the state of the cluster */
tcp_accept_channel {
  port = 8649
}

[...]
The changes in the above configuration file show that the master node which has IP address 192.168.1.101 will collect data from all nodes on tcp and udp port 8649.
Save and close the file. Then start ganglia-monitor, gmetad and apache services.
$ sudo /etc/init.d/ganglia-monitor start
$ sudo /etc/init.d/gmetad start
$ sudo /etc/init.d/apache2 restart
Ganglia Installation On Clients
Install the following package for each client you want to monitor.
On Debian / Ubuntu clients:
$ sudo apt-get install ganglia-monitor
On RHEL based clients:
# yum install ganglia-gmond
Configure Clients
Edit file /etc/ganglia/gmond.conf,
$ sudo nano /etc/ganglia/gmond.conf
Make the changes as shown below.
[...]

cluster {
  name = "my cluster"     ## Cluster name
  owner = "unspecified"
  latlong = "unspecified"
  url = "unspecified"

[...]

udp_send_channel {
  #mcast_join = 239.2.11.71   ## Comment
  host = 192.168.1.104   ## IP address of master node
  port = 8649
  ttl = 1
}
## Comment the whole section
/* You can specify as many udp_recv_channels as you like as well.
udp_recv_channel {
  mcast_join = 239.2.11.71
  port = 8649
  bind = 239.2.11.71
}
*/

tcp_accept_channel {
  port = 8649
}

[...]
Save and close the file. Next, restart ganglia-monitor service.
On Debian based systems:
$ sudo /etc/init.d/ganglia-monitor restart
On RHEL based systems:
# service gmond restart
Access Ganglia web frontend
Now point your web browser with URL http://ip-address/ganglia. You should see the client node graphs.
Ganglia:: unspecified Cluster Report - Mozilla Firefox_002
To view a particular node graphs, select the particular node you want from the Grid Choose Node drop-down box.
For example, i want to see the graphs of Ubuntu client which has IP address 192.168.1.100.
Ganglia:: unspecified Cluster Report - Mozilla Firefox_005
Graphs of my Ubuntu client (192.168.1.100) client:
Ganglia:: 192.168.1.100 Host Report - Mozilla Firefox_004
Client Node View:
Ganglia:: 192.168.1.100 Node View - Mozilla Firefox_006
Server Node view:
Ganglia:: 192.168.1.101 Node View - Mozilla Firefox_007
As you see in the above outputs, my client node (192.168.1.101) is down and server node (192.168.1.100) is up.