Skip to main content

Posts

Showing posts from June, 2021

Connect SparkThriftServer with Tableau/PowerBI

  Connect SparkThriftServer with Tableau/PowerBI REFERENCE : https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-use-bi-tools Use Power BI for Spark data visualization Note This section is applicable only for Spark 1.6 on HDInsight 3.4 and Spark 2.0 on HDInsight 3.5.   Once you have saved the data as a table, you can use Power BI to connect to the data and visualize it to create reports, dashboards, etc.   1.       Make sure you have access to Power BI. You can get a free preview subscription of Power BI from http://www.powerbi.com/ . 2.       Sign in to Power BI . 3.       From the bottom of the left pane, click Get Data . 4.       On the Get Data page, under Import or Connect to Data , for Databases , click Get . 5.       On the next screen, click Spark on Azure HDInsight and then click Connect . When prompted, enter th...

Enable Hive LLAP Via Knox

  Enable Hive LLAP Via Knox What is Hive llap: Live Long And Process (LLAP) LLAP provides a hybrid execution model.  It consists of a long-lived daemon which replaces direct interactions with the HDFS DataNode, and a tightly integrated DAG-based framework. For more details on hive llap follow below url: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=62689557 What is knox:   The Apache Knox™ Gateway is an Application Gateway for interacting with the REST APIs and UIs of Apache Hadoop deployments. The Knox Gateway provides a single access point for all REST and HTTP interactions with Apache Hadoop clusters. For more details on knox please follow below url https://knox.apache.org/ Purpose: This page explains the process of enabling Hive LLAP Via Knox.   Process: Step 1: Login to ambari and add the below properties in hive configs Hive.server2.thrift.http.path = cliservice Hive.server2.thrift.http.port = 10001 Hive.server2.transport.m...

Configuring Tableau and ODBC

Configuring Tableau and ODBC Hello All,   In this blog we will see how we can configure Tableau and ODBC on your system. Tableau is the most popular data visualization tool which is used widely to visualize the data. You can discover and share the insights that can change the business value.   Data visualization is a tool used by all data scientists which help them communicate with their end-users or clients. The result of all the data interpretation is not immediately coherent to the users who do not directly deal with the data. Data visualization acts like that connectivity point and makes users appreciate the occurrence of the data analysis. And, Tableau is one of the best Data visualization software/tools which is used by every other business now.   Tableau is a visual analytics program modifying the way we use data to solve problems by enabling oneself and businesses to make the most of their data. Tableau can perform below tasks.   1.  ...

Attach Disks to Azure VM

  Attach Disks to VM Add a data disk in Windows/Linux Go to azure portal. In the menu on the left, click  Virtual Machines . Select the virtual machine from the list. On the virtual machine blade, click  Disks . On the  Disks  blade, click  + Add data disk . In the drop-down for the new disk, select  Create disk . In the  Create managed disk  blade, type in a name for the disk(Name should be similar to the name of the disk which is already present) and adjust the other settings as necessary. When you are done, click  Create . We can add only 1 TB disk at a time, if we want to add more TB's we have to add more number of disks. In the  Disks  blade, click save to save the new disk configuration for the VM. After Azure creates the disk and attaches it to the virtual machine, the new disk is listed in the v...

How To Add Jenkins Node

  Jenkins Node Add   Hello All, We all know what is Jenkins and the working on Jenkins and its plugins. Its one of the widely used open source automation server, which helps in building, deploying and automating any project. In this post we will see how to add new node to jenkins cluster.  before we start this, we assume that you already have a jenkins server running/installed. Prerequisites You should have  Sudo  access  On the Agent machine: 1- Install the necessary packages install Java jdk openjdk # sudo yum install java-1.8.0-openjdk 2- Create a user on the agent to be used by Jenkins Create  user with name jenkins with /var/lib/jenkins as home directory. # sudo useradd -d /var/lib/jenkins jenkins # passwd jenkins 3- Generate an ssh key switch to jenkins user and create ssh key. # su - jenkins # ssh-keygen -t rsa -C "Jenkins agent key" Generating public/private rsa key pair. Ent...