Skip to main content

What is Big Data




What is Data?

The quantities, characters, or symbols on which operations are performed by a computer, which may be stored and transmitted in the form of electrical signals and recorded on magnetic, optical, or mechanical recording media.

Now, let’s learn Big Data introduction

Big data defined

What exactly is big data?

The definition of big data is data that contains greater variety, arriving in increasing volumes and with more velocity. This is also known as the three Vs.

Put simply, big data is larger, more complex data sets, especially from new data sources. These data sets are so voluminous that traditional data processing software just can’t manage them. But these massive volumes of data can be used to address business problems you wouldn’t have been able to tackle before.


The three Vs of big data

VolumeThe amount of data matters. With big data, you’ll have to process high volumes of low-density, unstructured data. This can be data of unknown value, such as Twitter data feeds, clickstreams on a web page or a mobile app, or sensor-enabled equipment. For some organizations, this might be tens of terabytes of data. For others, it may be hundreds of petabytes.
VelocityVelocity is the fast rate at which data is received and (perhaps) acted on. Normally, the highest velocity of data streams directly into memory versus being written to disk. Some internet-enabled smart products operate in real time or near real time and will require real-time evaluation and action.
VarietyVariety refers to the many types of data that are available. Traditional data types were structured and fit neatly in a relational database. With the rise of big data, data comes in new unstructured data types. Unstructured and semistructured data types, such as text, audio, and video, require additional preprocessing to derive meaning and support metadata.

The value—and truth—of big data

Two more Vs have emerged over the past few years: value and veracity. Data has intrinsic value. But it’s of no use until that value is discovered. Equally important: How truthful is your data—and how much can you rely on it?

Today, big data has become capital. Think of some of the world’s biggest tech companies. A large part of the value they offer comes from their data, which they’re constantly analyzing to produce more efficiency and develop new products.

Recent technological breakthroughs have exponentially reduced the cost of data storage and compute, making it easier and less expensive to store more data than ever before. With an increased volume of big data now cheaper and more accessible, you can make more accurate and precise business decisions.

Finding value in big data isn’t only about analyzing it (which is a whole other benefit). It’s an entire discovery process that requires insightful analysts, business users, and executives who ask the right questions, recognize patterns, make informed assumptions, and predict behavior.


Comments

Popular posts from this blog

Script To Monitor disk Usage

Hello All, Many times we have a requirement to automate few of the tasks, and one of the important task is to delete old log files from log destination. We usually get the conditions where we have to delete the log from particular directory when it reaches to some thresh hold value. Now a day the requirement of automation is highly increased where we are seeing most of the day to day tasks are getting automated using different scripting languages. Its always a good idea to automate your daily tasks which will save your working hours and will increase your productivity. Most of the time log destination gets full and due to which we face other issues like application down or particular service stop working. And manual deleting of log is hectic process so below script will help you to delete your log data when it meets your condition, and also it will trigger mail to recipient after deletion of logs. You can schedule this script in your crontab according to your feasible ti...

Connect SparkThriftServer with Tableau/PowerBI

  Connect SparkThriftServer with Tableau/PowerBI REFERENCE : https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-use-bi-tools Use Power BI for Spark data visualization Note This section is applicable only for Spark 1.6 on HDInsight 3.4 and Spark 2.0 on HDInsight 3.5.   Once you have saved the data as a table, you can use Power BI to connect to the data and visualize it to create reports, dashboards, etc.   1.       Make sure you have access to Power BI. You can get a free preview subscription of Power BI from http://www.powerbi.com/ . 2.       Sign in to Power BI . 3.       From the bottom of the left pane, click Get Data . 4.       On the Get Data page, under Import or Connect to Data , for Databases , click Get . 5.       On the next screen, click Spark on Azure HDInsight and then click Connect . When prompted, enter th...

Azure Active Directory

  What is Azure Active Directory? Azure Active Directory is Microsoft’s multi-tenant, cloud-based directory and identity management service. For an organization, Azure AD helps employees sign up to multiple services and access them anywhere over the cloud with a single set of login credentials. Azure Active Directory Concept: It is important to understand these Azure AD concepts. Identity . An object that can get authenticated. An identity can be a user with a username and password. Account . An identity that has data associated with it. You can't have an account without an identity. Azure subscription . Used to pay for Azure cloud services. You can have many subscriptions and they're linked to a credit card. Azure tenant/directory . A dedicated and trusted instance of Azure AD, a Tenant is automatically created when your organization signs up for a Microsoft cloud service subscription. More instances of Azure AD can be created. Azure AD is the underlying product ...

Azure Storage

Azure Storage is Microsoft's cloud storage solution for modern data storage scenarios.    Why Azure Storage: Durable and Highly Available: ·        Data is safe during hardware failure. ·        Replicate data across data centres. ·        In local catastrophe or natural disaster data replicated to other data centres remains highly available. Secure: ·        Data encrypted by Azure service Storage Service encryption (SSE) for data at rest. ·        Completed control over who access your data. Scalable: ·        Designed to massive scale. Managed: ·        Azure handles maintenance, updates and critical issues. Accessible: ·        Accessible from anywhere in the world. ·        You can access using http, https,...

Kubernetes-Update

                                                    https://kubernetes.io/ Kubernetes (K8s)  is an open-source system for automating deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery. Kubernetes builds upon  15 years of experience of running production workloads at Google , combined with best-of-breed ideas and practices from the community. Latest Verion:-  1.19 Kubernetes Objects Kubernetes defines a set of building blocks ("primitives"), which collectively provide mechanisms that deploy, maintain, and scale applications based on CPU, memory or custom metrics. Kubernetes is loosely coupled and extensible to meet different workloads. This extensibility is provided in large part by the Kubernetes API, which is used by int...