Wednesday 14 December 2016

Data and Big Data

What is Data?
     Data means nothing but information in raw or unorganized format. It is present everywhere in the world. The amount of data that's been created and stored on a global level is almost inconceivable and it just keeps growing every second. 90% of world's data today has been created in the last 2 years alone. In general, data is of worthless since it is unorganized. But, Information is more potential and valuable and is based on the concepts of data. There are generally 3 types of data: -
  • Structured Data
  • Unstructured Data
  • Semi-Structured Data
Why is data important?

Suppose you want to buy one from below two cars. Which one do you buy?
  
What do you mean you can't make that decision? Are you thinking to yourself you need more information? You probably do if you want to make a good decision.
So, what is it you need? You need data! This is true of business decisions as well - more data generally means less risk in decision making. The better the data, the lower the risk. Making a decision without data increases the risk that the decision will be faulty. More data means more accuracy.

So you know you need lots of data, but what does data look like? You're probably thinking of lots of numbers and maybe some charts and graphs. But data comes in many forms and is often categorized as structured and unstructured. Structured data generally consists of numerical information and is objective. Unstructured data is more subjective and is usually text-heavy. It can't generally be put into a data structure like columns or rows. based on around 80% of business decisions are relied on Unstructured data and only 20% of Structured data. The below two pictures demonstrate the structured data and unstructured data.
 
Semi-structured data is data that has not been organized into a specialized repository, such as a database, but that nevertheless has associated information, such as metadata, that makes it more amenable to processing than raw data.
Examples of unstructured data:  RAW files, Videos, Images, Audio files, Social Media etc.,
Examples of semi-structured data: JSON files, XML files etc.,

Big data

Big data is a term for datasets that are so large or complex that traditional data processing applications are inadequate to deal with them. The data can be of any voluminous amount of Structured data, Unstructured data and Semi-Structured data that has the potential to be mined for information. Actually, any data i.e., beyond the storage and processing capacity of a single centralized or physical machine or server is considered as Big Data.

In other words, Big Data describes the collection of complex and large datasets such that it is difficult to capture, process, store, search and analyze using conventional database systems. Big Data doesn't revolve around how much data you have, but what you do with it and the processing and analysis of it and the insights, products, and services that emerge from the analysis.

You can take data from any source and analyze it to find answers that enable 1) cost reductions, 2) time reductions, 3) new product development and optimized offerings, and 4) smart decision making. When you combine big data with high-powered analytics, you can accomplish business-related tasks like 
Determining root causes of failures, issues and defects in near-real time.
Generating coupons at the point of sale based on the customer’s buying habits.
Recalculating entire risk portfolios in minutes.
Detecting fraudulent behavior before it affects your organization.


Big Data is characterized of 7Vs but in general, it mainly characterized of 3Vs:

  • Volume: The amount of generated and stored data.
  • Variety: The type and nature of the data. Can be structured & unstructured data, text data, sensor data, log files etc.,
  • Velocity: Is the speed at which the data is generated and the speed at which it is processed, stored and analyzed.
The remaining Vs are

  • Variability: Variability is different from variety. The same is true of data if the meaning is constantly changing it can have a huge impact on your data homogenization. For example, A coffee shop may offer 6 different blends of coffee, but if you get the same blend every day and it tastes different every day, that is variability.
  • Veracity: Veracity is all about making sure the data is accurate, which requires processes to keep the bad data from accumulating in your systems. 
  • Visualization: Visualization is critical in today’s world. Using charts and graphs to visualize large amounts of complex data is much more effective in conveying meaning than spreadsheets and reports chock-full of numbers and formulas.
  • Value: Value is the end game. After addressing volume, velocity, variety, variability, veracity, and visualization – which takes a lot of time, effort, and resources – you want to be sure your organization is getting value from the data.


Big data in action:

Banking:
With large amounts of information streaming in from countless sources, banks are faced with finding new and innovative ways to manage big data. While it’s important to understand customers and boost their satisfaction, it’s equally important to minimize risk and fraud while maintaining regulatory compliance. Big data brings big insights, but it also requires financial institutions to stay one step ahead of the game with advanced analytics.

Education:
Educators armed with data-driven insight can make a significant impact on school systems, students, and curriculums. By analyzing big data, they can identify at-risk students, make sure students are making adequate progress, and can implement a better system for evaluation and support of teachers and principals.

Government:
When government agencies are able to harness and apply analytics to their big data, they gain significant ground when it comes to managing utilities, running agencies, dealing with traffic congestion or preventing crime. But while there are many advantages to big data, governments must also address issues of transparency and privacy.

Health Care:
Patient records. Treatment plans. Prescription information. When it comes to health care, everything needs to be done quickly, accurately – and, in some cases, with enough transparency to satisfy stringent industry regulations. When big data is managed effectively, health care providers can uncover hidden insights that improve patient care.

Manufacturing:
Armed with insight that big data can provide, manufacturers can boost quality and output while minimizing waste – processes that are key in today’s highly competitive market. More and more manufacturers are working in an analytics-based culture, which means they can solve problems faster and make more agile business decisions.

Retail:
Customer relationship building is critical to the retail industry – and the best way to manage that is to manage big data. Retailers need to know the best way to market to customers, the most effective way to handle transactions, and the most strategic way to bring back lapsed business. Big data remains at the heart of all those things.
Big Dta


1 comment: