Big data is an awareness, a realization of companies which start trusting that generated data have capital value. Data can be a tool for a new service or product or even generate a new value that complete an existing product. The most important thing in big data is use case of these data, that’s to say defining what would be the most useful way to enrich our business from all the days.
Some people define big data as this important that exist and has to be correctly treated (tweet, comment, picture, video or even action that has a less link with our customers), but others define it just as a set of storage and treatment framework that allow to work the most effectively possible with this data lake.
Data have been existed since a while, but social media, internet of things and this large data which growth exponentially and bring us by the way to the era of big data. In fact, we estimate that they are as many data as stars in the universe.
In 2012, data has been multiplicated to 400. In our previous post about how data are used nowaday, we said that 2.5 trillions byte of data have been generated. We express big data according to 5V:
means massive data quantity generated while a data analyst says that only 18% of these one are treated in big companies. Big data offer tools to stock and analyse data.
The speedness that data are generated and treated is a phenomenon which belongs to big data. Because data growth fastly and technologies have to be prepared to this evolution. For example, if one day, a person decides to watch all the videos that had been posted on youtube in a defined day, he will spend 8 year in front of his computer. In fact, 48 hours of video are posted per minute in this channel.
Data format must be take in account. previously, we had classic data storage with relational database that oblige data to follow the sae diagram defined. This fact is not the same on big data that treat at the same time structured and unstructured data. We also notice that 80 to 90% of data are unstructured. However, it’s impossible to know data format before their collection.
Collecting data and being able to extract correct information is not absolutely possible because of the messy data and sometimes the missing quality and precision. Big data will bring order by sorting data access and allows in same time the w ay to do necessary analysis.
All that data have no sense for the company if it don’t bring value. It is necessary to find IT statistician ( data scientist) who will discover solutions for department’s problem using production data.
The business intelligence side of big data is the module to treat data until their report. Because BI already treats data but not this size. We can do BI without doing big data. BI treans numerics data, production data, but big data is interested on data linked with human behavior, generated data on social media.