The Four V’s of Big Data: Which Wins in the Sea of Information?
by Claire Gilbert, Senior Statistical Analyst, Gongos, Inc.
As those guiding strategy and making data-driven decisions know, there is a bottomless ocean filled with statistics and facts that can answer complex—and often perplexing—questions. Ideally, these fragments fit together to create a complete image of humans, our behaviors and motivations, and the world in which we operate.
However, data is collected through various methods—smartphone tracking and geolocation, transactional records, and social scraping only scratch the surface—and these methods are each designed to meet unique business challenges. The resulting pool of information varies greatly based on the four V’s of big data – volume, velocity, variety, and veracity. Let’s review practical examples of these four dimensions in action:
- Volume: The heaviest smartphone users click, tap, or swipe their phone 5,427 times a day. These recorded touches add to the vast assortment of information available.
- Velocity: 40,000 Google searches occur every second, rapidly creating new data.
- Variety: With 48 hours of new content uploaded to YouTube each minute and 30 billion pieces of content shared monthly on Facebook, gone are the days where variety comes from text versus numeric fields. In today’s world, variety translates to images, video content, behavioral tracking, and much more.
- Veracity: Poor quality data costs the U.S. economy approximately $3.1 trillion/year according to a 2016 IBM study.
While the stats are staggering, these data reservoirs are often messy, poorly managed, and riddled with gaps. Data scientists and decision makers, faced with the ever-growing challenge to understand their customers, often make tradeoffs within the four V’s to efficiently decipher inbound information. Because of the enormous costs of poor quality data, the computer science adage warning ‘garbage in, garbage out’ is evermore applicable. Analyzing inferior data of any type and amount will result in useless or potentially risky outcomes. And with reliable data as the bedrock of customer-centric decisions, veracity is vital.
Veracity in Action: The story of one company wading into the depths to find clarity
Stakeholders at a global brand experience company were struggling to connect disparate customer touchpoints to refine their customer experience model. Internalizing this knowledge would ensure that resources were allocated to the most crucial drivers of growth. Historically, they had collected customer satisfaction metrics on a quarterly basis, amassing a large volume of data. In tying the satisfaction to multiple financial measures, they hoped to identify areas within the business where improving the customers’ experience would lead to increased revenue.
While velocity wasn’t of great concern since the data refreshed quarterly, the large variety and sheer volume of available information presented a grave challenge to the veracity. The information captured was inconsistent wave to wave, and the vast number of previous waves made it nearly impossible to track how it was updated. In this raw state, a data scientist could analyze the data, however resulting insights would be suspect because the foundation of the analysis wasn’t stable.
To ensure confidence in the results—and the decisions they inform—volume, and to some degree variety, were sacrificed in favor of veracity. Analysis was limited to fiscal quarters in which critical measures, such as NPS and overall satisfaction, were reliable. Additionally, certain fields of data were excluded from consideration if they couldn’t be converted to consistent formats. This decreased the overall data available, consequently bolstering confidence that the outcomes could guide organizational change and result in greater ROI.
Had the company acted on findings from the initial pool of data, it would have wrongly increased its annual on-site spend. However, those protocols in their current state proved to satisfy customers. As such, our client leveraged the results to invest in meaningful points along their customers’ journey, improving overall customer satisfaction, and ultimately increasing their revenue on each interaction.
Not All Four V’s are Created Equal
Veracity can make or break an analysis, however there are significant advan-tages to velocity, volume, and variety. Greater volume allows for increased confidence in predictions and can reduce the impact of outliers and extra-neous data points that would otherwise skew the results with smaller data. Live information updates create the ability to change course on the fly, and a large variety of data types often round out the story in a more comprehensive portrait of customers. But, if the data is messy or mismanaged, the confidence is misplaced, resulting in damaging decisions and severed trust with customers.
If the goal is to extract meaningful and actionable knowledge, it’s the quality of data that has an immeasurable impact. When swimming in the sea of data, veracity is a decision maker’s life raft – bringing the most valuable and impactful insights to the surface.
[To see the entire breadth of content that we’re publishing, subscribe to our thought leadership.]