Are you passionate about Software Development or Quality Assurance? We're looking for you! View our career opportunities.

Big Data Overview Part Two: A more in-depth look

As we discussed last week, Big Data is powerful and making huge waves across all industries. This week, we’ll take a more in-depth look at Big Data and how it works on a more technical level. Statistics or case studies presented in this blog posting are attained from a 2013 research white paper entitled ‘Big Data in Big Companies’ made available by the International Institute for Analytics.

Before we dive into specific technical details, we must first understand the primary objectives for Big Data. As with most new technologies in the IT industry, Big Data is focused on three major objectives: improving the time required for computational tasks, reducing operating costs, and the creation of new products or services (as mentioned with UPS and Google last week – see link) While it is possible to target all three objectives in a single data-based Business Intelligence snapshot, success is more likely when focusing on one or two objectives at a time. The objectives chosen are dependent on the company and its current needs; different objectives affect different business units within the company. For example, speeding up computational tasks is greatly appreciated by development teams, while cost savings make the financials look better to management. The objectives chosen will affect the process as well as the outcome.

Last week, we discussed cost savings from Big Data Business Intelligence at UPS, and testing new products and services at Google, so this week, I’ll focus on time reduction. This is the second most common usage of Big Data. While cost reduction doesn’t always result in less time, more often than not, less time will result in less cost. For time reduction, let’s look at an example from Macy’s Department Stores.

Unsurprisingly, department stores can carry millions of items. Macy’s sells a whopping 73 million items. Of course, the information for all these items, along with details and pricing information, are stored in a database. Working with, and manipulating large amounts of data, is usually time and resource consuming on a large scale, so applied time savings can be substantial. It used to take Macy’s over 27 hours to change its pricing in the event of sales or specials, only to change them back again at the close of the sale. To improve this situation, Macy’s chose a Big Data solution.

Using a big data analytics (such as Pig, Hive, or HBase) Macy’s was able to pull their data out of a Hadoop (one of the largest technologies in Big Data) cluster and place that data in parallel computing and in-memory software architectures. In a practical sense, this means that Macy’s was able to stream in and manipulate their data in real time. The end result was a reduction in processing time to just over an hour for optimization and price changes. This 26-hour saving resulted in lower operating costs and allowed the company to change their prices on the fly, becoming instantly competitive as other stores run sales on particular items or classes of items. This example shows that time savings are definitely an objective where Big Data delivers.

Now that we have a very basic understanding of Big Data’s top three objectives and how they are leveraged, we can look deeper to see the actual moving parts of Big Data. Big Data is driving a lot of industry change with regard to technologies and hardware, particularly legacy technologies. Big Data by its nature can’t mesh with existing legacy codes or hardware, so open solutions are required. Many organizations are following this trend and instead of proprietary software and hardware solutions, they are using open solutions such as Apache Hadoop and commodity hardware with custom written applications and solutions.

After the hardware, there is the Big Data Stack consideration. 

With the standard ‘Full-Stack’ there is:

  • Front-End
  • Back-End Code
  • Back-End Databases

 With the Big Data Stack there is:

  • Front-End
  • Back-End Code
  • Data
  • Platform Infrastructure
  • Back-End Databases

One of the appeals of Big Data is that it involves a very fast movement of data. Keeping data readily accessible helps leverage in-memory processing and parallel computing. This approach is also what makes Big Data’s analytics and aggregation so powerful. For example, you can get real-time statistics about that data which can be used to make business decisions. With stack design, you can read through your data as it’s entered with a text-mining application set up with trigger words. You can use this to see how your users feel about, view, react, or generally respond to certain things. This would be beneficial for Twitter analytics as a real-world example.

(Note: Big Data analytics is closely related to the concept of Business Intelligence. If you’d like a free whitepaper download about the use of Business Intelligence in a wide variety of business situations, click here.)

Beyond the real-time analytics, this stacking approach allows for real-time data aggregation. As data comes in, it can be structured into relational tables before being placed into the data tables that are predefined. This approach is extremely useful for collecting data for a multitude of different reports and comparisons. Using this method, all the data required for weekly, daily, or monthly reports are ready to go. Data can also be sorted into multiple report types and even calculated so there’s no need to run batch jobs to create standard reports. By sorting required data before it even hits the database you lower the number of scheduled jobs which must be run, which in turn, reduces the number of errors and job conflicts that may occur.

Considering even just a few of the benefits mentioned here, it is obvious why large companies are choosing to invest in Big Data. But you don’t have to be Macy’s to reap the benefits of the technology behind Big Data. There are many affordable solutions for enterprise level and mid-market firms that leverage data manipulation to the benefit of the business once you understand the concept of Big Data. Palm Beach Software Design has a great deal of experience bringing customized Big Data benefits to the mid-market.  

About Palm Beach Software Design

Palm Beach Software Design is comprised of a small, tight team of software and business professionals dedicated to growing businesses up to $75M by helping them to improve their potential by making operations more efficient, increasing sales and public impact, and modernizing for today’s business climate using technology and software as a basis. We are process-driven, with high standards of excellence, and a dedicated staff. We have been in business for 30 years, and although we are a Florida-based company, we serve clients throughout North America. Please contact us at 561-572-0233 and visit us on the web at to learn more about how Palm Beach Software Design, Inc. can help your business get that competitive business advantage.


Crystal Darin

Great article!

Post A Comment