In the first article in this series, I explained what machine data is and how it can help business applications. In this article, I explain how people can use a data fabric like Splunk to take machine data and use it to improve operational intelligence (I also explain what I mean by those terms).
If you look closely at why Splunk is popular in data centers, you can glimpse its broader potential and see why machine data is going to become an important source of insights.
In most data centers, dozens of web servers, routers, databases, switches, security devices, and application accelerators all work together and each one creates a log file. When something goes wrong, admins look at the log files. When something goes wrong, it is important to search all the logs at once.
Splunk aggregates logs from all these devices in one master index. Instead of logging into each machine, a system administrator can diagnose a problem using Splunk’s central, aggregated index. This capability is wildly popular in data centers, continues to fuel Splunk’s growth, and has spawned a raft of competitors.
What’s in a Log?
Logs consist of millions of micro-events, as shown in the graphic below. A web server logs every element of every page it serves. A database logs every request for information and every update. Often different types of records are written to the same log. If there’s a failure, huge chunks of data are written to a log to record the details. Almost all machine data has a timestamp.
Data, Data, and More Data
Now consider the new world we are facing. Devices track all sorts of behavior, pumping out huge amounts of machine data. Each device does or could have a log file but it can’t store its log data (or much of it). To see the bigger picture, you must track what is happening on many different devices. As time goes on, more and more machine data will start spewing out of mobile devices, sensors, industrial equipment, medical devices, and so on. This is in addition to the data we all throw off every day in social media, email, banking, and other activities.
Right now, much of this data is never captured or examined because existing systems cannot make it useful. They cannot handle the scale or the velocity of the data. A data fabric fills this gap and allows machine data to be harvested.
What Is a Data Fabric?
A data fabric is an umbrella concept for the capabilities needed to make machine data useful. Splunk is the first instance of a data fabric. It makes machine data useful because it can handle:
- Time series data
- Structured or unstructured data
- Data at huge volumes
- Data from a vast number of heterogeneous sources
- Static data in huge repositories
- Real-time streams of data
- Ad hoc queries
- Advanced application development through APIs
Most data processing systems make you choose one or two of these capabilities. A data fabric like Splunk doesn’t. You can throw the fabric over the data and start exploring it.
A data fabric must also be able to serve many types of users. Splunk’s SPL allows for incremental exploration by most people in just a few minutes. Message Bus, for example, uses Splunk to allow its customers to explore machine data that shows how each customer is using its product for sending email at scale.
In addition, SPL is a powerful application development environment completely controllable through APIs. Splunk has developed dozens of special purpose applications, and customers have created hundreds more.
In essence, a data fabric is a system that makes machine data useful without restricting your options for exploring it.
How Operational Intelligence Will Change Business Applications
A data fabric opens up the possibility for machine data to start providing value. In environments like data centers where machine data is the focus, the benefits are immediate. System administrators can investigate problems and monitor proactively to detect problems early.
But since most business applications were built before the era of machine data, it isn’t always obvious how machine data could improve them.
What Machine Data Does
Machine data extends and deepens the operating model. Most business applications were built in an era of information scarcity. If machine data allows, for example, a consumer’s movements around a store to be tracked instead of just his purchases, it is possible to create a richer model of his behavior. As sensors emit more machine data, the sophistication and granularity of the models will increase. The visibility provided by better, more granular models is the first impact of machine data.
Machine data identifies important events. Machine data changes business applications by identifying events with business significance. For example, elevator data could raise an alert when traffic dropped by a certain percentage. With richer models of behavior, it is possible to create a sophisticated description of what is normal, which provides the foundation for many insights.
A richer model and an expanded collection of events paves the way for better applications that expand awareness about what is happening. In addition, applications that were not initially designed to operate on real-time data can get the benefit of insights collected in real time. This paves the way for advanced automation that makes an application more powerful.
Machine Data Informs Operational Intelligence
A paper published last year “Operational Intelligence: What It Is and Why You Need It Now” identifies four levels of this transformation, as shown in the graphic below:
Of course, the ability to transform an application with operational intelligence greatly depends on the control you have over it. But even the most locked down, packaged applications have ways of accepting new data and integrating new UI elements into their screens. Almost every application in a company’s portfolio could benefit from operational intelligence if the right machine data is available to be exploited through a data fabric.
Running a Machine Data Adoption Program
Installing a data fabric like Splunk and using operational intelligence as a roadmap will be a good start, but nothing will happen unless relevant machine data is found. Some industries, like manufacturing, are awash in data and need to find a way to put it to better use. Most companies are going to need to explicitly search for relevant machine data.
Exploring Machine Data
To lead this transformation, technology leaders will need to start a formal program of searching for and evaluating the value of machine data. While the tech staff will usually have to play a role and set up the initial environment, once you have Splunk in place and people have some initial training, they can explore machine data on their own. The more people involved in the process of finding value in data, the better. The best way to direct this activity is to make sure that everyone knows questions that would be valuable to answer or to focus on the processes that create the most value.
Understanding the Impact of Machine Data
Once awareness of the value of machine data has spread and more people can search and evaluate data, the challenge becomes to understand how it can have an impact. For most companies, operational intelligence will help other efforts:
- For example, if you have a center of excellence focused on integration, the data fabric will increase the ability for machine data sources to be integrated with applications or for applications to create new sets of data.
- For app dev groups, using semantic logging, that is, having applications write descriptions of activity to log files, can increase the ability to track user behavior. A data fabric can put that data to use.
- For business intelligence, machine data can add new sources of insight and also recognize events in real time.
My belief is that CIOs and CTOs can use machine data to escape a back office focus in which the biggest victory that could be achieved is to drive IT costs to zero. By focusing on the potential of exploiting machine data using a data fabric like Splunk and operational intelligence as a roadmap, CIOs and CTOs can start having an impact on the front office where there is unlimited upside potential. That’s the real mission of IT in the first place, to make a business more successful, and machine data will play a huge role.