This article is a CITO Research Overview of Attivio, and will attempt to answer the following questions:
- What does Attivio do?
- The logic of Attivio
- How Attivio Works
- How is Attivio different from other competing products and approaches?
- Which CITO Research Technology Perspectives is Attivio aimed at?
- What compelling value does Attivio provide to a CITO?
- What questions should be explored in further research?
What does Attivio do?
Attivio is a fascinating company that is born of deep experience in business intelligence and enterprise search. Its basic value proposition is as follows:
- Business Intelligence systems are better at raising questions than at answering them
- When a BI Dashboard shows that sales have dropped or risen, the answer to why that happened is often in the unstructured information in a company
- To answer the questions that arise, you often have to then search through the unstructured information to find what really happened
- Attivio remedies this problem by connected BI dashboards with the relevant unstructured information
Here's a perfect example of how this could work. I recently talked with a CTO of a large systems integrator. He and the rest of senior management were puzzled why work related to a technology that accounted for a large amount of business that has been generally on the decline was not declining at one major client.
They then looked through the statements of work at that client and found that many projects were being purposefully misclassified as related to the declining technology when they were not. The account manager was doing this because he knew that he would be punished if the work in that area declined.
If the unstructured information was connected to the dashboard then all of this could happen much faster. It is also likely that more use would be made of unstructured information. So, now that we have the basic idea, let me explain more about Attivio.
The Logic of Attivio
My understanting of Attivio began in January when I met with Ali Riaz, the company’s CEO. I am always eager to explain new technology that points to trends that are going to be meaningful for CITOs.
I was delighted to find a product and a company culture that I think really illustrate some of the trends in which I am most interested. Ali Riaz has held executive positions with Computer Sciences Corporation and Novartis.
He has a strong background in enterprise software, from both the buyer and the vendor perspectives. For several years he was President and COO of FAST, a search company that was eventually acquired and held by Microsoft.
In doing so, Ali got a firsthand view of how people are accessing information, and uncovered a variety of problems that can be boiled down into three observations:
1. Enterprise search only finds unstructured content. The whole story of anything in an enterprise is really contained in both structured data--meaning databases, data warehouses and applications--and unstructured content – meaning documents, PDFs, emails, etc.
Enterprise search companies sometimes flatten out enterprise data, and make it into something approximating a document, but that’s quite a different approach from being able to handle structured data natively, as it’s being updated.
So, Ali’s first observation was that enterprise search is not enough. But neither is business intelligence, which ignores the estimated 80% of enterprise information that’s created in an unstructured format.
The ideal system would be able to aggregate all the unstructured data and all the structured data. The reason Ali points out that this is important is that the answers to questions come in two forms.
First, you ask “what happened?” Typically, the answer to that question is followed up with an objective such as, “Let’s look at our sales for the last quarter.” You may find that your sales for the last quarter went up or down based on the data you pull from your ERP system.
Or you might ask, “what does our sales pipeline look like?” And you see this numerical data that reflects a model of your enterprise, and it shows that something goes up or down, and that’s great.
The problem is, you don’t generally find out the answer to why that model changed, because the answer to why that data went up or down is not really in the structured data in the application. It’s in the unstructured data that surrounds your enterprise.
It’s in the e-mails, the documents, the meeting minutes, and the presentations. Ideally, you want a system that not only tells you what happened—“sales went up”—but also why it happened.
The convergence of results between structured and unstructured data is crucial to making that happen. That’s why Attivio set out to define a new kind of technology platform that would deliver unified information access (UIA), the target market segment the company is trying to address.
2. The second observation Ali made was that there’s a frustration that repeatedly occurs when people find information, but then they can’t do anything about it very quickly, and he calls this capability bi-directional workflow.
In a one way workflow, the data goes from an enterprise application into a data warehouse, and from a variety of other places into unstructured forms. Then, when you do a search, you get this version of the data that is in essence, “frozen behind glass.”
Suppose that you want to do something about that data by changing something in the underlying enterprise applications. If you were to do this, you’d have to figure out where the data came from, find your way into the right enterprise application, either by way of the document or through the structured data that you’re looking at, search around in that application for the data that you’re interested in, and then do whatever it is you’re going to do to change that data.
What is needed, according to Ali, is a method that will take you directly from that display of the query results to the application, and that’s what he calls bi-directional workflow.
3. The third observation that Ali made is that it takes a long time to implement enterprise software systems. He oversaw a successful implementation in a large company, where the company was actually able to implement an enterprise search system on time and on budget.
When Ali was talking with the CIO at the end of the job, he remarked, “Isn’t it great that we did what we said we were going to do?” And the CIO said, “Yes, it is good that we hit this on time. It is good that we hit it on budget. But we’ve taken about a year to do it, and now we have another silo of information that I need to worry about. And that’s all well and good, but it’s clear that we’ve now understood a lot more about what we want to do, but we’re all exhausted. We now have to understand how to incorporate this new system into our processes. And then, after we do that, we have to figure out how to improve it, and if we have the time, energy and budget to do so.”
At that point, Ali realized that the experience of implementing enterprise software, whether you succeeded or not, carried with it a sense of exhaustion.
Then we must ask: is it possible to implement enterprise systems without a death march?
It would be better if you are able to implement software in an Agile fashion, and you could quickly get to a result in a few months.
Perhaps the result is smaller than what would have been achieved in the 12-month cycle, but the shorter-term Agile result would allow you to look at what you did, incorporate it into your systems, start understanding the data, and undertake the next iteration, in the same way Agile development suggests.
How Attivio Works
Attivio’s product seeks to resolve many of the issues discussed above. Attivio provides query results that are a combination of structured and unstructured data, and can be executed by simple keyword searching or more complex and precise SQL.
Imagine a screen with a dashboard on top that summarizes the structured data in the applications. That’s a traditional business intelligence function.
Then imagine that, underneath that screen, there is a result screen that shows you all the unstructured data that is related to that structured data—all of the documents and presentations that typically contain the most relevant information but are the most difficult to find. That is basically Attivio’s product.
Here’s how the product functionality addresses the three problems above:
The challenge of Attivio’s technology is to to allow the structured world of business intelligence data and the unstructured world of enterprise search to be connected, so that when you do a search in the business intelligence realm, you can see what unstructured content in the enterprise search realm might be related to it.
Conversely, the alternate possibility is that you do a search in the enterprise realm and you then find business intelligence data that’s related to it. Still, the first vector, that is, the phenomena of searching in business intelligence, and linking to the appropriate unstructured content through enterprise search, is the one that happens most often.
So, how do you connect the structured and unstructured data worlds?
Let’s imagine that you have a typical business intelligence dashboard, or are looking at the output of a business intelligence report. What is in that report that might be relevant to your unstructured content?
It turns out that what’s on that report are various ”things”, such as company names, product names, financial concepts, industry trends, and all sorts of other items that you can identify as entities or concepts, which could then match terms on the unstructured side.
A very ham-handed approach to this would consist of entering one of these terms in an enterprise search box, and seeing underneath that the search terms related to that entity. The problem is, that approach just doesn’t work well enough.
While it might help a little bit, it now confronts the user with the problem of running an unsorted search, which is a relatively crude search results list to look through.
Attivio takes the idea of entities and concepts, and applies those concepts to the unstructured data, so that, when you create a business intelligence dashboard, Attivio allows the entities that are related to that specific dashboard to be extracted.
Attivio retrieves relevant entities, such as company name, product name, and the like, that can then be joined to the structured data so you get a much more accurate listing of relevant content. The result of this dashboarding is that you can now go from having a much less valuable set of results to a much more valuable and comprehensive.
That's the first way that Attivio connects the business intelligence world to the unstructured business intelligence world.
The second issue Attivio attacks is, how do you make the list of content more meaningful?
Attivio uses a process called key phrase extraction, which seeks to answer the question, "what is the unstructured data telling you?" Key phrase extraction filters unstructured data for key concepts that are being discussed
Entity extraction helps with linkage, and concepts help you disambiguate and seek the answer to the question, “What is this business telling me; what am I trying to understand?”
As you drill deeper, and more information surfaces, you can ask more questions and get more granular still. For example, if you were an executive at Verizon and you had a dashboard of your Apple sales, and you also had a list of all the forum content in your view, you might see that a lot of the content was positive, but then you might see "network problems" as a link.
You drill into the "network problems," and you find, that people love their iPhones, but they don’t like the AT&T network.
You can also use sentiment analysis, where you can quantitatively evaluate whether people are saying positive or negative things about one of your search terms, to analyze unstructured data. Of course it's likely that both some positive and negative comments will be made on one search term.
This is what Attivio CTO Sid Probstein calls "entanglement."
Once you identify issues, you can then put those issues into workflows that automatically get triggered once a key phrase is found. Then, at least some of your business intelligence analysis can happen automatically when it interacts with unstructured data.
Attivio's technology revolves around the idea that there is a structured world of business intelligence data, and that one can use the entity to link the structured and unstructured worlds together. This allows you to use a variety of techniques, such as key phrase extraction and sentiment analysis, to analyze and gain business insights from unstructured data.
The technology builds on itself, so that the analysis of unstructured data becomes more and more sophisticated, and more and more vectors, in addition to sentiment analysis and key phrase extraction, will be applied.
How is Attivio different from other competing products and approaches?
Attivio is unique in its approach to unified information access. They are not trying to replicate the functionality of the BI products. They have the ability to create dashboards, but Attivio happily integrates with players like Jaspersoft iReports and data visualization products like TIBCO Spotfire or QlikView.
The idea of using entity extraction on structured data is something that I haven't run across before. Once the entities are extracted from the sturctured data they become a bridge connected to the relevant unstructured data.
Which CITO Research Technology Perspectives is Attivio aimed at?
It is unlikely that small organizations will get much out of Attivio. In order for the product to work, a company must have many active users of lots of data and also have relevant unstructured data in repositories that can be searched.
Attivio is primarily relevant to the large departmental user of technology and to the corporate and multi-organizational level.
What compelling value does Attivio provide to a CITO?
Attivio will help CITOs who are supporting end-users involved in processes and decision making that is heavily data oriented. If that data also leads to questions that can be explored using the unstructured data in a company, then Attivio will speed the way to more informed decisions.
CITO's can discover this by doing a survey of the highest value processes in their company, then interviewing the process owners and participants and asking the following questions:
- Please provide a list of the most important questions that are relevant to increasing your performance?
- What data or documents would be needed to answer these questions?
- What follow on questions do these questions lead to?
- How fast must these questions be answered to provide value?
- What is the economic impact of answering these questions in a timely manner?
One thing for the intrepid CITO to keep in mind is that neither the process owners nor the end-users will be able to answer these questions fully. The CITO will have to help them manage the complexity of doing the research to find out what sources of data are available both in structured and unstructured questions.
The CITO will likely learn quite a bit about the business by answering these questions. With this information in hand, it should be possible to identify low-hanging opportunities. The budget won't be hard to get because the ROI comes right from the end-users.
What questions should be explored in futher research?
I'm sure over the next few months I will be able to chat with more end-users of Attvio. What I'm interested in is finding out answers to the following questions:
How often does the answer or insight sought show up in the unstructured content provided by Attvio?
How often does the unstructured information provided by Attvio become the starting point for a set of questions that are answered inside a set enterprise search environment?
How much do Attivio's entity extraction and advanced analysis methods help make the process of searching unstructured documents better?
The trick of using entity extraction to connect structured and unstructured data should apply in many contexts in addition to BI reports and dashboards.
How can Attivio enhance other applications?