top of page
Recent Posts
Featured Posts

Big Data, Discovery, Extraction and Contract Management

A Data-Driven World

Never in the history of our world, has the ability to discover, extract, interpret and intelligently act on data been so important. And never before has the amount of data been more overwhelming. According to Eric Schmidt, Google’s Chief Executive Officer, the world creates 5 exabytes of data every two days. That is roughly the same amount created between the start of civilization and 2003 . To be fair, we are talking about digital data only. Data has been generated for as long as people have been around to create it. Until relatively recently, however, (i.e. the last 50 years) it has been siloed in minds, file folders, file cabinets and ledgers, where by comparison, it has also been relatively inaccessible from a discovery and analysis perspective.

Big Data

The term “big data” has taken special significance in describing the mammoth amount to information being acquired and stored in our digital society; from buying habits, to website visits, to social media, to plant floor data, to almost anything you can imagine. Its value is of limited value without an ability to discover and extract what is valid from what is not, and to analyze a meaningful, manageable data set. Equipped with this information, business intelligence and other solutions can then work their predicative and trend analysis magic and arrive at useful conclusions.

Contracts: Enterprise Life-Blood

So how do big data, data discovery, data extraction and contract management play together? Allow me to explain. Whether a company is involved in software development, manufacturing, agriculture, finance, insurance, consumer packaged, goods, retail or some other enterprise, a company’s contracts are its life-blood. They assure a predictable and orderly revenue flow as the basis for day-to-day and future operations and planning, raw materials purchasing, capital equipment acquisition, capital expansion, employment and a host of other critical business functions.

Big Data or Just Extremely Important Data?

When we look at data and data complexity from a contract perspective, large complex contracts might have 3000-4000 contract obligations that must be managed and tracked. This can be a formidable task for any company. Similarly, the complexity of managing 300- 500 contacts, each with 50 obligations, can also be overwhelming for the small company. Would this be considered “Big Data”? Probably not, but it is important data, nonetheless. So while big data implies a data quantity that can be difficult to manage due to sheer size, the opportunity to realize benefit by more intelligently discovering and managing the important data you have, by comparison what is “small data”, can be equally instructive.

A Proper Level of Rigorousness

If we can agree that contracts are incredibly important across enterprises world-wide, why are companies less than rigorous when it comes to managing contractual obligations and opportunities? Why do we find billion dollar companies without centralized contract repositories, managing billions of dollars in contract revenues using people and spreadsheets to manage manually? No matter how rigorous, not matter how conscientious an individual might be, with increasing complexity the human mind reaches a point where it can no longer process consistently or accurately. In this age of specialized tools, it is a question that bears answering.


Companies of size have hundreds/thousands/tens of thousands of contracts as a minimum; some inactive, some of one-year duration, some auto-renewing and some multi-year. Most are managed using meta-data contained in a spread sheets or in informal databases, without comprehensive notifications or reminders of impending key activities or dates. Most of these enterprises have no ability to search and report internal document details. In this age of SOX and corporate compliance and accountability, this leaves CEOs highly exposed. It leaves them with incomplete knowledge of what their contractual obligations truly are. Does this describe your contract management environment?

When we talk about contracts, we are talking about are the company-jewels. All the information is there, but without the right tool, companies have no easy way to comprehensively understand what their contracts contain and when they need to take certain actions, except by reading every contract individually. If you have hundreds or thousands of contracts, not only is this process long and tedious, but it is also error prone.

Two Levels of Discovery

So where does document discovery and extraction come into play? If you are like many companies, your meta-data is non-existent or limited in scope. Document discovery and extraction allows you to reach inside your data, extract what is meaningful and create data that is useful. There are two ways to complete discovery, ad hoc and structured. If you have a deep search enabled CLM solution, then in an ad hoc fashion, you have the ability to search within your contract documents to discover any word, term, phrase, language or value. This is important when business conditions require you to assess your liability as related to certain contractual obligations.

The second way to do Discovery is via a specialized technology called Document Structure Analysis (DSA). DSA, in combination with Optical Character Recognition (OCR), uses propriety algorithms to analyze document structure, complete pattern recognition, etc. to identify key information so that it can be extracted. Imagine if you will a database of 5,000 documents, each 40 pages in length, or 200,000 pages of documents. Without contract meta-data or with incomplete contract meta-data, how do you discover where important information resides? By manually searching 200,000 pages? Of course not.

What Your're Probably Missing

When contract meta-data is missing or incomplete, Discovery and Extraction allows key information to be discovered (contract assignment clauses, venue clauses, renewal clauses, price and discount structures, etc.), identified, extracted and then used to automatically populate a comprehensive meta-data description of each document. With this meta-data in place, executives then have the ability to slice, dice and report on metrics and complete trend analysis to their hearts content. Problem solved.

The Value of Introspection

I invite you to think about the real value of your contracts, how little you know about their details, the limitations you face to discover non-standard contract items and the possibility of making changes to your contract management process to increase transparency and efficiency. If contract meta-data doesn’t exist today or is incomplete, document discovery and extraction provides the opportunity to quickly become knowledge about where there is additional revenue opportunity and where, as yet unidentified risk, can be found.

To Learn More

openSourceCM is a pre-eminent provider of Contract Life-Cycle Management (CLM) solutions and document discovery and extraction services. To learn more about how to discover the hidden jewels in your document data, please call (866) 673-6768 or visit us at

Tags: CM News

Search By Tags
Follow Us
  • Facebook Basic Square
  • Twitter Basic Square
  • Google+ Social Icon
bottom of page