
For decades, insurers have been amassing enormous databases of claim information. The amount of data available to insurers today is overwhelming. There are millions of untapped records, reports and documents currently lying in insurance data centers; enough information to actually foresee and predict likely claim outcomes. The big question for insurers is, "What exactly do we do with this data?" Significantly more important is, "How do we turn this data into knowledge that can be acted upon?"
The ability to effectively mine and analyze this data can result in faster claims resolution, better data modeling and forecasting and considerable loss cost savings for insurers. When you consider that a small number of claims cases constitutes a significant portion of a carrier's total claims payout, it follows that small improvements in the claims payout process can offer significant financial benefits to insurance companies.
Claims professionals now have the ability to mine actionable information from both structured and unstructured data sources through the use of semantic technologies and text data mining, a first step in the effective use of predictive analytics. In fact, the primary objective of text mining is to create new information that has predictive value.
Characteristics of Claims Data
Carriers have accumulated raw claim or loss data that equates into millions of text records. Examples of text data include claim description fields in claim files, the content of e-mails, underwriters' written evaluation of prospective policy holders contained in underwriting files and responses to open-ended survey questions on customer satisfaction surveys. Until recently, this information was available, yet extracting knowledge from it was not financially feasible.
Industry research suggests that 80 percent of any company's data is unstructured and that 90 percent of that information is unmanaged. In the insurance industry, claims data is actually 97 percent unstructured with the richest information residing in the unique set of words, acronyms and abbreviations that comprise the adjuster's notes. So, if insurers are not effectively using text mining, it can be implied that important business decisions are being made based upon only about three percent of the information available. When you consider the size of the insurance industry, that percentage is a staggering revelation.
While the richest information may reside in the adjuster notes, those notes are just a part of a claim department's unstructured data inventory. There also are e-mails with attachments, imaged documents, case manager notes, Web based information, recorded statement transcriptions (or digital audio files) and digital photos. All of this valuable data content currently exists within insurance companies' data inventories; the information gold that the industry has yet to mine effectively.