Health insurance companies are challenged to prevent a staggering amount of fraud, waste and abuse cases. While today’s technologies promise to simplify the process, it’s key that companies are feeding clean data into those tools. Let’s explore why this is complicated and what it takes to prepare data.
The state of health insurance data
No matter how your company is detecting fraud, one principle holds true: the results you produce can only be as good as the data you call upon. In other words, if you put poor-quality data into your fraud model, you’ll end up with poor results.
It’s no simple matter ensuring your data is up to par considering the number and types of data sources you’re calling upon. On average, health insurance companies process claims containing 30-38 services per member annually. Multiply that by the number of members and providers – and the number of internal and external data sources that go into every claim processing transaction – and the volume is tremendous. As a result, it takes significant time and effort to prepare the data for detecting fraudulent claims.
Let’s look more closely at all the permutations of that data:
- Internal data sources. Health care companies often pull information from multiple legacy systems in order to detect and prevent fraud, waste and abuse. In addition to confirming code sets and policies – and consulting medical records and documentation – the company needs to arrive at a comprehensive view of the provider making the claim.
- External data sources. It’s also necessary to pull in data from third-party sources. These might include aggregated information on historical claims data, fraud watch and sanction lists, business and credit bureaus, news and social media, and medical billing data, to name a few.
- Structured and unstructured data. Every company generates and works with a combination of structured and unstructured data. Just as it sounds, structured data is in a structured format (think rows and columns within a relational database, for example). This enables software programs to easily analyze the data. On the other hand is unstructured data, which lacks any consistent or defined organization. Examples of unstructured data include emails, images, and medical records. It’s far more difficult for a software program to understand data in this form. The additional challenge is that over 80% of health insurance data is considered unstructured data.
Going from raw to clean data
In addition to aggregating and normalizing all this data so it’s consistent, your company needs to make sure the data is accurate and usable. This involves steps such as confirming entities are correct and determining if data is missing. These checks are essential since it’s easy for people to incorrectly enter or deliberately falsify data – such as date of birth – into a system.
Experience has shown that it takes a significant amount of upfront work to prepare your data for your fraud detection model. In fact, most companies spend about 80% of their time preparing their data for use in the model, and the other 20% of their time designing and launching the model.
However, preparing the data is time well spent. The more data you feed into your fraud detection model, the greater the results and insights. At Shift we have found that an effective fraud, waste and abuse reduction program powered by clean, accurate data can significantly help your organization. Enabling you to:
- Identify more fraud, waste and abuse cases
- Reduce fraud losses and improper payments
- Prioritize actionable alerts
To learn more about how Shift can help reduce fraud, waste and abuse for health insurers, please visit our Force for Healthcare solutions page.