Machine Learning (ML) is a computer science field that enables computer systems to learn by automatically building statistical models with the use of available data. Instead of following static program instructions, ML algorithms learn from the data to identify patterns, discover knowledge and make decisions with minimal human intervention.
ML is part of the broader field of Artificial Intelligence (AI). Other areas of AI frequently intertwine with ML, such as Knowledge Reasoning, Natural Language Processing and Artificial General Intelligence. Recently, we have been witnessing a wave of new products and services built within several AI spaces. Many of these initiatives have been under a strong backlash as they failed to meet expectations. However, we can also see the usefulness of ML algorithms when applied to a specific limited problem, such as detecting an anomaly in business transactions.
Healthcare insurance and the issue of waste, abuse and fraud
Healthcare insurance is a multifaceted industry that brings together care providers, insurance companies and patients. As the industry is expected to create social benefits, there is constant pressure to contain costs while providing security and improving the health of the general population.
Misuse of the health insurance is an ongoing issue. Motivated by the financial incentives, different stakeholders are creating waste, abuse the market or even commit fraud. The volume of waste, abuse and fraud (WAF) is estimated to be in the range of 5-10% of the yearly healthcare expenditure. This makes WAF a significant contributor to the medical inflation. Insurance claims are under continuous scrutiny by the healthcare payers for being one of the key tools to control healthcare spending.
Looking from an insurance perspective, WAF is being generated by both healthcare providers and insured members. In the worst cases, a conspiracy-type of fraud involves several parties colluding in the misuse. When looking at the complexity levels, we can categorize WAF in seven levels. This starts with single transaction as the simplest one and goes up all the way to multiparty, criminal conspiracies.
The most prevalent types of fraud carried out by policy holders are: gaining access to or being reimbursed for services typically not covered by the policy. For clinicians and healthcare providers, financial gain is the main motivation with up-coding, service unbundling, and billing for unnecessary or even not rendered services.
Machine Learning in combating WAF
Insurance companies are already applying rule-based systems to detect WAF in insurance claims. These systems are very similar to other fraud prevention systems for financial transactions, e.g. credit card transactions, where the system checks the validity of a transaction against a predefined set of business rules. These business rules require continuous management. Even then, they are only useful as long as the person managing the rules can create a comprehensive mental model for the complete rule set.
This is where ML fits in perfectly and elegantly solves a complex problem. Once the ML models have been trained with historical transactions, they can quantify the anomaly of each new transaction compared to the history and assign potential risk to it. Furthermore, ML models adapt as the system processes new transactions. Which means they are improving while operating and manual rule management is no longer needed.
One important distinguishing factor is the ability of the ML algorithms to learn from the judgments of the human claim processors. Typical rule-based systems have a set of predefined medical rules that determine if a treatment should be approved for a given medical condition. However, claim processors make their decisions using additional information, such as understanding of the specific care provider or history of the insured member. They also have additional information from outside the claim system or even make professional judgments based on their medical experience. ML models trained with these decisions will adapt their risk prediction based on the judgment made by the processor. They are implicitly implementing rules that are not only medical but arise from the daily practice.
Netcetera designed and built RiSIC, ML based system that quantifies the WAF risk of insurance claims. Following is the high-level overview of the approach.
Problem representation
Healthcare claims contain cleanly structured data elements that can be used as input for ML model training. These elements include information about the insured member with their medical condition, the medical procedures and services performed on the patient, the prescribed medications, time, date and location of the services, and others.