Insurance codes are used by people’s health plan to make decisions about how much your doctor and other healthcare providers should be paid. There is some variety of coding systems currently used [1]:
Current Procedural Terminology (CPT) codes, used by physicians to describe the services they provide.
Healthcare Common Procedure Coding System (HCPCS), used by Medicare. It is subdivided into level I codes (equal to CPT codes) and level II codes. The latter are used for identifying products, supplies, and services not included in the CPT codes (e.g. prosthetics and ambulance services)
International Classification of Diseases (ICD), developed by the World Health Organization (WHO), with the goal of identifying the patient’s health condition/diagnosis. These codes are typically combined with CPT Codes, to make sure that the patient’s health condition and the received services match (i.e., for matching billing and documenting diagnosis).
After a given procedure, healthcare professionals list the procedure’s code in an insurance claim form so that the hospital is partially or fully refunded by that procedure.
However, it is natural that not all claims sent correspond to the actual procedures that were performed – due to fraud or submission errors, for instance.
Here’s an example (adapted from [1]): if you fall and sprain your ankle, and go to the emergency services as a consequence of it, you might end up performing an X-Ray of the ankle. If by mistake, the healthcare professionals mislabel ankle X-Ray as elbow X-Ray but still end up giving you the diagnosis of sprained ankle, the procedure and the diagnosis is not consistent, and the insurance claim might end up being rejected.
What are some of the issues resulting from this process which can have monetary consequences, from several points of view of the negatively affected stakeholders?
What does each lose?
Patient
Miscoding the received services (diagnosis or procedures) can lead to the patient being labeled with a condition he does not have.
Increased expenses, incurring on additional cost for the patient, for the insurance company or both.
The patient having pre-existing conditions (which can be misdiagnosed) can potentially lead to obstacles in obtaining health coverage.
Often the patient has no access to his medical records, leading to a lack of visibility of these issues.
Hospital
Healthcare professionals might be overpaid or underpaid for a procedure
Insurance companies can even deny the claim and not pay anything, resulting in price increases also to the hospital
Hospitals can forget to include some expenses, losing the opportunity to be reimbursed by procedures they did/materials they used.
Delays in corrections might lead to extra administrative costs and delay in payments.
Insurance Company
Besides coding errors, hospitals could try to submit extra items associated with a procedure that they did not really use, to receive more funding (fraud).
Delays in corrections might lead to extra administrative costs and delays in payments.
So, how can we use AI to assist in this area?
Modeling Anomaly Detection in Insurance Claims
We will show you how to solve it using multiple techniques, including: Supervised, Unsupervised and Weakly supervised versions. To know more about it, enroll into our online course where we discuss in-depth all of these concepts.
We can identify problems in insurance claims’ submissions at a certain granularity level:
A group of claim codes does not make sense
A group of claim codes does not make sense because the healthcare professional clicked the adjacent code in the interface or used the code in the wrong category (since the codes have a certain hierarchy - e.g. local and global anesthesia).
This use case of insurance claim error detection can be applied to both hospitals and insurance companies, as there is interest in understanding which claims are not correct. Basically, the possible options are to: identify a claim as wrong, correct an error in claim codes and/or try to explain why or where it occurred.
We will now start using letters to refer to the claim codes, as a simplification. The following figure represents the possible inputs and outputs of an insurance claim model.
Regarding the third case, we can have extra things in the claim sent by the hospital - an insurance company will want to have the minimum things possible, so the model should delete the codes which are unnecessary.
For the fourth case, a claim sent by a hospital can have missing codes for some procedures (i.e. by mistake or by lack of knowledge regarding a specific code). A model applied in a hospital should be able to add extra claims when they are missing.
What are the restrictions?
A model used to detect errors in insurance claims should be invariant to the order in which the healthcare professional places the codes (e.g. A-B-C or C-A-B). We can add this invariance in different ways:
Augmenting the input data with random shuffling.
Ordering both input and outputs in alphabetical/numerical order.
Using a position-invariant representation. For instance, since claim sequences can be considered as text, we could use Bag-of-Words for counting the presence of claims regardless of their order.
How can we do this?
There are several approaches we can take in this problem, depending on the amount of labels and data available. We will give some examples of how we can approach this in a supervised and unsupervised way, with a more detailed focus in an unsupervised approach.
Available data
Claim Code
Date of claim
Possibly: Result (used as a label)
Assumed labels
Positive: Wrong claim. Insurance company reported issues with a certain hospital’s claim, and the hospital backed down and agreed with the error.
Negative: Cases in which the insurance company detected that the claim had no errors, did not want to spend time and money in legal processes for that given claim or failed to detect an error that existed. As such, negative labels are a mix of positive and negative.
None: Remaining claims which were not evaluated yet
Two major learning mechanisms can be used:
Supervised Approaches
If we have both positive and negative labels, this is a classical supervised learning problem framed as binary classification. We can then manually extract features from the codes, such as the co-occurrence of code pairs or use some Deep Model (e.g. RNNs) to try to infer the relationship between codes from the input.
However, as negative labels can also contain the positive target, we can instead think of this problem as weakly supervised and use Positive Unlabeled learning (PU Learning), in which the class that is not positive is considered to have both negative and positive examples (mixed set). Inside PU Learning, there are several algorithms that can be used, some of which are described/referred in the literature [2].
Unsupervised approaches
If there are no labels available at all, we then need to follow an unsupervised approach. We will describe a few examples next:
Code embeddings
We can train a word2vec model that, given two codes, estimates the most likely adjacent code. Note that we are adding position-invariance.
This way, we can train code embeddings (similar to word embeddings) which learn the relationships between different codes. Then, we check if a given code has the embedding with the smallest distance to its neighbours. If not, we replace it by the code who does.
This is more error-prone as we can have codes for common operations with similar embedding distance.
Generative model
Using a generative model, we can fit our claims to a model, learning a density function. The most common claims will be close in a given probability space. Examples of models who do so are variational autoencoders or a gaussian mixture models. We will then be able to know the probability of each claim being an outlier.
With a denoising autoencoder, we are trying to reconstruct a certain claim sequence - we add noise and the model tries to know what is wrong and try to correct it. We can then calculate a reconstruction error, which tells us we should have more elements of a certain claim and less elements of another claim. We then have an explanation informative to know what is wrong in the claim.
Seq2Seq inspired: probability of a sequence being wrong
Alternatively, we can have a single model which tells us the probability of each element being wrong (and therefore, we know the probability of the whole claim being wrong).
To do this, we randomly add label noise by adding, removing and swapping claims. We then have an autoencoder which has a sigmoid layer that reconstructs the probability of each claim being wrong.
We have a higher degree of confidence in the model (and can measure its uncertainty) and can decide better on which claims we should manually analyze, since we have probabilities. On the other hand, we know a certain sequence has a high probability of being wrong, but we don’t know if it should be added or deleted.
To solve this issue, we could add a network with three extra tasks: probability of the claim code being wrong because it needs to be edited, deleted or added.
Mixed generative and reconstructive model
We can also have a generative model which shares weights with a denoising autoencoder (or other reconstructive model). This way, a generative model tells us which claim is wrong, and the reconstructive model tells us why it’s wrong (i.e., what part of the sequence is wrong), returning also the corrected sequence.
What to do with model results?
So, how can we be actionable with our model? Let us assume we are an insurance company with these two tools:
Probability of the claim being wrong.
Suggestions of what is wrong.
If we want to select N cases to manually evaluate, how could we optimize this to determine which are the most cost-effective claims?
The insurance company has certain costs associated to this procedure, and an example claim with the codes AAGKM, which should be AAGKD. Each code is a procedure/item with a certain cost.
Positive cases are fraud cases, which we want to manually evaluate.
If we are applying this model in an insurance company, we want to maximize both True Positives (TP) and True Negatives (TN). By maximizing True Negatives, we save analysis time, and by maximizing True Positives, we are reducing the number of cases which the insurance company should not be paying, but actually is.
On the other hand, if we apply this in a hospital, we want to minimize FP - cases which are flagged as a fraud but are not, costing man hours to evaluate manually - and FN - cases which are flagged as negative but are actually fraud, costing money due to errors.
How can we optimize this for an insurance company?
There is a certain cost associated with correcting something in a claim and a price difference between the reconstructed claim and the original claim.
For each claim, X, we can calculate a score, and choose the samples with the highest N scores as the claims to manually evaluate.
This score needs to be composed of two terms. In the first term, containing the expected value in case fraud is detected, we multiply the probability of fraud by the money saved by the insurance company when fraud is detected. Here, the money inflow is dependent on the cost of the corrected claim subtracted from the original claim price and the man-hour rate required for correcting that claim manually.
In the second term, we multiply the probability of non-fraud by the man hour rate required for analyzing that sample, because even if there’s no fraud, there’s a cost associated with analyzing that claim manually.
Score(X) = P(Fraud) x (
PriceCorrectedClaim(X) -
Price(X) -
ManHourRate(Corrected(X) - X)
) -
(1-P(Fraud)) x ManHourRate(Corrected(X) - X)
So, for the above example, if the model corrects the sequence AAGKM to AAGKD, we have a 90% confidence that it is anomalous, and assuming a fixed price of 5€ per claim analysis:
Score(AAGKM) = 0.9 x (1510 - 1500 - 5) = 4.5
Conclusion
In this blog post, we presented the issue of automatically detecting errors/anomalies in insurance claims, an use case which can affect several stakeholders: patients, hospitals and insurance companies.
This approach can be done in a supervised or unsupervised way, depending on the available data. Even with no labels available, it is possible to create an interpretable and actionable model for optimizing the process of manually reviewing claims.
Let us know if you have any more ideas for solving this issue!
Special offers, latest news and quality content in your inbox once per month.
Signup single post
Recommended Articles
Article
A new era has arrived for NILG.AI
Sep 5, 2022 in
News
Today is NILG.AI’s fourth anniversary. Happy birthday to us! For most humans, birthdays are a synonym for getting older and leaving the good days of the youth behind. For companies, they are a moment to reflect on everything we achieved, recognize how far we have come, and envision how far we will go. So, let’s […]
Trip data is any type of data that connects the origin and destination of a person’s travel and is generated in countless ways as we move about our day and interact with systems connected to the internet. But why is trip data sensitive? The trips we take are unique to us. Researchers have found that […]
Is the fastest route always the best? This article may give you a different perspective if your answer is yes. Normally there are multiple ways to tackle a given problem or task, and the optimization field is no different, as there are different approaches we can take to find an optimal solution. The choice of […]
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Cookie
Duration
Description
cookielawinfo-checkbox-analytics
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional
11 months
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy
11 months
The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.