In this blog post, you’ll learn about some examples of decision processes you can use in recommender systems: do you see any usage for recommending less popular products as a way to improve your business? You will see it now!
The Use Case
Let’s imagine a use case where you are building a MOOC platform (like Udemy/Coursera). Your CEO means to increase student engagement and recurrence in the platform, and wants to know the preferred courses for each student.
Your team of Data Scientists, after brainstorming, decides you should build a model which predicts the affinity each user has for each course, and build a decision process on top of it. Your marketing team is able to send a fixed amount of newsletters per week with limited slots, of the best courses for each user, at no associated cost.
Data
MOOCs websites, being online services, can easily generate a lot of user data with no major acquisition effort. For this example, the most relevant characteristics are:
Person: student and its characteristics, such as age, location, interests, gaps in knowledge, …
Product: course and its metadata (category, duration, price, knowledge area)
User-course interactivity (clicks on other courses)
External Market Indicators: how popular is each area and what is the coverage of that type of content in other competitor services
Model
The focus of this blog post is not to explore data sources, so let’s assume we have a model that takes in this data.
There are three major groups of models for recommendation systems: content-based, collaborative filtering, and hybrid. At NILG.AI, we have built content-based algorithms for a lot of our clients, as they allow you to compare products by their characteristics, reducing cold start in new products.
For more details when comparing these types of models, we recommend an in-depth article such as this one.
Decision Process
Now towards the main topic of the blog post: how do you go from a list of affinities per user to an actual recommendation in the newsletters? There are multiple options for creating the final newsletter:
Rule-based approaches:
This is typically the first step in organizations that don’t use machine learning: for example, sort courses for each user based on the overall popularity in the areas in which he has expressed interest for.
Pros: easy to implement, can add business knowledge explicitly
Cons: non-personalized. The implemented business rules also lead to a dependency in changing them over time, if the business evolves.
Recommend Top-K courses to each student (Course to Student).
In this method, each student is recommended the top-K courses for him, according to the calculated affinity.
Pros: we maximize opening rates.
Cons: some courses may not be at the top of any user, achieving low visibility. This happens in new courses as well, that won’t get as much attention
Recommend Top-K students to each course (Student to Course).
In this strategy, each course is assigned to the top-K students, according to their affinity.
Pros: we ensure each course receives a relevant number of leads. This is a more business-oriented approach
Cons: overall opening rate per user can be reduced.
Urgency trade-off of individual vs. overall affinity
In this strategy, you calculate how much you would lose if you don’t send a course to a user. This is based, for instance, on the difference between affinity and the maximum affinity from the rest of the users. This is based, for instance, on the difference between affinity and the maximum affinity from the rest of the users. The maximum affinity is the best you can do for that user. By recommending matches where the difference between affinity and maximum affinity is largest, that user is the most relevant one for the course.
Pros: promotes a better distribution of leads across courses.
Cons: we may be missing top matches.
Randomly recommend courses
In this strategy, you randomly recommend courses (or semi-random, within a certain group). For improving your model constantly, and exposing the user to out-of-the-box recommendations, it might be a viable strategy at a smaller percentage of the recommendations, unless it hurts your sales by a lot.
Pros: more diversity for learning the algorithm.
Cons: less relevant recommendations.
There is no actual best rule for this: one approach we have followed in the past was to run a hybrid approach, where you combine sections of the newsletter using different strategies. You need to test out these proportions in different AB Tests.
So… it does make sense that you can end up recommending less popular products to improve your business: you are giving exposure to less popular courses, preventing them from ending up in a feedback loop. If the courses are rarely recommended, they don’t get a lot of sales/clicks. If they don’t get a lot of sales/clicks, they are rarely recommended!
Model Evaluation
To measure if the model is doing well, besides the technical metrics (Lift, ROC AUC, Average Precision, …) there are some business KPIs you can measure. These KPIs, for these types of businesses, are typically based on positive feedback signals, such as click rates and sales. Some are more short-term but lead to a less relevant intent signal (such as clicks).
For instance, you can focus on:
User-centric: such as the clicks per newsletter and the unsubscription rates.
Business-centric: click rates per course and course exposure. Course exposure is a measure of the diversity of the sent recommendations – if there’s a distribution where the initially more popular courses get more exposure, but the remaining ones also get some exposure, this will be highlighted within this metric.
Take into consideration that, while some KPIs might increase, others might decrease: for instance, you can dramatically increase course exposure rate but decrease the overall experience: this would happen if you randomly recommend a course, for instance.
Using strategies such as Student to Course recommendations, you should be improving your business-centric KPIs and decreasing your user-centric KPIs, when compared to a baseline. On the other hand, Course to Student is an approach that is more user-centric. There’s always a trade-off you need to consider and discuss with the business teams.
AB Testing
Regarding AB Testing Strategies: you will need to experiment with the proportion type of each strategy if you want to use the hybrid strategy as mentioned above. The number of groups should be dependent on the number of active users you have and the time you want to run these experiments for.
Note that in many cases, there’s a budget associated with course recommendation (for instance, you can only recommend N courses per week, due to a maximum number of sales, caused by digital license issues). In this case, AB Testing can lead to unreliable conclusions due to cannibalization bias: the recommendations in one strategy are influencing the others, breaking the rules of standard AB Testing. We’ll talk more about this later in another blog post!
Conclusions
There’s a lot to be done after having a prediction of how much a student likes a certain product. It’s up to the Data Scientists to help define these constraints, together with the rest of the team. With a huge diversity of approaches to follow, there’s really not a “best solution”. You should quickly iterate to generate value, while continuously running R&D activities and monitoring the performance of the current approach.
Is this the solution you need? If so, contact us. Otherwise, contact us anyway, and let’s find a better solution together!
Like this story?
Subscribe to Our Newsletter
Special offers, latest news and quality content in your inbox once per month.
Signup single post
Recommended Articles
Article
A new era has arrived for NILG.AI
Sep 5, 2022 in
News
Today is NILG.AI’s fourth anniversary. Happy birthday to us! For most humans, birthdays are a synonym for getting older and leaving the good days of the youth behind. For companies, they are a moment to reflect on everything we achieved, recognize how far we have come, and envision how far we will go. So, let’s […]
Trip data is any type of data that connects the origin and destination of a person’s travel and is generated in countless ways as we move about our day and interact with systems connected to the internet. But why is trip data sensitive? The trips we take are unique to us. Researchers have found that […]
Is the fastest route always the best? This article may give you a different perspective if your answer is yes. Normally there are multiple ways to tackle a given problem or task, and the optimization field is no different, as there are different approaches we can take to find an optimal solution. The choice of […]
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Cookie
Duration
Description
cookielawinfo-checkbox-analytics
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional
11 months
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy
11 months
The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.