Contents
Modern recommender systems select for each viewer not only films, but also the most attractive previews for them. We tell you how the algorithms responsible for content recommendations are created and work.
About the expert: Stanislav Zhuravlev, Chief Product Officer Media/KION
What is a recommender algorithm
A recommendation system for online cinemas is a way to find the most engaged and interested viewer, and for the user it is one of the options for individualization. If you draw historical parallels, the recommendation system can be compared to the division of labor: the one who sculpted pots became a potter, and the one who bent horseshoes became a blacksmith. Recommendations in content are the same, but in reverse: they allow you to reinforce preferences and even shape tastes.
Modern recommendation systems are a technological process that works on algorithms that determine the preferences of the viewer. The algorithm analyzes a large amount of information about the user and compares it with data about the movie, series, program, as well as data about other viewers who have watched and rated this content.
Artificial intelligence should ideally ensure the correct selection of content for each specific user.
What data does the algorithm use
Two types of data are important for the recommendation algorithm: historical and demographic. The data of the first type provide information about content views by a specific user or people who are most similar to the user (who, when and where watched, how long it took to view, how many repeated views, whether they watched to the end). The data of the second type are socio-demographic information about users: gender, age, field of activity and place of residence.
Most of the user data in the KION online cinema is predicted using machine learning models based on aggregated anonymized data. This data is collected in HDFS (Hadoop Distributed File System) – a file system designed to store large files.
How the recommender algorithm works
Typically, recommendations include three types of filtering:
- collaborative filtering (CF). Based on the actions of the user, he is classified into a certain category. The algorithm then identifies the actions of other people from the same category/similar category and suggests content for viewing;
- content-based (content-based). Such mechanisms work in accordance with the description of the elements and user preferences. Recommendations display key concepts that the consumer has previously used when searching for content. For example, if he watched comedy shows, then other programs in this genre will be recommended to him;
- hybrid. This model combines the methods of the previous two. The most popular hybrid approach is the two-tier model. First, it uses collaborative filtering, which selects a small number of candidates, and then they are ranked by a much more powerful content model. This type of recommendation is used by services like Youtube or Netflix.
First, the recommendation system typifies each content unit according to several groups of features: content meta-information (genres, director, year, country, tags), collaborative features of user interaction and content (clicks, views, etc.), features of the video sequence (computer technology). view finds and identifies items by tags). Then these features are combined into vectors and stored for further calculations as templates. A similar process occurs in relation to users: each person can be represented in a vector space through interactions with content (what he watched, where he clicked, what he watched to the end) and within the framework of a probabilistic model that determines gender, age, income, region.
When a user enters a KION storefront, the system matches their vector with the content vector. The content that is “closer” to the user is ranked higher. In this case, one user can have several vectors at once.
Each showcase is reviewed once a day. Depending on the triggers, the storefront may change more frequently. For example, the purchase of a subscription could be such a trigger. In addition, storefronts change depending on the user’s device. This screens out irrelevant content, such as 4K video for smartphones.
The main thing is training
The operation of any algorithm is provided by a predictive model. It allows you to predict in advance how the viewer will behave if you set certain parameters. To train the model, companies study large numbers of viewers and their behavior. In this case, the task becomes to obtain the maximum amount of data about users according to various parameters.
For example, if 90% of the sample of thriller fans are fans of hard rock, then it is highly likely that a thriller can be offered to a lover of this genre of music on the platform to watch, and he will be interested in this recommendation.
However, one selected parameter is not enough to build recommendations. For advice to be relevant, it is important for the service to collect a significant amount of anonymized data to train its artificial intelligence models.
It is the training and training of models that take 90% of the working time in the development of technology, while writing code for artificial intelligence is only 10% of the time.
KION has a quality assessment system that is responsible for ensuring that recommendations comply with business requirements and regulations. So, she checks that she does not give out duplicate content or too few titles to clients. The share of children’s content or series is also monitored – with some models, they can grow uncontrollably.
Other quality metrics are variety, accuracy, novelty. Finally, the platform uses “avatars” – a group of “typical” and a group of “atypical” users, on which the system is checked: it should work well for “typical” users, but at the same time be able to take into account the interests of “atypical”.
Can algorithms work successfully without a human?
Artificial intelligence is logical but not creative, so it can solve problems efficiently, but it still requires a human to set them up. Editorial selections in online cinemas are no less important than automated recommendation systems, since when compiling them, editors, just like algorithms, analyze accumulated data, experience and other indicators, but in addition use creative intuition, so there is more emotion and knowledge in their choice about man and his nature.
Most often, the need for manual control arises during the formation of “event” collections. For example, a model does not know how to make selections in honor of the Oscars, birthdays of actors, film festivals, etc. – the set of such cases is too diverse.
In such cases, the AI-verified hint model, combined with the creative intuition of the editor, gives the best results.