Application of boosting in recommender systems

M. A. Zharova; Жарова М. А.; V. I. Tsurkov; Цурков В. И.

doi:10.31857/S0002338824060083

Application of boosting in recommender systems

Autores: Zharova M.A.¹, Tsurkov V.I.²
Afiliações:
1. Moscow Institute of Physics and Technology (MIPT)
2. Federal Research Center “Computer Science and Control”, Russian Academy of Sciences
Edição: Nº 6 (2024)
Páginas: 91-110
Seção: ARTIFICIAL INTELLIGENCE
URL: https://permmedjournal.ru/0002-3388/article/view/683140
DOI: https://doi.org/10.31857/S0002338824060083
EDN: https://elibrary.ru/sudevr
ID: 683140

Citar

Texto integral

Acesso aberto
Acesso é fechado

Acesso está concedido
Acesso é fechado

Acesso é pago ou somente para assinantes

Resumo
Sobre autores
Bibliografia
Arquivos suplementares
Estatísticas

Resumo

In today's digital era, recommender systems have gained a strong foothold, becoming an important tool for effectively managing information flows. Their demand is largely due to the dynamics of current society, namely information overload and the need to personalize data. With the expansion of the scope of application of recommendation algorithms, many non-standard cases appear, for which the use of classical approaches is not so effective. This paper examines one of these: a small number of objects with a relatively large number of users in conditions of high correlation between some objects. For modeling, it is proposed to use gradient boosting, a machine learning algorithm based on an ensemble of decision trees.

Palavras-chave

recommender systems, boosting, users and items, correlation, calibration

Sobre autores

M. Zharova

Moscow Institute of Physics and Technology (MIPT)

Autor responsável pela correspondência
Email: zharova.ma@phystech.edu
Rússia, Dolgoprudny, Moscow oblast

V. Tsurkov

Federal Research Center “Computer Science and Control”, Russian Academy of Sciences

Email: v.tsurkov@frccsc.ru
Rússia, Moscow

Bibliografia

Cano E., Morisio M. Hybrid Recommender Systems: A Systematic Literature Review // Intelligent Data Analysis. 2017. V. 21. P. 1487–1524.
Al-bashiri H., Abdulhak M., Romli A., Hujainah F. Collaborative Filtering Recommender System: Overview and Challenges // J. Computational and Theoretical Nanoscience. 2017. V. 23. P. 9045–9049.
Jahrer M., Toscher A. Collaborative Filtering Ensemble // J. Machine Learning Research. 2012. V. 18. P. 61–74.
Ahn H., Kang H., Lee J. Selecting a Small Number of Products for Effective User Profiling in Collaborative Filtering // Expert Systems with Applications. 2010. V. 37. P. 3055–3062.
Zharova M., Tsurkov V. Neural Network Approaches for Recommender Systems // J. Computer and Systems Sciences International. 2024. V. 62. P. 1048–1062.
Castells P., Moffat A. Offline Recommender System Evaluation: Challenges and New Directions // AI Magazine. 2022. V. 43. P. 225–238.
Bokde D., Girase S., Mukhopadhyay D. Matrix Factorization Model in Collaborative Filtering Algorithms: A Survey // Procedia Computer Science. 2015. V. 49. P. 136–146.
Filho T., Song H., Perello-Nieto M. Classifer Calibration: a Survey on How to Assess and Improve Predicted Class Probabilities // Machine Learning. 2023. P. 3211–3260.
Alzubaidi L., Bai J., Al-Sabaawi A. A Survey on Deep Learning Tools Dealing with Data Scarcity: Definitions, Challenges, Solutions, Tips, and Applications // J. Big Data. 2023. V. 10. № 46.
Grinsztajn L., Oyallon E., Varoquaux G. Why do Tree-based Models Still Outperform Deep Learning on Tabular Data? // arXiv:2207.08815v1, 2022.
Alzubaidi L., Zhang J., Humaidi A. Review of Deep Learning: Concepts, CNN Architectures, Challenges, Applications, Future Directions // arXiv:2207.08815v1, 2022.
Borisov V., Leemann T., Sebler K. Deep Neural Networks and Tabular Data: A Survey // arXiv:2110.01889v3, 2022.
Bentejac C., Csorgo A., Martinez-Munoz G. A Comparative Analysis of Gradient Boosting Algorithms // Artificial Intelligence Review. 2020. V. 54. P. 1937–1967.
Sahour H., Gholami V., Torkaman J. Random Forest and Extreme Gradient Boosting Algorithms for Streamflow Modeling Using Vessel Features and Tree-rings // Environmental Earth Sciences. 2021. V. 80. № 747.
Имплементация модели LightGBM на Python // GitHub. Microsoft LightGBM: webcite https://github.com/microsoft/LightGBM (accessed: 10.07.2024).
Имплементация модели XGBoost на Python // GitHub. Distributed (Deep) Machine Learning Community XGBoost: webcite https://github.com/dmlc/xgboost (accessed: 10.07.2024).
Имплементация модели CatBoost на Python // GitHub. CatBoost: webcite https://github.com/catboost/catboost (accessed: 10.07.2024).
Ke1 G., Meng Q., Finley T. LightGBM: A Highly Efficient Gradient Boosting Decision Tree // Advances in Neural Information Processing Systems. 2017. P. 3146–3154.
Эксперименты с моделью LightGBM // Kaggle. LightGBM experiments: webcite https://www.kaggle.com/code/prashant111/lightgbm-classifier-in-python (accessed: 10.07.2024).
Chen T., Guestrin C. XGBoost: A Scalable Tree Boosting System // arXiv:1603.02754v3, 2016.
Dorogush A., Prokhorenkova L., Gusev G. CatBoost: Unbiased Boosting with Categorical Features // arXiv:1706.09516v5, 2019.
Pargentn F., Pfisterer F., Thomas J., Bischl D. Regularized Target Encoding Outperforms Traditional Methods in Supervised Machine Learning with High Cardinality Features // Computational Statistics. 2022. V. 37. P. 2671–2692.
Niculescu-Mizil A., Caruana R. Predicting Good Probabilities with Supervised Learning // Machine Learning, Proc. 22nd Intern. Conf. (ICML). Bonn, Germany, 2005. P. 625–632.
Guo C., Pleiss G., Sun Y., Weinberger K. On Calibration of Modern Neural Networks // arXiv:1706.04599v2, 2017.
Barlow R., Bartholomew D., Bremner J., Brunk H. Statistical Inference under Order Restrictions: The Theory and Application of Isotonic Regression // Royal Statistical Society. Series A: General. 1974. V. 137. P. 92–93.
Zadrozny B., Elkan C. Transforming Classifier Scores into Accurate Multiclass Probability Estimates // The Eighth ACM SIGKDD Intern. Conf. on Knowledge Discovery and Data Mining. Edmonton, 2002.
Platt J. Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods // Advances in Large Margin Classifiers. Cambridge: MIT Press, 2000. P. 61–74.
Zadrozny B., Elkan C. Transforming Classifier Scores into Accurate Multiclass Probability Estimates // Proc. 8th ACM SIGKDD Intern. Conf. on Knowledge Discovery and Data Mining. N. Y., 2002. P. 694–699.
Guo C., Pleiss G., Sun Y., Weinberger K. On Calibration of Modern Neural Networks // Machine Learning, Proc. 34th Intern. Conf. (ICML). Sydney, 2017.
Gupta C., Ramdas A. Distribution-free Calibration Guarantees for Histogram Binning without Sample Splitting // arXiv:2105.04656v2, 2021.
Naeini M., Cooper G. Binary Classifier Calibration Using an Ensemble of Piecewise Linear Regression Models // Knowledge and Information Systems. 2018. V. 54. P. 151–170.
Filho T., Song H., Perello-Nieto M. Classifier Calibration: a Survey on How to Assess and Improve Predicted Class Probabilities // Machine Learning. 2023. V. 112. P. 3211–3260.
Wang H., Liang Q., Hancock J., Khoshgoftaar T. Feature Selection Strategies: a Comparative Analysis of SHAP-value and Importance-based Methods // J. Big Data. 2024. V. 11. № 44.
Gebreyesus Y., Dalton D., Nixon S., Chiara D., Chinnic M. Machine Learning for Data Center Optimizations: Feature Selection Using Shapley Additive exPlanation (SHAP) // Future Internet. 2023. V. 15. № 88.
Имплементация библиотеки для подбора гиперпараметров Optuna на Python // GitHub. Optuna: webcite https://github.com/optuna/ optuna (accessed: 20.07.2024).

Arquivos suplementares

Ação

1. JATS XML

Baixar

Nome de usuário
Senha
Lembrar usuário

Esqueceu a senha?	Cadastro

Nome de usuário
Senha
Lembrar usuário

Esqueceu a senha?	Cadastro

Nº 2 (2025)

Nº 2 (2025)

Application of boosting in recommender systems

Texto integral

Resumo

Palavras-chave

Sobre autores

M. Zharova

V. Tsurkov

Bibliografia

Arquivos suplementares