INFORMATION ABOUT PROJECT,
SUPPORTED BY RUSSIAN SCIENCE FOUNDATION

The information is prepared on the basis of data from the information-analytical system RSF, informative part is represented in the author's edition. All rights belong to the authors, the use or reprinting of materials is permitted only with the prior consent of the authors.

 

COMMON PART


Project Number17-71-30029

Project titleBigData intellectual technologies for financial decision support systems based on predictive modeling

Project LeadBoukhanovsky Aleksander

AffiliationITMO University,

Implementation period 2017 - 2020  extension for 2021 - 2023

PROJECT EXTENSION CARD

Research area 01 - MATHEMATICS, INFORMATICS, AND SYSTEM SCIENCES, 01-721 - Systems and technologies of mathematical simulation of social and economic processes

KeywordsFinancial system, big data, cloud computing, predictive models, forecasting, direct simulation, machine learning, evolutionary computing, complex networks


 

PROJECT CONTENT


Annotation
The project is aimed at researching and developing methods, models, scalable parallel algorithms and software tools implementing the above to support decision making in financial processes organization and management. The scientific uniqueness of the project is determined by the solutions’ orientation to provide the entire life cycle of predictive models for data (Data Driven Approach, DDA) creation and application for financial processes of different scales, designed through scalable machine learning methods. Within the framework of the project the following tasks are solved: - developing infologic models of data and processes in financial area, considering the diversity of market players’ strategies, consumers’ segmentation, taking into account their social and mental characteristics, and the influence of other global real-world systems’ processes; - developing theoretical foundations and creating scalable machine learning algorithms to design and identify financial processes’ predictive models based on DDA; - developing and researching methods for predictive modeling, forecasting, evaluating and interpreting the solutions for global financial systems, taking into account their hierarchy and multiscale; - developing high-performance computing technologies and infrastructure for big data collection, storage and processing, ensuring the effective use of the developed models, methods and algorithms. The project practical result is a cloud-based software platform for financial simulation and data processing providing data collection, aggregation and processing services, building predictive models, and simulating financial processes. It forms the core for creating various applied financial services, in demand by commercial banks, consulting and investment companies, retail, brokers and hedge funds, rating agencies, insurance companies and regulators. The social importance of the project is to create a public Internet service allowing consumers to rate various financial products based on their own individual characteristics and preferences, which will improve the financial security of Russians in the face of aggressive and unfair advertising.

Expected results
1. A complex of infologic data models and processes in the financial sphere taking into account the diversity of the market players’ strategies, global systems interaction and the temporal evolution of aggregated profiles of participants in financial environment; and the method of constructing the aggregated profiles with support of semantic, psychometric, spacio-temporal and topological levels of participants’ description; 2. A family of scalable algorithms of machine learning for constructing and identification the predictive models in financial processes; a meta-training method for automatically selecting and adjusting the parameters of the predictive model, taking into account the accuracy and resource-consuming nature of calculations; the method of previously trained models reuse in order to solve similar tasks, taking into account the resource intensity in case of re-training; 3. The method of multiscale modeling of an hierarchical financial system with minimum three levels support of detalization (micro-, meso-, macro-) and the ensemble technology of deterministic and interval prediction of the financial system evolution; 4. A family of efficient parallel algorithms implementing procedures for data processing and DDA-models execution of various computing architectures maintaining the local connections between the data structures; a set of the algorithms for mutual data placement and computational operations planning, taking into account the data demanded in calculations, adaptive scheduling algorithm, the cloud infrastructure balancing and restructuring; 5. An open cloud software platform for simulation and prediction of financial processes, including services for collecting, aggregating and processing data, creating predictive models and simulating financial processes; 6. A complex of multiscale predictive models and a set of application services in the form of composite applications to solve the tasks of credit, behavioral, collection scoring, portfolio management, increasing marketing campaign conversions, operational efficiency, acquiring; 7. A public Internet service for the rating of domestic banks financial products based on microsegmentation, an open virtual testing area for collaborative development, testing, launching and connection of the financial processes predictive models, a simulator of the global financial environment with various levels of detailed elaboration. The practical results of the project (the platform and set of applications developed) will be demanded by the industrial partner (PJSC “Bank of Saint Petersburg”) to create the application services for various participants of financial market.


 

REPORTS


Annotation of the results obtained in 2020
The goal of this stage of the project is: a) enriching the platform for multi-scale modeling and forecasting of financial processes with new modules and services of various scales; b) developing tools and technologies for collaborative development, use, and testing of platform models, including in public access mode; C) implementing experimental samples of decision support technologies based on the method of integrating DDA models of various scales. In 2020, 20 modules were developed and adjusted for classification, regression, clustering, and object generation tasks based on platform data. The total number of modules of the platform for multi-scale modeling and forecasting of financial processes was 41. The modules of 2020 are classified according to their applicability to different scales of the financial environment and include: 1) for the micro-scale – models for identifying and restoring components of dynamic digital profiles of financial entities; 2) for the meso-scale – simulation models of consumer choice in the financial sector; 3) for the macro-scale – simulators of marketing strategies. A virtual polygon has been developed for collaborative development, testing, launching, and interfacing predictive models as an add-on to the platform. Polygon projects are implemented on the basis of React, Kubernetes and JupyterHub technologies, forming an isolated computing area in a virtual container with access to both the CPU and GPU for computational tasks. The functionality of the polygon is shown on a number of public services developed and integrated into the polygon: a service for rating financial products of domestic banks, services for developing strategies for the "Next Best Action" (NBA) for clients and employees. The platform also supports the development of public services outside the virtual polygon, which is demonstrated by the standalone web service for predictive analysis of socio-economic data and the service for searching potential clients-legal entities in counterparty networks, deployed in the technological sandbox of an industrial partner bank. To increase the efficiency of the virtual polygon when working collaboratively on models, the platform's data storage subsystem has been improved in terms of the distributed storage and data processing component of aggregated profiles. Experiments on the performance of this component have shown that the platform is suitable for efficient storage and access to AP with various components with the size of the AP database up to 10 million records. Based on the method of combining models of various scales, a simulator of the global financial environment is created based on the approach of distributed intelligent systems in the form of a set of software agents that iteratively: a) select activity in the environment; b) receive feedback from the environment (agents of natural and / or artificial intelligence); C) update parameters of reaction models (AI agent reflection). Based on the simulator, a demonstrator of the ecosystem of digital identities of bank clients was created (micro-scale – client, meso-scale – client network, macro-scale – urban environment). In order to develop the logic of CSPPR as personal digital assistants of financial environment subjects, a cognitive interface was created for working with a virtual polygon and simulator based on the concept of "digital avatars" – intelligent software agents that evolve in accordance with the needs and goals of the owner. An example of a cognitive interface is implemented for the task of a personal digital assistant in choosing consumer activity strategies. In general, the results demonstrate applicability of the platform for organizing the full life cycle of working with financial process models on large data sets, including in the mode of collaborative development and access. Based on the practical result of the research – a platform for multi-scale modeling and forecasting of financial sector processes – it is advisable to implement the following classes of systems for the real economy: a) analytical monitoring systems and decision support systems for managing financial processes of various scales in the interests of various categories of stakeholders (from government authorities to networks of retail and service points and banks); b) ecosystems of personal digital assistants for clients/employees/managers based on cognitive interfaces to the platform core; c) simulators ("aquariums") of customer / employee / counterparty behavior based on meso-layer models of the Platform, generative models of aggregated profiles and contact networks; d) descriptive and normative macro-models of financial systems development of various scales (from the level of a city / country subject to cross-border environments); e) quality control systems and generative design of intelligent models for the financial sector based on the adaptation of meta-learning methods and model composition within the platform in the logic of incremental learning.

 

Publications

1. Atkisson C., Gorski P.J., Jackson M., Holyst J.A., D'Souza R. Why understanding multiplex social network structuring processes will help us better understand the evolution of human behavior Evolutionary anthropology, 2020. – Т. 29. – №. 3. – С. 102-107. (year - 2020) https://doi.org/10.1002/evan.21850

2. Bardina M., Vaganov D., Guleva V. Socio-demographic features meet interests: on subscription patterns and attention distribution in online social media Procedia Computer Science, 2020. – Т. 178. – С.162-171 (year - 2020) https://doi.org/10.1016/j.procs.2020.11.018

3. Boytsov A.A., Gladilin P.E. Separating real-world photos from computer graphics: comparative study of classification algorithms Procedia Computer Science, 2020. – Т. 178. – С.320-327 (year - 2020) https://doi.org/10.1016/j.procs.2020.11.046

4. Buzdalov M., Mishra S. If unsure, shuffle: deductive sort is Θ(MN3), but O(MN2) in expectation over input permutations GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference, 2020. – С. 516-523. (year - 2020) https://doi.org/10.1145/3377930.3390246

5. Buzdalov M., Mishra S. Filter Sort Is Ω(N3) in the Worst Case Lecture Notes in Computer Science, 2020-С.675-685 (year - 2020) https://doi.org/10.1007/978-3-030-58115-2_47

6. Choloniewski, J., Sienkiewicz, J., Dretnik, N., Leban, G., Thelwall, M., Holyst, J.A. A calibrated measure to compare fluctuations of different entities across timescales Scientific Reports, 2020. – Т. 10. – №. 1. – С. 1-16 (year - 2020) https://doi.org/10.1038/s41598-020-77660-4

7. Chunaev P. Interpolation by generalized exponential sums with equal weights Journal of Approximation Theory, 2020. – С. 105397. (year - 2020) https://doi.org/10.1016/j.jat.2020.105397

8. Chunaev P.V., Gradov T.A., Bochenina K.O. Community detection in node-attributed social networks: How structure-attributes correlation affects clustering quality Procedia Computer Science, 2020. – Т. 178. – С.355-364 (year - 2020) https://doi.org/10.1016/j.procs.2020.11.037

9. Deeva I., Andriushchenko P.D., Kalyuzhnaya A.V., Boukhanovsky A.V. Bayesian Networks-based personal data synthesis ACM International Conference Proceeding Series, 2020. – С. 6-11 (year - 2020) https://doi.org/10.1145/3411170.3411243

10. Derevitskii I., Kogtikov N., Lees M.H., Cai W., Ong M. Risk-based AED placement - singapore case Lecture Notes in Computer Science, 2020. – С. 577-590 (year - 2020) https://doi.org/10.1007/978-3-030-50423-6_43

11. Egorov A., Sokhin T., Butakov N. Towards a Retrospective One-Class Oriented Approach to Parents Detection in Social Media 2020 27th Conference of Open Innovations Association (FRUCT), Номер статьи 9211021, Страницы 54-60 (year - 2020) https://doi.org/10.23919/FRUCT49677.2020.9211021

12. Gladilin P.E., Levina P.V. Method for analyzing the location, assortment and success of outlets based on transactional data ACM, - (year - 2021)

13. Gladysheva E.A., Derevitskii I.V., Severiukhina O.A. A trust and relevance-based Point-Of-Interest recommendations method with inaccessible user location Procedia Computer Science, 2020. – Т. 178. – С.153-161 (year - 2020)

14. Gorski P.J., Bochenina K., Holyst J.A., D'Souza R. Homophily Based on Few Attributes Can Impede Structural Balance PHYSICAL REVIEW LETTERS, Volume 125, Issue 7, Номер статьи 078302 (year - 2020) https://doi.org/10.1103/PhysRevLett.125.078302

15. Jedrzejewski A., Toruniewska J., Suchecki K., Zaikin O., Holyst J.A. Spontaneous symmetry breaking of active phase in coevolving nonlinear voter model Physical Review E, - (year - 2020) https://doi.org/10.1103/PhysRevE.102.042313

16. Kalinin A., Vaganov D.A., Bochenina K.O. Discovering patterns of customer financial behavior using social media data Social Network Analysis and Mining, 2020. – Т. 10. – №. 1. – С. 1-14. (year - 2020) https://doi.org/10.1007/s13278-020-00690-3

17. Kudinov S., Antonov A., Ilina E. Specifying spatial and temporal characteristics of increased activity of users of e-participation services Communications in Computer and Information Science, - (year - 2021)

18. Kudinov S., Smirnov E., Dunaenko S. Using multi-agent simulation to predict natural crossing points for pedestrians and choose locations for mid-block crosswalks Geo-Spatial Information Science, 2020. – С. 1-13. (year - 2020) https://doi.org/10.1080/10095020.2020.1847003

19. Merezhnikov M.V., Hvatov A. Closed-form algebraic expressions discovery using combined evolutionary optimization and sparse regression approach Procedia Computer Science, 2020. – Т. 178. – С.424-433 (year - 2020) https://doi.org/10.1016/j.procs.2020.11.044

20. Mishra S., Buzdalov M., Senwar R. Time complexity analysis of the dominance degree approach for non-dominated sorting GECCO 2020 Companion - Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, 2020. – С. 169-170. (year - 2020) https://doi.org/10.1145/3377929.3389900

21. Saitov I., Polevaya T., Filchenkov A. Dermoscopic attributes classification using deep learning and multi-task learning Procedia Computer Science, 2020. – Т. 178. – С.328-336 (year - 2020) https://doi.org/10.1016/j.procs.2020.11.034

22. Severiukhina O., Kesarev S.A., Bochenina K.O., Boukhanovsky A.V., Lees M.H., Sloot P. Large-scale forecasting of information spreading Journal of Big Data, 2020. – Т. 7. – №. 1. – С. 1-17. (year - 2020) https://doi.org/10.1186/s40537-020-00350-5

23. Sigova M., Koniukhov S., Soliar E. Banks needs for predictive analytics and predictive models in cost management system Proceedings of the 36th International Business Information Management Association Conference, IBIMA 2020, - (year - 2021)

24. Surikov A.G., Egorova E.V. Emotional analysis of Russian texts using emojis in social networks Lecture Notes in Computer Science, - (year - 2021)

25. Surikov A.G., Egorova E.V. Alternative method sentiment analysis using emojis and emoticons Procedia Computer Science, 2020. – Т. 178. – С.182-193 (year - 2020) https://doi.org/10.1016/j.procs.2020.11.020

26. Vaganov D., Bardina M.G., Guleva V. From Generality to Specificity: On Matter of Scale in Social Media Topic Communities Lecture Notes in Computer Science, 2020. – С. 305-318. (year - 2020) https://doi.org/10.1007/978-3-030-50423-6_23

27. Volokha V.D., Gladilin P.E. Identifying user interests and habits using object detection and semantic segmentation models Lecture Notes in Computer Science, - (year - 2021)

28. Zamiralov A., Khodorchenko M.A., Nasonov D.A. Detection of housing and utility problems in districts through social media texts Procedia Computer Science, 2020. – Т. 178. – С.213-223 (year - 2020) https://doi.org/10.1016/j.procs.2020.11.023

29. Chunaev P. Community detection in node-attributed social networks: A survey Computer Science Review, 2020. – Т. 37. – С. 100286. (year - 2020) https://doi.org/10.1016/j.cosrev.2020.100286

30. - Программная библиотека восстановления геокоординат событий по частично размеченным транзакционным данным -, № 2020619733 от 21.08.2020 (year - )

31. - Сервис персонализированных рекомендации заведений общественного питания -, № 2020663030 от 21.10.2020 (year - )

32. - Разработка и реализация программного комплекса для прогнозирования динамики использования финансовых инструментов -, № 2020666032 от 03.12.2020 (year - )

33. - Программная библиотека прогнозирования потока клиентов в торгово-сервисных точках городской среды -, № 2020660069 (year - )

34. - Помощник для выбора онлайн-курсов, игра про котиков и не только: лучшие идеи участников школы от команды Аватара ИТМО ITMO.NEWS, - (year - )

35. - Ученые показали, почему тяжело достичь «райского состояния» ITMO.NEWS, - (year - )

36. - Финансовые информационные технологии поддержки принятия решений на основе больших данных Control Engineering Россия июнь 2019, 2019, с.20-24 (year - )


Annotation of the results obtained in 2017
The goal of the study for this stage of the project is to substantiate the direction of research in the field of intelligent technologies for financial decision support based on big data and predictive modeling. The financial sphere is traditionally understood as the aggregate of existing banking and financial products on the market, as well as mechanisms for their creation and distribution. Support for decision-making in the financial sphere requires solving the problems of forecasting financial markets, estimating the demand for new financial products, determining the reliability of potential users of credit products, and identifying other sources of profit for market participants. Based on the analysis of literature sources, the basic requirements for decision support technologies were formulated that consider the fragmented observability of the financial market, the adjustment of financial products and services to the individual characteristics of customers, and the transition to omnichannel financial agent interaction systems. The cumulative consideration of the described factors requires the development of new problem statements for decision support in the financial sphere, including not only functional elements, links between them, limitations and target criteria, but also the law of their evolution for different scales of the financial system. The comparative analysis of existing methods, models and algorithms for financial decision support, based on four groups of criteria (purpose, level, model type, model concept), has demonstrated a tendency to include in the traditional models of financial mathematics the advances of related areas, as well as non-financial data, to combine individual financial data with cyberspace data, especially geospatial and media (audio, video) data. Proceeding from this, the project developed a method for constructing aggregated profiles (AP) of financial entities that provides a transformation of a set of unstructured data collected from various sources (social media, databases of banks and service enterprises, state statistical databases) into a structured form that allows interpret the subject's behavior on the semantic, psychometric, spatio-temporal and topological levels. Based on this, a complex of infological models mapping the subject's AP to the subspace of the parameters of a specific applied task was constructed. They were designed to solve problems of (a) identifying preferred patterns of payment and purchasing activity according to the customer's financial transactions, (b) online scoring by user profile in social network, (c) modeling of the inter-bank lending market. Since different infological models can belong to different levels of a generalized multiscale model, the mechanism of transition between models of different levels is additionally developed. The AP itself and the infological data models are used as the basis for building predictive Data Driven Approach (DDA) models. At this stage of the project, a set of separate methods and procedures for probabilistic analysis and machine learning, applied sequentially or in parallel to individual AP elements, are considered to design and identify predictive DDA-models. Thus, for the identification of models in the form of complex networks, methods for restoring the topologies by a set of macro characteristics were developed, based on the solution of the multicriterial optimization problem, taking into account the empirical data (fixing the bow-tie of the network structure). To design and identify the parameters of DDA-models reflecting the statistical interrelations between the elements of the AP, variants of (a) regression methods and methods of supervised classification are implemented, (b) classifications by methods of unsupervised machine learning, (c) mixed methods based on clustering and regression models, d) decision trees. To design and identify balance models (for example, investment development of territories), methods for calibrating the parameters of models have been developed within the framework of the problem of evolutionary computation with the acceleration of convergence due to (1) the extraction of useful information from individuals with a relatively low fitness value; (2) increasing the range of possible fitness function values by making more use of mutations with an increased frequency of mutations; (3) use of variation operators that accept more than one individual to enter. In general, these methods allow for the fragments to build a general mathematical model of a hierarchical financial system, which includes a descriptive metamodel of the evolution of the financial system, a set of isolated content models at different levels of detail, and a set of motivational models for analyzing the qualitative properties of the system. A descriptive metamodel in the form of Liouville type equations reflects the evolution of financial system invariants used in various interpretations in isolated models representing individual levels. The macro level is represented by a model of interbank lending based on a multiplex complex network. Meso level is represented by financial model of investment development of territories. The microlevel is represented by the risk forecasting model in the system of transnational p2p lending: the simulation model of p2p lending in the network community is combined with the model for assessing the risk of defaults based on social media data and the customer's payment history. A set of motivational models of hierarchical systems of a more general type has also been developed, which allows to explore various aspects of the creation, transformation, transfer of material and information flows in financial environments. To implement these methods and models in practice, as well as create an infrastructure for collective access to financial modeling tools, a number of applied research was carried out to develop the information and technological base of the project. A method for collecting, storing, and processing semi-structured data for the implementation of DDA-modeling on various computing architectures with the support of local relationships between data structures has been developed. It allows, using the infologic model for a specific task, to provide data indexing in order to optimize access to them from computing procedures in a distributed cloud environment. This method is applied to the data of social media (Vkontakte, Twitter, Instagram) collected on the potential customers of the bank-industrial partner with the help of the crawler. To implement the data placement planning, a modular hybrid procedure has been developed that uses both heuristic (HEFT, DCPG) and meta-heuristic algorithms (GA, PSO, NN), as well as their hybrid modifications used in constructing coevolutionary schemes that simultaneously optimize the configuration of virtual computing resources, distribution of calculations and placement of data in a distributed environment, based on the specifics of the objective problem, given the infologic model. The architecture of the software system of the open cloud software platform for simulation and forecasting of financial processes is proposed. It includes data collection, aggregation and data processing services, construction of forecast models, and simulation of financial processes implemented based on the CLAVIRE cloud platform extension. The prototyping of the main elements of the platform in the part of data collection and aggregation, modeling at different levels of detail, coupling models of different scales, in the languages of Scala, C # and Python was made. A set of demonstration tasks of multiscale modeling in the financial sphere was prepared to form model scenarios based on historical data. Data on the evolution of the interbank lending market (the banking network of the Russian Federation for 2008-2017), the investment development of the territories (the satellite city of Southern, based on the master plan of 2017), as well as financial behavior in the case of credit defaults (data of the bank - industrial partner for 2013-2016, about 60 thousand loan agreements). The models constructed on these data are logically interrelated (inter-bank lending reflects the conditions on the mortgage market, the development of territories - the population's demand for mortgage loans, and the scoring model - the possibility of obtaining these loans and the risk of default for banks). In general, the results demonstrate the fundamental possibility of solving the problems of modeling the financial system by integrating models of different levels, considering individual non-financial data of the subjects.

 

Publications

1. Dzhafarov B., Voloshin D., Petrov M., Butakov N. Modelling multistage information spreading in dynamic complex networks Procedia Computer Science, 2017. — Vol. 119. — pp. 376–385 (year - 2017) https://doi.org/10.1016/j.procs.2017.11.197

2. Górski P.J., Kułakowski K., Gawroński P., Hołyst J.A. Destructive influence of interlayer coupling on Heider balance in bilayer networks Scientific Reports, 2017.— Vol. 7.—Article number: 16047 (year - 2017) https://doi.org/10.1038/s41598-017-15960-y

3. Guleva V.Y., Bochenina K.O. Graph Theoretical Approach to Bow-Tie Interbank Networks Reconstruction Studies in Computational Intelligence, 2017. — Volume 689. — pp. 1184-1194. (year - 2017) https://doi.org/10.1007/978-3-319-72150-7_96

4. Kuvshinov K., Bochenina K., Gorski P.J, Holyst J.A. Hybrid CPU-GPU Simulation of Hierarchical Adaptive Random Boolean Networks Lecture Notes in Computer Science, - (year - 2017)

5. Muravyov S., Filchenkov A. Meta-learning system for automated clustering Proceedings of Workshop AutoML 2017 @ ECML-PKDD: Automatic selection, configuration and composition of machine learning algorithms, 2017. - № - Access mode: https://sites.google.com/site/automl2017ecmlpkdd/workshop/accepted-papers (year - 2017)

6. Severiukhina O.A., Smirnov P.A., Bochenina K.O., Nasonov D., Butakov N.A. Adaptive load balancing of distributed multi-agent simulations on heterogeneous computational infrastructures Procedia Computer Science, 2017. — Vol. 119. — pp. 139-146 (year - 2017) https://doi.org/10.1016/j.procs.2017.11.170

7. Toruniewska J., Kułakowski K., Suchecki K., Hołyst J.A. Coupling of link- and node-ordering in the coevolving voter model Physical Review E - Statistical, Nonlinear, and Soft Matter Physics, — 2017.— Vol. 96.—Article number: 042306 (year - 2017) https://doi.org/10.1103/PhysRevE.96.042306

8. Zabashta A., Filchenkov A. NDSE: Instance Generation for Classification by Given Meta-Feature Description Proceedings of Workshop AutoML 2017 @ ECML-PKDD: Automatic selection, configuration and composition of machine learning algorithms, IET, 2017 (year - 2017)

9. - Научная Школа-практикум молодых ученых и специалистов «Новые интеллектуальные технологии: финансы, здравоохранение, коммуникации» Раздел «Новости» веб-сайта Института финансовых кибертехнологий, 28.11.2017 г. (year - )

10. - В Университете ИТМО разработают систему поддержки принятия решений в финансовом секторе news.ifmo.ru, 28 Июля 2017 г. (year - )

11. - Интервью с руководителем проекта РНФ проф. Я. Холыстом и ключевым участником проекта К.О. Бочениной Журнал NewTone (Мегабайт), 20.12.2017 г. (year - )

12. - Проф. Януш Холыст представил проект РНФ студентам ИТМО.Финтех Раздел «Новости» веб-сайта Института финансовых кибертехнологий, 13.10.2017 г. (year - )


Annotation of the results obtained in 2018
An ontological data model of the financial environment has been developed for describing relationships of entity attributes of various scales and relationships between them, and a method for reducing the dimension and visualizing semantic fields for the automated construction of concept spaces associated with the subject. A discrete-event model of the evolution of an aggregated profile of an entity of a financial environment is developed on the basis of a context-dependent modeling approach linking the change in the state of agents of the financial environment with informational messages about the state of the environment, opinions and actions of other agents. The complex of infological models of data of entities and processes in the financial sector (developed in 2017) was refined to take into account the temporal variability of aggregated profiles by highlighting the basic values and state variables of agents (identified based on aggregated profile data and discrete event modeling, respectively). A family of scalable machine learning algorithms has been developed and investigated, including methods for improving the performance of algorithms with a large search space based on the use of efficient data structures for implementing internal procedures; methods for establishing connectivity taking into account the target criterion of similarity for multiplex weighted networks of agents of the financial environment, methods of dimensionality reduction and identification of heterogeneity of complex networks with attributes of nodes. A meta-learning procedure has been developed and investigated. It is intended for automatic selection of a machine learning model with simultaneous adjustment of its hyperparameters and a model selection procedure for financial time series. The ensemble technology of deterministic and interval forecasting of the evolution of the financial system, taking into account the impact of external factors, was developed and investigated and tested on the tasks of financial systems of micro- and meso-scales. Procedures for assessing the quality of forecasts based on a composition of quality metrics for the validation stage, an approach to assessing the stability of forecast quality metrics based on the Monte Carlo method, an approach to optimizing the quality metrics of predictive models, taking into account the limitations on reactivity characteristics, have been developed and investigated. A set of methods for identifying parameters of predictive models of financial processes has been developed, including methods for: 1) identifying and spatial clustering of points of interest on transactional activity, which allow identifying customer payment interest zones in the city; 2) identifying the travel segment and preferred overseas travel destinations; 3) identification of customer interests according to open sources; 4) identifying the functional roles of subscribers in an organization’s community; 5) identification of the inflow / outflow parameters of the organization’s community subscribers; 6) identification of the characteristics of the borrower profile based on open source data and transactions; 7) clustering networks of transitions between categories of expenditure; 8) clustering for multiplex networks of similarity from different sources of financial and open data, by means of which groups of clients are identified that are similar in terms of a set of components of an aggregated profile; 9) identifying the hidden characteristics of subscribers on the topological component of the aggregated profile of the organization. A set of efficient parallel algorithms that implement data processing and execution of DDA models for various computational architectures has been developed and investigated, including: an algorithm for the effective redistribution of large data arrays; a family of parallel predictive models and financial data processing procedures. An experimental model of a multiscale simulation modeling and forecasting of financial processes with components of data collection, aggregation and data processing, predictive modeling of financial processes, implementation of multiscale applications, optimization of data placement has been developed. The software platform was tested according to the developed program and experimental research methodology in order to assess the functionality, scalability and performance of the modules and components. The test results confirmed the applicability of the modules and components of the platform for solving the stated tasks and the compliance of the scalability and performance characteristics with the stated nominal values. Model scenarios were formed and experimental studies of the prognostic capabilities of the developed system were made, which showed the possibility of obtaining forecasts for a number of substantive tasks of the financial sphere on the micro, meso and macro scales, which, according to the metrics of forecast quality, fall into the categories of “good” and “very good” and obtain interpretable clusters of subjects of the financial environment. The possibility of using the results of modeling for substantive tasks of credit, behavioral, collection scoring, portfolio management, increasing conversion of marketing campaigns, operational efficiency, and acquiring has been substantiated. The practical result of the study was the creation of a software platform for multiscale simulation and forecasting of financial processes. On its basis, it is advisable to implement analytical monitoring systems and decision support systems for managing financial processes of various scales in the interests of various categories of stakeholders (from authorities - to networks of trade and service points and banks).

 

Publications

1. Abdrashitova Y.S., Zabashta A.S., Filchenkov A.A. Spanning of Meta-Feature Space for Travelling Salesman Problem Procedia Computer Science, Vol.136, pp.174-182. (year - 2018) https://doi.org/10.1016/j.procs.2018.08.250

2. Abubakirov A.R., Nikitin N.O., Kalyuzhnaya A.V. Model for credit scoring on a base of behavioural and macroeconomic predictors 2018 IEEE Northwest Russia Conference on Mathematical Methods in Engineering and Technology (ММEТ NW 2018), - (year - 2018)

3. Buzdalov M.V. Make Evolutionary Multiobjective Algorithms Scale Better with Advanced Data Structures: Van Emde Boas Tree for Non-Dominated Sorting Lecture Notes in Computer Science, - (year - 2018)

4. Derevitskii I.V., Nuzhdenko I.B., Bochenina K.O. Identifying places of financial interest using open data Procedia Computer Science, Vol. 136, pp. 265–273 (year - 2018) https://doi.org/10.1016/j.procs.2018.08.268

5. Gajewski L.G., Suchecki K., Hołyst J.A. Multiple propagation paths enhance locating the source of diffusion in complex networks Physica A: Statistical Mechanics and its Applications, - (year - 2018)

6. Golubev K.A., Zagarskikh A.S., Karsakov A.S. A framework for a multi-agent traffic simulation using combined behavioural models Procedia Computer Science, Vol. 136, pp. 443-452 (year - 2018) https://doi.org/10.1016/j.procs.2018.08.267

7. Grigorev A.K., Derevitskii I.V., Bochenina K. Analysis of Special Transport Behavior Using Computer Vision Analysis of Video from Traffic Cameras Communications in Computer and Information Science, Vol 858., pp. 289-301 (year - 2018) https://doi.org/10.1007/978-3-030-02843-5_23

8. Krawczyk M.J., Kułakowski K., Hołyst J.A. Hierarchical partitions of social networks between rivaling leaders PLoS ONE, Vol. 13. – №. 3. – Article № e0193715 (year - 2018) https://doi.org/10.1371/journal.pone.0193715

9. Lantseva A.A., Ivanova S.V. Assessment of pedestrian flow volumes through public transport modelling Procedia Computer Science, Vol. 136, pp. 463-471. (year - 2018) https://doi.org/10.1016/j.procs.2018.08.265

10. Nikitin N.O., Kalyuzhnaya A.V., Bochenina K.O., Kudryashov A. A., Uteuov A.K.,Derevitskii I.V. Evolutionary Ensemble Approach for Behavioral Credit Scoring. Lecture Notes in Computer Science, Vol. 10862, pp. 825-831 (year - 2018) https://doi.org/10.1007/978-3-319-93713-7_81

11. Nuzhdenko I., Uteuov A., Bochenina K. Detecting Influential Users in Customer-Oriented Online Communities Lecture Notes in Computer Science, Vol. 10862, pp.832-838 (year - 2018) https://doi.org/10.1007/978-3-319-93713-7_82

12. Paluch R., Lu X., Suchecki K., Szymański B.K., Hołyst J.A. Fast and accurate detection of spread source in large complex networks Scientific Reports, Volume 8, Issue 1, Article N2508 (year - 2018) https://doi.org/10.1038/s41598-018-20546-3

13. Shalamova V.V.; Efimova V.A.; Muravyova S.B.; Filchenkov A.A. Reinforcement-based Method for Simultaneous Clustering Algorithm Selection and its Hyperparameters Optimization Procedia Computer Science, Vol.136, pp. 144-153. (year - 2018) https://doi.org/10.1016/j.procs.2018.08.247

14. Sigova M.V., Klioutchnikov I.K., Zatevakhina A.V., Klioutchnikov O.I. Approaches to evaluating the function of prediction of decentralized applications. Proceedings of the 32th International Business Information Management Association Conference, - (year - 2018)

15. Sigova M.V., Klyutchnikov I.K., Zatevakhina A.V., Lobanova I.A. Outlook for the development of self-adjusting models for forecasting prices in financial markets. Proceedings of the 32th International Business Information Management Association Conference, - (year - 2018)

16. Simonov A.O., Lebin A.E., Shcherbak B.D., Zagarskikh A.S., Karsakov A.S. Multi-agent crowd siimulation on large areas with utility-based behavior models: Sochi Olympic Park Station use case Procedia Computer Science, Vol. 136, pp. 453-462 (year - 2018) https://doi.org/10.1016/j.procs.2018.08.266

17. Siudem G., Hołyst J.A. Diffusion on hierarchical systems of weakly-coupled networks Physica A: Statistical Mechanics and its Applications, Vol. 513, pp. 675-686 (year - 2018) https://doi.org/10.1016/j.physa.2018.08.078

18. Vaganov D.A., Funkner A.A., Kovalchuk S.V., Guleva V.Y., Bochenina K.O. Forecasting Purchase Categories with Transition Graphs Using Financial and Social Data Lecture Notes in Computer Science, Vol. 11185, pp. 439-454 (year - 2018) https://doi.org/10.1007/978-3-030-01129-1_27

19. Vaganov D.A., Guleva V.Y., Bochenina K.O. Social Media Group Structure and Its Goals: Building an Order Studies in Computational Intelligence, Vol 813, pp.473-483. (year - 2018) https://doi.org/10.1007/978-3-030-05414-4_3

20. Vaganov D.A., Sheina E.S., Bochenina K.O. A comparative study of social data similarity measures related to financial behavior Procedia Computer Science, Vol. 136, pp. 274-283. (year - 2018) https://doi.org/10.1016/j.procs.2018.08.270

21. Varvara I.D., Ivanov S.V. Crime rate prediction in the urban environment using social factors Procedia Computer Science, Vol. 136, pp. 472-478. (year - 2018) https://doi.org/10.1016/j.procs.2018.08.261

22. Volotskiy T., Smirnov J., Ziemke D., Kaddoura I. An Accessibility Driven Evolutionary Transit Network Design Approach in the Multi-agent Simulation Environment Procedia Computer Science, Vol. 136, pp. 499-510 (year - 2018) https://doi.org/10.1016/j.procs.2018.08.255


Annotation of the results obtained in 2019
The purpose of this stage’s work is developing multi scale predictive models of processes in the financial sphere, their integration into a platform of multi scale modelling and forecasting of financial processes, and developing mathematical, model, algorithmic, and technological support for effectively solving applied problems of classification, regression and clustering of data in the financial sphere with use of such platform. We have developed a method of applying the platform of multi scale modelling and forecasting to solve the applied problems of forecasting, optimising, simulation modelling and pattern recognition on the financial sphere data. We have developed and corrected 12 modules: a) forecasting the next category in the chain of client’s expenses; b) forecasting expense categories for the next time period; c) transactional and social media data scoring; d) portfolio management; e) increasing conversion rates of marketing campaigns (developed); f) optimising operational effectiveness; g) analysis of acquiring rates; h) optimising acquiring rates; i) forecasting travel using the transactional data; j) simulation modelling of consuming activity in retail networks; k) clustering complex networks with node attributes; l) recommending counterparties; m) recovering attributes in complex networks; n) recovering the source of spreading in a complex network; o) analysis of client affinity networks. Therefore, for solving objective tasks in the financial sphere available in the platform are 24 modules that solve 43 applied problems of modelling and optimisation in the financial sphere. We conducted a study of predictive capabilities of the platform’s developed and corrected modules, that showed high effectiveness of the proposed methods and algorithms and their applicability for solving the applied tasks of predictive modelling in the financial sphere. We have developed a procedure of adaptive rebuilding of the data model due to the specifics of the tasks being solved, the semantic content of the aggregated data, and the evolution of the aggregated profiles with use of the mechanism of adaptive aggregation functions. The mechanism of adaptive aggregation functions is based on the automated analysis of compound, structure, and run time of user requests of data extraction with subsequent generation of aggregated data representations. The reduced time of request processing is implemented in the procedure based on a data storage reconfiguration scheme that uses a genetic algorithm with adaptive population. The effectiveness of the developed procedure is studied for the problem of time series storage and processing. The modification of the method of automated selection of models and algorithms for selecting structural parameters of models includes a generalisation of a method proposed in 2018 that aims at finding the optimal parameters of predictive models, balancing between accuracy resource intensity of calculations with use of: a) meta feature analysis of the initial data and the target domain data; b) enriching of the training sample with use of generative adversarial networks; c) optimising the computational processes based on co-evolutional algorithms. The developed procedure of repeated use of priorly trained models includes: a) model training on large samples consisting of data from different domains; b) additional training of a model from point (a) on a target sample (generally, by freezing some of the layers of a neural network, while updating the weights on other layers). For the financial time series such transfer learning results in a significant increase in forecasting accuracy on a small volume (less than 3%) of the available data in relation to the initial domain. With transferring of knowledge between the financial exchange points in the stock market, fully connected and recurrent neural networks show the more stable results compared to a model trained on the target sample only. For optimising effectiveness of computational process organisation to use in algorithms of balancing and reconfiguring of data storage schemes, we have developed and validated on a number of applied financial sphere tasks theoretical models of application performance. These theoretical models of performance are based on evaluating of expected run time of consecutive and parallel application depending on the algorithm’s computational complexity and the parameters of the computing environment. Validation of the models of performance for the forecasting and pattern recognition tasks in financial time series showed accordance of the experimental evaluations of run time of multithreaded and multiprocessor models with the theoretical values. Based on the mechanism of co-evolution we developed an adaptive algorithm of planning, balancing and reorganising the cloud infrastructure. We studied the performance and scalability of various platform components and also models for solving applied tasks developed in 2019. In general, the results show applicability of the developed platformed solution for solving a wide range of the fintech tasks, including taking into the account the temporary variability of aggregated profiles of financial sphere subjects and with possibility of connecting the scales of modelling.

 

Publications

1. Chołoniewski J., Sienkiewicz J., Leban G., Hołyst J.A. Modelling of temporal fluctuation scaling in online news network with independent cascade model Physica A: Statistical Mechanics and its Applications, Vol. 523. P. 129-144 (year - 2019) https://doi.org/10.1016/j.physa.2019.02.035

2. Chunaev P., Nuzhdenko I., Bochenina K. Community Detection in Attributed Social Networks: a Unified Weight-Based Model and Its Regimes IEEE International Conference on Data Mining Workshops 2019.The Ninth IEEE ICDM Workshop on Data Mining in Networks (November 8, 2019, Beijing, China), - (year - 2019)

3. Deeva I. Computational Personality Prediction Based on Digital Footprint of A Social Media User Procedia Computer Science, Vol. 156. P. 185-193 (year - 2019) https://doi.org/10.1016/j.procs.2019.08.194

4. Derevitskii I., Severiukhina O., Bochenina K. Clustering interest graphs for customer segmentation problems Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS2019), - (year - 2019)

5. Grigorev A., Severiukhina O., Derevitskii I. Anomaly Detection Using Adaptive Suppression Procedia Computer Science, Vol. 156. P. 274-282 (year - 2019) https://doi.org/10.1016/j.procs.2019.08.203

6. Kachalsky I., Zabashta A., Filchenkov A., Korneev G. Generating Datasets for Classification Task and Predicting Best Classifiers with Conditional Generative Adversarial Networks Conference proceedings of ICAAI 2019, - (year - 2019)

7. Kalinin A., Vaganov D., Bochenina K. Improving statistical relational learning with graph embeddings for socio-economic data retrieval Procedia Computer Science, Vol. 156. P. 235-244 (year - 2019) https://doi.org/10.1016/j.procs.2019.08.199

8. Khodorchenko M. Distant supervision and knowledge transfer for domain-oriented text classification in online social networks Procedia Computer Science, Vol. 156. P. 166-175 (year - 2019) https://doi.org/10.1016/j.procs.2019.08.192

9. Klioutchnikov I., Sigova M., Klioutchnikova A. Big data in digital-banks Proceedings of the 33rd International Business Information Management Association Conference, IBIMA 2019: Education Excellence and Innovation Management through Vision 2020, P. 9594-9601 (year - 2019)

10. Lysenko A., Shikov E., Bochenina K. Combination of individual and group patterns for time-sensitive purchase recommendation Proceedings of MoST-Rec Workshop CIKM 2019 (Beijin, China, 2019), - (year - 2019)

11. Lysenko A., Shikov E., Bochenina K. Temporal point processes for purchase categories forecasting Procedia Computer Science, Vol. 156. P. 255-263 (year - 2019) https://doi.org/10.1016/j.procs.2019.08.201

12. Maslyaev M., Hvatov A., Kalyuzhnaya A. Data-Driven Partial Derivative Equations Discovery with Evolutionary Approach Lecture Notes in Computer Science, Computational Science ICCS 2019. Vol. 11540. Springer, Cham. P. 635-641 (year - 2019) https://doi.org/10.1007/978-3-030-22750-0_61

13. Muravyov S., Antipov D., Buzdalova A., Filchenkov A. Efficient Computation of Fitness Function for Evolutionary Clustering Mendel, Vol. 25. N1. P. 87-94 (year - 2019) https://doi.org/10.13164/mendel.2019.1.087

14. Pershutkin A., Dukhanov A., Gladilin P. An Approach to Terrain Trafficability Evaluation Based on Neural Network for Emergency Decision Support Systems The 13th IEEE International Conference Application of Information and Communication Technologies (AICT) Conference Proceedings, P. 270-275 (year - 2019)

15. Petukhov A., Zaikin O., Bochenina K. Analysis of the geospatial activity profiles of bank customers Procedia Computer Science, Vol. 156. P. 245-254 (year - 2019) https://doi.org/10.1016/j.procs.2019.08.200

16. Safiulin I., Butakov N., Alexandrov D., Nasonov D. Ensemble-based method of answers retrieval for domain specific questions from text-based documentation Procedia Computer Science, Vol. 156. P. 158-165 (year - 2019) https://doi.org/10.1016/j.procs.2019.08.191

17. Shakhova M., Zagarskikh A. Dynamic Difficulty Adjustment with a simplification ability using neuroevolution Procedia Computer Science, Vol. 156. P. 395-403 (year - 2019) https://doi.org/10.1016/j.procs.2019.08.219

18. Shchepin N., Zagarskikh A. Building behavioral AI using trust and reputation model based on mask model Procedia Computer Science, Vol. 156. P. 387-394 (year - 2019) https://doi.org/10.1016/j.procs.2019.08.216

19. Shikov E., Bochenina K. Forecasting Purchase Categories by Transactional Data: A Comparative Study of Classification Methods Lecture Notes in Computer Science, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 11538 LNCS, P. 249-262 (year - 2019) https://doi.org/10.1007/978-3-030-22744-9_19

20. Sigova M., Dolbezhkin V., Koltsov A. Objective contradictions in the integration of social networks, payments services and distributed ledger technology Proceedings of the International Conference on Artificial Intelligence: Applications and Innovations (IC-AIAI 2019, Vrdnik Banja, Serbia, 30 September-4 October 2019), P. 10-14 (year - 2019) https://doi.org/10.1109/IC-AIAI.2019.00009

21. Sigova M., Klyuchnikov I., Vasilev S., Zatevakhina A. The impact of the digitization of the financial industry on the modeling and pricing of financial assets International Journal of Risk Assessment and Management, - (year - 2019)

22. Simonov A., Zagarskikh A., Fedorov V. Applying Behavior characteristics to decision-making process to create believable game AI Procedia Computer Science, Vol. 156. P. 404-413 (year - 2019) https://doi.org/10.1016/j.procs.2019.08.222

23. Stavinova E., Bochenina K. Forecasting of foreign trips by transactional data: a comparative study Procedia Computer Science, Vol. 156. P. 225-234 (year - 2019) https://doi.org/10.1016/j.procs.2019.08.198

24. Tomp D., Muravyov S., Filchenkov A., Parfenov V. Meta-learning Based Evolutionary Clustering Algorithm Lecture Notes in Computer Science, International Conference on Intelligent Data Engineering and Automated Learning IDEAL 2019. Vol. 11871. Springer, Cham. P. 502-513 (year - 2019) https://doi.org/10.1007/978-3-030-33607-3_54

25. Uteuov A. Topic model for online communities’ interests prediction Procedia Computer Science, Vol. 156. P. 204-213 (year - 2019) https://doi.org/10.1016/j.procs.2019.08.196

26. Vaganov D., Kalinin A., Bochenina K. On Inferring Monthly Expenses of Social Media Users: Towards Data and Approaches Studies in Computational Intelligence, Complex Networks and Their Applications VIII. COMPLEX NETWORKS 2019. Vol. 881. Springer, Cham. P. 854-865 (year - 2020) https://doi.org/10.1007/978-3-030-36687-2_71

27. Zabashta A., Filchenkov A. Active Dataset Generation for Meta-learning System Quality Improvement Lecture Notes in Computer Science, International Conference on Intelligent Data Engineering and Automated Learning IDEAL 2019. Vol. 11871. Springer, Cham. P. 394-401. (year - 2019) https://doi.org/10.1007/978-3-030-33607-3_43

28. Zaikin O., Derevitskii I., Bochenina K., Holyst J. Optimizing Spatial Accessibility of Company Branches Network with Constraints International Conference on Computational Science. Lecture Notes in Computer Science, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 11537 LNCS, с. 332-345 (year - 2019) https://doi.org/10.1007/978-3-030-22741-8_24

29. Zaikin O., Petukhov A., Bochenina K. Bank Branch Network Optimization Based on Customers Geospatial Profiles Lecture Notes in Computer Science, On the Move to Meaningful Internet Systems: OTM 2019 Conferences. Vol. 11877. Springer, Cham. P. 201-208 (year - 2019) https://doi.org/10.1007/978-3-030-33246-4_13