Intellectual Data Mining in Socio-Geographic Research

Occupation: Leading Researcher
Affiliation: V.B. Sochava Institute of Geography, Siberian Branch of the Russian Academy of Sciences
Address: Irkutsk, 1, Ulan-Batorskaya. st., Irkutsk. 664033, Rusian Federation

Journal name

Obshchestvennye nauki i sovremennost

Edition

Issue 6

Pages

150-164

Abstract

In social geography, aimed at understanding the territorial organization of society, various methods are used, including data mining. However, there is no generalization of the experience of using such methods in world science. Therefore, the purpose of this article is to analyze the global array of scientific articles on this issue to identify priorities, algorithms and thematic areas with their capabilities and limitations. Using the author's method of semantic search based on machine learning, about two hundred articles published in the last two decades have been identified in eight bibliographic databases. Their generalization made it possible to identify chronological and chorological priorities, as well as to establish that a limited number of algorithms had been used for the geospatial data mining, which can be combined into groups of neural network, evolutionary, decision trees, swarm intelligence and support vector methods. These algorithms were used in five thematic areas (spatial-urban, regional-typological, area-based, geo-indicative and territorial-connective). The main features and limitations in each direction are given.

Keywords

artificial neural network, genetic algorithm, swarm intelligence, random forest, support vector machine, urban spatial expansion, regional typology, socio-economic regionalization, geo-indication, spatial interaction

Acknowledgment

The work was carried out at the V.B. Sochava Institute of Geography of the Siberian Branch of the Russian Academy of Sciences funded by the State task (registration number of the topic: AAAA-A17-117041910166-3).

Received

30.08.2021

Publication date

20.12.2021

Number of characters

26999

Cite

GOST	Blanutsa V. Intellectual Data Mining in Socio-Geographic Research // Obshchestvennye nauki i sovremennost – 2021. – Issue 6 C. 150-164 [Electronic resource]. URL: https://ons-journal.ru/S086904990017878-7-1 (circulation date: 22.07.2024). DOI: 10.31857/S086904990017878-7
MLA	Blanutsa, Viktor "Intellectual Data Mining in Socio-Geographic Research." Obshchestvennye nauki i sovremennost 6 (2021):150-164. DOI: 10.31857/S086904990017878-7
APA	Blanutsa V. (2021). Intellectual Data Mining in Socio-Geographic Research. Obshchestvennye nauki i sovremennost (6), pp.150-164 DOI: 10.31857/S086904990017878-7

100 rub.

When subscribing to an article or issue, the user can download PDF, evaluate the publication or contact the author. Need to register.

Размещенный ниже текст является ознакомительной версией и может не соответствовать печатной


1	Введение
2	Под интеллектуальным анализом данных понимается применение алгоритмов искусственного интеллекта для того, чтобы извлечь скрытые закономерности (структуры) из исходных данных. Следует учитывать, что не все алгоритмы искусственного интеллекта позволяют обнаруживать новое знание. Более того, оперирование геопространственными данными, для которых характерны территориальная локализация, пространственная автокорреляция, иерархическая организация, географическая маршрутизация и пространственно-временная трансформация, дополнительно ограничивает возможности интеллектуального анализа [Atluri et al. 2017; Li et al. 2016; Wang, Eick 2018; Wylie et al. 2019]. В связи с данной особенностью в географических науках еще не сформировалось полное представление о том, какие именно алгоритмы искусственного интеллекта, в какой мере и по каким конкретным тематическим направлениям можно использовать для извлечения скрытых пространственно-временных структур из геоданных. Первым шагом на пути решения проблемы может стать обобщение мирового опыта интеллектуального анализа данных. До настоящего времени в общественной географии, нацеленной на познание территориальной организации общества, такое обобщение не проводили. Для сравнения можно отметить, что в смежных научных дисциплинах начали появляться подобные обобщения – например, в региональной экономике [Блануца 2020].
3	Данное исследование проводится с целью обобщить мировой опыт применения интеллектуального анализа данных в общественно-географических исследованиях для того, чтобы выявить приоритеты, алгоритмы и тематические направления с их возможностями и ограничениями. Для достижения цели потребовалось решить следующие задачи: выявить массив (корпус) публикаций, в которых приведены эмпирические результаты изучения территориальной организации общества посредством интеллектуального анализа данных; определить хронологические и хорологические (по странам) приоритеты в выявленных исследованиях; сформировать список применяемых алгоритмов и отметить их сильные и слабые стороны; сгруппировать выявленные публикации в несколько тематических направлений и констатировать их возможности и ограничения.
4	Понимание сущности интеллектуального анализа и в целом искусственного интеллекта постоянно менялось с середины прошлого века [Haenlein, Kaplan 2019]. В настоящее время к алгоритмам искусственного интеллекта относят методы, которые опираются на машинное обучение [Cristianini 2014]. Впервые машинное обучение в общественно-географических исследованиях применили при построении искусственной нейронной сети (Artificial Neural Network, ANN), которая моделировала межрегиональные телекоммуникационные потоки в Австрии [Fischer, Gopal 1994]. На смену единичным экспериментам пришло значительное увеличение количества географических исследований в XXI в. (например, по геоурбанистике до 2001 г. было опубликовано 2 статьи, посвященные применению ANN, а в 2001–2016 гг. – 138 [Grekousis 2019]). Теоретическое осмысление возможностей машинного обучения происходило от нейросетевой парадигмы пространственного анализа [Fischer 1998] до концепции географического искусственного интеллекта [Janowicz et al. 2020].

Number of purchasers: 0, views: 466

Readers community rating: votes 0

1. Adamatzky A. (Ed.) (2010) Game of Life Cellular Automata. London: Springer-Verlag.

2. Atluri G., Karpatne A., Kumar V. (2017) Spatio-Temporal Data Mining: A Survey of Problems and Methods. ACM Computing Surveys. vol. 1, no. 1, pp. 1–37 (https://doi.org/10.1145/3161602).

3. Basse R. M., Charif O., Bόdis K. (2016) Spatial and Temporal Dimensions of Land Use Change in Cross-Border Region of Luxemburg. Development of a Hybrid Approach Integrating GIS, Cellular Automata and Decision-Learning Tree Models. Applied Geography. vol. 67, pp. 94–108 (https://doi.org/10.1016/j.apgeog.2015.12.001).

4. Blanutsa V.I. (2018) Social'no-ekonomicheskoe rajonirovanie v epohu bol'shih dannyh [Socio-Economic Regionalization in the Era of Big Data]. Moscow: INFRA-M.

5. Blanutsa V.I. (2020) Regional'nye ekonomicheskie issledovaniya s ispol'zovaniem algoritmov iskusstvennogo intellekta: sostoyanie i perspektivy [Regional Economic Research Using Artificial Intelligence Algorithms: State and Prospects]. Vestnik Zabajkal'skogo gosudarstvennogo universiteta. vol. 26, no. 8, pp. 100–111 (https://doi.org/10.21209/2227-9245-2020-26-8-100-111).

6. Brabyn L., Jackson N. O. (2019) A New Look at Population Change and Regional Development in Aotearoa New Zealand. New Zealand Geographer. vol. 75, pp. 116–129 (https://doi.org/10.1111/nzg.12234).

7. Breiman L. (2001) Random Forests. Machine Learning. vol. 45, no. 1, pp. 5–32 (https://doi.org/10.1023/A:1010933404324).

8. Cao M., Bennett S. J., Shen Q., Xu R. (2016) A Bat-Inspired Approach to Define Transition Rules for a Cellular Automaton Model Used to Simulate Urban Expansion. International Journal of Geographical Information Science. vol. 30, no. 10, pp. 1961–1979 (https://doi.org/10.1080/13658816.2016.1151521).

9. Carlei V., Nuccio M. (2014) Mapping Industrial Patterns in Spatial Agglomeration: A SOM Approach to Italian Industrial Districts. Pattern Recognition Letters. vol. 40, pp. 1–10 (https://doi.org/10.1016/j.patrec.2013.11.023).

10. Colantonio E., Cialfi D. (2016) Smart Regions in Italy: A Comparative Study through Self-Organizing Maps. European Journal of Business and Social Science. vol. 5, no. 9, pp. 84–99.

11. Cristianini N. (2014) On the Current Paradigm in Artificial Intelligence. AI Communication. vol. 27, no. 1, pp. 37–43 (https://doi.org/10.3233/AIC-130582).

12. De Castro L. N., Timmis J. (2002) Artificial Immune Systems: A New Computational Approach. London: Springer-Verlag.

13. Dorigo M., Di Caro G., Gambardella L. M. (1999) Ant Algorithms for Discrete Optimization. Artificial Life. vol. 5, no. 2, pp. 137–172 (https://doi.org/10.1162/106454699568728).

14. Fischer M.M., Gopal S. (1994) Artificial Neural Networks: A New Approach to Modeling Interregional Telecommunication Flows. Journal of Regional Science. vol. 34, no. 4, pp. 503–527 (https://doi.org/10.1111/j.1467-9787.1994.tb00880.x).

15. Fischer M.M. (1998) Computational Neural Networks: A New Paradigm for Spatial Analysis. Environment and Planning A: Economy and Space. vol. 30, no. 10, pp. 1873–1891 (https://doi.org/10.1068/a301873).

16. Gounaridis D., Chorianopoulos I., Symeonakis E., Koukoulas S. (2019) A Random Forest – Cellular Automata Modelling Approach to Explore Future Land Use/Cover Change in Attica (Greece), Under Different Socio-Economic Realities and Scales. Science of the Total Environment. vol. 646, pp. 320–335.

17. Grekousis G. (2019) Artificial Neural Networks and Deep Learning in Urban Geography: A Systematic Review and Meta-Analysis. Computers, Environment and Urban Systems. vol. 74, pp. 244–256 (https://doi.org/10.1016/j.compenvurbsys.2018.10.008).

18. Haenlein M., Kaplan A.A. (2019) A Brief History of Artificial Intelligence: On the Past, Present, and Future of Artificial Intelligence. California Management Review. vol. 61, no. 4, pp. 5–14 (https://doi.org/10.1177/0008125619864925).

19. Hajek P., Henriques R., Hajkova V. (2014) Visualising Components of Regional Innovation Systems Using Self-Organizing Maps – Evidence from European Regions. Technological Forecasting and Social Change. vol. 84, pp. 197–214 (https://doi.org/10.1016/j.techfore.2013.07.013).

20. He Y., Ai B., Yao Y., Zhong F. (2015) Deriving Urban Dynamics Evolution Rules from Self-Adaptive Cellular Automata with Multi-Temporal Remote Sensing Images. International Journal of Applied Earth Observation and Geoinformation. vol. 38, pp. 164–174 (https://doi.org/10.1016/j.jag.2014.12.014).

21. Henriques R., Bacao F., Lobo V. (2012) Exploratory Geospatial Data Analysis Using the GeoSOM Suite. Computers, Environment and Urban Systems. vol. 36, no. 3, pp. 218–232 (https://doi.org/10.1016/j.compenvurbsys.2011.11.003).

22. Janowicz K., Gao S., McKenzie G., Hu Y., Bhaduri B. (2020) GeoAI: Spatially Explicit Artificial Intelligence Techniques for Geographic Knowledge Discovery and Beyond. International Journal of Geographical Information Science. vol. 34, no. 4, pp. 625–636 (https://doi.org/10.1080/13658816.2019.1684500).

23. Karimi F., Sultana S., Bakakan A. S., Suthaharan S. (2019) An Enhanced Support Vector Machine Model for Urban Expansion Prediction. Computers, Environment and Urban Systems. vol. 75, pp. 61–75 (https://doi.org/10.1016/j.compenvurbsys.2019.01.001).

24. Kohonen T. (2001) Self-Organizing Maps. 3rd ed. Berlin, Heidelberg: Springer-Verlag.

25. LeCun Y., Boser B., Denker J. S., Henderson D., Howard R. E., Hubbard W., Jackel L. D. (1989) Backpropagation Applied to Handwritten Zip Code Recognition. Neural Computation. vol. 1, no. 4, pp. 541–551.

26. Li D., Wang S., Yuan H., Li D. (2016) Software and Applications of Spatial Data Mining. WIREs: Data Mining and Knowledge Discovery. vol. 6, no. 3, pp. 84–114 (https://doi.org/10.1002/widm.1180).

27. Liu D., Tang W., Liu Y., Zhao X., He J. (2017) Optimal Rural Land Use Allocation in Central China: Linking the Effect of Spatiotemporal Patterns and Policy Interventions. Applied Geography. vol. 86, pp. 165–182 (https://doi.org/10.1016/j.apgeog.2017.05.012).

28. Liu X., Ou J., Li X., Ai B. (2013) Combining System Dynamics and Hybrid Particle Swarm Optimization for Land Use Allocation. Ecological Modelling. vol. 257, no. 5, pp. 11–24 (https://doi.org/10.1016/j.ecolmodel.2013.02.027).

29. Liu Y., Feng Y., Pontius R. G. (2014) Spatially-Explicit Simulation of Urban Growth through Self-Adaptive Genetic Algorithm and Cellular Automata Modelling. Land. vol. 3, no. 3, pp. 719–738 (https://doi.org/10.3390/land3030719).

30. Liu Y. L., Tang D. W., Kong X., Liu Y. F., Ai T. (2014) A Land-Use Spatial Allocation Model Based on Modified Ant Colony Optimization. International Journal of Environmental Research. vol. 8, no. 4, pp. 1115–1126 (https://doi.org/10.22059/IJER.2014.805).

31. López-Iturriaga F. J., Sanz I. P. (2018) Predicting Public Corruption with Neural Networks: An Analysis of Spanish Provinces. Social Indicators Research. vol. 140, pp. 975–998 (https://doi.org/10.1007/s11205-017-1802-2).

32. Lu Y., Laffan S., Pettit C., Cao M. (2020) Land Use Change Simulation and Analysis Using a Vector Cellular Automata (CA) Model: A Case Study of Ipswich City, Queensland, Australia. Environment and Planning B: Urban Analysis and City Science. vol. 47, no. 9, pp. 1605–1621 (https://doi.org/10.1177/2399808319830971).

33. Ma X., Zhao X. (2015) Land Use Allocation Based on a Multi-Objective Artificial Immune Optimization Model: An Application in Anlu County, China. Sustainability. vol. 7, no. 11, pp. 15632–15651 (https://doi.org/10.3390/su71115632).

34. Mitchell M. (1996) An Introduction to Genetic Algorithms. Cambridge, MA: MIT Press.

35. Naghibi F., Delavar M. R., Pijanowski B. (2016) Urban Growth Modeling Using Cellular Automata with Multi-Temporal Remote Sensing Images Calibrated by the Artificial Bee Colony Optimization Algorithm. Sensor. vol. 16, no. 12, e2122 (https://doi.org/10.3390/s16122122).

36. Nijkamp P., Reggiani A., Tsang W. F. (2004) Comparative Modelling of Interregional Transport Flows: Applications to Multimodal European Freight Transport. European Journal of Operational Research. vol. 155, no. 3, pp. 584–602 (https://doi.org/10.1016/j.ejor.2003.08.007).

37. Poletaeva N.G. (2020) Klassifikaciya sistem mashinnogo obucheniya [Classification of Machine Learning Systems]. Vestnik Baltijskogo federal'nogo universiteta im. I. Kanta. Seriya: Fiziko-matematicheskie i tekhnicheskie nauki. no. 1, pp. 5–22.

38. Psyllidis A., Yang J., Bozzon A. (2018) Regionalization of Social Interactions and Points-Of-Interest Location Prediction with Geosocial Data. IEEE Access. vol. 6, pp. 34334–34353 (https://doi.org/10.1109/ACCESS.2018.2850062).

39. Qian Y., Xing W., Guan X., Yang T., Wu H. (2020) Coupling Cellular Automata with Area Partitioning and Spatiotemporal Convolution for Dynamic Land Use Change Simulation. Science of the Total Environment. vol. 722, e137738 (https://doi.org/10.1016/j.scitotenv.2020.137738).

40. Qiu R., Xu W., Zhang J., Staenz K. (2018) Modelling and Simulating Urban Residential Land Development in Jiading New City, Shanghai. Applied Spatial Analysis and Policy. vol. 11, pp. 753–777 (https://doi.org/10.1007/s12061-017-9244-4).

41. Sharygin M.D., Stolbov V.A. (2020) Teoretiko-metodologicheskie aspekty poiska zakonov i zakonomernostej v obshchestvennoj geografii [Theoretical and Methodological Aspects of the Search for Laws and Regularities in Public Geography]. Geograficheskij vestnik. no. 1, pp. 22–32 (https://doi.org/10.17072/2079-7877-2020-1-22-32).

42. Su S., Sun Y., Lei C., Weng M., Cai Z. (2017) Reorienting Paradoxical Land Use Policies Towards Coherence: A Self-Adaptive Ensemble Learning Geo-Simulation of Tea Expansion under Different Scenarios in Subtropical China. Land Use Policy. vol. 67, pp. 415–425 (https://doi.org/10.1016/j.landusepol.2017.06.011).

43. Triantakonstantis D., Mountrakis G. (2012) Urban Growth Prediction: A Review of Computational Models and Human Perceptions. Journal of Geographic Information System. vol. 4, pp. 555–587 (https://doi.org/10.4236/jgis.2012.46060).

44. Vapnik V. N. (1998) Statistical Learning Theory. New York: John Wiley and Sons.

45. Wang S., Eick C. F. (2018) A Data Mining Framework for Environmental and Geospatial Data Analysis. International Journal of Data Science and Analytics. vol. 5, pp. 83–98 (https://doi.org/10.1007/s41060-017-0075-9).

46. Wang W., Jiao L., Zhang W., Jia Q., Su F., Xu G., Ma S. (2020) Delineating Urban Growth Boundaries under Multi-Objective and Constraints. Sustainable Cities and Society. vol. 61, pp. 1–12 (https://doi.org/10.1016/j.scs.2020.102279).

47. Wu P., Tan Y. (2019) Estimation of Poverty Based on Remote Sensing Image and Convolutional Neural Network. Advances in Remote Sensing. vol. 8, no. 4, pp. 89–98 (https://doi.org/10.4236/ars.2019.84006).

48. Wylie B. K., Pastick N. J., Picotte J. J., Deering C. A. (2019) Geospatial Data Mining for Digital Raster Mapping. GIScience and Remote Sensing. vol. 56, no. 3, pp. 406–429 (https://doi.org/10.1080/15481603.2018.1517445).

49. Yan J., Thill J.-C. (2009) Visual Data Mining in Spatial Interaction Analysis with Self-Organizing Maps. Environment and Planning B: Planning and Design, vol. 36, no. 3, pp. 466–486 (https://doi.org/10.1068/b34019).

50. Yao J., Mitran T., Kong X., Lal R., Chu Q., Shaukat M. (2020) Land Use and Land Cover Identification and Disaggregating Socio-Economic Data with Convolutional Neural Network. Geocarto International. vol. 35, no. 10, pp. 1109–1123 (https://doi.org/10.1080/10106049.2019.1568587).