This chapter focuses on a brief overview of the prediction methods that need to be used and the construction of the methods, and then through the creation of a database, a new model for predicting track vibration in metro transportation using machine learning algorithms was constructed. The new model combines machine learning algorithms and a database model, enabling it to be more conducive to the analysis and prediction of metro rail data.

### Database theory and machine learning algorithm theory research

Database is a general term for the entire database system, which includes the entire system services, program services, librarians and computer applications. Which the construction of the database is generally on some logical language and information parameters for the design, usually use the design language for the SQL, mainly using the computer for the construction of the database as shown in Fig. 1 for the database of the basic system construction model^{16}.

The management system in the management system of the database in Fig. 1 includes the relevant collection in the database, the software that manages the database, the relevant programs in the database, etc. are unified by the data administrator, and the user information and parameter storage are saved by the computer structure. Database through the maintenance of data management and control, used to achieve the management and storage of data. In the construction of the database, it is necessary to analyze the demand, determine the target data needed, and determine the information and entity information to be stored. The data prediction of the subway traffic environment is needed to store the subway environmental traffic and environmental vibration data, the database mainly serves the vibration measurement data and numerical calculation data^{17}. In the construction of the data structure, including several kinds of storage information of the database, construction information, data detection information, vibration information, model building information and so on. Construction information is a collection of structure, soil, level, distance and building information for subway track information; vibration information and modeling information are a collection of actual information data on site. In the design of the database usually use the E–R model for model building, the E–R model is a conceptual method model that can describe the data of the real world. It is shown in Fig. 2.

As shown in Fig. 2, the total construction conditions include a summary of the tunnel structure and soil conditions, which also includes a summary of the construction distance construction depth, the basic simulation model and vibration conditions are summarized in the construction conditions, in which the vibration conditions need to determine the vibration frequency, spectrum, and evaluation indexes, and experimental testing needs to be determined for the target, location, and instrumentation time and so on. The design of the model requires the use of software to build the model time name.

In the subway traffic environment of the model for the experiment, the need for managers to manage the database, data mobilization, the state of the vibration of the state of the prediction and other state of the analysis, for the management of data is mainly on the subway traffic environment vibration of the actual measurements and numerical data to manage the query and call. Vibration prediction is mainly on the subway better environmental vibration field using machine learning algorithms for data learning, so as to simulate the working conditions of the training and prediction, data visualization is mainly the use of charts and graphs of information on the data to analyze and display^{18}. As shown in Fig. 3 the various functions of the database are introduced.

As shown in Fig. 3, the system components of the database are mainly data management, vibration model predictive analysis, and visualization and analysis of data. Through the analysis and use of the three modules can realize a variety of functions of the database, so as to then use learning algorithms to calculate and analyze the data. Machine learning is a learning method that allows computers to learn through data by issuing instructions to the computer, thereby commanding the computer to perform operations such as programming, and traditionally there are a variety of artificial learning algorithms, including decision trees, clustering, Gaussian, and vector machines. Deep learning then belongs to a sub-domain of machine learning learning algorithms, belonging to a more detailed division of machine learning in the collection. For the analysis of vibration in the underground traffic environment, it is necessary to take the parameter information of the train, such as speed, depth of vibration source, horizontal distance, shear wave velocity, damping ratio, and other factors affecting the vibration response of the subway traffic environment as the sample features, and the response parameters of vibration as the dependent variables. Therefore the parameter analysis is carried out using deep learning algorithms in machine learning algorithms, but since neural networks consist of a large number of neurons capable of transmitting and processing data information, the neurons are also capable of being trained and strengthened into a fixed neural ideology that allows for a stronger response to specific information. This allows for better data analysis and prediction of vibration situations in the metro traffic environment, which suggests that using neural network models is a better method of model prediction and analysis. Among them, for the traditional machine learning algorithm model building is mainly divided into input layer, output layer and hidden layer. Among them, the formula for the input layer is shown in Eq. (1)^{19}.

In Eq. (1), \(x_i\) denotes the value output to the next layer and \(a_i\) denotes the input data. Where the hidden layer is calculated as shown in Eq. (2).

$$y_j = \sum\limits_i = 1^M w_ij x_i $$

(2)

In Eq. (2), \(y_j\) denotes the output value of the input to the next layer and \(w_ij\) denotes the weights between the input layer and the hidden layer. The formula for the input layer is shown in Eq. (3).

$$o_k = \sum\limits_j = 1^o w_kj s_j $$

(3)

In Eq. (3) \(w_kj\) denotes the weights between the hidden layer and the output layer, and \(s_j\) denotes the value of the output change through to the hidden layer, where \(s_j = f(y_j )\). Where the number of samples set is \(m\), the number of neurons in the neural network is \(M\) and the number of hidden layers is \(Q\)^{20}. The basic model of the machine learning algorithm obtained after calculating each value is shown in Fig. 4.

In Fig. 4, the basic structure of the machine learning algorithm consists of neuron visualization, where the input layer represents the input of data and passes it to the next layer, and the hidden layer represents where the data is transferred to the next output layer through computation and analysis. The output layer is the input and processed data through here to complete the final model data output. After the completion of the construction of the machine learning algorithm, it is necessary to calculate some of the necessary data such as some of the basic subway units within the railroad track control equations. As shown in Eq. (4).

$$E_r^* I_r \frac\partial^2 u_r \partial x^4 + m\frac\partial^2 u_r \partial t^2 = Fe^iw0t \delta (x – \overlinex_0 – vt)$$

(4)

In Eq. (4), \(E_r^*\) represents the material and damping change elastic model of the railroad track, \(v\) represents the moving speed when loading, \(w\) represents the number of angular rotations of the track, \(u_r\) represents the moving distance of the track when it is in vertical direction, \(m\) represents the mass of the track per unit length, \(\overlinex_0 \) represents the change of the train’s position in the initial moment of loading, and \(x\) represents the moving distance. The damping force formula for the rail is shown in Eq. (5).

$$E^* = E(1 + 2i\beta )$$

(5)

In Eq. (5), \(\beta\) denotes the ratio of damping in the medium, \(E\) denotes the modulus of elasticity, and \(E^*\) denotes the modulus of elasticity after considering the damping of the medium. As shown in Eq. (6)^{21}.

$$G^* = G(1 + 2i\beta )$$

(6)

In Eq. (6), \(G\) denotes the shear modulus and \(G^*\) denotes the shear modulus after considering the medium damping. The two formulas are able to calculate the track operation data in the subway environment, thus obtaining the prediction data and experimental data that need to be calculated by the machine learning algorithm. As shown in Fig. 5 is the flow chart of the basic operation structure of the machine learning algorithm^{22}.

As shown in Fig. 5, first of all, in the machine learning algorithm on the subway environment track data to start the phase, the need to first initialize the data processing, by preparing the data set will be split into the data set, split into the training set of a test set of two parts, respectively, and then the algorithm of the data set, in the calculation of the data set of different data using the appropriate machine learning algorithms to analyze the data set, and train the appropriate algorithms after the analysis. At the same time, the size and features of the training algorithm are obtained through the above split analysis. Finally, the model is then used to determine whether the data set needs to be analyzed and calculated to output the algorithm model^{23}.

### Construction of environmental vibration prediction for metro transport based on machine learning and SQL database

The use of machine learning algorithms and database technology in the prediction of the subway transportation environment can improve the accuracy of the prediction data and reduce the cost of prediction by predicting the construction program, drawings, operation period and program enforceability data. Algorithmic predictive modelling applications for databases are generally programmed in Python. By saving the data information as a .py file in the Python programme, the entire data information is a single Python module, which can be imported into a different application using the import module in the programme. Therefore saving the algorithmic model and then calling the new data algorithmic program through the program ensures that the current data information can be used by the algorithmic model, which in turn enables the conduction of data between the model predictions and the database.

For the construction process of the algorithmic model and the underground vibration SQL database, firstly, in the query of the data parameters of the underground traffic environment, send data prediction requirements; SQL database through the search engine will get the data transmission into the server, after receiving the data through the Python programme view tool will be converted to the data module, and then the data will be transmitted into the algorithmic prediction model, and secondly Receive the current prediction result parameter data, and finally display the front-end interface in the form of data conversion in the Python programme to get the final prediction result data. At this time to complete the construction process of the database algorithm prediction model.

Vibration modeling refers to the use of formulas and measurements of vibration targets along the construction line to simulate the speed vibration frequency and vibration size of the vibration targets. However, the prediction results using the model need to make reference to the subway construction environment value and significance^{24}. The traditional subway vibration prediction model uses the construction of empirical prediction methods, experimental prediction methods and other methods. The empirical prediction method is the more widely used prediction method, often using the fitting of the formula to achieve the vibration target prediction of the subway environment. As shown in Eq. (7).

$$L_a (room) = L_t (tunnelwall) – C_g – C_gb – C_b$$

(7)

In the formula (7), \(L_a (room)\) indicates the acceleration level of the inner wall of the tunnel in the construction of the subway building, and \(L_t (tunnelwall)\) indicates the vibration acceleration level of the ground in the construction of the subway. \(C_g\) \(C_gb\) and \(C_b\) indicate the attenuation of vibration in the vibration fault, vibration into the tunnel, vibration in the tunnel. Equation (8) for the prediction of vibration noise and vibration sound pressure formula.

$$L_B = L_r + R_tr + R_tu + R_g + R_b$$

(8)

In Eq. (8), \(L_B\) denotes the predicted sound pressure level of the subway environment, \(L_r\) denotes the velocity level of the subway track, \(R_tr\) denotes the amount of vibration energy lost in the subway track, \(R_tu\) denotes the reduction of vibration energy transmitted in the subway channel, \(R_g\) denotes the energy lost in the transmission of vibration energy at the ground level, and \(R_b\) denotes the loss of vibration energy in the inner wall of the subway tunnel. The vibration prediction formula when the subway passes through the soft soil layer is shown in Eq. (9)^{25}.

$$V = F_v F_R F_B = [V_T F_S F_D ]F_R F_B$$

(9)

In Eq. (9), \(F_v\) represents the change of the function of vibration in the subway channel, \(F_R\) represents the change of the mass of the subway track, \(F_B\) represents the change of the subway building after amplification, \(F_T\) represents the type of the train passing through the subway, \(F_S\) represents the speed of the train, and \(F_D\) represents the distance of the train. The vibration prediction data can be obtained by the method of hammer tapping. As shown in Eq. (10)^{26}.

$$L_v = L_F + TM_line + C_building$$

(10)

In Eq. (10), \(L_v\) denotes the vibration velocity level of the predicted value, \(L_F\) denotes the force density level of the vibration generating source, \(TM_line\) denotes the linear transmission efficiency of the train vibration, and \(C_building\) denotes the corrected energy of the vibration transmitted from the ground to the inner wall. The formula for the change of vibration environmental impact during train operation is shown in Eq. (11).

$$VL_Z\max = VL_Z0\max + C_VB$$

(11)

In Eq. (11), \(VL_Z\max \) denotes the predicted maximum vibration level, \(VL_Z0\max \) denotes the predicted vibration source intensity, and \(C_VB\) denotes the correction index of the vibration target. The improved formula of Eq. (11) is shown in Eq. (12) .

$$C_VB = C_V + C_W + C_B + C_T + C_D + C_B + C_TD$$

(12)

In Eq. (12), \(C_V\) represents the correction value of speed of running train, \(C_W\) represents the correction value of bearing weight and spring mass of train, \(C_R\) represents the correction condition value of train track, \(C_T\) represents the correction value of different track styles, \(C_D\) represents the correction value of attenuation of distance, \(C_B\) represents the correction value of building in the project, and \(C_TD\) represents the correction value of density of the train traveling. The change formula of vibration parameters in different areas is different, such as Eq. (13) for the vibration value in another area.

$$VL_Z\max = VL_Z\max ,0 + C$$

(13)

In Eq. (13), \(VL_Z\max ,0\) denotes the vibration level of the maximum measured vibration source of the train passing through the engineered tunnel, and \(C\) denotes the corrected value of vibration.

$$C = C_v + C_g + C_l + C_b + C_L + C_B$$

(14)

Equation (14) is the expression of \(C\) value, in Eq. (14), \(C_v\) represents the correction value of the train’s traveling speed, \(C_g\) represents the mass of the train’s bearings, \(C_l\) represents the correction value of the train’s traveling curve, \(C_b\) represents the correction value of the train’s track, \(C_L\) represents the correction value of the train’s traveling distance, and \(C_B\) represents the correction value of the track’s construction. The empirical method is able to calculate the fitted formula, so using the empirical method for prediction can make the prediction more effective and less costly, so using the empirical method for prediction during the construction phase of the subway will make the construction project less costly, but the lack of accuracy of the empirical method is a big problem of this method. At this time the use of machine learning neuron method can greatly reduce the construction cost and enhance the accuracy of prediction.

Test prediction is the vibration assessment of the actual predicted vibration targets in the field, meaning that the actual measurement of the field data is then predicted. However, in the actual field test, because the subway has not yet been completed, the test data results are mostly simulation results data. Hybrid prediction method is through the prediction results of the accuracy and multi-parameter system coupled to the different methods of simulation experiments so that the experimental prediction method for the hybrid prediction method, usually hybrid prediction method can solve the complex parameter uncertainty and multi-data mixed system error problems. As shown in Fig. 6, the prediction and evaluation process of subway construction environment.

Figure 6 shows that in the feasibility of the program stage needs to be the subway line and construction phase of the feasibility analysis, through the empirical prediction method of the subway environment model for the prediction of the parameters, to achieve a full range of environmental prediction of the preliminary assessment. After the predictive analysis of the feasibility of the program, the actual data of the subway environment using the actual method of testing and analyzing the sensitivity of the calculator environment, to achieve the local data prediction of the subway environmental parameters. In the construction design stage, the sensitivity is judged by using the actual testing and in-tunnel measurement method to determine whether the sensitivity meets the construction requirements. Finally, the total construction process evaluation conclusion is calculated. In the prediction model of machine learning and database technology, the selected characteristic parameters include column velocity, vibration depth, distance, damping ratio, density, Poisson’s ratio, shear wave speed and other influencing factors. Therefore, multiple neuron parameters need to be designed while building data for the model. The number of neurons designed determines the number of parameters selected for the data, the more parameters for the experimental data will be more comprehensive, and the whole prediction results will be more accurate.

link