2D Simplified Wildfire Spreading Model in Python: From NumPy to CuPy

Wildfires are a problem of great interest in the world because of the damage they cause every year to forest, wild fauna and flora, an also threatening human lives, among others. There exist some computational models for numerical simulations to address this phenomena, and a few of them have an open-source implementation. The goal of this work is to extend a CPU Python ’s implementation of wildfire open-source framework developed in NumPy , to a GPU improved version using CuPy . The algorithm used is based on a numerical discretization of a wildfire spreading mathematical model described by a system of partial differential equations. Computational and mathematical components, numerical simulations and applications are described in details in the document. In addition, this work includes a brief performance comparison between both implementations, pointing out that we can achieve lower execution times using the CuPy GPU implementation without spending enough programming effort.


Introduction
Wildfires are ubiquitous phenomena of global interest since they may occur in almost every zone of the world [1], and have the potential to damage large extensions of land [2]. Fire can start for different causes, including natural causes such as lightning, spontaneous combustion, and volcanic eruption, but about 75% of worldwide fires are caused by humans, either deliberately or by negligence [3,4]. Due to the devastating effects of this type of disaster and their dynamics behavior, it is crucial to have open-source implementations as an active tools for analysis of wildfires, specifically, based on fire models.
According to [5] fire models can be grouped into three categories: risk assessment, propagation and effect models. Risk assessment is associated to quantification of possible fire episodes. The second category study the propagation of fire, starting from physical laws up to regression models. Finally, effect models is a high level analysis, including fuel management planning, tree mortality up to analysis of ecosystems, among others. These categories are not mutually exclusive. Propagation models can be sub-classified into three types: empirical, physical and semi-empirical. The empirical models use a statistical approach to describe the fire behavior. Physical models are based exclusively in physical laws. And finally, semi-empirical models combine both empirical and physical approaches.
The goal of this work is extend a CPU implementation developed in Python using Numpy and described in [6], to a GPU implementation prototype exploring the use of CuPy.
During the last decade, two of the most used approaches in Wildfire Modeling are Partial Differential Equations (PDE) and Cellular Automata (CA) based models. The first class of models is based on physical laws, describing physical and chemical processes using differential equations [7]. In the second kind of models, the fire dynamic is studied on a landscape grid following transition rules. These rules determine the state of each cell and they are based on theoretical and semi-empirical mathematical fire behavior models [7]. Initial efforts of PDE -based modeling for fire spreading were the work of Albini [8] and Weber [9]. In the area of coupled atmosphere-fire models are the work of Grishin [10], Linn with FIRETEC [11,12], and McGrattan with FDS applied to wildland fires [13,14], among others. Regarding empirical models we can mention FARSITE [15] or ELMFire [16], but as well as coupled models, they are outside the scope of this article. For more details of forest fire models see [7] or [17].
This article presents a PDE -based spreading model which uses a similar mathematical formulation as presented in [18], [19], and [20]. Asensio & Ferragut [18] presented a 2-dimensional model which describes the dynamic of temperature and solid fuel. They solved the equations using a Finite Element Method (FEM ) for the spatial discretization and a semi-implicit Euler scheme for time integration. This work presented some numerical results qualitatively consistent with fires under controlled conditions. Mandel et al. [19] derived a similar model than Asensio & Ferragut, based on the conservation of energy equations and the balance of fuel supply. They used Finite Difference Method (FDM ) for the spatial derivatives and Explicit Euler Method for time integration. As an active fire simulation tool, this work introduces data assimilation techniques to correct fire dynamics using real measurements. Finally, Eberle et al. [20] also derived a similar model to the previous ones, using the conservation of mass, energy and fuel supply balance equation. The equations were solved using a Radial Basis Function (RBF ) Collocation Method for the spatial discretization, and a Crank-Nicolson scheme for time integration. This work shows the ability of the model to simulate fire for 2 different fuel types and how the propagation speed varies according to the parameters used.
A selection of related work using CA and PDE approaches and their code availability is presented in Table 1.

Wildfire mathematical model
The mathematical model used for this work is based on the proposed by [18] with the simplification introduced by [20]. Assuming a spatial domain x = (x, y) ∈ Ω = [x min , x max ] × [y min , y max ] ⊂ R 2 with boundary Γ = ∂Ω, and a temporal domain t ∈ [0, t max ] ⊂ R + 0 . Let u(x, t) be the numerical value of the temperature and β(x, t) the amount of fuel, both at spatial coordinate x and at time t. The vector field that models the effect of wind w(t) and topography Z(x) is v(x, t) = w(t) + ∇Z(x), and f (u, β) is a nonlinear heat source that describes the interaction between fuel and temperature. The simplified mathematical model of wildfire spreading, is defined as where, Since the model is defined in a 2-dimensional domain the gradient is defined as ∇ = ∂ ∂x , ∂ ∂y and the Laplace operator ∆ = ∂ 2 ∂x 2 + ∂ 2 ∂y 2 . The nondimensional parameters of the model are diffusion coefficient κ, the inverse of activation energy of fuel ε, natural convection α, reaction heat q, and the phase change threshold u pc .
An assumption used in this work is that the spatial domain is large enough to avoid fire spreading at the boundary of Ω, then the following Dirichlet boundary conditions are used: . Nevertheless, the model can be used with Neumann boundary conditions. The components used by the model are presented in Figure 1.

Numerical Algorithm
Numerical implementation of (1) requires the approximation of spatial and time partial derivatives involved in the problem formulation. To address these approximations, this work uses the Method of Lines, technique for numerical solving of PDEs in which all but one dimension is discretized. The following subsections will describe the methods used to approximate the partial derivatives indicated previously.

Method of Lines
Method of Lines (MOL) is an important approach for the numerical solving of PDEs. This method consists of two parts, space discretization and time integration [31]. The space discretization approximates the PDE by a system of Ordinary Differential Equations (ODEs) by discretizing the space domain using finite difference, finite elements, spectral methods, among others. Here, time is the independent variable of the ODE system. In the second step, the system is solved using some numerical method for time integration, discretizing the time domain to get a fully discrete scheme. There are several works, such as [32,33,31,34], discussing the stability and convergence of the method for some particular problems. This work uses MOL as a mathematical framework to solve (1) numerically.

Spatial Approximation
In the mathematical model used for this work, temperature dynamics needs the computation of ∆u and ∇u. To deal with these space partial derivatives, a discretization on a rectangular grid of N x + 1 by N y + 1 nodes is established. Let x i and y j the discrete version of the spatial domain, defined as where ∆x = (x max − x min )/N x and ∆y = (y max − y min )/N y . Now, temperature component is defined as u i,j (t) = u(x i , y j , t) and stored in the matrix U (t) described by Following this ordering, fuel component β i, and the vector field v( is stored on the grid by the matrices V 1 (t), V 2 (t) defined as follows Analogous for functions f (x i , y j , t) = f i,j (t) and g(x i , y j , t) = g i,j (t), Now, partial derivatives will be computed using the Finite Differences Method by means of computing the product between a Differentiation Matrices and the corresponding discretized variable. Following the approach of [35], a Central Finite Difference Method (CD FDM) approximation for the first and second spatial partial derivative [36] is used. An extension of Forward and Backward Difference is used to keep the order of accuracy at the boundary of the domain. For more details, we included the subsection Finite Difference formulas in the Appendix to describe the expressions used.
For simplicity, time dependency was omitted in u i,j (t) and will be omitted for U (t) as well. Taking into account the storage order of matrices, the boundary conditions described before, and dropping the error terms O(∆x 2 ) and O(∆y 2 ) spatial partial derivatives can be approximated as presented below.
• The first partial derivative with respect to x using (40), (41), and (42) is: with, • The second partial derivative with respect to x using (43), (44), and (45) is: with, • The first partial derivative with respect to y using (46), (47), and (48) is: with, • The second partial derivative with respect to y using (49), (50), and (51) is: with, Defined these expressions, we can approximate Laplacian and gradient of u as follows, and, Using (15) and (16), the system of PDEs defined in (1) becomes to the following dynamical system where F (t) = f (U (t), B(t)), G(t) = g(U (t), B(t)) and ⊙ denotes the Hadamard product or element-wise multiplication. Notice that the values of U on Γ ensures that the boundary conditions are satisfied for each t. In particular, the Dirichlet boundary conditions u(x, t) = β(x, t) = 0, x ∈ Γ, are imposed setting the values of u i,j (t) = β i,j (t) = 0 when i = 0, N x , j = 0, . . . , N y and when i = 0, . . . , N x , j = 0, N y . The next step is to solve (17) by some numerical integration method.

Time Approximation
This section aims to represent (17) as an Initial Value Problem (IVP) and solve it numerically. Rewriting (17) as dt , a vector version of (18) is presented due to the existence of several libraries that solve IVPs but require a vector representation of the problem (see details in the Vectorization subsection from the Appendix). Let us define y(t) as, now, (18) is redefined by the following IVPẏ where,ẏ and There is a wide range of numerical methods to solve (20) and it is necessary to specify a discrete version of time. This domain will be defined as, where ∆t = (t max − t min )/N t . The numerical methods implemented by this work will be detailed below. The first method used to solve (20) is Euler Method, mainly for first experiments and testing the convergence of spatial approximation. Defining y n = y(t n ), the approximation of y(t) at time t n , the computation of the IVP using this method is y n+1 = y n + ∆t Φ (t n , y n ) .
This method has a temporal truncation error of O(∆t) [36]. Furthermore, this work also uses Fourth Order Runge-Kutta Method (RK4) to solve (20), since its stability region give the flexibility of a ∆t choice [35]. Defining y n = y(t n ), the approximation of y(t) at time t n , the computation of the IVP using this method is where, This method has a temporal truncation error of O(∆t 4 ) [36].

Algorithm Description
Using the approach of MOL described in section 3.2.1, the algorithm proposed integrates spatial and time approximation presented in sub-subsections 3.2.2 and 3.2.3. The application of RK4 with expression (25) and (22) is detailed in Algorithm 1. The approximation of Φ using expression (22) is presented in Algorithm 2. Finally, Algorithm 3 summarize the application of MOL to approximate the simplified wildfire mathematical model described in subsection 3.1.
The computational complexity of the entire Algorithm is ∼ 4 N t (8 N 3 + 6 N 2 + C f N 2 + C g N 2 ), where N = N x = N y is the number of nodes in x and y direction respectively, N t is the number of nodes in t, C f and C g are the costs of evaluating the function f and g. See [6] to get a detailed description of the complexity derivation.

Stability Analysis
To ensure the operation of the algorithm, a stability analysis to solve the problemẏ(t) = Φ(t, y(t)) is performed. Expanding the expression 18, where,U The stability analysis requires the Jacobian Matrix of (27), that is, the computation of the derivative of each vector element with respect to U i,j and B i,j , i = 0, . . . , N x , j = 0, . . . , N y as follows, Then, the elements of (28) are: Notice that H pc (U i,j ) is not differentiable at u pc but ∂Hpc(Ui,j ) ∂Ui,j = 0 for U i,j ̸ = u pc . Finally, J Φ is defined by where, Now, the stability analysis requires the computation of the eigenvalues λ of J Φ (t n , y n ) for n = 0, ..., N t , and they must be inside the stability region of the IVP solver. Letz = λ ∆t ∈ C, the Euler method is stable if |1 +z| ≤ 1 and RK4 is stable if |1 +z + 1 Notice that the stability region of RK4 is larger than the Euler Method, allowing the use of larger values of ∆t.

Numerical Implementation
The implementation of the Algorithms 1, 2, and 3 was developed using Python. Two libraries were used to performed both implementations for CPU and GPU.
The first one using NumPy, which performs optimized operations over the RAM by means the use of the multithreading framework OpenMP. Notice that this library allows the storage of vector data structures, and it works as a wrapper of all the necessary linear algebra algorithms implemented in C/C++ [37].
The second implementation was developed using CuPy, an extension of NumPy but works with GPU, specifically, by means the use of CUDA libraries [38]. A CUDA compatible GPU is required for its operation due to the data manipulation is performed over the video RAM (vRAM). It has almost a full compatibility with the functions used in NumPy, so, a very simple way to describe its use is changing the routine from numpy.routine to cupy.routine.
For both implementations, the spatial partial derivatives in Φ are computed using array slices, which are optimized to achieve short execution times. The original framework described in [6] also includes the matrix-matrix product using numpy.dot, but due to the limited memory for GPU implementation, the slice approach was selected since we do not need to storage the differentiation matrices. To solve the IVP problem (20), the Euler Method and RK4 method were implemented from scratch using for loops.
It is important to remark that once having the first implementation in NumPy, the extension to CuPy is almost straightforward by making the changes mentioned above. This allows to obtain a GPU implementation with a higher performance than the CPU with very little programming effort.

Algorithm Analysis
To evaluate the properties of the algorithm as a valid tool for approximation, this section presents a brief description of the convergence and complexity, and a deep detail the stability analysis of the numerical method. All the analyses elaborated in this section were developed with Python CPU implementation.

Stability
Stability analysis is performed using N x = N y = 63, t max = 25, N t = 500 for the Euler Method, and N t = 100 for RK4. Figure 2 presents the stability region of the Euler Method and eigenvalues of J Φ times ∆t for the experiment performed. For the Euler Method, except in some specific cases for intermediate time-steps, thē z i values are slightly outside the stability region when ℜ(z i ) < 0. Although, this did not produce an unstable numerical simulation, we strongly suggests to use RK4 to avoid any numerical concern.
Analysis of experiment for RK4 is presented in Figure 3. It contains the stability region of the method and eigenvalues of J Φ times ∆t.
Unlike the Euler method, in this case thez i are within the stability region for all intermediate time steps. According to sub-subsection 3.2.5, almost allz i computed for Euler Method are inside its stability region but for RK4 all these values are inside the stability region. These experimental results enable the proposed algorithm as a valid tool for numerically solving the model (1), but it is convenient the use of RK4 over the Euler method.

Convergence and Complexity
A convergence and complexity analysis of the algorithm is described in details in [6]. The curves of convergence is presented in Figure 4 and the complexities in Figure 5.   These analyses are important because they show that the numerical schemes are consistent because the error for both approximations goes to 0 when ∆x = ∆y → 0 and ∆t → 0, and also the complexity guarantees that the numerical algorithm can compute numerical simulations in affordable times because it is polynomial with respect to the problem size.

Numerical Simulation
The numerical experiment presented in Figure 6 is performed for a qualitative fire spreading analysis. It tries to reproduce qualitatively the output shown in [18]. The high-temperature zone, or fire front, is formed following the wind direction. Also, the fuel is consumed in the same shape as the fire front. The maximum temperature value reached is around u = 9, which in physical variables corresponds to T = 1110 [K]. This result is within the values measured in a forest fire and and also agrees with the output reported by [18].

Implementations performance
In order to compare the computational capacity of each of both implementations, we performed 4 simulations for grids size of N x = N y = {2 6 , 2 7 , 2 8 , 2 9 }. The idea is to show the capabilities of the GPU implementation, but taking care of the limited vRAM availability. The codes were executed in a workstation provided by the Scientific Computing Team (SCT) of the Universidad Técnica Federico Santa María. This computer has an Intel(R) Core(TM) i9-10900 CPU with 2.80 GHz frequency, 10 physical cores, 20 logical cores and 64 GB of RAM. Also, it includes a NVIDIA GeForce RTX 3060 graphic card, which has 3584 cores with 1.32 GHz frequency and 12 GB of vRAM. Figure 7 presents the result of the experiments for the Numpy and CuPy implementations.   From Figure 7a we can figure out that CuPy has slower times for experiments of N ≥ 2 8 . It is very important when using Finite Differences due to the convergence of the method, requiring a finer mesh for better results. Additionally, Figure 7b shows that for N = 2 9 the CuPy implementation can be executed more than 8 times faster comparatively with NumPy just changing the name of the library in the code. However, it is important to mention that the GPU version will be limited according to the graphics card resources available to execute the code.

Discussion
Compared to the related work presented in Section 2, this article presents an implementation using a similar model than [18], [19], and [20], but solved using FDM for spatial discretization and RK4 for time integration. Table 2 shows a comparison of the numerical methods used by the related work presented in section 2 and this work. The main advantage of FDM is the simplicity of the mathematical formulation versus the FEM or a RBF collocation method, but it requires finer meshes for complex geometries. As disadvantage of the FEM, it requires a special data structure to handle the computation in addition to the pre-processing related to the computer grid generation. For RBF, it depends on the radial basis function used, in addition to the selected parameters according to the problem nature. In terms of implementation FDM is easier than the others methods, and also is suitable for the data structure provided by NumPy and CuPy. On the other hand, for time integration, both semi-implicit Euler and Crank-Nicolson are more stables than explicit methods such as Euler or RK4 methods allowing the use of larger ∆t, but they require solving a system of equations for each time step, which may result in longer computation times. Although this work uses a method that requires evaluating 4 times the right-hand side of the PDE, it allows to achieve a better order of convergence.

Conclusions
This work has successfully presented two Python implementations to support wildfire spreading and effects analysis. Using the numerical implementation of a two-dimensional simplified wildfire model, this work allows the numerical simulation of several fire scenarios. These simulations depend on components such as temperature, fuel fraction, wind, and topography which are parts of the mathematical developed by [18], using the assumptions of [20] . For the numerical approximation of the mathematical model, the proposed algorithm is based on Method of Lines, which splits the space and time dependency allowing to approximate independently using different classes of numerical methods. In this implementation was used FDM for space and RK4 for time. Regarding the contribution of this work, this article: 1. Provided analysis and description of the numerical model used.
2. Presented an algorithm with the numerical method used, the convergence, temporal complexity, stability and simulation capabilities. Since the results were positive, we can experimentally assure that the proposed algorithm is valid for the approximation of the mathematical model.

Extended a first
NumPy implementation to CuPy, in order to take advantage of the GPU performance.
4. Showed the performance and comparison of both CPU and GPU implementations.
It is important to point out that the CuPy implementation allows a fast GPU prototype, improving the performance compared to a CPU implementation without the necessary effort of directly using frameworks such as CUDA or OpenCL. This performance improvement is only valid for larger problems mainly because of the fixed cost required in moving memory from RAM to vRAM.
Satisfactorily, this work goes in the direction to keep improving the current open-source computational tools for wildfire spreading analysis. As results, the developed framework is publicly available in https: //github.com/dsanmartin/ngen-kutral.

Future Work
Following the guidelines developed in this work, as future work at least three possible routes can be detached: 1. Explore the use of spectral methods in the approximation of partial derivatives with respect to spatial variables. For instance, the use of Chebyshev's differentiation matrices [35], which require a smaller number of nodes to obtain better approximations when computing the derivatives. Also, the use of the Fourier Transform can be explored, taking advantage of the optimized libraries for the computation of the Fast Fourier Transform in both CPU and GPU. A prototype CPU version was developed in [39].
2. Use a GPU framework such as CUDA or OpenCL, in order to improve the performance achieved by CuPy. These specialized software allow to take better advantage of the resources provided by the GPU, due to the memory hierarchy model. In particular, the use of shared memory can significantly accelerate the performance of some operations.
3. Improve the quality of the mathematical model used, coupling a weather model that allows a more realistic wind behavior. For example, the use of the Navier-Stokes equations or some approximation of it that enables the consideration of the atmospheric variation generated by the increase of temperature in a wildfire episode. It is necessary to consider that the numerical methods involved in fluid dynamics are computationally expensive, so they require the use of parallel programming techniques and the access to High Performance Computing (HPC) resources. Inverse vectorization operator.
For the second partial derivative with respect to y: