Simulating Severe Weather

Dr. Dobb's Journal March 1999

Powerful hardware and efficient algorithms make the difference

By Louis J. Wicker

Louis is an associate professor in the Department of Meteorology at Texas A&M University. He can be contacted atl-wicker@tamu.edu.

Simulations of severe thunderstorms and tornadoes were closely linked to the development of supercomputers during the 1970s, as well as the ongoing explosion in microprocessor power over the last decade. What made three-dimensional (3D) simulations possible was the arrival of the Cray-1A supercomputer at the National Center for Atmospheric Research in 1976. This provided researchers with enough computational power to perform the first simulations of rotating thunderstorms, which require 3D domains. The first simulations, which used approximately 15,000 grid points evenly distributed within a 50×50×15-kilometer box, required nearly 30 hours of CPU time and all of the Cray's memory to simulate two hours of a thunderstorm's life cycle. The model output from these simulations led to our first concrete understanding of the flow dynamics of tornadic thunderstorms and began a new era in severe storm research where "cloud models" and supercomputers were the primary tools of discovery. (Cloud models are specialized fluid-dynamics programs that represent the complex airflow and thermodynamic processes that occur within a wide range of convection, from small fair-weather cumulus clouds to severe thunderstorms and tornadoes.)

The explicit simulation of tornadoes within these thunderstorm simulations took almost another 10 years of research and technology advances. My work in this area, while a Ph.D. student at the National Center for Supercomputing Applications at the University of Illinois, was greatly facilitated by the delivery of the first large memory (1-gigabyte) supercomputer -- the Cray-2 -- in 1988. Having large in-core memory greatly simplified the development of new cloud models that had multiple grid-mesh capability.

The simultaneous simulation of both thunderstorms and tornadoes requires a model to include motions on scales of tens of kilometers down to hundreds of meters or smaller. The first 3D thunderstorm simulations in 1977 used a 2-kilometer grid mesh. Tornado-producing simulations require a grid mesh having 100-meter spacing. Increasing the mesh resolution by a factor of 20 in each spatial direction was not feasible, because it would have increased CPU time by a factor of 160,000. Therefore, a new model had to be designed and developed; one that could locally enhance the grid resolution in regions of the storm where the tornado might form. This multiple grid technique, called "adaptive gridding," still required nearly 10 million grid points and almost 100 hours of Cray-2 time to properly capture the thunderstorm and its tornado for 30 minutes of real time. Even with today's powerful workstations and supercomputers, tornado simulations require multiple grid-mesh simulations and dozens of hours of computation.

Numerical Methods of Cloud Models

Simulation of thunderstorms and tornadoes require three classes of physical processes to be represented within the cloud model.

All cloud models start from a form of the equations of motion for a fluid, called the "Reynolds-averaged Navier-Stokes equations." In a Cartesian rotating-coordinate system, these partial differential equations (PDEs) can be written as in Example 1, where (u,v,w) are the velocity components associated with the (x,y,z) axes, T the temperature, qi the water substances (vapor, cloud, rain, ice, and the like), p the pressure, r the density, R the gas constant for dry air, Cp the specific heat of air at constant pressure, f the coriolis parameter (which includes the effects of the earth's rotation), D the turbulent mixing, H the heating rate of air, M the microphysical processes, and g the gravitational acceleration.

Together, the equations in Example 1 describe the transport and mixing of heat, mass, momentum, and water substance in the atmosphere, as well as the effects of microphysical processes.

Approximating the PDEs on a grid using numerical methods is generally a straightforward, yet detailed, process. In atmospheric models, it is conventional to decompose the dependent variables of the governing equations into amplitudes associated with a Fourier series representation of the model data, or simply represent the values themselves using a set of grid points. In cloud models, the latter approach is used because it is more computationally efficient when including the microphysics or turbulent mixing parameterizations. Approximating the equations in Example 1 on a grid mesh is done using the finite difference method (FDM). For example, one important process in the Example 1 PDEs is advection (transport) of temperature by the winds. For a 2D plane, the change in temperature at a given point due to advection is given in Example 2, where the local change of T at a point is minus each velocity component multiplied by the temperature change in the direction of the wind.

Now we must discretize each term in the equation to formulate the PDE for advection on a grid mesh. For our hypothetical 2D problem, this mesh would look like Figure 1, with all the variables being defined at each grid point and at time t. Here, T(x,y) is identical to Ti,j, which begins to look like a 2D array element. Derivatives must now be represented in terms of the values of temperature at each grid point. You do this by using Taylor series expansions as in Example 3, where x, y, and t represent the spatial and temporal grid spacing associated with the computational mesh (space and time). By rewriting these as in Example 4, a representation for PDE in Example 2 in terms of a temperature at each grid point and at time t has been found. The higher-order terms (in brackets) are typically ignored. This "truncation error" in the finite difference approximations then appears as inaccuracies in the solution. For example, Figure 2 shows the advection of a square block of temperature in one dimension. The exact solution to this problem is the square traveling at speed u to the right, and retaining its initial shape. An FDM applied to this problem shows errors in both shape and speed. The initial square pulse becomes deformed and moves slower than the actual speed.

This error is associated with truncating the Taylor-series equations. Increasing the number of grid points representing the temperature field will reduce the error. However, some truncation error will always exist. Much of the art of numerical modeling is in trying to create more accurate versions of the finite difference equations (such as in Example 2) while not significantly increasing the computational cost.

One important aspect of the "discretation" process is the inherent numerical behavior of the FDM for a given PDE. Not all Taylor-series representations of the spatial and temporal derivatives can be combined together to yield an approximately correct answer. Some combinations actually blow-up -- that is, the value at each grid point grows exponentially with time, quickly generating an nonphysical solution. Mathematical analysis of the FDMs is used to determine which combination of the derivative approximations are stable. Often these stable schemes have restrictions on the size of the time step used relative to the spatial grid spacing and the flow velocity. This time limit is called the "Courant-Freidrichs-Levy (CFL) stability condition," which can be stated for one-dimensional advection as in Example 5(a). This means that any feature cannot be transported more than one grid length per time step, or else the scheme loses track of where things are. If this stability limit is violated, the finite difference scheme blows up. Another important implication of the CFL criteria is that as the grid-mesh size decreases, the time step must also decrease (assuming the velocity is constant). Therefore, doubling the resolution of a 3D mesh actually increases the amount of computation by a factor of 16, not a factor of 8, if only the number of grid points are considered.

Example 5(b), an often-used stable FDM approximation, referred to as a "centered-in-time" and "centered-in-space" scheme, is sometimes called the "Leap-Frog" scheme. The numerical solution to Example 2 is stepped forward in time by computing the right side of Example 5(b). This computation is explicit; that is, the right side equations have no dependencies on variables at time t+t. Therefore, these types of FDM are much faster to solve than implicit methods such as those used for finite element simulations. This also implies that such schemes should be easily represented as vectors with parallel solutions, since there are no data dependencies associated with stepping the computation forward in time.

The discretation of all partial derivatives in Example 1 generates a large system of linear equations, which we then solve at each grid point at every time step. In addition to these FDM equations is the inclusion of the discretation of the turbulent mixing and microphysical processes. Turbulent mixing terms have a large set of additional partial derivatives to be computed. The microphysical terms are also expensive to compute because they usually involve exponentiation and other complex logical and floating-point operations. Depending on the complexity of the turbulent mixing and the microphysical parameterizations, a 3D cloud model requires between 2000 and 5000 floating-point operations to advance the dependent variables at a single grid point for each time step. Therefore, integrating these simulations for thousands of time steps requires large computational resources.

Software Design and Implementation

The design and implementation of cloud-modeling software has steadily improved over the last 25 years. Most of the original cloud modeling code was written in undocumented and often obtuse code, partly because the code needed a work-around for the small in-core memory machines that were available and partly because many of those doing the programming were scientists who had little formal training in computer programming. Add to this the use of Fortran-77, a useful but unstructured language with no memory management capability, and you begin to get the picture. However, things have rapidly improved during the last 15 years. The advent of large core memories and supercomputing workstations has greatly facilitated code development. Researchers have started to incorporate modern programming practices into their code. The lifetime of any given software has also steadily decreased so that new software is written more frequently. The original cloud-model software written for the Cray-1A remained unchanged for almost five years. Current software, such as my research program, is now improved every few months. This has improved the readability and organization of most codes. Even so, the biggest advancement has yet to be realized -- that is, the introduction of Fortran-90 -- because most code probably should be rewritten completely to take advantage of Fortran-90's structures and memory management. My own software will be rewritten sometime this year to take advantage of these new features.

Figure 3 shows the design of a typical cloud-model software system. The function of the solver module is simply to take the initial conditions (the dependent variables at time t) and, with the specified boundary conditions, solve the PDEs from Example 1 for a single time step. This would also include the effects associated with microphysics and the sub-grid scale motions. The solver then returns the solution, at a time t+t, to the management module. The I/O modules are responsible for reading and writing the dependent variables at specified times and parsing the data into subsets that are easier to manage and store. For example, most cloud models store the entire 3D grid data to disk every 5 to 10 minutes. This produces a file or set of files that can be as large as 10 gigabytes. Examining all this data is difficult. Therefore, other I/O streams are generated, which make it easier to manage the simulation. Most cloud models produce a textual output file that shows the maximum and minimum of each variable every few seconds and also outputs smaller data files that contain a subset of the 3D data. All of these data are used by researchers to track and understand what is happening to the thunderstorm within the simulation. The textual data is often used to monitor the simulation for solution stability and accuracy, while other small data files are used to get a quick overview of the thunderstorm's evolution.

Until recently, most software combined the driver and solver modules, since there was little to do after initialization but integrate the model forward in time. However, the incorporation of adaptive gridding requires a management routine that not only controls the time stepping and data I/O, but also examines the solution to determine when and where new grids are needed. This driver also makes sure that numerical grids communicate their data to the other grids, which is necessary for accurate simulation. For example, Figure 4 shows simulation using adaptive grids for a line of thunderstorms. The numerical domain is 1000×1000×20 kilometers in size in order to capture the entire line of thunderstorms. However, the individual thunderstorms need to be resolved using 2-kilometer grid meshes. Here the driver module, using criteria specified by the user, places 6-kilometer and 2-kilometer grid meshes automatically around the line of thunderstorms. The areas of active thunderstorms (the black contours) all are contained within 2-kilometer grids. As the solution evolves, the adaptive grid meshes also evolve to maintain solution accuracy.

Parallel Processing

As with many other disciplines, parallel processing has become an integral part of simulating thunderstorms and tornadoes. Many problems require simulations that would take hundreds of hours on single-processor systems. Considerable effort has been spent on porting software to various parallel systems. Initially, versions of my cloud model were written for single instruction, multiple data (SIMD) systems such as Thinking Machines, CM-2, or CM-5 platforms. The use of distributed memory in these systems required a significant redesign and rewrite of the code. After considerable effort, the code was ported. However, its parallel efficiency was so low due to a fundamental aspect of cloud model code (and other atmospheric codes as well), that the ratio between floating-point operations to communication was close to unity. Therefore, a machine having fast processors but slow interprocessor communication would not perform well on this type of code. Indeed, many of the first parallel platforms had this limitation.

In more recent architectures, such as the SGI Origin systems or the HP Exempler systems, this problem has been somewhat overcome. These computers have a single memory image employing a fast bus or crossbar communication system. Communication costs are much reduced. Parallelization of cloud model code is often done automatically by the compiler. This proved to be a sufficient way to get 80 to 90 percent speedups on 10 to 30 processors, and for many simulations, this is certainly adequate. However, experience has shown that this methodology will not allow the efficient use of 100-1000 processors. Other parallelization methods, such as message passing, are needed for cutting-edge simulations like those used to study tornadoes.

To use clusters of SIMD systems, such as 1024 SGI Origin processors clustered as four 256-node machines, communication between grid points again must be reduced. There are several ways to do this. Figure 5 shows commonly used methods to reduce the communication load between processors or clustered systems. In two dimensions, Figure 6(a), a domain decomposition is done such that a portion of the computational grid is running on each processor (or cluster). Communication is reduced, because only values along the edges of each subgrid need to be communicated to neighboring subgrids running on other processors (clusters). Figure 6(b) shows the extension of domain decomposition to three dimensions, by dividing the domains vertically as well as horizontally. It can be shown that for 3D domains, the 3D domain decomposition is the most efficient method of reducing the amount of communication per floating-point operation. Using these methods, speedups of 70 to 80 percent can be obtained. This methodology, which is widely used in fluid-dynamic software, has allowed simulation speeds to approach 100 gigaflops.

Again, large parallel simulations of thunderstorms and tornadoes produce tens of gigabytes of data. Thunderstorm flows are complex and continually evolve over the entire simulation, which can be up to five hours long. Therefore, sophisticated analysis and visualization tools are needed to explore and understand the dynamics of the storm.

Visualization

Perhaps the greatest challenge facing atmospheric scientists today -- both in thunderstorm modeling and elsewhere -- is to be able to access, analyze, and understand the billions of numbers that can now be routinely generated by current supercomputers or even workstations.

Scientific visualization is therefore a crucial tool for a researcher. However, while cloud models are a fairly mature technology, it seems clear that the tools needed to examine such data are only just beginning to mature. Early scientific visualization of clouds (see Figure 6) used visual artists and animators to generate the images using sophisticated modeling software such as Wavefront (http://www.wavefrontsciences .com/). These images were relatively expensive to create in terms of time and effort, and clearly were not practical for day-to-day access and analysis. As workstation power has increased, more interactive tools have become available. One of the most popular and useful tools used today to visualize cloud model and other atmospheric model data is a package called Vis5D, developed by Bill Hibbard at the University of Wisconsin-Madison (http://www.ssec.wisc.edu/software/ software.html). By compressing the data, this software takes a series of 3D data files generated from the model and combines them to produce a 4D data set (space and time) that can be quickly examined in a variety of ways. The visualization methods include isosurfaces, vector planes, slice images, and trajectories (weightless particles following the flow), all of which can be manipulated interactively. Researchers can quickly modify the parameters associated with each tool to bring out certain features within the date. This type of capability -- especially that of time animation -- enables researchers to grasp the development evolution of the thunderstorm and/or tornado. The interactivity is crucial for rapid and comprehensive analysis of the storm simulation. For example, the complex image in Figure 7 was generated in only a few minutes, yet the information communicated to a researcher can be very significant.

The Future

Beginning in the mid 1970s, the first simulations of thunderstorms by researchers significantly increased our understanding of thunderstorm behavior. During the last 20 years, our new understanding of these phenomena has improved forecasts of thunderstorms and tornadoes to the public. Without the development of supercomputers and workstations during this period, much of this new knowledge would not have been discovered. Therefore, the continued development of computational tools, both hardware and software, is vital to increasing our understanding and improving forecasts of these phenomena.

DDJ


Copyright © 1999, Dr. Dobb's Journal