Raster Tools User’s Guide

Spatial Allocator Version 3.6:

 

Work Assignment: 4-01

Contract: Operation of the Center for Community Air Quality Modeling and Analysis

Contract No: EP-W-05-045

 

 

 

 

 

 

 


Prepared for:       William Benjey

                              U.S. EPA, ORD/NERL/AMD/APMB

                              E243-04

                              USEPA Mailroom

                              Research Triangle Park, NC 27711

 

 

Prepared by:       Center for Environmental Modeling for Policy Development

                              Institute for the Environment

                              The University of North Carolina at Chapel Hill

                              137 E. Franklin St., CB 6116

                              Chapel Hill, NC 27599-6116

 

 

Date due:             March 12, 2009

 


Contents

 

1     Introduction.............................................................................................................................. 1

1.1     Background on the Spatial Allocator............................................................................... 1

1.2     The Need for Raster Tools in Spatial Allocator 3.6.......................................................... 2

2     Input Data................................................................................................................................ 3

2.1     2001 USGS and NOAA NLCD Data.............................................................................. 3

2.2     MODIS DATA................................................................................................................. 4

3     Raster Tools – Spatial Allocator v3.6..................................................................................... 5

3.1     Tool for Removing Overlapping Data from NLCD Images: preProcessNLCD.exe........ 6

3.2     Tool for Creating Modeling Grid Domain Shapefiles: create_gridPolygon.exe.............. 7

3.3     Tool to Project and Rasterize Shapefile and MODIS Data to NLCD: toNLCDRaster.exe           7

3.4     Tool to Compute Grid Land Use Information at 30-meter resolution: computeGridLandUse.exe.......................................................................................................................................... 8

3.5     Tool to Convert CVS Grid Land Use Output into WRF netCDF output: txt2ncf.exe... 9

3.6     Tool for Housing Density File Processing: rasterWtoPolygons.exe................................. 9

4     Obtaining Raster Tools – Spatial Allocator v3.6................................................................... 9

4.1     Where to Obtain Software................................................................................................ 9

4.2     Where to Obtain Documentation.................................................................................... 10

4.3     Help Desk Support for Raster Tools – Spatial Allocator................................................ 10

5     Installing Spatial Allocator.................................................................................................... 10

Location of Raster Tools within the Spatial Allocator............................................................ 11

5.1     Third-party software....................................................................................................... 11

5.2     Setting up the User Environment................................................................................... 11

6     Downloading Input Data....................................................................................................... 12

6.1     USGS NLCD  Input Data.............................................................................................. 12

6.2     Downloading 2001 NOAA NLCD data for the U.S. Coastal Regions......................... 13

6.3     MODIS 2001 Land Cover Data..................................................................................... 14

6.4     Specifying Location of Input Data................................................................................. 14

7     Running Raster Tools Using Scripts.................................................................................... 14

7.1     preProcessLanduseImages.csh........................................................................................ 14

7.2     generateGridShapefile.csh.............................................................................................. 16

7.3     allocateRasterLanduse2WRFGrids.csh.......................................................................... 17

7.4     convertLanduseTxt2WRFNetCDF.csh.......................................................................... 21

7.5     allocateRasterW2Polys.csh............................................................................................. 22

8     Visualization of netCDF outputs.......................................................................................... 22

9     Software Performance.......................................................................................................... 25

10   Future Enhancements........................................................................................................... 25


1            Introduction

1.1    Background on the Spatial Allocator

The Spatial Allocator was originally developed to provide tools that could be used by the air quality modeling community to perform commonly needed spatial tasks without requiring the use of a commercial Geographic Information System (GIS).  GISs can be expensive and also require a fair amount of specialized expertise in their use.  Prior to 2008, the Spatial Allocator focused on the use of vector GIS data to create modeling input files such as spatial surrogates needed to allocate emissions from county data to grid cells.  Vector GIS data is provided as sets of vertices that represent points (e.g., gas stations), lines (e.g., roads), or polygons (e.g., county boundaries, water bodies). Vector GIS data typically contains information about the location of the items represented and one or more attributes (e.g., population in each county, area of a water body). 

In 2007, there was a desire to update the land use data used by air quality and meteorological models from 1990s and earlier vintage data to more recent data, such as the United States Geological Survey (USGS) National Land Cover Data (NLCD). The NLCD was prepared from satellite images taken in 2001 (more information is provided below).  The NLCD is provided as raster data or images. Raster data is provided as a series of pixels, each of which has an integer value.  Raster data is similar to images that are displayed on a computer screen in that it is comprised of a series of pixels, each of which has some value and the pictures taken all together create images. The September 2008 release of the Spatial Allocator is the first release to contain tools that process raster data into a form that can be used by air quality and meteorological models, such as computing the percentage of each land use category in each grid cell.

Because of the need to support these fundamentally different aspects of Spatial Allocation, we have divided the Spatial Allocator documentation into three components:

  1. Vector Tools: these tools process vector GIS data (e.g., to perform functions such as creating spatial surrogates and mapping data from counties to grids and visa versa)
  2. Raster Tools: these tools process raster data into forms needed by modeling such as gridded land use
  3. Surrogate Tools: these tools use the Vector Tools and additional Java tools to help manage the creation and manipulation of spatial surrogates used in emissions modeling.

This user’s guide provides information on how to use the Raster Tools component of the Spatial Allocator. With this release of the Spatial Allocator, the software now has three components: Raster Tools, Vector Tools, and Surrogate Tools. The document formerly titled the “MIMS Spatial Allocator User’s Guide” has been renamed to Vector Tools, Spatial Allocator Version 3.6.  The Surrogate Tools User’s Guide is also available as a component of Spatial Allocator Version 3.6. All three of these user’s guides are available on the CMAS Center web site (www.cmascenter.org).”

1.2    The Need for Raster Tools in Spatial Allocator 3.6

High-resolution land-cover datasets such as the NLCD are available for use in meteorology and air quality model­ing. However, the need to set up land use data for different modeling grid domains and different grid projections, and the need to convert between different datums have made it difficult to create consistent high-resolution land use data sets for us in modeling. [Note that a Geodetic datum defines the size and shape of the earth and the origin and orientation of the coordinate systems used to map the earth (see http://www.colorado.edu/geography/gcraft/notes/datum/datum.html).] Also, meteorology and air quality modeling domains applied for the U.S. can be larger than all of the contiguous 48 U.S. states and, for example, include parts of Canada and Mexico. These multinational domains can add the further complication of requiring modelers to obtain land use data from multiple sources.

High-resolution raster data sets (also known as “images”) containing land-cover data, percent imperviousness, and tree canopy estimates are available for all parts of the multinational air quality and meteorological modeling domains and can be downloaded. In developing the Raster Tools component discussed in this user’s guide we have targeted three datasets that can be used to create improved land-cover inputs for modeling:

·         30-m-resolution 2001 United States Geological Survey (USGS) National Land Cover Data (NLCD) for the United States

·         more refined 2001-2006 National Oceanic and Atmospheric Administration (NOAA) NLCD for the U.S. coastal regions

·         1-km-resolution Moderate Resolution Imaging Spectroradiometer (MODIS) 2001 land cover data for the globe

These datasets are quite large—over 60 gigabytes (GB) altogether—and so require significant processing resources. Also, before the data can be used in support of meteorology and air quality modeling, issues must be resolved regarding data overlaps between the raster images, the different map projections and different datums used in the image data and the models, and the use of different land use classification schemes by the various data providers. A set of tools for addressing these difficulties was needed.

In 2007, the Spatial Allocator (http://www.ie.unc.edu/cempd/projects/mims/spatial/) was enhanced so that it could use a new version of the Cartographic Projections Library PROJ.4, which supports datum transformation. This updated Spatial Allocator allows users to re-project spatial data to different geographic datums (for example, 1983 North American Datum - NAD83 to a perfect sphere used by meteorological and air quality models). The addition of Raster Tools to the Spatial Allocator in 2008 enabled users to compute modeling grid land use information from high-resolution satellite raster data. This allows meteorological and air quality modelers to import NLCD and MODIS land use data and convert them to land use percentage information for each grid cell of the user’s specified grid. After the user specifies the modeling domain of interest, the tool assigns the data to the appropriate domain grid cells. It then outputs land use percentages for each grid cell in a comma-separated-value (CSV) format text file or a Weather Research and Forecasting (WRF)-format netCDF file so that they can be used for meteorological or air quality modeling.

2            Input Data

2.1    2001 USGS and NOAA NLCD Data

The NLCD used by the new Raster Tools can be downloaded from the following sites:

Specific instructions on how to download this data for use in Raster Tools will be given in Section 6.

The 2001 USGS land use classes can be viewed at http://www.mrlc.gov/nlcd_definitions.php. 2001 NOAA coastal NLCD classification scheme can be at from http://www.csc.noaa.gov/crs/ lca/tech_cls.html. Additional details on these data sets are available at those web sites.

The two classification schemes match each other well, except that the numbering systems used are different. Also, NOAA’s NLCD classification has a Tundra class (number 24) but the USGS NLCD data do not have that class, so we added “Tundra” to the USGS classification scheme and assigned it the number 75. The output land use information for modeling grids uses the updated USGS land use classification, which has a total of 31 classes (Table 1).

Table 1. USGS NLCD Classifications

Number and class

Number and class

11 Open Water

73 Lichens

12 Perennial Ice/Snow

74 Moss

21 Developed Open Space

81 Pasture/Hay

22 Developed Low Intensity

82 Cultivated Crops

23 Developed Medium Intensity

90 Woody Wetlands

24 Developed High Intensity

91 Palustrine Forested Wetland

31 Barren Land (Rock/Sand/Clay)

92 Palustrine Scrub/Shrub Wetland

32 Unconsolidated Shore

93 Estuarine Forested Wetland

41 Deciduous Forest

94 Estuarine Scrub/Shrub Wetland

42 Evergreen Forest

95 Emergent Herbaceous Wetlands

43 Mixed Forest

96 Palustrine Emergent Wetland

51 Dwarf Scrub

97 Estuarine Emergent Wetland

52 Shrub/Scrub

98 Palustrine Aquatic Bed

71 Grassland/Herbaceous

99 Estuarine Aquatic Bed

72 Sedge/Herbaceous

75 Tundra (class added for this project)

 

127 No Data

Please be aware that the 2001 USGS image data use the value 127 to represent places where there are no data, while NOAA NLCD images use a value of 0 where there are no data and a value of 1 for unclassified values.

2.2    MODIS DATA

The 2001 MODIS land use data for the globe can be downloaded from: http://www-modis.bu.edu/landcover.  In viewing the downloaded MODIS data sets, we found that the provided North American data set had some position shifting relative to defined coastal and other boundaries, so we decided to use the global MODIS data set instead.  To use the global data set we needed to clip it to the North American area and project it into the NLCD projection for consistency. The input MODIS data set is for the North American region and is in the NLCD projection. This dataset is provided with the Raster Tools software, see section 6 for the location of this data.  The MODIS classes are shown in Table 2.

Table 2. MODIS Classifications

Number and class

Number and class

0 water

10 grasslands

1 evergreen needleleaf forest

11 permanent wetlands

2 evergreen broadleaf forest

12 croplands

3 deciduous needleleaf forest

13 urban and built up

4 deciduous broadleaf forest

14 cropland / natural vegetation mosaic

5 mixed forests

15 permanent snow and ice

6 closed shrublands

16 barren or sparsely vegetated

7 open shrublands

17 IGBP* water

8 woody savannas

254 unclassified

9 savannas

255 fill value (normally ocean water)

 

150 no data (class added for this project)

* IGBP stands for International Geosphere-Biosphere Programme (http://www.igbp.net/)

Note that as provided, the MODIS data do not have a “no data” class (like class 127 in the USGS land use classification). When this data is process by the Geospatial Data Abstraction Library (GDAL) processing, however, any cells for which there is no data available are assigned a value of 0 if a NODATA value is not specifically defined during the processing. This creates a problem because in the MODIS data a value of 0 indicates the water class. Therefore, we added the class “no data” to the MODIS classification (and assigned it the number 150), so that cells with no data were not erroneously assigned to the water class during GDAL processing. The NODATA cell values and all image classes are explicitly defined in the Raster Spatial Allocator Tools. In addition, MODIS data have many large areas with the value 255; this is defined as “fill value,” but is treated as ocean area. When using the tools, it is important for users to verify that the cells in their region that are assigned to the 0 and 255 classes really do contain water.

All land class variables from both the NLCD and MODIS datasets are included in the results of the Raster Spatial Allocator Tools. Lookup tables are used within the tools to account for both NLCD and MODIS data during processing. If the land use classes are changed in subsequent versions of these data sets users will have to change the land use classification tables defined in the Raster Spatial Allocators Tools program computeGridLandUse.cpp program.

3            Raster Tools – Spatial Allocator v3.6

The Raster Tools component of the Spatial Allocator v3.6 consists of a set of raster image file processing programs that compute gridded land use information for the user’s modeling domain based on input image data that were discussed in Section 2: the 2001 USGS NLCD (including land use, imperviousness, and canopy), the 2001 to 2006 NOAA coastal NLCD land use images, and the 2001 NASA MODIS land use image data. The processing programs were developed in the C++ language with the GDAL, Cartographic Projections Library PROJ.4, and Unidata netCDF library to process shapefiles and raster land use images. Five programs were developed to accomplish the steps required to process the land use data:

·         preProcessNLCD.exe

·         create_gridPolygon.exe

·         toNLCDRaster.exe

·         computeGridLandUse.exe

·         txt2ncf.exe

These programs are described in the following subsections, and a flowchart showing their relationships is given in Figure 1. In addition, a program that was developed under a different project to compute housing units at census blocks from a housing density image is also included in this Raster Tools release.

 

Figure 1. Flowchart of Software Functions for Grid Land Use Computations

 

3.1    Tool for Removing Overlapping Data from NLCD Images: preProcessNLCD.exe

The preProcessNLCD.exe program preprocesses NLCD images. USGS land use data in the contiguous 48 states are divided into 14 zones, and the data for each zone are separately downloadable (http://www.mrlc.gov/nlcd_multizone_map.php). In processing the NLCD, we found that there are data overlaps among the zone images. Because the air quality modeling domain can be bigger than the 48 contiguous states, keeping track of overlapping cells for a modeling domain during the computation can take a lot of space at the 30-m resolution. The preProcessNLCD program was developed to remove the overlapping areas among each set of NLCD land use data. There are four sets of NLCD land use data sets that can be used in preprocessing: USGS NLCD Landuse Files, USGS NLCD Urban Imperviousness Files, USGS NLCD Tree Canopy Files, and NOAA Coastal Change Analysis Program (C-CAP) NLCD Landuse Files. Users need to preprocess the downloaded NLCD images only one time. After that, land use information can be computed for any modeling domain from the preprocessed image data sets. The processed images are available for download from the CMAS ftp site. This program may be run using the test script called preProcessLanduseImages.csh (see Section 7.1).

3.2    Tool for Creating Modeling Grid Domain Shapefiles: create_gridPolygon.exe

The create_gridPolygon.exe program is used to generate a modeling domain grid polygon shapefile with GRIDID starting from 1 and going to the total number of cells in the domain, counting from the lower left corner cell to the right and upward to the last cell at the upper right corner, as shown in Figure 2. Users can apply this program to create any domain grid shapefile for their needs. This program may be run using the test script called generateGridShapefile.csh (see Section 7.2).

Figure 2. Example of Numbering of
GRIDID in the Generated Shapefile

9

10

11

12

5

6

7

8

1

2

3

4

 

3.3    Tool to Project and Rasterize Shapefile and MODIS Data to NLCD: toNLCDRaster.exe

The toNLCDRaster.exe program was developed to process the grid domain shapefile generated from the above create_gridPolygon.exe program and the MODIS image to put them in the same projection as the NLCD. The program will project the modeling domain grid polygon shapefile into the NLCD projection, which is automatically obtained from an image in the preprocessed NLCD image directory. Then, the projected shapefile is rasterized, using GRIDID as the raster value, into the NLCD grid format according to the following conditions:

  1. Same cell size, and
  2. Corner points have half cell size remaining when divided by cell size: fabs(modf(xUL/xCellSize,&intpart) ) = 0.5 or fabs(modf(yUL/yCellSize,&intpart) = 0.5
    where fabs is the absolute value function and modf is a mathematical mod operator[

In addition, the input MODIS image in the NLCD projection is clipped to cover the modeling domain area for future processing. This program may be run by using the test script called allocateRasterLanduse2WRFGrids.csh (see Section 7.3).

3.4    Tool to Compute Grid Land Use Information at 30-meter resolution: computeGridLandUse.exe

The computeGridLandUse.exe program computes the modeling domain grid land use information table from the modeling domain grid raster data set, clipped MODIS image, and preprocessed NLCD image data sets. The program generates a CSV text format output table with GRIDID, ROW, COL, IMPERV (imperviousness), CANOPY, in addition to NLCD land use classes and/or MODIS land use classes according to the user’s selection. In addition, the program outputs the domain land use information in a WRF-format netCDF file, so the generated land use information can easily be incorporated into WRF modeling. In the output NLCD land use table, the class numbers consist of the original NLCD class number with a “1” prepended to them (see Table 3). In the output MODIS land use table, the class numbers consist of the original MODIS class number with a “2” prepended to them (see Table 4). This program may be run using the test script called allocateRasterLanduse2WRFGrids.csh (see Section 7.3).

Table 3. Output NLCD Classes

Number and class

Number and class

111 Open Water

173 Lichens

112 Perennial Ice/Snow

174 Moss

121 Developed Open Space

181 Pasture/Hay

122 Developed Low Intensity

182 Cultivated Crops

123 Developed Medium Intensity

190 Woody Wetlands

124 Developed High Intensity

191 Palustrine Forested Wetland

131 Barren Land (Rock/Sand/Clay)

192 Palustrine Scrub/Shrub Wetland

132 Unconsolidated Shore

193 Estuarine Forested Wetland

141 Deciduous Forest

194 Estuarine Scrub/Shrub Wetland

142 Evergreen Forest

195 Emergent Herbaceous Wetlands

143 Mixed Forest

196 Palustrine Emergent Wetland

151 Dwarf Scrub

197 Estuarine Emergent Wetland

152 Shrub/Scrub

198 Palustrine Aquatic Bed

171 Grassland/Herbaceous

199 Estuarine Aquatic Bed

172 Sedge/Herbaceous

175 Tundra

Table 4. Output MODIS Classes

Number and class

Number and class

20 water

29 savannas

21 evergreen needleleaf forest

210 grasslands

22 evergreen broadleaf forest

211 permanent wetlands

23 deciduous needleleaf forest

212 croplands

24 deciduous broadleaf forest

213 urban and built up

25 mixed forests

214 cropland / natural vegetation mosaic

26 closed shrublands

215 permanent snow and ice

27 open shrublands

216 barren or sparsely vegetated

28 woody savannas

2254 unclassified

217 IGBP water

2255 fill value (normally ocean water)

 

3.5    Tool to Convert CVS Grid Land Use Output into WRF netCDF output: txt2ncf.exe

The txt2ncf.exe program converts a CSV output file created from the above computeGridLandUse.exe program into the WRF land use netCDF format. The netCDF array starts from the lower left corner and goes up from left to right (in the same way as ArcGIS fishnet grids as shown in Figure 2). We developed this program before we added the direct netCDF output capability to the computeGridLandUse.exe program (Section 3.4). The txt2ncf.exe program may be run using the script file called convertLanduseTxt2WRFNetCDF.csh (see Section 7.4).

3.6    Tool for Housing Density File Processing: rasterWtoPolygons.exe

The rasterWtoPolygons.exe program was developed for a project related to future emissions allocation. In order to allocate future-year emissions, we need future-year population and housing units at census-block level. The Spatially Explicit Regional Growth Model (SERGoM)l produces future-year housing density images at 100-meter resolution grids. Because using ArcGIS to generate census-block housing units from SERGoM 100-m cell housing density raster data requires more than one processing step, we developed rasterWtoPolygons.exe, which computes future-year housing units at census-block level from future-year housing density image data in a single step. This C++ program uses the GDAL to process 100-m housing density raster data and outputs a text file table with census block ID and housing units for that block. This program may be run using a test script called allocateRasterW2Polys.csh (see Section 7.5).

4             Obtaining Raster Tools – Spatial Allocator v3.6

4.1    Where to Obtain Software

  1. The Spatial Allocator v 3.6 is available for download from the CMAS web site, http://www.cmas.org
  2. On the left-hand side of the web site, in the Download Center panel, click on MODELS.
  3. Log in using an existing CMAS account, or create a new CMAS account.
  4. Use the pull-down list to select Spatial Allocator as the model you wish to download, then click Submit.
  5. Specify the product you wish to download as:  SPATIAL ALLOCATOR 3-6, specify the type of computer as Linux PC, and the compiler as the GNU Compilers and click Submit.
  6. In the table that appears, follow the links to the gzipped tar archive for Linux.
  7. For other Unix platforms, makefiles are included to allow the user to build the third-party libraries and the Raster Tools executables.
  8. The table also includes links to the User’s Guides and to the NLCD and BELD3 Input Data.

4.2    Where to Obtain Documentation

A documentation link is found on the left-hand side of the www.cmascenter.org web site in the Help Desk panel. Click on the DOCUMENTATION link, and then use the drop-down list to select Spatial Allocator and click on Submit. Use the pull-down menu to select Spatial Allocator 3.6 and then click search. The documentation pane shows the available documentation for this release of Spatial Allocator. There are three different user’s guides within the Spatial Allocator: one for Raster Tools, one for Vector Tools, and one for Surrogate Tools.

4.3    Help Desk Support for Raster Tools – Spatial Allocator

To submit a question, report errors, and make requests for enhancements to programs within the Raster Tools, if you already have a bugzilla account, please use the bugzilla site for the Spatial Allocator: http://bugz.unc.edu/enter_bug. cgi?product=Spatial%20Allocator. Use the pull-down menu on the right-hand side of the bugzilla web site to specify the component for which you wish to receive help.

If you do not have a bugzilla account, please visit the http://bugz.unc.edu/ website and send an e-mail requesting a new account from the bugzilla administrator, and once you have the account use the above link to access the bugzilla site for the Spatial Allocator.

5            Installing Spatial Allocator

The Spatial Allocator is distributed as a gzipped tar file that contains the executable programs, scripts and the third-party libraries for both the Vector Tools and the Raster Tools. The input data is provided in a separate tar file due to the large size > 30 GB of the datasets.  If you have access to a compute server with scratch space for large data files, you might consider installing the data files there. 

Choose a directory on your Linux computer to unzip and expand the files. The directory where you expand the code is denoted as <local directory> in this documentation. Use the following commands to unzip and expand the tar file:

  • tar –xzvf sa_03_2009.tar.gz

 

Set the SA_HOME environment variable in your .cshrc.  Add the following to your .cshrc:

 

  • setenv SA_HOME {local directory}/sa_03_2009

Location of Raster Tools within the Spatial Allocator

The Raster Tools are stored under the $SA_HOME/src/gdal_apps directory. These programs are provided for users who are running on a Red Hat Linux computer using 32-bit compilers. If a different version of Linux is used, users will have to build the third-party software packages (described in the next section) in addition to rebuilding the Raster Tools programs using the Makefile in this directory.

5.1    Third-party software

When running the Raster Tools programs, compiled GDAL, PROJ.4, and netCDF libraries are needed. If users need to rebuild the third-party libraries to run on a different Linux platform than the one just cited in Section 4.4.1, they can use the freely available GNU g++ compiler available at http://gcc.gnu.org/. The source code for each of the third-party libraries is provided under $SA_HOME/src/libs. The README file provides instructions on how to build the third-party libraries, and there is also support documentation provided with each software package.

The Raster Tools use version 4.6 of PROJ.4 with datum transformation support. The README describes the set of datum shift files (proj-datumgrid zip) that were downloaded from PROJ4 web site: http://trac.osgeo.org/proj/.  Raster Tools requires the use of a version of the PROJ.4 library that is compiled with datum transformation support so that it can convert data to different datum projections as needed. The datum transformation capability was first added to the Spatial Allocator in Version 3.4.

5.2    Setting up the User Environment

The six programs provided in the Raster Tools were not developed using the original Spatial Allocator code-base. The C++ programs do not use any I/O API libraries and do not use the GRIDDESC.txt file to define a grid domain. All information needed by the programs is provided through environment variables. The GDAL was not statically built because it includes interfaces to a lot of other packages (e.g., Oracle, GRASS). To build GDAL statically, it would have required that a lot of other libraries be available (ones that are not used by the Raster Tools) to resolve those references. Because the programs were not statically compiled, the following lines must be added to the user’s .cshrc file in order to use the Raster Tools scripts and programs:

 

# specify local directory containing the Spatial Allocator Software

setenv SA_HOME <local directory>

 

# set library and include directory for RPROJ.4 datum transformation

setenv PROJDIR  ${SA_HOME}/src/libs/proj-4.6.0/local

setenv PROJ_LIBRARY  ${PROJDIR}/lib

setenv PROJ_INCLUDE  ${PROJDIR}/include

setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:${PROJ_LIBRARY}

 

# set library for GDAL

setenv GDALHOME ${SA_HOME}/src/libs/gdal-1.5.2/local/lib

setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:${GDALHOME}

 

#set directory from which to run GDAL application programs

setenv GDALBIN ${SA_HOME}/src/libs/gdal-1.5.2/local/bin/

 

# Set netCDF library and include directory

setenv NETCDF  ${SA_HOME}/src/libs/netcdf-4.0/local

setenv NETCDF_LIB  ${NETCDF}/lib

setenv NETCDF_INC  ${NETCDF}/include

setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:${NETCDF_LIB}

6            Downloading Input Data

The USGS NLCD Input data has been provided in the  sample_nlcdmodis_data.tar.gz file for Region 14, which includes North Carolina, as this is the domain that the scripts are currently configured to run.  The steps that were used to acquire input data for Region 14 are shown below, and may be used to acquire additional image data.

6.1    USGS NLCD  Input Data

 

The input data must be downloaded by the user according to the size of the modeling domain.  The following example is given for a modeling domain centered over North Carolina.

 

The NLCD used by the new Raster Tools can be downloaded from the following site:
30-m-resolution 2001 USGS NLCD data for the United States:  http://www.mrlc.gov
Once a user goes to the site: http://www.mrlc.gov they need to:
select Access Data
select NLCD 2001 Data
select Download NLCD 2001 Data
Click on a link to Download NLCD 2001 Data
which re-directs the user to:
http://www.mrlc.gov/nlcd_multizone_map.php
    click on the region of interest for the United States.

North Carolina is in region 14

Click on Region 14, then the user is given the option to
download three files for this region:
        Tree Canopy Zip
        Urban Imperviousness Zip
        Land Cover Zip

Linux users can use the wget command to download the zip files:

Right click on the link containing the zip file and select copy link location.

Then insert the link location after the wget command at the linux prompt

Example:

wget http://www.mrlc.gov/multizone/landcover/area_14_landcover.zip

area_14_landcover.zip contained the following:
emerald$ ls -rlt
total 965064
-rw-r----- 1 lizadams cep-emc  46367990 2007-03-06 13:00 landcover14_3k_022007.rrd
-rw-r----- 1 lizadams cep-emc 304563746 2007-04-25 10:18 landcover14_3k_022007.img
drwxr-x--- 8 lizadams cep-emc      4096 2008-09-23 17:35 area_14_landcover_metadata/

 

Once user has downloaded and unzipped the files for each region that they need data.  If they would like to use the paths specified in the nlcd_files.txt then the files should be placed in the following subdirectories:

 

$SA_HOME/data/nlcd/images/landcover

$SA_HOME/data/nlcd/images/canopy

$SA_HOME/data/nlcd/images/imperv

 

6.2    Downloading 2001 NOAA NLCD data for the U.S. Coastal Regions

 

Visit http://www.csc.noaa.gov/crs/lca/locateftp.html

The user is asked to select a region from the US map.

Click on North Carolina, the user is given the option to download the following data:

To download the NC 2001 Land Cover Data, right click on the link and select copy link location, and then paste it after the wget command.

Example:

 

wget http://www.csc.noaa.gov/cgi-bin/crs/BouncePage.cgi?Bounce2=ftp://ftp.csc.noaa.gov/pub/crs/lca/data/nc_nc2001.zip

 

unzip nc_nc2001.zip

 

The zip file contained:

ls -rlt

total 111512

-rw-r----- 1 lizadams cep-emc    40297 2007-08-31 09:52 z5558_2001.txt

-rw-r----- 1 lizadams cep-emc    30619 2007-08-31 10:26 z60_2001.txt

-rw-r----- 1 lizadams cep-emc 39079856 2008-02-27 14:06 nc_01.img

-rw-r----- 1 lizadams cep-emc 17875591 2008-09-24 12:10 nc_nc2001.zip

 

Once you have downloaded the files for the regions that you are interested in, move the files to the following directory:

$SA_HOME/data/nlcd/images/noaa_landcover/

6.3    MODIS 2001 Land Cover Data

 

The global MODIS dataset was clipped to the North American area and projected into the NLCD projection is provided with under $SA_HOME/data/nlcd/

The files include: na_modis_nlcd.bil, na_modis_nlcd.blw, na_modis_nlcd.hdr, na_modis_nlcd.stx

 

6.4    Specifying Location of Input Data

 

nlcd_input.txt file must be edited to specify the type and the file name with the path to the location where the files are stored on your computer.  To examine the nlcd_input.txt file do the following:

  • cd $SA_HOME/data
  • cat nlcd_files.txt

 

The nlcd_files.txt file contains a line that contains an identification label that identifies where the images were downloaded from followed by the path and name of the images files.  The content options for the labels include:

  • USGS NLCD Landuse Files:
  • USGS NLCD Urban Imperviousness Files:
  • USGS NLCD Tree Canopy Files:
  • NOAA CGAP NLCD Landuse Files:

7            Running Raster Tools Using Scripts

The five scripts listed in Sections 7.1 through 7.5 are provided with the software to assist users with getting started. The scripts are provided to serve as examples only; the user is responsible for providing actual input data for his/her application and customizing the scripts accordingly. The example scripts are located under the directory $SA_HOME/raster_scripts.  These scripts are configured to process images required for a WRF 12 km domain covering North Carolina.

7.1    preProcessLanduseImages.csh

Input:

  • Images (downloaded by the user, in *.img format), location specified in nlcd_files.txt
  • Input Text File (contains the identification label(s), and location/name of input image files)

Environment Variable

Path + File Name

INPUT_NLCDFILES_LIST

../data/nlcd_files.txt

  • Contents of a sample nlcd_files.txt for North Carolina:

 

USGS NLCD Landuse Files:

../data/nlcd/images/landcover/landcover14_3k_022007.img

USGS NLCD Urban Imperviousness Files:

../data/nlcd/images/imperv/impervious14_091406.img

USGS NLCD Tree Canopy Files:

../data/nlcd/images/canopy/canopy14_020207.img

NOAA CGAP NLCD Landuse Files:

../data/nlcd/images/noaa_landcover/nc_01.img

 

Output:

  • Images (in *.bil format with associated *.bil.aux.xml, *.clr, *.hdr, *.prj files) written to $DATADIR
  • Output Text File (contains the identification labels(s), and location/name of output image files)

Environment Variable

Path + File Name

OUTPUT_NLCDFILES_LIST

../data/pp_nlcd_files.txt

 

  • Contents of a sample pp_nlcd_files.txt for North Carolina

 

USGS NLCD Landuse Files:

../data/nlcd/landcover14_3k_022007.bil

USGS NLCD Urban Imperviousness Files:

../data/nlcd/impervious14_091406.bil

USGS NLCD Tree Canopy Files:

../data/nlcd/canopy14_020207.bil

NOAA CGAP NLCD Landuse Files:

../data/nlcd/nc_01.bil

Executable:

  • ../bin/preProcessNLCD.exe

 

Run time estimates:

Preprocessing the 30 m resolution NLCD images uses all of the image data files and covers the contiguous United States, this script takes about 1 hour 11 minutes to complete:

881.082u 749.211s 1:10:51.93 38.3%      0+0k 0+0io 43pf+0w

Processing the 30 m resolution NLCD images for North Carolina took 2 minutes and 44 seconds to complete:

53.296u 14.501s 2:43.67 41.4%   0+0k 0+0io 29pf+0w

Analyzing Results:

Image files in the *.img and *.bil formats can be viewed using Global Mapper, available at:

http://mcmcweb.er.usgs.gov/drc/dlgv32pro/index.html

Contents of the preProcessLanduseImages.csh script:

 

#!/bin/csh -f

#****************************************************************************

# Purpose: to get rid of image overlapping areas among downloaded USGS NLCD

#           images (landuse, imperviousness, canopy) and NOAA coastal NLCD

#           images.

#           All image names and locations are stored in a file defined by

#           INPUT_NLCDFILES_LIST.  Users only need to run this program once

#           if they obtain original image data.  

#

# Output: All preprocessed images will be stored in directory defined by

#          DATADIR.  New image file names are listed in file defined by

#          OUTPUT_NLCDFILES_LIST.

#

# Written by:  L. Ran, August 2008

#              the Institute for the Environment at UNC, Chapel Hill

#              in support of the EPA NOAA CMAS Modeling, 2007-2008.

#

# Usage: preProcessLanduseImages.csh

#

#****************************************************************************

 

#

#  Preprocessing image files to get rid of overlapping pixels

#  Important: INPUT_NLCDFILES_LIST file must have file type lines provided in the release sample file.

#

setenv INPUT_NLCDFILES_LIST ../data/nlcd_files.txt

setenv DATADIR "../data/nlcd"

setenv OUTPUT_NLCDFILES_LIST ../data/pp_nlcd_files.txt

../src/gdal_apps/preProcessNLCD.exe

 

7.2    generateGridShapefile.csh

Input Files: None (Specify modeling Grid Coordinates in the generateGridShapefile.csh)

Executable:

  • ../bin/create_gridPolygon.exe

Output:

  • Output Shape File

 

Environment Variable

Path + File Name

GRID_SHAPEFILE_NAME

../output/wrf12km_nc.shp

 

Run time estimates:

This script completes in a few seconds.

Contents of the preProcessLanduseImages.csh script:

 

#!/bin/csh -f

#****************************************************************************

# Purpose: to get rid of image overlapping areas among downloaded USGS NLCD

#           images (landuse, imperviousness, canopy) and NOAA coastal NLCD images.

#           All image names and locations are stored in a file defined by

#           INPUT_NLCDFILES_LIST.  Users only need to run this program once if

#           they obtain original image data.

#

# Output: All preprocessed images will be stored in directory defined by

#          DATADIR.  New image file names are listed in file defined by

#          OUTPUT_NLCDFILES_LIST.

#

# Written by:  L. Ran, August 2008

#              the Institute for the Environment at UNC, Chapel Hill

#              in support of the EPA NOAA CMAS Modeling, 2007-2008.

#

# Usage: preProcessLanduseImages.csh

#

#****************************************************************************

 

#

#  Preprocessing image files to get rid of overlapping pixels

#  Important: INPUT_NLCDFILES_LIST file must have file type lines provided in the release sample file.

#

setenv INPUT_NLCDFILES_LIST ../data/nlcd_files.txt

setenv DATADIR "../data/nlcd"

setenv OUTPUT_NLCDFILES_LIST ../data/pp_nlcd_files.txt

../bin/preProcessNLCD.exe

7.3    allocateRasterLanduse2WRFGrids.csh

Input:

  • Preprocessed NLCD Images (created by preProcessLanduseImages.csh, in *.bil format), location specified in pp_nlcd_filex.txt
  • MODIS Image (provided)
  • Input Text File (contains the identification label(s), and location/name of input image files)

Environment Variable

Path + File Name

OUTPUT_NLCDFILES_LIST

../data/pp_nlcd_files.txt

 

Output:

  • Output Files

Environment Variable

Path + File Name

OUTPUT_NLCDFILES_LIST

../output/ncwrf12km_landuse.txt

OUTPUT_LANDUSE_NETCDF_FILE

../output/nc_wrf12km_landuse.nc

 

Executables:

  • ../src/gdal_apps/toNLCDRaster.exe
  • ../src/gdal_apps/compute_GridLandUse.exe

Run time estimates:

For a 12km WRF domain for North Carolina this script takes less than 2 minutes to complete.

Analysis of Results:

  • netCDF format files can be inspected using ncdump, a tool found in the netCDF library.

Example:

cd $SA_HOME/output

$SA_HOME/src/libs/netcdf-4.0/ncdump/ncdump nc_wrf12km_landuse.nc | more

  • To view the netCDF formatted output files, users may use ncview, a visual netCDF browser that can be downloaded from: http://meteora.ucsd.edu/~pierce/ncview_home_page.html

Contents of the allocateRasterLanduse2WRFGrids.csh script:

 

#!/bin/csh -f

#*******************************************************************************

# Purpose:  to generate landuse information for a given modeling domain grids # from:

#     1. USGS NLCD 30m Landuse Files

#     2. USGS NLCD 30m Urban Imperviousness files 

#     3. USGS NLCD 30m Tree Canopy Files

#     4. NOAA CGAP Costal NLCD 30m Landuse Files

#     5. MODIS 1km IGBP Landcove file

#

#     There are three programs involved in computation:

#     1. ../src/gdal_apps/create_gridPolygon.exe -- first users have to

#        create a regular domain grid shapefile with

#        GRIDID item.  Users only need to create the shapefile once. 

#

#     2. ../src/gdal_apps/toNLCDRaster.exe  -- needed to run each time for

#        computing a grid domain landuse information.  It is

#        used to prepare domain grid shapefile and MODIS data for computation

#        in the following program.  The program

#        will create temp_grdshape_nlcd.shp which is the projected grid

#        polygon shapefile in NLCD projection. 

#

#     3. ../src/gdal_apps/computeGridLandUse.exe -- needed to run each time

#        for computing a grid domain landuse information.

#        It computes grid landuse information based on users' selections.  It

#        outputs grid landuse information into a text file

#        and a WRF netcdf file.

#

#

# Written by the Institute for the Environment at UNC, Chapel Hill

# in support of the EPA NOAA CMAS Modeling, 2007-2008.

#

# Written by:   L. Ran, August 2008

#

# Usgae: allocateRasterLanduse2WRFGrids.csh

#        We set all needed environment variables for each program even though

#        some share the same

#        environment variables.  So, users can run each program as needed.

#****************************************************************************

 

#===================================================================

#

# Purpose: Create a polygon shapefile for a given modeling domain grids

#

setenv GRID_PROJ "+proj=lcc +a=6370000.0 +b=6370000.0 +lat_1=33 +lat_2=45 +lat_0=40 +lon_0=-97"

 

setenv GRID_ROWS 35

setenv GRID_COLUMNS 50

 

setenv GRID_XMIN 1392000.0

setenv GRID_YMIN -552000.0

 

setenv GRID_XCELLSIZE 12000

setenv GRID_YCELLSIZE 12000

 

setenv GRID_SHAPEFILE_NAME ../output/wrf12km_nc.shp

 

#if you already created this shapefile comment it out

../bin/create_gridPolygon.exe

#===================================================================

 

 

#===================================================================

#

# The following two programs are used to compute landuse information

# from preprocessed NLCD and MODIS images for the modeling domain

# grid shapefile

#

#

# Purpose: Prepare domain grid shapefile and MODIS file to NLCD format and

#          projection

#     1. Project grid shapefile to NLCD projection. 

#        The program will create temp_grdshape_nlcd.shp in current running

#        directory.  

#     2. Rasterize grid shapefile to 30m NLCD grids.

#     3. Project and clip MODIS into grid domain NLCD projection

#

#     GDALBIN: GDAL bin dir which is needed to run its programs under there.

#     GRID_SHAPEFILE_NAME:  the domain grid polygon shapefile which can be

#     created using generateGridShapefile.csh or above scripts.

#     POLYGON_ID: grid ID for each grid in the shapefile

#     GRID_RASTERFILE_NAME: name to store rasterized grid polygons in 30m

#     cell.  It will be deleted when computeGridLandUse.exe is finished.

#     INPUT_MODISFILE: Clipped and processed North American MODIS IGBP

#     landuse image.

#     OUTPUT_MODISFILE: Clipped and regrided MODIS IGBP images in the

#     modeling domain.

#     DATADIR: dir containing all preprocessed NLCD images

#

setenv GDALBIN "../src/libs/gdal-1.5.2/local/bin"

setenv GRID_SHAPEFILE_NAME           ../output/wrf12km_nc.shp   

setenv POLYGON_ID                     GRIDID  

setenv GRID_RASTERFILE_NAME          ../output/wrf12km_nc_30m.bil

setenv DATADIR                       "../data/nlcd"

setenv INPUT_MODISFILE               "../data/nlcd/na_modis_nlcd.bil"

setenv OUTPUT_MODISFILE              ../output/wrf12km_nc_modis.bil

../bin/toNLCDRaster.exe

 

 

#

# Purpose: Compute domain grid landuse information based on grided shapefile #          and processed

#          MODIS image and output landuse information into text and netcdf

#          files

#

#grid domain description

setenv GRID_ROWS 35

setenv GRID_COLUMNS 50

setenv GRID_XMIN 1392000.0

setenv GRID_YMIN -552000.0

setenv GRID_XCELLSIZE 12000

setenv GRID_YCELLSIZE 12000

setenv GRID_PROJ "+proj=lcc +a=6370000.0 +b=6370000.0 +lat_1=33 +lat_2=45 +lat_0=40 +lon_0=-97"

setenv POLE_LATITUDE  90

setenv POLE_LONGITUDE  0

 

#input files

setenv GRID_RASTERFILE_NAME          ../output/wrf12km_nc_30m.bil

 

#File contains all preprocessed NLCD image files. It was created by preProcessLanduseImages.csh

setenv OUTPUT_NLCDFILES_LIST         ../data/pp_nlcd_files.txt

setenv OUTPUT_MODISFILE              ../output/wrf12km_nc_modis.bil

 

#INCLUDE data selection

setenv INCLUDE_USGS_LANDUSE          YES

setenv INCLUDE_USGS_IMPERVIOUSNESS   YES

setenv INCLUDE_USGS_CANOPY           YES

setenv INCLUDE_NOAA_LANDUSE          YES

setenv INCLUDE_MODIS                 YES

 

# Output files

setenv OUTPUT_LANDUSE_TEXT_FILE      ../output/nc_wrf12km_landuse.txt

setenv OUTPUT_LANDUSE_NETCDF_FILE    ../output/nc_wrf12km_landuse.nc

../bin/computeGridLandUse.exe

 

#===================================================================

7.4    convertLanduseTxt2WRFNetCDF.csh

 

#!/bin/csh -f

########################################################################

#

# Purpose: This scripts file converts WRF gridded landuse text file generated by

# allocateRasterLanduse2WRFGrids.csh to netCDF format.  This scripts

# were created before allocateRasterLanduse2WRFGrids.csh could generate WRF

# netCDF output format.

#

#

# Written by:  Craig A. Mattocks, May 2008

#              Limei Ran, Modified June 2008

#              the Institute for the Environment at UNC, Chapel Hill

#              in support of the EPA NOAA CMAS Modeling, 2007-2008.

#

# Usage:  convertLanduseTxt2WRFNetCDF.csh

#

########################################################################

 

###################

# File attributes #

###################

setenv INPUT_LANDUSE_TEXT_FILENAME ../output/nc_wrf12km_landuse.txt

setenv OUTPUT_LANDUSE_NETCDF_FILENAME ../output/nc_wrf12km_landuse_new.nc

 

#################################################

# WRF grid domain and map projection parameters #

#################################################

setenv GRID_ROWS     35

setenv GRID_COLUMNS  50

 

setenv GRID_XMIN 1392000.0

setenv GRID_YMIN -552000.0

 

setenv GRID_XCELLSIZE 12000

setenv GRID_YCELLSIZE 12000

 

setenv GRID_PROJ "+proj=lcc +a=6370000.0 +b=6370000.0 +lat_1=33 +lat_2=45 +lat_0=40 +lon_0=-97"

 

setenv POLE_LATITUDE  90

setenv POLE_LONGITUDE  0

 

######################################

# Run text --> netCDF file converter #

######################################

../bin/txt2ncf.exe

 

7.5    allocateRasterW2Polys.csh

 

#!/bin/csh -f

#****************************************************************************

# Purpose: to allocate raster weight data to polygons.

# This script file was created to allocate ICLUS SERGOM housing density

# raster to housing units

# in each census blocks.

#

# Written by:  L. Ran  06/2008

#****************************************************************************

#

# Allocate raster image weight to polygons:

#     1. Project polygon shapefile into raster file projection and extent

#        grids.

#     2. Rasterize polygon shapefile into raster grid format.

#     3. compute each polygon's raster weight total

#

setenv GDALBIN "../src/libs/gdal-1.5.2/local/bin"

setenv POLYGON_SHAPEFILE_NAME        "../data/tn_pophu2k_aea.shp"

setenv POLYGON_ID                    BLOCK_ID

setenv POLYGON_RASTERFILE_NAME       "../output/tn_pophu2k_rst.bil"

setenv WEIGHT_RASTER_FILE            "../data/tn_hd2k"

setenv WEIGHT_TYPE                   ICLUS_HOUSING_DENSITY

setenv OUTPUT_WEIGHT_NAME            "HU00"

setenv OUTPUT_TEXT_FILE              "../output/tn_allocated_rasterHD2polyHU.csv"

../bin/rasterWtoPolygons.exe

8            Visualization of netCDF outputs

 

The netCDF output files can be visualized using ncview.

Figure 2 contains images of the variables contained a netCDF output file that was generated by running allocateRasterLanduse2WRFGRids.csh with the following environment variables:

setenv INCLUDE_USGS_LANDUSE          YES

setenv INCLUDE_USGS_IMPERVIOUSNESS   NO

setenv INCLUDE_USGS_CANOPY           NO

setenv INCLUDE_NOAA_LANDUSE          NO

setenv INCLUDE_MODIS                 NO

 

Figure 2

Figure 3. contains images of the variables contained in a netCDF output file was generated by running allocateRasterLanduse2WRFGRids.csh with the following environment variable settings:

setenv INCLUDE_USGS_LANDUSE          YES

setenv INCLUDE_USGS_IMPERVIOUSNESS   YES

setenv INCLUDE_USGS_CANOPY           YES

setenv INCLUDE_NOAA_LANDUSE          YES

setenv INCLUDE_MODIS                 YES

Figure 3

9            Software Performance

The programs developed in C++ with the GDAL are quite efficient in processing the high number of very large images required to provide updated land use data for air quality and meteorological modeling. It took five to six hours to preprocess all 71 downloaded USGS and NOAA NLCD image files on the UNC Linux server. It took around five hours to compute the WRF 12-km small domain, and around 20 hours to compute WRF 12-km large domain. The rasterized 30-m WRF 12-km Eastern U.S. domain image is more than 40 GB, and the rasterized 30-m WRF 12-km continental U.S. domain image is more than 130 GB. Given that the programs are working at 30-m resolution for such huge domains, we are quite satisfied with the amount of time the programs required to run.

10        Future Enhancements

The GDAL is a set of well-maintained vector and raster spatial data processing libraries. Many well-known GIS packages use it, including GRASS, ArcGIS 9.2, and Google Earth. With additional funding, we could develop a simple GIS graphics package that would use a Java-based interface and would have some simple functions for viewing shapefiles, raster (image) files, or netCDF files. This would support overlaying modeling outputs on top of shapefiles (such as political boundaries, cities, population, roads) and on top of image files such as satellite images or land use images.

 

We could also add to the interface some basic GIS functions that air quality modeler’s need, such as the ability to create domain grids, to generate surrogates, and to generate land use information for a domain. Basically, air quality modelers could use this package to view their data and perform some spatial data functions they need, thus avoiding the need to purchase and use commercial GIS packages.

 

Additional enhancements include: 1) develop a program to extract preprocessed GEOS satellite data for a modeling grid, with user defined domain, and output in WRF NetCDF format, 2) add a variable grid option for land cover computation, surrogate computation, and BELD3 data processing.*