DBF, SHP, SHX (index file, topological relationship, order; when join to other files); PRJ, XML (metadata)
Data Models & Structure
Data Models: thelogicalmeans of data storage and organization for use in an information system → vector, raster, TIN
Data Structures: thelogical and physicalmeans by which a map feature or an attribute is digitally encoded (i.e., file type) → e.g., GDB, shapefile, coverage, ESRI grid, ERDAS image file
Vector
point, line (2 or more points), polygon (3 or more points, closed line)
georelational: links what with where
data structure:
shape file (spaghetti): Each contains1 feature class(point, line, polygon);geometric datais linked with a separate attribute table in the “Shape” field in abinaryformat
coverage (topological): Each contains 1+ feature class (point “node” + tics, line “arc”, polygon, annotation); arc: 각 point의 coordinate 저장하지 않지만 각 point별로 연결된 point들 나열
geodatabase: links what with where by how
topology: adjacency (by arc; left, right polygon), connectivity (by polygon; # of arcs, list of arcs)
arc: lines of common boundary between 2 polygons
node: starting/ending point of arcs
critical points: any points within an arc that define the angle and distance of an arc segment
Raster
data structure
single layer: uncompressed, run-length encoding, Quadtree compression
multiple layers: BSQ, BIP, BIL
pixel classification problem
TIN
DEM: digital representation of the topography
TIN is generated by 출처: https://www.researchgate.net/publication/325271931_Comparison_of_Inverse_Distance_Weighted_and_Natural_Neighbor_Interpolation_Method_at_Air_Temperature_Data_in_Malang_Region
Thiessen polygons
점 집합이 주어졌을 때, 각 점에 대해해당 점이 가장 가까운 영역을 정의한 다각형.
즉, 어떤 위치가 주어졌을 때 그 위치에서 가장 가까운 점(관측지점)을 찾을 수 있도록 하는 공간 분할.
Delaunay triangulation(a network of triangles is drawn such that the circle that surrounds the three points of the trianglecontains no other points)
같은 점 집합에 대해, 어떤삼각망을 만들되 조건이 있음:
어떤 삼각형의 외접원(Circumcircle) 안에는 다른 점이 존재하지 않음.
이 조건을 만족하면 "잘생긴(triangle shape quality가 좋은)" 삼각형들이 만들어짐.
TIN 생성 시 실제 사용하는 것은 Delaunay Triangulation이며, 이는 Voronoi(Thiessen) 구조와 본질적으로 연결되어 있다.
Thiessen Polygon과 Delaunay 삼각망은 서로 쌍대 관계:
각Thiessen polygon의 꼭짓점을 연결하면 Delaunay 삼각형이 생성됨.
반대로, 각Delaunay 삼각형의 무게 중심/외심을 연결하면 Voronoi(Thiessen) polygon이 만들어짐.
good in pseudo 3D rendering, bad in computational intensive /than raster
XML approach for distributing and storing geographic information
not presentation language, programming language
trending: KML, GeoJSON
Paper discussion
Zhou, M., Chen, J., & Gong, J. (2016). Rendering interior-filled polygonal vector data in a virtual globe. International Journal of Geographical Information Science, 30(11), 2208-2229. http://dx.doi.org/10.1080/13658816.2016.1165819
value at each grid cell location is equivalent to the value at the nearest observation (only one value)
using Thiessen Polygons
Advantage: most appropriate for qualitative data
Disadvantage: resulting surface is discontinuous with step-wise gradation between observed values, how to treat data on boundary?
Average Neighbor Interpolation
value at each grid cell location is an average of the value at the nearby observations
Advantages: Fast, Smooths the data
Disadvantages: Requires subjective selection of r (and hence k), Does not extrapolate (=estimate) beyond the minimum and maximum of observed values, Does not consider the distance and direction of observations, Not an exact interpolator (result in different result based on window, neighborhood, …)
can do sensitivity test to justify results
Inverse Distance Weighted (IDW) Interpolation
value at each grid cell location is a distance weighted average of the values at the nearby observations
p parameter
p 값이 작을수록 (예: p=1) → 거리에 따른 가중치 감소가 완만 → 멀리 있는 점들도 비교적 큰 영향을 가짐
p 값이 클수록 (예: p=3,4) → 거리에 따른 가중치 감소가 급격 → 가까운 점들이 훨씬 더 큰 영향을 가짐, 멀리 있는 점은 거의 영향 없음
Advantages: Results in a continuous and smooth surface Is an exact interpolator (i.e. derived surface passed through observed values)
Disadvantages: Requires subjective selection of parameters (k and p), Does not extrapolate beyond the minimum and maximum of observed values, Does not consider direction of observations
Polynomial
Best fit a smooth surface that is defined by a mathematical function to the observed locations
Order: a coefficient in determining how many “bends” the surface can have in fitting the observations
First: a flat surface (e.g. linear)
Second: a surface with one bend (e.g. quadratic)
Third: a surface with two bends (e.g. cubic)
Nth order polynomial
Global
Fit through all the points. The global surface changes gradually and captures coarse-scale pattern in the data.
Advantages: Allow extrapolation, Less subjective selection of parameter (only order of polynomial), Simple and easy to use
Disadvantages: Not an exact interpolator, Less applicable to a complex surface
Local
Seek a balance between the global trend and local influence. Fit multiple surfaces for each overlapping neighborhoods
Advantages: Allow extrapolation, A “pseudo” exact interpolator, Can model complex surface
Disadvantages: Subjective selection of parameters (order of polynomial and global/local trend), More complicated procedure in selecting neighborhood size/shape
Radial Basis Function (RBF)
Fit a smooth surface that passes through the observed locations by minimizing the total curvature of surface (i.e. change of slope), Often refer to as Spline
Advantage: Exact Interpolator, Allow extrapolation, Results in a smooth surface with gentle variation (than IDW)
Disadvantage: Does not work well with complex surface with abrupt changes, Mathematically complex
Kriging
Semivariance
First Law of Geography
Spatial autocorrelation: similarity of spatial pattern based on proximity
Semivariance: dissimilarity between observations (= half of square difference)
sill (height), range (distance where sill occurs), nugget (initial semivariance where distance is zero)
gives an idea to set neighborhood
An anisotropic surface has observations varying with direction
An isotropic surface has the same variation at each direction
Kriging is geostatistical method for spatial interpolation that is composed of 3 components: Spatial autocorrelation, A trend, Random errors
****Types of Kriging
Ordinary Kriging – assumes the trend is an unknown constant
Simple Kriging – assumes the trend is a known constant
Universal Kriging – assumes the trend is a known deterministic function
Cokriging – uses the correlation of additional variable in calculating spatial autocorrelation
Evaluation
Cross Validation: assess quality of interpolation function
How to pick algorithm? Data type, ,error at observations, error between observations, general form of error?
Paper discussion
Comber, A., & Zeng, W. (2019). Spatial interpolation using areal features: A review of methods and opportunities using new forms of data with coded illustrations. Geography Compass, 13(10), e12465. https://doi.org/10.1111/gec3.12465
Santangelo, M., Marchesini, I., Bucci, F., Cardinali, M., Fiorucci, F., & Guzzetti, F. (2015). An approach to reduce mapping errors in the production of landslide inventory maps. Natural Hazards and Earth System Science, 15(9), 2111-2126. https://doi.org/10.5194/nhess-15-2111-2015
Xie, P., Liu, Y., He, Q., Zhao, X., & Yang, J. (2017). An efficient vector-raster overlay algorithm for high-accuracy and high-efficiency surface area calculations of irregularly shaped land use patches. ISPRS International Journal of Geo-Information, 6(6), 156. http://dx.doi.org/10.3390/ijgi6060156
statistical, others (e.g. area, perimeter, region group[values→zone index])
global
(e.g. how far each cell is from a specific cell)
Data Flow Diagram
graphical representation of the logical steps in linking between an input and an output
Paper discussion
Pronk, M., Hooijer, A., Eilander, D., Haag, A., de Jong, T., Vousdoukas, M., ... & Eleveld, M. (2024). DeltaDTM: A global coastal digital terrain model. Scientific Data, 11(1), 273. https://doi.org/10.1038/s41597-024-03091-9
Devine, J. A., Currit, N., Reygadas, Y., Liller, L. I., & Allen, G. (2020). Drug trafficking, cattle ranching and Land use and Land cover change in Guatemala’s Maya Biosphere Reserve. Land Use Policy, 95, 104578. https://doi.org/10.1016/j.landusepol.2020.104578
Surface: a continuous field of values that may vary over space
Surface Modeling: to extract useful information about the surface (E.g. spatial interpolation)
Topographic Modeling: to extract useful topographic information about the terrain landscape
Topographic Modeling
Contour
a line that connects points of equal value over a surface
Contour interval represents the break value of the vertical distance between contour lines
Base contour is the minimum value of contouring
Slope (focal operation)
the gradient of elevation change over a certain distance (where/how fast the surface is changing)
Rise over run
Expressed as degree or percent
slope of slope: can detect slope change (high value: big change of slope, edge of slope)
Aspect
the direction of the sloping surface (which direction the surface is changing)
Shaded Relief
Hillshading simulates how the terrain looks with the interaction between a hypothetical light source and surface features
Viewshed
Line of sight determines the visibility of a specific target from an observation point (visibility)
Cut/Fill
summarizes the areas and volumes of change between two surfaces. It identifies the areas and volume of the surface that have been modified by the addition or removal of surface material
Paper discussion
Ruzickova, K., Ruzicka, J., & Bitta, J. (2021). A new GIS-compatible methodology for visibility analysis in digital surface models of earth sites. Geoscience Frontiers, 12(4), 101109. https://doi.org/10.1016/j.gsf.2020.11.006
Camelli, F., Lien, J. M., Shen, D., Wong, D. W., Rice, M., Löhner, R., & Yang, C. (2012). Generating seamless surfaces for transport and dispersion modeling in GIS. Geoinformatica, 16(2), 307-327.https://doi.org/10.1007/s10707-011-0138-3
Wang, S., Wang, M., & Liu, Y. (2021). Access to urban parks: Comparing spatial accessibility measures using three GIS-based approaches. Computers, Environment and Urban Systems, 90, 101713. https://doi.org/10.1016/j.compenvurbsys.2021.101713
Zou, H., Yue, Y., Li, Q., & Yeh, A. G. O. (2012). An improved distance metric for the interpolation of link-based traffic data using kriging: A case study of a large-scale urban road network. International Journal of Geographical Information Science, 26(4), 667–689. https://doi.org/10.1080/13658816.2011.609488
Atkinson, D. M., Deadman, P., Dudycha, D., & Traynor, S. (2005). Multi-criteria evaluation and least cost path analysis for an arctic all-weather road. Applied Geography, 25(4), 287–307. https://doi.org/10.1016/j.apgeog.2005.08.001
lab 6: Distances, Least Cost Path and Exposure Modeling
Euclidean Distance, Polyline to Raster, Cost Distance, Cost Path, Raster Calculator, Raster to Polyline, Calculate Geometry
GIS에서 사용하는 데이터(지형, 토양, 기후 등)는 측정 오류, 해상도, 모델 단순화 때문에 불확실성을 포함함.
이런 불확실성이 **분석 결과에 어떻게 영향을 주는지(uncertainty propagation)**를 정량적으로 이해하려는 것이 목적.
단순히 입력 데이터의 오차만 보는 게 아니라, 모델 구조와 공간적 상호작용까지 고려해야 함을 강조.
⚙️ 불확실성 전파의 복잡성 이유
GIS 모델의 비선형성 (non-linearity)
입력값이 조금만 바뀌어도 결과가 비선형적으로 달라짐 → 단순한 오차 계산이 어려움.
공간적 상관관계 (spatial correlation)
인접한 지역끼리 데이터가 독립적이지 않음 → 확률적 모델링이 복잡해짐.
모델 결합 (model coupling)
여러 GIS 도구를 연계하면 한 단계의 불확실성이 다음 단계로 전이됨.
확률 분포의 형태
입력 데이터가 정규분포를 따르지 않을 수 있음 → 단순 통계 접근법이 한계.
🔍 대표적 접근 방법
오차 전파 공식 (error propagation formulas)
수학적으로 오차가 어떻게 전달되는지 계산하지만, 선형 모델에만 적합.
몬테카를로 시뮬레이션 (Monte Carlo simulation)
입력값을 확률적으로 반복 생성하여 결과 분포를 추정.
계산량은 많지만 복잡한 모델에도 적용 가능.
지리통계학적 방법 (geostatistics)
공간 상관성을 반영하여 불확실성의 공간적 분포를 추정.
🧩 사례 연구 요약
실제 GIS 분석 예시(예: 지형 경사 분석, 알루미늄 농도 예측 등)에서, 입력 데이터의 오차가 결과의 공간 패턴에 큰 영향을 미침을 보여줌.
특히 공간적 상관을 무시하면 불확실성이 과대 혹은 과소 추정됨.
📊 결론
GIS의 불확실성 분석은 단순한 “오차 계산”이 아니라, 공간적·통계적·모델적 복합 시스템 분석임.
연구자와 실무자는 모델링 전 과정에서 불확실성을 명시적으로 다뤄야 함.
향후 연구는 계산 효율성과 공간적 표현의 정확성을 함께 고려해야 함.
🧠 요약 포인트
GIS 결과의 신뢰성을 높이려면 불확실성 분석이 필수.
선형 모델보다 비선형, 공간적 상관, 복합 입력을 고려한 접근 필요.
몬테카를로 방법이 가장 일반적이고 실용적임.
"왜 단순하지 않은가?" → 현실의 데이터와 모델은 모두 복잡하게 얽혀 있기 때문.
Keil, J., O'Meara, D., Korte, A., Edler, D., Dickmann, F., & Kuchinke, L. (2024). How to visualize the spatial uncertainty of landmark representations in maps?. Journal of Environmental Psychology, 99, 102441. https://doi.org/10.1016/j.jenvp.2024.102441