电子与控制

基于条件随机场的遥感图像语义标注

  • 杨俊俐 ,
  • 姜志国 ,
  • 周全 ,
  • 张浩鹏 ,
  • 史骏
展开
  • 1. 北京航空航天大学 宇航学院, 北京 100191;
    2. 数字媒体北京市重点实验室, 北京 100191;
    3. 南京邮电大学 通信与信息工程学院, 南京 210003
杨俊俐 女, 博士研究生。主要研究方向: 模式识别、机器学习、计算机视觉、遥感图像分类、识别。 Tel: 010-82338061 E-mail: junliyang0406@gmail.com

收稿日期: 2014-08-29

  修回日期: 2014-12-18

  网络出版日期: 2015-01-12

基金资助

国家自然科学基金(61371134,61071137,60776793);中央高校基本科研业务费专项资金

Remote sensing image semantic labeling based on conditional random field

  • YANG Junli ,
  • JIANG Zhiguo ,
  • ZHOU Quan ,
  • ZHANG Haopeng ,
  • SHI Jun
Expand
  • 1. School of Astronautics, Beihang University, Beijing 100191, China;
    2. Beijing Key Laboratory of Digital Media, Beijing 100191, China;
    3. School of Communication and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

Received date: 2014-08-29

  Revised date: 2014-12-18

  Online published: 2015-01-12

Supported by

National Natural Science Foundation of China (61371134,61071137,60776793); Fundamental Research Funds for the Central Universities

摘要

遥感图像包含的信息丰富,纹理复杂,而遥感图像语义标注又为后续的目标识别、检测、场景分析及高层语义的提取提供了重要信息和线索,这使其成为遥感图像理解领域中一个关键且极具挑战性的任务。首先针对遥感图像语义标注问题,提出采用条件随机场(CRF)框架对遥感图像的底层特征和上下文信息建模的方法,将Texton纹理特征与CRF中的自相关势能结合来捕捉遥感图像中的纹理信息及其上下文分布,采用组合Boosting算法进行Texton纹理特征选择和参数学习;然后将Lab空间中的颜色信息与CRF中的互相关势能结合来描述颜色上下文;最后用Graph Cut算法对CRF进行推导求解,得到图像自动语义标注结果。同时,建立了可见光遥感图像数据库Google-4,并对全部图像进行了人工标注。Google-4上的实验结果表明:采用CRF框架与Texton纹理特征和颜色特征相结合对遥感图像建模的方法与基于支持向量机(SVM)的方法相比较,能够取得更准确的语义标注结果。

本文引用格式

杨俊俐 , 姜志国 , 周全 , 张浩鹏 , 史骏 . 基于条件随机场的遥感图像语义标注[J]. 航空学报, 2015 , 36(9) : 3069 -3081 . DOI: 10.7527/S1000-6893.2014.0356

Abstract

Remote sensing images exhibit abundant information and complicated texture, and remote sensing image semantic labeling provides important information and clue for the subsequent object recognition, detection, scene analysis and high-level semantic extraction, which makes it a significant and extremely challenging task in remote sensing image understanding field. To address the task of remote sensing image semantic labeling, we propose to utilize the conditional random field (CRF) framework to model the low-level features and context information in remote sensing images. A texture descriptor ‘Texton’ is combined with the association potential in CRF framework to capture the texture layout in remote sensing images. ‘Texton’ feature selection and model parameter learning are carried out by employing the joint Boosting algorithm. And color information in Lab color space is used in the interaction potential in CRF to describe the color context. Then a Graph Cut algorithm is utilized to infer the CRF model to get the automatic semantic labeling results of the images. At the same time, we establish an optical remote sensing image database Google-4 and label all the images by manual annotation. Experimental results on Google-4 show that the CRF modeling scheme combined with ‘Texton’ and color feature can accomplish the semantic labeling task of remote sensing images more accurately compared to the support vector machine (SVM)-based methods.

参考文献

[1] Solberg A H S, Taxt T, Jain A K. A Markov random field model for classification of multisource satellite imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 1996, 34(1): 100-113.
[2] Jackson Q, Landgrebe D. Adaptive Bayesian contextual classification based on Markov randomfields[J]. IEEE Transactions on Geoscience and Remote Sensing, 2002, 40(11): 2454-2463.
[3] Tranni G, Gamba P. Boundry adaptive MRF classification of optical very high resolution images[C]//IEEE International Geoscience and Remote Sensing Symposium. Piscataway, NJ: IEEE Press, 2007: 1493-1496.
[4] Lafferty J, McCallum A, Pereira F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C]//the 18th International Conference on Machine Learning. Brookline, United States: Microtome Publishing, 2001: 282-289.
[5] Kumar S, Hebert M. Discriminative random fields: A discriminative framework for contextual interaction in classification[C]//IEEE International Conference on Computer Vision. Piscataway, NJ: IEEE Press, 2003: 1150-1157.
[6] Kumar S, Hebert M. Discriminative random fields[J]. International Journal of Computer Vision, 2006, 68(2): 179-201.
[7] Kumar S. Models for learning spatial interactions in natural images for context-based classification[D]. Pittsburgh: Carnegie Mellon University School of Computer Science, 2005.
[8] Liu C, Szeliski R, Kang S B, et al. Automatic estimation and removal of noise from a single image[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(2): 299-314.
[9] Zhong P, Wang R S. A multiple conditional random fields ensemble model for urban area detection in remote sensing optical images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2007, 45(12): 3978-3988.
[10] Zhong P, Wang R S. Learning conditional random fields for classification of hyperspectral images[J]. IEEE Transactions on Image Processing, 2010, 19(7): 1890-1907.
[11] He X M, Zemel R, Carreira-Perpinan M A. Multi-scale conditional random fields for image labelling[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE Press, 2004: 695-702.
[12] Shotton J, Winn J, Rother C, et al. Texton boost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation[C]//European Conference on Computer Vision. Berlin, Germany: Springer Verlag, 2006: 1-15.
[13] Shotton J, Winn J, Rother C, et al. Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context[J]. International Journal of Computer Vision, 2009, 81(1): 2-23.
[14] Julesz B. Textons, the elements of texture perception, and their interactions[J]. Nature, 1981, 290(5802): 91-97.
[15] Torralba A, Murphy K P, Freeman W T. Sharing visual features for multi-class and multi-view object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 19(5): 854-869.
[16] Boykov Y, Jolly M P. Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images[C]//Proceedings of International Conference on Computer Vision. Piscataway, NJ: IEEE Press, 2001: 105-112.
[17] Boykov Y, Veksler O, Zabih R. Fast approximate energy minimization via graph cuts[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(11): 1222-1239.
[18] Friedman J, Hastie T, Tibshirani R. Additive logistic regression: A statistical view of boosting[J]. Annals of Statistics, 2000, 28(2): 337-374.
[19] Jones D G, Malik J. A computational framework for determining stereo correspondence from a set of linear spatial filters[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer Verlag, 1992: 395-410.
[20] Rother C, Kolmogorov V, Blake A. GrabCut—Interactive foreground extraction using iterated graph cuts[J]. ACM Transactions on Graphics, 2004, 23(3): 309-314.
[21] Sutton C, McCallum A. Piecewise training of undirected models[C]//Proceedings of Conference on Uncertainty in Artificial Intelligence. Arlington, United States: AUAI Press, 2005: 568-575.
[22] Baluja S, Rowley H A. Boosting sex identification performance[J]. International Journal of Computer Vision, 2006, 71(1): 111-119.
[23] Zhu H, Basir O. An adaptive fuzzy evidential nearest neighbor formulation for classifying remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2005, 43(8): 1874-1889.
[24] Bruzzone L, Roli F, Serpico S B. An experimental comparison of neural networks for the classification of multi-sensor remote sensing images[C]//Geoscience and Remote Sensing Symposium. Piscataway, NJ: IEEE Press, 1995: 452-454.
[25] Iikura Y, Chi T M, Msuoka Y. Efficient classification of multispectral images by a best linear discriminant function[C]//Geoscience and Remote Sensing Symposium. Piscataway, NJ: IEEE Press, 1988: 505-508.
[26] Tarabalka Y, Rana A. Graph-cut-based model for spectral-spatial classification of hyperspectral images[C]//Geoscience and Remote Sensing Symposium. Piscataway, NJ:IEEE Press, 2014: 3418-3421.

文章导航

/