Skin Lesion Detection And Segmentation Biology Essay

�n

Abstract�X Skin lesion detection and segmentation in

dermatoscopic images is difficult because there are large inter

variations in shape, size, color, and texture between lesions and

skin types. The detection and segmentation involves feature

extraction and selection. Feature ranking is applied to find the

most discriminative ones, and a predetermined optimal number

of the top-ranking features are used in classification. Many

published works have adopted this methodology but the results

are insufficient for real-world applications. We suggest that

issue could lies in the lack of understanding in the relationship

between the higher-ranking features. In this paper, an

experimental study is presented to examine the relationships

between top ranking features and their performances when

combined differently. Our results support the idea that by

understanding the nature of the extracted features and

applying them synergistically, better performance could be

achieved than by simply using the highest ranking features in

order (e.g., using the top three features) with a classifier.

I. INTRODUCTION

Over the past 10 years, the incidence and mortality rates of

the most lethal type of skin cancer, malignant melanoma

have been increasing worldwide [1]. Since this skin lesion

can be alleviated with a simple excision if identified early,

early diagnosis is important [1]. However, even for an

experienced dermatologist, diagnosis of melanoma by using

digital dermoscopy is below 90% [2]. Due to the difficulty

and subjectivity of human judgment, automated or semiautomated

analysis of the images is an important research

area [3].

Fig. 1 shows two typical lesion images. Automated skin

lesion detection is difficult because the lesions are of

different color, texture and boundaries.

a b

Fig.1. Two skin lesion images in different color, with different

lesion texture and boundaries.

Automated skin lesion detection and segmentation

involves the use of image descriptor(s) to extract significant

Wei Xiong is with Institute for Infocomm Research, A*STAR,

Singapore 138632. Ai Jia Wang and S. H. Ong are with National University

of Singapore, Singapore 119077. Corresponding author is Wei Xiong:

wxiong@i2r,a-star.edu.sg.

features and applying these features with an appropriate

classification algorithm, e.g. K-Nearest Neighbor (KNN)

technique. Often, there are numerous features available for

each descriptor and employing all available features will

increase the dimensionality of the feature space. That may

lead to a problem known as over-representation, which

decreases the performance of the classifiers instead [4].

Recent studies attempt to circumvent this problem by using a

feature ranking technique to select features with better

discriminating power, and to eliminate redundant, irrelevant

or noisy features [5]. For instance, Ganster et al. extracted

features of shape, radiometric as well as local and global

parameters to describe the malignancy of a lesion [6].

Significant features were then selected by the application of

statistical feature subset selection methods and KNN used for

the final classification. While the recent methodological

papers have shown good promises, their performance was not

sufficient for real-world application [7]. One key issue could

lie in the lack of understanding in the relationships between

the highest ranking features that are used for classification.

This paper provides a comprehensive study to unravel the

relationships between the top ranking features experimentally

by applying them in different combinations. The results

show that, although all features used in the final classification

are top ranking features, how they are combined has great

impact on performance. This suggests that top ranking

features may not always be complementary. As such, by

clarifying the relationship between the discriminating

features, synergistic features may be applied together to

achieve even better classification performance for skin lesion

detection.

II. METHODOLOGY

The proposed algorithm consists of 4 steps: (1) feature

selection, (2) feature extraction, (3) feature ranking and (4)

classification with different feature combinations.

2.1 Feature Selection

Skin lesion images contain a variety of color and texture

information. Specifically, malignant melanoma exhibits a

rich combination of colors as well as geometrical structure

of pigmentation [8]. It has been shown that color and texture

descriptors are complementary in the detection and

segmentation process [8]. Therefore, GLCM (texture

descriptor) and color features in RGB and HSV channels are

subjects of interest in this work.

GLCM is one of the most popular and powerful tools to

describe texture [9]. Since its proposal in 1973 by Haralick et

al., it has been widely used on many texture classification

Experimental Study on Feature Selection and Combination for

Pigmented Skin Lesion Detection and Segmentation

Wei Xiong, Ai Jia Wang, S.H. Ong

CONFIDENTIAL. Limited circulation. For review only.

Preprint submitted to 35th Annual International IEEE EMBS Conference.

Received February 3, 2013.

applications and continues to be important in this domain

[10]. Texture information plays a huge role in skin lesion

diagnosis for the dermatologist and intuitively, humans

derive texture information based on the tone in an image.

GLCM is able to leverage on this implicit relationship which

makes it a powerful descriptor. While the GLCM is a

commonly used descriptor, few studies have been done to

understand the relationship between the many features

derived from GLCM for skin lesion detection

The GLCM [13] in general is a tabulation of how often

different combinations of pixel brightness values (grey

levels) occur in an image. The texture information is

specified by the matrix of relative frequencies with

two neighboring pixels separated by displacement d and

angle �c (0o, 45o, 90o, 135o). See Fig.2. for an example on

how the matrix is computed.

Fig.2. 0o GLCM computation example.

Color feature extraction is done in two color spaces -

Hue, Saturation and Value (HSV) as well as the Red, Green

and Blue (RGB). HSV, being a perceptually uniform color

space, has been shown to be more robust than RGB in lesion

segmentation [11]. However, although HSV is better, RGB is

still the primary color space in all lesion images and has also

been used in numerous published works [12] . Therefore,

RGB color space will also be considered.

2.2 Feature Extraction

319 lesion images with five-dermatologist-approved ground

truth were used in this study. Feature extraction is carried

out with a 3x3 non-overlapping moving pixel patch. GLCM

was extracted in 0o (Eq. (1)) and 45o (Eq. (2)):

P(i,j,d,0o) = #{((k,l),(m,n)) (Ly �e )X (Ly �e )|k-m|=0, |ln|=

d, I(k,l)=i, I(m,n)=j} (1)

P(i,j,d,45o) = #{((k,l),(m,n)) (Ly �e )X (Ly �e )|(k-m=d, ln=-

d) or (k-m=-d, l-n=d), I(k,l)=i, I(m,n)=j} . (2)

Here # denotes number of elements in the set, (k,l) denotes

the position of one pixel within a patch, (m,n) denotes the

position of a neighboring pixel, Ly denotes the number of

columns, denotes the number of rows, d denotes the

spatial distance for GLCM computation.

Based on the GLCM algorithm, features such as energy,

inverse difference normalized, dissimilarity,

contrast, and up to 22 features per pixel patch were extracted

[13-15] .

For color feature extraction, median, skewness, mean,

standard deviation and kurtosis were computed over RGB

and HSV channels separately. Then by comparison with the

ground truth, this pixel patch and their feature value will be

labeled as one of the two classes: lesion or non-lesion.

2.3 Feature Ranking

K-nearest neighbor (KNN) algorithm was used for feature

ranking. Firstly, the entire database were divided into a

training set (249 images randomly selected) and testing set

(remaining 70 images) where the testing set will be left out

of feature ranking and classification training. Secondly, the

performance of each feature in the training set was computed

using KNN leave-one-out validation method. To achieve

that, the feature values from all patches will be plotted in 1-

D space. Then at any one time, each patch will be classified

based on the class of its nearest neighbor in term of

Euclidean distance. In this case, K is chosen to be 1.

Classification is applied to all pixel patches.. The algorithm

will return an error index (e.g. 0.1 = 10% of the pixels were

classified incorrectly) for each feature which will be used to

calculate the performance of each feature.

2.4 Classification

The Support Vector Machine (SVM) classifier was

considered for this study because of its ease in training and

strong performance with data that has many features. The

classifiers were run using Matlab built-in functions. Only the

top 5 features obtained from feature ranking (see Table 1)

were considered as classification inputs to avoid overrepresentation

and reduce computation complexity. To train

the SVM, the Sequential Minimal Optimization learning

algorithm was chosen because it is one of the faster learning

algorithms available with reliable error reducing capability

[16]. Also, the Gaussian Radial Basis Function (RBF) K(xi,xj)

= exp(||xi-xj||/(2�m2)) was chosen as the transformation kernel.

In this work, SVM was trained under different parameters,

ranging from 2 to 5 features with different combinations.

Each combination is tested with different sigma (�m) value

(0.1, 0. 3, 0.5 & 0.7) in order to determine the best.

III. EXPERIMENTS AND RESULTS

Table 1 shows the feature ranking results and it appears that

0 degree GLCM and HSV color features have the top five

most discriminating power that can be used to distinguish

the normal and lesion classes. They are used to train the

classifiers for lesion segmentation.

TABLE I. SINGLE FEATURE RANKING USING KNN.

Feature Group Feature Name Rank

F1: 0 degree GLCM Energy 1

F2: Colour features Median for Value

channel of HSV

2

F3: Colour features Skewness for

Saturation channel of

3

CONFIDENTIAL. Limited circulation. For review only.

Preprint submitted to 35th Annual International IEEE EMBS Conference.

Received February 3, 2013.

HSV

F4: 0 degree GLCM Inverse difference

normalized

4

F5: 0 degree GLCM Information measure

of correlation

5

In the table,

�U �U

, , d=1.

F3 is given by:

�U ��

��

�U ��

(3)

Here x is the saturation matrix of the pixel patch, �� is the

standard deviation of x, n=9 (total number of pixels). F4 is

defined by

F4 =�U �U

| |

(4)

The SVM results at different sigma values have shown

that the most optimal value of sigma is 0.1 (see Fig. 3).

Fig. 3. SVM ROC space plot for different sigma values (0.1,

0.3, 0.5 & 0.7). The top-left most point is for .

Table 2 lists out the lesion detection performance using

different feature combinations in terms of accuracy,

specificity and sensitivity for sigma=0.1. We have the

following observations. Firstly, the result has indicated that

F5 may possess the best discriminating power as any

combination that is coupled with F5 was able to reach an

accuracy of greater than 80%; while the rest failed to do so.

F5 is the information measure of correlation, which is a

derivation of the correlation function. In essence, the feature

measures the gray tone linear-dependencies in the image. As

such, it is capable of discriminating homogenous areas from

non-homogenous areas [17]. In the case of skin melanoma

where the lesion is often less homogenous in texture and

color tone than the normal skin, F5 will be able to describe

the images aptly. This might be the reason why it is able to

produce better accuracy when combined with other

descriptors. For example, from Table 2, F1 with F4 produced

the worse accuracy of all (56.94%); however F1, F4 together

with F5 is able to provide an accuracy of (86.14%),

emphasizing the importance of F5.

Secondly, utilizing the same number of features but in

different combination produces different result. From table

2, F1 and F4 gives an accuracy of 56.94% while F1 and F3

gives an accuracy of 72.80%. This suggests that although all

features are top ranking features, some combinations are

complementary while others are not. In addition, it can be

noted that selecting the top features in ascending order of

their ranking may not always produce the best accuracy. For

an instance, F1, F2 and F3 gives an accuracy of 73.10%

while F3, F4 and F5 gives an accuracy of 89.00%.

TABLE II. ACCURACY, SPECIFICITY AND SENSITIVITY OF

FEATURE COMBINATIONS

Feature Accuracy Specificity Sensitivity

F1,F2 66.48 68.83 64.13

F1,F3 72.80 74.08 71.52

F1,F4 56.94 30.18 83.70

F1,F5 86.03 93.83 78.23

F2,F3 73.073 74.05 72.10

F2,F4 66.47 66.98 65.95

F2,F5 87.16 92.61 81.71

F3,F4 72.90 74.22 71.57

F3,F5 88.28 92.45 84.11

F4,F5 85.99 93.97 78.01

F1,F2,F4 66.59 67.35 65.83

F1,F2,F3 73.10 74.16 72.03

F1,F2,F5 87.15 92.85 81.45

F1,F3,F4 72.95 73.46 72.44

F1,F3,F5 88.36 92.47 84.25

F1,F4,F5 86.14 93.98 78.30

F2,F3,F4 73.83 76.53 71.12

F2,F3,F5 89.83 93.58 86.07

F2,F4,F5 87.51 92.58 82.44

F3,F4,F5 89.00 93.10 84.90

F1,F2,F3,F4 74.19 76.40 71.99

F1,F2,F3,F5 88.76 92.82 84.70

F1,F2,F4,F5 87.66 92.67 82.65

F1,F3,F4,F5 89.06 93.22 84.91

F2,F3,F4,F5 90.51 94.29 86.74

F1,F2,F3,F4,F5 90.84 94.61 87.06

Thirdly, an optimal number of features should not be

predetermined but rather depends on the discriminating

power of the extracted features and their complementary

properties. For example, a combination of three features (F1,

F4 & F5) may perform better than two (F2 & F4). But when

0.932

0.934

0.936

0.938

0.94

0.942

0.944

0.946

0.948

0.08 0.1 0.12 0.14 0.16 0.18

Sensitivity

1-specificity

CONFIDENTIAL. Limited circulation. For review only.

Preprint submitted to 35th Annual International IEEE EMBS Conference.

Received February 3, 2013.

utilized with F5, a combination such as F3 & F5 may out

perform a combination of 3 such as F2, F3 &F4. In this case,

fixing the number prior to SVM training could have limited

the performance.

As an example, Fig. 4 illustrates lesion detection and

segmentation results in two images using the same feature

combination (F1 and F5). The automatic processing results

are very close to the respective ground truth.

Fig.4. Segmentation results using feature combination of F1

and F5. Red contour is segmentation produced by SVM

while the green contour refers to a ground-truth that was

agreed upon by 5 experts.

IV. DISCUSSION AND CONCLUSION

Our work has provided some insights into the relationships

between the top ranking features from KNN feature ranking

technique. Three keys observations can be made from the

obtained results: 1) F5 has the best discriminating power; 2)

different combinations of the same number of the top

ranking features will give different results; 3) an optimal

number of features should not be predetermined but depends

on the discriminating power of the extracted features and

their complementary properties. Therefore, our experimental

result has demonstrated that feature ranking followed by the

determination of optimal number of features to be used in

the classifiers is insufficient. Only through a good

understanding the synergistic relationship between each

feature can a more accurate algorithm can be proposed.

Acknowledgment

We would like to thank Dr. Hitoshi Iyatomi from Hosei

University, Japan, who has kindly provided the raw

dermatoscopic images with expert annotations.

V. REFERENCES

[1] J. H. Jaseema Yasmin, M Mohamed Sadiq: An

Improved Iterative Segmentation Algorithm using

Canny Edge Detector with Iterative Median Filter for

Skin Lesion Border Detection. Comput Med Imaging

Graph 2012, 50(6):37-42.

[2] Argenziano G, et. al., Dermoscopy of pigmented skin

lesions: results of a consensus meeting via the Internet.

J Am Acad Dermatol 2003, 48(5):679-693.

[3] Celebi ME, Iyatomi H, Schaefer G, Stoecker WV:

Lesion border detection in dermoscopy images. Comput

Med Imaging Graph 2009, 33(2):148-153.

[4] Jain A, Waller W: On the optimal number of features in

the classification of multivariate Gaussian data. Comput

Med Imaging Graph 1978, 10(5�V6):365-374.

[5] Huan Liu, Setiono R: Feature selection via

discretization. Knowledge and Data Engineering, IEEE

Transactions on 1997, 9:642-645.

[6] Ganster H, Pinz P, Rohrer R, Wildling E, Binder M,

Kittler H: Automated melanoma recognition. Medical

Imaging, IEEE Transactions on 2001, 20(3): 233-239.

[7] Razeghi O, Qiu G, Williams H, Thomas K: Computer

Aided Skin Lesion Diagnosis with Humans in the Loop.

Machine Learning in Medical Imaging 2012, vol. 7588,

pp. 266-274

[8] Dhawan AP, Sim A: Segmentation of images of skin

lesions using color and texture information of surface

pigmentation. Comput Med Imaging Graph 1992,

16(3):163-177.

[9] Milan Sonka, Vaclav Hlavac, Roger Boyle: 2 (Ed):

Image Processing, Analysis, and Machine Vision. 1999.

[10] Prasad P, Varma V, Harish V, Kumar K: Classification

of Different Textures Using SVM and Fuzzy logic.

International Journal 2012, 2(4):463-466

[11] Nowak L, Ogorzalek M, Pawlowski M: Pigmented

Network Structure Detection Using Semi-Smart

Adaptive Filters. In Systems Biology (ISB), 2012 IEEE

6th International Conference on. 2012:310-314.

[12] Iyatomi H, Oka H, Celebi M, Ogawa K, Argenziano G,

Soyer H, Koga H, Saida T, Ohara K, Tanaka M:

Computer-based classification of dermoscopy images of

melanocytic lesions on acral volar skin. Journal of

Investigative Dermatology 2008, 128(8):2049-2054

[13] Haralick R, Shanmugam K, Dinstein I: Textural features

for image classification. Systems, Man and Cybernetics,

IEEE Transactions on 1973, SMC-3(6):610-621

[14] Soh L, Tsatsoulis C: Texture analysis of SAR sea ice

imagery using gray level co-occurrence matrices.

Geoscience and Remote Sensing, IEEE Transactions on

1999, 37(2):780-795.

[15] Clausi DA: An analysis of co-occurrence texture

statistics as a function of grey level quantization.

Canadian Journal of Remote Sensing 2002, 28(1):45-62

[16] Platt, John C: "Sequential Minimal Optimization: A Fast

Algorithm for Training Support Vector Machines." (1998).

Microsoft Research TechReport MSR-TR-98-14

[17] Baraldi A, Parmiggiani F: An investigation of the

textural characteristics associated with gray level

cooccurrence matrix statistical parameters. Geoscience

and Remote Sensing, IEEE Transactions on 1995,

33(2):293-304.

CONFIDENTIAL. Limited circulation. For review only.

Preprint submitted to 35th Annual International IEEE EMBS Conference.

Received February 3, 2013.