Special Session 46: 

Informative SNP selection based on fuzzy clustering and improved binary particle swarm optimization algorithm

zejun Li
Hunan Institute of Technology, Hengyang, China
Peoples Rep of China
Co-Author(s):    
Abstract:
Single-nucleotide polymorphisms (SNPs) refer to genomic loci where more than one nucleotide is present in a population. Although genotyping all SNP loci in samples can provide accurate genetic information for disease association studies, it is very costly. Searching for a representative set of SNP loci (tag SNP) can reduce the cost of genotyping while preserve the original mutation information as much as possible. At present, some methods have been proposed for the selection of tag SNP loci, which still have limitations such as low prediction accuracy, high time complexity and an excessive number of selected tag SNP loci. This study proposes an informative SNP selection method based on fuzzy clustering and improved binary particle swarm optimization (FCBPSO). In FCBPSO, the fuzzy clustering method based on the equivalence relation is first used to obtain the candidate tag SNP set to reduce the redundancy between loci. The candidate SNP set is then optimized with the improved binary particle swarm optimization algorithm to obtain the final tag SNP set. FCBPSO not only reduces the size and dimension of the optimization problem but also simplifies the training prediction model, thus reducing the running time. Experiments show that this method is superior to other methods in terms of accuracy and efficiency.