Research Article Open Access

Secured Disclosure of Sensitive Data in Data Mining Techniques

Kirubhakar Gurusamy1 and Venkatesh Chakrapani2
  • 1 Surya Engineering College, India
  • 2 , India

Abstract

Recent advances in data collection, data dissemination and related technologies have inaugurated a new era of research where existing data mining algorithms should be reconsidered from the point of view of securing sensitive data. People have become increasingly unwilling to share their data. This frequently results in individuals either refusing to share their data or providing incorrect data. In turn, such problems in data collection can affect the success of data mining, which relies on sufficient amounts of accurate data in order to produce meaningful results. Based on the analysis of shortcomings of earlier technologies this study proposes a new method for securing numerical and categorical data. In this method the categorical data is converted into Binary form and perturbation based noise is introduced as a security method based on the security level anticipated. Several types of noise addition methods were employed and generalized results were evaluated in terms of misclassification error and privacy level. An average of misclassification error was below 50% for 75-90% security level, which is better than earlier methods which didn’t handle categorical data. The results obtained prove that the proposed method outperforms some of the currently existing methods thereby ensuring the possibility of securing sensitive data irrespective of its type being numerical or categorical.

Journal of Computer Science
Volume 8 No. 12, 2012, 2042-2052

DOI: https://doi.org/10.3844/jcssp.2012.2042.2052

Submitted On: 13 July 2012 Published On: 18 December 2012

How to Cite: Gurusamy, K. & Chakrapani, V. (2012). Secured Disclosure of Sensitive Data in Data Mining Techniques. Journal of Computer Science, 8(12), 2042-2052. https://doi.org/10.3844/jcssp.2012.2042.2052

  • 3,145 Views
  • 2,643 Downloads
  • 0 Citations

Download

Keywords

  • Security
  • Privacy
  • Data Dissemination
  • Clustering
  • Quantification