User-representative feature selection for keystroke dynamics

Alsolami, Eesa, Boyd, Colin, Clark, Andrew J., & Ahmed, Irfan (2011) User-representative feature selection for keystroke dynamics. In De Capitani di Vimercati , Sabrina & Samarati , Pierangela (Eds.) International Conference on Network and System Security, 6-8 September 2011, Università degli Studi di Milano, Milan.

View at publisher


Continuous user authentication with keystroke dynamics uses characters sequences as features. Since users can type characters in any order, it is imperative to find character sequences (n-graphs) that are representative of user typing behavior. The contemporary feature selection approaches do not guarantee selecting frequently-typed features which may cause less accurate statistical user-representation.

Furthermore, the selected features do not inherently reflect user typing behavior. We propose four statistical based feature selection techniques that mitigate limitations of existing approaches. The first technique selects the most frequently occurring features. The other three consider different user typing behaviors by selecting: n-graphs that are typed quickly; n-graphs that are typed with consistent time; and n-graphs that have large time variance among users.

We use Gunetti’s keystroke dataset and k-means clustering algorithm for our experiments. The results show that among the proposed techniques, the most-frequent feature selection technique can effectively find user representative features. We further substantiate our results by comparing the most-frequent feature selection technique with three existing approaches (popular Italian words, common n-graphs, and least frequent ngraphs). We find that it performs better than the existing approaches after selecting a certain number of most-frequent n-graphs.

Impact and interest:

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

189 since deposited on 13 Oct 2011
9 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 46474
Item Type: Conference Paper
Refereed: No
Keywords: feature selection, keystroke dynamics, 2-graphs
Subjects: Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > COMPUTER SOFTWARE (080300) > Computer System Security (080303)
Divisions: Past > Schools > Computer Science
Past > QUT Faculties & Divisions > Faculty of Science and Technology
Past > Institutes > Information Security Institute
Copyright Owner: Copyright 2011 [please consult the authors]
Deposited On: 13 Oct 2011 22:22
Last Modified: 15 Oct 2011 01:52

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page