N. Anquetil and T. Lethbridge, Extracting concepts from file names; a new file clustering criterion, Proceedings of the 20th International Conference on Software Engineering, pp.84-93, 1998.
DOI : 10.1109/ICSE.1998.671105

D. Binkley, M. Hearn, and D. Lawrie, Improving identifier informativeness using part of speech information, Proceeding of the 8th working conference on Mining software repositories, MSR '11, pp.203-206, 2011.
DOI : 10.1145/1985441.1985471

S. Butler, Mining Java class identifier naming conventions, 2012 34th International Conference on Software Engineering (ICSE), pp.1641-1643, 2012.
DOI : 10.1109/ICSE.2012.6227216

S. Butler, M. Wermelinger, Y. Yu, and H. Sharp, Improving the Tokenisation of Identifier Names, Proceedings of the 25th European conference on Object-oriented programming, pp.130-154, 2011.
DOI : 10.1017/CBO9780511585852

B. Caprile and P. Tonella, Restructuring program identifier names, Proceedings International Conference on Software Maintenance ICSM-94, p.97, 2000.
DOI : 10.1109/ICSM.2000.883022

W. B. Cavnar and J. M. Trenkle, N-Gram-Based Text Categorization, Symposium On Document Analysis and Information Retrieval, pp.161-175, 1994.

M. Ceccato, M. Marin, K. Mens, L. Moonen, P. Tonella et al., A Qualitative Comparison of Three Aspect Mining Techniques, 13th International Workshop on Program Comprehension (IWPC'05), pp.13-22, 2005.
DOI : 10.1109/WPC.2005.2

F. Deissenboeck and M. Pizka, Concise and consistent naming, Software Quality Journal, vol.38, issue.11, pp.261-282, 2006.
DOI : 10.1007/s11219-006-9219-1

B. Dit, L. Guerrouj, D. Poshyvanyk, and G. Antoniol, Can Better Identifier Splitting Techniques Help Feature Location? In ICPC, pp.11-20, 2011.

P. Domingos and M. Pazzani, On the Optimality of the Simple Bayesian Classifier under Zero-One Loss, Machine Learning, vol.29, 1997.

S. T. Dumais and H. Chen, Hierarchical classification of Web content, Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval , SIGIR '00, pp.256-263, 2000.
DOI : 10.1145/345508.345593

E. Enslen, E. Hill, L. Pollock, and K. Vijay-shanker, Mining source code to automatically split identifiers for software analysis, 2009 6th IEEE International Working Conference on Mining Software Repositories, pp.71-80, 2009.
DOI : 10.1109/MSR.2009.5069482

J. Falleri, M. Huchard, M. Lafourcade, C. Nebut, V. Prince et al., Automatic Extraction of a WordNet-Like Identifier Network from Software, 2010 IEEE 18th International Conference on Program Comprehension, pp.4-13, 2010.
DOI : 10.1109/ICPC.2010.12

URL : https://hal.archives-ouvertes.fr/lirmm-00531807

T. Fawcett, An introduction to ROC analysis, Pattern Recognition Letters, vol.27, issue.8, pp.861-874, 2006.
DOI : 10.1016/j.patrec.2005.10.010

H. B. Feild and D. Lawrie, Identifier Splitting: A Study of Two Tchniquese, Proceedings of MASPLAS'06 Mid-Atlantic Student Workshop on Programming Lanquages and Systems Rutgers University, 2006.

G. Guo, H. Wang, D. A. Bell, Y. Bi, and K. Greer, An kNN Model-Based Approach and Its Application in Text Categorization, CICLing'04, pp.559-570, 2004.
DOI : 10.1007/978-3-540-24630-5_69

M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann et al., The WEKA data mining software, ACM SIGKDD Explorations Newsletter, vol.11, issue.1, pp.10-18, 2009.
DOI : 10.1145/1656274.1656278

E. W. Høst and B. M. Østvold, Debugging Method Names, Proceedings of the 23rd European Conference on ECOOP 2009 ? Object-Oriented Programming, pp.294-317, 2009.
DOI : 10.1109/SCAM.2008.23

C. X. Ling, J. Huang, and H. Zhang, AUC: a statistically consistent and more discriminating measure than accuracy, Proceedings of the 18th international joint conference on Artificial intelligence, pp.519-524, 2003.

A. Mccallum, R. Rosenfeld, T. M. Mitchell, and A. Y. Ng, Improving Text Classification by Shrinkage in a Hierarchy of Classes, Proc. of the int. conf. on Machine Learning, pp.359-367, 1998.

T. M. Mitchell, Machine learning, 1996.

I. C. Mogotsi, D. Christopher, and . Manning, Prabhakar Raghavan, and Hinrich Schütze: Introduction to information retrieval, Information Retrieval, vol.13, pp.252-253, 2010.

F. Provost and T. Fawcett, Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions, Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pp.43-48, 1997.

F. Sebastiani, Machine learning in automated text categorization, ACM Computing Surveys, vol.34, issue.1, 2002.
DOI : 10.1145/505282.505283

P. Warintarawej, A. Laurent, P. Pompidor, A. Cassanas, and B. Laurent, Classifying Words: A Syllables-Based Model, 2011 22nd International Workshop on Database and Expert Systems Applications, pp.208-212, 2011.
DOI : 10.1109/DEXA.2011.21

URL : https://hal.archives-ouvertes.fr/lirmm-00671499

P. Warintarawej, A. Laurent, P. Pompidor, and B. Laurent, Classification of brand names based on n-grams, 2010 International Conference of Soft Computing and Pattern Recognition, pp.12-17, 2010.
DOI : 10.1109/SOCPAR.2010.5685842

URL : https://hal.archives-ouvertes.fr/lirmm-00582626

Y. Yang and X. Liu, A re-examination of text categorization methods, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval , SIGIR '99, pp.42-49, 1999.
DOI : 10.1145/312624.312647

Y. Yang and J. Pedersen, A comparative study on feature selection in text categorization, Proceedings of the Fourteenth International Conference on Machine Learning (ICML'97), pp.412-420, 1997.