L’usage des entropies est-il justifie en apprentissage a partir des donnees?<b>Is it justified to use entropy measures in machine learning applications?
Palavras-chave:
Mesures d’entropie, Apprentissage machineResumo
De nombreux algorithmes d’apprentissage machine utilisent les mesures d’entropie comme critère de construction qu’ils cherchent ensuite à optimiser. Parmi les mesures le plus employées, l’entropie de Shannon est certainement la plus populaire. Cependant, dans les applications réelles, l’usage des mesures d’entropie s’avère totalement inapproprié à la fois sur le plan pratique et sur le plan théorique. De nombreuses hypothèses sont en fait retenues de manière implicites alors qu’elles sont infondées. Dans cette présentation, nous allons essayer d’identifier ces hypothèses sous-jacentes et montrer qu’elles sont inadaptées en apprentissage à partir des données. Nous énoncerons ensuite, de façon intuitive d’abord, de nouvelles propriétés qui se requises pour définir des mesures pouvant déboucher sur des algorithmes plus efficients pour l’apprentissage machine.
Abstract
Many machine learning algorithms use entropy measures as a criterion of construction that they seek to optimize. Among the most applied measures, Shannon's entropy is certainly the most known. However, in the real world applications, the use of the entropy measure turns out to be totally inadequate both in theory and in practice. Indeed, many hypothesis are in fact implicitly assumed whereas they are unfounded, therefore unjustified. In this paper, we will try to identify those hypothesis and we will demonstrate that they are unsuitable in machine learning with real data. Then, we will introduce, intuitively, a set of new prosperities that should be required for measures that are supposed to lead to efficients algorithms.
Metrics
Referências
ACZEL, J., Daroczy, Z.: On Measures of Information and Their Characterizations. Academic Press, NY, S. Francisco, London (1975)
BARANDELA, R., Sánchez, J.S., García, V., Rangel, E.: Strategies for learning in class imbalance problems. Pattern Recognition 36(3) (2003) 849–851
CHAI, X., Deng, L., Yang, Q., Ling: Test-cost sensitive naive bayes classification. In IEEE, ed.: ICDM apos;04. Fourth IEEE International Conference on Data Mining, ICDM’04 (2004) 973–978
CHEN, C., Liaw, A., Breiman, L.: Using random forest to learn imbalanced data. Technical Report 666, Berkeley, Department of Statistics, University of California (2004)
DOMINGOS, P.: Metacost: A general method for making classifiers cost-sensitive. Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining (KDD-99) (1999) 155–164
ELKAN, C.: The foundations of cost-sensitive learning. In Nebel, B., ed.: IJCAI, Morgan Kaufmann (2001) 973–978
FORTE, B.: Why shannon’s entropy. In Conv. Inform. Teor., 15 (1973) 137–152
HARTLEY, R.V.: Transmission of information. Bell System Tech. J. 7 (1928) 535–563
HENCIN, A.J.: The concept of entropy in the theory of probability. Math. Found. of Information Theory (1957) 1–28
PROVOST, F.: Learning with imbalanced data sets. Invited paper for the AAAI’2000 Workshop on Imbalanced Data Sets (2000)
RÉNYI, A.: On measures of entropy and information. 4th Berkely Symp. Math. Statist. Probability 1 (1960) 547–561
RITSCHARD, G., Zighed, D., Marcellin, S.: Données déséquilibrées, entropie décentrée et indice d’implication. In Gras, R., Orús, P., Pinaud, B., Gregori, P., eds.: Nouveaux apports théoriques à l’analyse statistique implicative et applications (actes des 4èmes rencontres ASI4, 18-21 octobre 2007), Castellón de la Plana (España), Departament de Matemàtiques, Universitat Jaume I (2007) 315–327
SHANNON, C.E.: A mathematical theory of communication. Bell System Tech. J. 27 (1948) 379–423
SHANNON, C.A., Weaver, W.: The mathematical of communication. University of Illinois Press (1949)
ZIGHED, D.A., Marcellin, S., Ritschard, G.: Mesure d’entropie asymétrique et consistante. In Noirhomme-Fraiture, M., Venturini, G., eds.: EGC. Volume RNTI-E-9 of Revue des Nouvelles Technologies de l’Information., Cépaduès-Éditions (2007) 81–86
ZIGHED, D., Rakotomalala, R.: Graphe d’induction: Apprentissage et Data Mining. Hermès, Paris (2000)
BREIMAN, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification And Regression Trees. Chapman and Hall, New York (1984)
DOMINGOS, P.: The role of occam’s razor in knowledge discovery. Data mining and knowledge discovery 3(4) (1999) 409–425
LENCA, P., Lallich, S., Do, T.N., Pham, N.K.: A comparison of different off-centered entropies to deal with class imbalance for decision trees. In: Advances in Knowledge Discovery and Data Mining. Springer (2008) 634–643
MARCELLIN, S., Zighed, D.A., Ritschard, G.: Evaluating decision trees grown with asymmetric entropies. In: Foundations of Intelligent Systems. Springer (2008) 58–67
OLSHEN, L.B.J.F.R., Stone, C.J.: Classification and regression trees. Wadsworth International Group (1984)
PROVOST, F.J., Fawcett, T.: Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. Knowledge Discovery and Data Mining (1997) 43–48
QUINLAN, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
SEBBANÜ, M., NockO, R., Chauchat, J., Rakotomalala, R.: Impact of learning set quality and size on decision tree performances. IJCSS 1(1) (2000) 85
Downloads
Publicado
Como Citar
Edição
Seção
Licença
Autores que publicam nesta revista concordam com os seguintes termos:- Autores mantém os direitos autorais e concedem à revista o direito de primeira publicação, com o trabalho simultaneamente licenciado sob a Licença Creative Commons Attribution que permite o compartilhamento do trabalho com reconhecimento da autoria e publicação inicial nesta revista.
- Autores têm autorização para assumir contratos adicionais separadamente, para distribuição não-exclusiva da versão do trabalho publicada nesta revista (ex.: publicar em repositório institucional ou como capítulo de livro), com reconhecimento de autoria e publicação inicial nesta revista.
- Autores têm permissão e são estimulados a publicar e distribuir seu trabalho online (ex.: em repositórios institucionais ou na sua página pessoal) a qualquer ponto antes ou durante o processo editorial, já que isso pode gerar alterações produtivas, bem como aumentar o impacto e a citação do trabalho publicado (Veja O Efeito do Acesso Livre).