Obesity is a global epidemic affecting over 1.5 billion people and is one of the risk factors for several diseases such as Type 2 diabetes mellitus and hypertension. Despite measures taken by public health authorities to control obesity, number of people affected with obesity and its related disorders are increasing at alarming rate. It is imperative therefore to understand the molecular pathophysiology of obesity prior to undertaking any containment measures. This book summarizes our research work for developing a system with the capability to mine an extensive corpus of scientific articles and produce a network of genes implicated in a given disease. The system has been developed taking Obesity as a test case. The constructed network holds a central place in understanding the pathophysiology of Obesity and will serve as an important tool for the identification of drug targets. This work has been further extended to seven other obesity related disorders (Diabetes Mellitus II, Cholelithiasis, Hypertension, Hyperlipidemia, Polycystic Ovarian Syndrome, Osteoarthritis and Fatty Liver) and four control sets (Rheumatoid Arthritis, Contact Dermatitis, Asthma and Urticaria).
About the book: Educational reference in data mining Book offers "data mining" explained algorithms for those who learn data mining. We have through this book to explain the most important data mining techniques, a neural network technology, and starting with pre-treatment of the data, classification and analysis of the pairing and clustering, with the strengthening of the explanation with examples to illustrate key concepts that give background knowledge reader is necessary, and in order to understand all the technology of data mining techniques All it takes-book reader is a little knowledge of mathematics and statistics, we have added references for the book include this information
Thanks to increasingly powerful storage media, multimedia resources have become nowadays essential resources and the challenge is how to quickly find relevant information. To accomplish this task, the text within images and videos can be a relevant key. In this work we focus on recognizing the content of the text and we assume that the text box has been detected and located correctly. We focused on a particular machine learning algorithm called convolutional neural networks (CNNs). These are networks of neurons whose topology is similar to the mammalian visual cortex. CNNs were initially used for recognition of handwritten digits. They were then applied successfully on many problems of pattern recognition. We propose in this work a new method of binarization of text images, a new method for segmentation of text images, the study of a convolutional neural network for character recognition in images, a discussion on the relevance of the binarization step in the recognition of text in images based on machine learning methods, and a new method of text recognition in images based on graph theory.
Software vulnerabilities can lead to monetary and information losses. Moreover, detecting vulnerability is very resource intensive. Due to limited human and financial resources, prioritization of vulnerabilities is a crucial task. Very little attention has been paid to textual information hidden in vulnerabilities databases like CVE and OSVDB. This research explores how to use from the information hidden in these texts to vulnerability prediction and clustering.
Prevalence of overweight and obesity is increasing not only among adult but also among adolescents. Not much information is available regarding prevalence of overweight and obesity among adolescents in a developing country like India. This work was done in Ahmedabad, a city in India. The book can provide valuable information to researchers regarding prevalence and risk factors for overweight and obesity among adolescents in a developing country.
Text mining draw more and more attention recently, it has been applied on different domains including web mining, and sentiment analysis. Text preprocessing is an important stage in text mining. The main problems in text mining are structuring text data, and the very high dimensionality of text data. Natural language processing and morphological tools can be employed to reduce the dimensionality of text data. In addition, term weighting schemes can be used to enhance text representation as feature vector. Researches in the field of Arabic text mining are still fairly limited. The work of this book presents and compares the impact of text preprocessing on Arabic text classification using popular text classification algorithms. Text preprocessing includes applying different term weighting schemes, and Arabic morphological analysis (stemming and light stemming). Text Classification algorithms are applied on 7 Arabic corpora. Results show that Light stemming with term pruning is best feature reduction technique; Support Vector Machines and Naive Bayes variations outperform other algorithms; Weighting schemes impact the performance of distance based classifier.
In Communities and Networks, Katherine Giuffre takes the science of social network analysis and applies it to key issues of living in communities, especially in urban areas, exploring questions such as: How do communities shape our lives and identities? How do they foster either conformity or innovation? What holds communities together and what happens when they fragment or fall apart? How is community life changing in response to technological advances? Refreshingly accessible and built on fascinating case examples, this unique book provides not only the theoretical grounding necessary to understand how and why the burgeoning area of social network analysis can be useful in studying communities, but also clear technical explanations of the tools of network analysis and how to gather and analyze real-world network data. Network analysis allows us to see community life in a new perspective, with sometimes surprising results and insights, and this book enables readers to gain a deeper understanding of social life and the relationships that build (and break) communities. This engaging text will be an exciting new resource for upper-level undergraduate and beginning graduate students in a wide range of courses including social network analysis, community studies, urban studies, organizational studies, and quantitative methods.
The inherent resource constraints in SMEs make their quality management highly challenging. SMEs in developing economies face even tougher hurdles while managing their quality. With the development of the concept of Extended Enterprise, SMEs can capitalize on the institutional strength of their larger supply chain partners building upon their strengths in quality management. This book demystifies the supply chain networks in high context developing economies, using the strengths of these networks to accelerate the positive diffusion of quality management practices from large buyers to small suppliers. This work also helps in developing an understanding of business cultures in developing economies along-with examining the current health of quality management and supplier development activities through empirical research. Based on the above, the book investigates the determinants of quality diffusion in supply chains.
Nearly 25 yrs ago, large-scale mining began in the Rajsamand area. Mining and smelting at its base metal deposits in this region are one of the oldest in the world, dating back to more than 2500 yrs. Mining industry in the region is today deriving and garnering its profitability purely through sustainable exploitation of scarce natural resources. Thus, monitoring of mining activity in this area over years become important. Remote sensing satellite data provide wide area coverage for such events. The present study addresses monitoring of mining activity and its impact on the natural resources using GIS data packages. The study become significant, as the assessment of erratic mining activities would prove helpful in minimizing the hazard it is causing to the ecology and other natural resources by checking them and employing suitable measures to combat the havoc caused by them.
Revision with unchanged content. While many text mining projects emphasize retrieval and extraction, text mining can be leveraged to discover new and previously unknown information. Nowhere is the potential more apparent than in pharmacogenomics-based drug discovery. Text mining can help pharmaceutical researchers reduce the vast information overload hindering pharmacogenomics-based drug discovery because it can aid in the generation of rich novel information from large collections of diverse scientific literature and research data. However the pharmaceutical industry appears to be reluctant to innovate bleeding-edge text mining technologies for drug discovery. The present book re-frames text mining as an approach to automate the generation of novel and interesting information, reviews successful exemplary text mining applications, and examines a case study of a leading pharmaceutical company within the book’s proposed novelty-generation paradigm. The present book is written for a wide range of professionals and scholars, not only for information scientists, industry analysts, and pharmaceutical executives, but also for those interested in innovation studies and the automated acceleration of discovery.
Neural networks are processing devices which are either algorithms or actual hard wares. Their designs are motivated by the design and functioning of human brains and components thereof. Neural networks provide improved performance over conventional technologies in the areas of machine vision, robust pattern detection, signal filtering, virtual reality, data segmentation, data compression, data mining, text mining, artificial life, adaptive control, optimization and scheduling, complex mapping and many more. In this book the fundamental simulation methodologies of the neural networks - McCulloch Pitts neuron model, Hebb''s network, perceptron network, ADALINE neuron model, MADALINE neurons model, hetero associative memory network, auto associative memory network, bidirectional associative memory network, discrete Hopfield network, back propagation network, self organizing map network, learning vector quantization network, max net, mexican hat network, hamming net and counter propagation network are described and illustrated with the help of algorithms, MATLAB source codes and outputs.
The subjective answer evaluations performed on large scale, say in academia, show that human evaluations remain biased and are subject to specific guidelines given to human assessors. An attempt is made to generate unbiased system. The overall system’s functionality relies on three main components, namely, question processing component, the search component and the answer-evaluation component. The parts of e-book in the knowledge sources are annotated and classified according to the bootstrapping ontologies of the subject domain. In continuum with the efforts put by computational linguistic communities towards automatic retrieval of text semantics, an algorithm is designed to automatically generate ‘model answer’ for any keyed-in question or series of related questionnaires. An attempt is made to extract the most meaningful textual fragments from natural language sentences, highlighting the semantic sense of the explained discourse. The research focus roots upon extracting the noun phrases existing in dominating subject and object roles that are intended to connect to the verb phrases, forming triplet components for whole sentence or a part of it.
Malnutrition, in every form, presents significant threats to human health. The world today is facing a double burden of malnutrition which includes both under nutrition and overweight, especially in developing countries. Childhood obesity is one of the most serious public health challenges of the 21st century which is emerging among all ages and socio-economic groups. The belief that obesity or over-nutrition was the problem of developed countries alone is no longer true. The prevalence of obesity seems to be increasing in most parts of the world, even where it used to be rare. Obesity is not a disease in itself but rather a complex symptom. The cause of obesity is complex and multi-factorial, within the context of environmental, social and genetic factors. Hence, this study was conducted to determine the prevalence and the factors contributing to overweight/obesity as it would be a useful tool in planning and developing appropriate intervention methods.
World today can be described as interactions of many entities such as humans, animals, smartphones interacting among themselves. Interactions that occur regularly typically correspond to significant, yet often infrequent and hard to detect interaction patterns that are interesting to know in order to understand and predict behaviors of entities. To identify these regular behaviors, the book presents the periodic subgraph mining problem in a dynamic network and an efficient algorithm to solve it. A dynamic network is a temporal sequence of graphs that represents interactions among individuals of a population over the time. Social network analysis is probably the most famous example of dynamic network analysis. The book proposes the applications of the problem on some real-world networks and shows that analyzing interesting and insightful periodic interaction patterns uncover and characterize the natural periodicities of systems.
The discovery of interesting patterns from database transactions is one of the major problems in knowledge discovery in database. One such interesting pattern is the association rules extracted from these transactions. The goal of this research was to develop and implement a parallel algorithm for mining association rules. We implemented a parallel algorithm that used a lattice approach for mining association rules. The Dynamic Distributed Rule Mining (DDRM) is a lattice-based algorithm that partitions the lattice into sublattices to be assigned to processors for processing and identification of frequent itemsets. We implemented the DDRM using a dynamic load balancing approach to assign classes to processors for analysis of these classes in order to determine if there are any rules present in them. Experimental results show that DDRM utilizes the processors efficiently and performed better than the prefix-based and Partition algorithms that use a static approach to assign classes to the processors. The DDRM algorithm scales well and shows good speedup.