Bioinformatics Speacial Interest Group (BI SIG)

First International BioInfo'2005 Workshop

Plovdiv, Bulgaria, 20th September 2005

Abstracts

Local and "Personalised" Modeling and Knowledge Discovery in Bioinformatics

Prof. Nikola Kasabov

Director, Knowledge Engineering and Discovery Research Institute, KEDRI Auckland University of Technology, (http://www.kedri.info/), nkasabov@aut.ac.nz

Abstract - This presentation introduces first some challenging problems in Bioinformatics (BI) and then applies methods of Computational Intelligence (CI) to illustrate possible solutions to them. The main focus of the talk is on how CI can facilitate discoveries from biological data and the extraction of knowledge.

Methods of evolving knowledge-based neural networks and hybrid neuro-evolutionary systems, characterised by adaptive learning, rule extraction and evolutionary optimization [1], are highlighted among the other traditional CI methods [2].

CI solutions to BI problems such as: DNA sequence analysis, microarray gene expression analysis and profiling, protein structure prediction, gene regulatory network discovery, medical prognostic systems, modeling gene-neuronal relationship [3] and others are presented and illustrated.

Fundamental issues in CI such as: dimensionality reduction and feature extraction, model creation and model validation, model adaptation, model optimization, knowledge extraction, inductive versus transductive reasoning, global versus local models, kernel methods and others are addressed and illustrated on BI problems.

While inductive modelling is used to develop a set of local models covering the whole problem space and then to recall them for a new data vector, transductive modeling is concerned with the creation of a single, "personalized" model for every new input vector based on some closest vectors from the existing problem space. Issues of feature selection, dimensionality reduction, neighbourhood selection, model optimization, model verification, personalized profiling, and knowledge discovery are discussed and illustrated on the case study problems with the use of a software environment NeuCom (http://www.theneucom.com/). A comparative analysis of different CI methods applied to the same problems is presented in an attempt to identify generic and specific applicability of the CI methods.

Keywords: Computational Intelligence, Adaptive knowledge-based neural networks, Bioinformatics, Neuroinformatics, Local modelling, Transductive reasoning, Personalised modelling.

References

[1] N.Kasabov, Evolving Connectionist Systems: Methods and Applications in Bioinformatics, Brain Study and Intelligent Machines, Springer Verlag, 2002 (http://www.springer.de/)

[2] N.Kasabov, Foundations of neural networks, fuzzy systems and knowledge engineering, MIT Press, 1996 (http://www.mitpress.edu/)

[3] N.Kasabov and L.Benuskova, Computational Neurogenetics, Journal of Computational and Theoretical Nanoscience, vol.1, No.1, American Scientific Publishers, 2004 (http://www.aspbs.com/)

Viewing the Phenomenon of Heterosis as a Network of Interacting Parallel Aggregation Processes

Elena Tsiporkova¹ and Veselka Boeva²

¹Computational Biology Division, Dept. of Plant Systems Biology

Flanders Institute for Biotechnology, Ghent University, (http://www.psb.ugent.be/cbd/), elena.tsiporkova@psb.UGent.be

²Department of Computer Systems, Technical University of Plovdiv, veselka_boeva@hotmail.com

Abstract - This contribution develops a mathematical model allowing interpretation and simulation of the phenomenon of heterosis as a network of interacting parallel aggregation processes. Heterosis ('hybrid vigour') refers to an improved performance of F1 hybrids with respect to the parents. It has been observed that a cross between quasi-homozygous parents can in some cases lead to an offspring (F1) that is better in terms of yield, stress resistance, speed of development, etc. as compared to the parents. Heterosis is of great commercial importance since it enables the breeder to generate a product (F1 hybrid seed) with preserved values which in turn, allows the farmer to grow uniform plants expressing these heterosis features. Besides a commercial interest there is a more fundamental scientific interest associated with the biological phenomenon of heterosis performance, as an excellent example of what complex genetic interactions can lead to. Two main models have been considered in attempts to explain heterosis [1]: the additive-dominance model and the epistatic model. The present work is focused on the former one.

We have initially pursued expressing the overall heterosis potential in terms of the heterosis potentials of each of the individual genes controlling the trait of interest. This has allowed us to gain a better understanding of the biological mechanisms behind the phenomenon of heterosis. According to the additive-dominance model the net heterosis potential can be expressed as a weighted mean of the heterosis potentials of the individual genes weighted with their relative additive effects respectively. Whenever the alleles are dispersed between the parents this weighted mean is further rescaled according to their association-dispersion coefficient.

Next, we have developed a mathematical formalism that allows to interpret the sub-processes building up the additive-dominance heterosis as interacting parallel aggregations. The individual genes controlling the trait of interest are viewed as interacting agents involved in the process of achieving a trade-off between their individual contributions to the overall heterosis potential [2]. Each agent is initially assigned a vector of interacting coefficients, representing the relative degrees of influence this agent accepts from the other agents. Then the individual heterosis potentials of the different agents are combined in parallel with weighted mean aggregations, one for each agent (i.e. taking into account the degrees of influence of each agent). Consequently, a new heterosis potential is obtained for each agent. The above parallel aggregations are repeated again and again until a consensus between the agents is attained.

Keywords: Heterosis, Additive-dominance model, Bioinformatics, Recursive aggregation.

References

[1] J.A.Birchler, D.L.Auger, N.C.Riddle, In Search of the molecular Basis of Heterosis, The Plant Cell 15 (2003) 2236 - 2239.

[2] E.Tsiporkova, V.Boeva, Nonparametric Recursive Aggregation Process, Kybernetika, Journal of the Czech Society for Cybernetics and Inf. Sciences, 40 1 (2004) 51 - 70.

About a New Method for Bioimpedance Measurement

Assoc. Prof. Dr.-Eng. Stanislav Dimov

Faculty for Engineering in German language-TU Sofia, Department of Informatics-University of Mining&Geology, Sofia, stani@mgu.bg

Abstract - In the article is proposed a new electrical method for bioimpedance measurement [1]. The method is based on the indirect measurement of human skin bioimpedance via electrical generator. The electrical power is applied on the active electrode on the human body and transmitted back via neutral electrode in the generator. The active part of the bioimpedance is calculated by means of microprocessor unit according to the low of Ohm and the capacitive part according to the practical determined curves of the skin capacitance from the applied voltage. With the proposed method is obtained a good accuracy by the measurement of bioimpedance. The method is large applicable in the electro surgery [2].The second topic in the article is the design of a control program for the microprocessor unit by variable bioimpedance of the human body during electro surgery intervention. The moment value of the bioimpedance should be calculated and the output power of the RF generator should be regulated. Another task, which is considered in the article, is the a priori determination of the bioimpedance of the human skin by the different physiological conditions of the patent and different age (for example children, adults) [3].

Keywords: bioimpedance, electrical measurements, electro surgery, computer modelling, electricity properties of the tissue

References:

[1] Melab, Lecture in Biomedical Engineering, Department of Biomedical Engineering, Seoul National University, Korea, 2004.

[2] Li Wing, Feasibility of using an Implantable to measure thoroticic Congestion, Pacing and Clinical Electrophysiology,vol. 28,Isue 5, 2005.

[3] Multi-frequency bioimpedance measurement of children in intensive care, Medical & Biological Engineering & Computing, 2001.

Dissolution of Bi-component Fibrin Clots with Plasmin: Quantitation of the Modulating Effect of Myosin

Kiril Tenekedjiev*, Balázs Váradi** and Krasimir Kolev**

* Department of Economics and Management, Technical University - Varna, Bulgaria (correspondence author), kiril@dilogos.com

** Department of Medical Biochemistry, Semmelweis University, Budapest, Hungary

Abstract - The effect of myosin on the fibrin dissolution with plasmin is studied. Turbidimetric data evidence that myosin retards the fibrin degradation with plasmin. Semi-quantitative evaluation of the plasmin efficiency defines myosin as a potent modulator of fibrinolysis: at 0.5 molar ratio of myosin to fibrin monomers 8-fold higher plasmin concentration is necessary to yield the same rate of dissolution as in the pure fibrin clots. Using a dynamic non-steady state kinetic model and a multifactorial optimization procedure to the experimental turbidimetric data gained for various myosin-fibrin ratios and plasmin concentrations, we have determined the rate constants for the interaction of plasmin with myosin and fibrin-myosin complexes. The kinetic parameters of our new bi-component model system suggest that the complex of myosin and fibrin differs markedly from its separate constituents as a substrate of plasmin. A 50-fold decrease is detected in the apparent catalytic constant for fibrin in the complex (0.14 s^-1 versus 0.003 s^-1), whereas the myosin degradation is less affected (0.91 s^-1 versus 0.04 s^-1). Thus, in the examined bi-component system the dissolution of the fibrin matrix with plasmin can proceed only following removal of the myosin.

Keywords: thrombolysis, proteolysis, kinetics, heterogeneous phase catalysis

One Way of Protein Structure Representation for Determining Protein Structure Similarity

Prof. Stoicho D. Stoichev¹ and Dobrinka Petrova²

¹Department of Computer Systems, Technical University - Sofia, stoi@tu-sofia.bg

²Department of Computer Systems, Technical University of Plovdiv, dobi_l5@yahoo.com

Abstract - As a result of many projects number of protein structures, determined experimentally grows at an accelerated rate. However, it is impossible to determine all of them by experiments. The requirement of computational methods for protein structure determination is evident.

In this paper is considered a technique for specifying super-secondary and tertiary protein structure using an autopsy of a PDB file. This way of protein structure representation can be used for determining protein structure similarity and classification. Proposed technique uses rules, which are created to describe newly extracted structures and to compare them to already known structures.

Keywords: protein structure representation, protein structure similarity, classification, super-secondary, tertiary structure

References

[1] Alexandrov, N.N. and Fischer, D. Analysis of topological and nontopological structural similarities in the PDB: new examples with old structures.

[2]Rastall,ProteinArchitecture - http://www.food.rdg.ac.uk/online/fs916/lect3/lect3.htm.

[3] http://www.ncbi.nlm.nih.gov/Structure/VAST/vastsearch.html

An Algorithm for Determining Gene Activity Network

Stoicho D. Stoichev¹, Hristina Dinkova² and Nikola Kasabov³

¹Department of Computer Systems, Technical University - Sofia, stoi@tu-sofia.bg

²Department of Computer Systems, Technical University - Sofia, chrissy_p_d@yahoo.com

³Knowledge Engineering and Discovery Research Institute, KEDRI, Director, Auckland University of Technology, (http://www.kedri.info/), nkasabov@aut.ac.nz

Abstract - The correlation between activities of different genes in an organism is of great importance for the science (molecular biology, bioinformatics, medicine ,etc.). We represent these dependences as a directed weighted graph: each vertex represent a gene and the directed edge between genes g2 and g1 has a weight equal to a polynomial (of some degree) giving the partial dependence of the activity of g1 from the activity of g2. We propose an algorithm for determining the coefficients of these polynomials. The input data for the algorithm are values of activity of each gene for several (m) time moments (intervals are hours or days).

The activity of the gene i is represented by the formula

n is the number of the genes and polynomials are of degree 4.

Before computing the coefficients of the polynomials we exclude the genes with slight variation and we consider only a smaller set of genes whose activities have greater derivatives.

The function g(t) given for several instances g(t₁), g(t₂), . . . , g(t_m) we approximate by linear segments. The number of the unknown coefficients in the above formula is 5(n-1) and we need such a number of linear equations for each gene, i.e. number of values of the activities for the moments whose number is distributed proportionally to the derivatives of the segments of the approximation function.

Hybrid Neural Network Applications

Albena Taneva and Michail Petrov^*

Technical University Sofia, Branch Plovdiv, Control Systems Department

^*Head of the Department

Abstract - Many problems still need new solutions in variety areas. The scientists very often investigate and then attempt to copy the approaches and methods from the Nature. This paper will present an algorithms based on human knowledge and computer programming for solving and for improving the control strategies.

The Takagi-Sugeno fuzzy logic, feed forward neural network and simplified gradient decent learning are used as basic tools for the: adaptive control [1] and predictive strategy design [2, 3].

The main focus here is an Adaptive Neuro Fuzzy Architecture (ANFA) with Takagi Sugeno engine designed as a model and as an optimizer for different control tasks. The programmed algorithms can be viewed as examples of the Computational Intelligence submitted to the natural laws. The common model ANFA has adaptive features and architecture and is based on human knowledge to the particular task.

Related to the real time implementation is developed simplified gradient decent algorithm - Recurrent Two steps Gradient Algorithm (RTGA) for on-line training and updating of the model parameters.

The goal of the whole work was: to evaluate and adapt the plant model parameters on-line, to optimized the control, and to carried out real time experiments. The obtained results are showed that the developed ANFA architecture with RTGA learning are promising tool and can be shifted to solve the some tasks in Bioinformatics area.

Keywords: Adaptive Neural Network, Sugeno Fuzzy Logic, Local modeling, Optimization

Reference:

[1] Petrov M., I.Ganchev, and A.Taneva. Fuzzy PID Control of Nonlinear Plants. First International IEEE Symposium "INTELLIGENT SYSTEMS", Varna, Bulgaria, September, 2002, pp. 30-35

[2] Taneva A. Predictive controller based on fuzzy neural model. Journal of the Technical University of Plovdiv "Fundamental Sciences and Applications", vol.9, 2002, series B

[3] Petrov M., I. Ganchev, A. Taneva. Fuzzy Model Predictive Control of Nonlinear Processes. Preprints of the International Conference on "Automation and Informatics'2002", Sofia, Bulgaria, 5-6 November, 2002, pp. 77-80

Learning by Function Minimization Applied to Breast Cancer Diagnosis

Ludmil Dakovski¹ and Zekie Shevked²

¹Department of Computer Systems and Technologies, Technical University - Sofia

²Department of Computer Systems and Technologies, Technical University of Plovdiv

Abstract - Machine learning proposes helpful techniques for today's challenge of diagnosis and prediction. Our work is concerned with learning from examples and its application to medical diagnosis. We propose representing available training instances as logical functions and applying a new strategy to minimization. The main goal is to find a more compact representation of classification function and use it for prediction of unknown cases. This approach performs well on the problem of breast cancer diagnosis.

Key words: Learning from Examples, Function Minimization, Classification, Breast Cancer Diagnosis.