Please login first

List of accepted submissions

Show results per page
Find papers
  • Open access
  • 19 Reads
Scheduler for SANN Analysis of U.S. Supreme Court Network Based on Markov-Shannon Entropy

Many complex systems may be represented as complex networks of ith parts or nodes (ni) interconnected by some kind of bonds, ties, relationships, links (Lij). For instance, Fowler et al. represented all case citations (Lij) in the U.S. Supreme Court as a network of nj cases citing and/or cited by other. These huge collections of nodes/links are impossible to remember and rationalize by a single person in order to assign correct links in new situations. Fortunately, Artificial Neural Networks (ANNs) can help us in this task. If we want use ANNs to predict links in complex networks, first we need to transform all the information into numerical input parameters to feed ANNs, second: we need to find the best ANN to predict our network. We can solve the first problem quantifying the structural information of the complex system (Brain, Ecological, Social, etc.) with universal information measures known as Shannon entropy (Sh). We can quantify topological (connectivity) information of both the complex networks under study and a set of ANNs trained using Shannon measures. Then using both sets of information parameters as inputs we can develop a dual QSPR model to discriminate between SANNs and not efficient ANN topologies. Here we used this QSPR method to develop potential HPC schedulers for complex systems. We studied 663072 citations to majority opinions in 43 sub-networks; each one with 5,000 (5K) citations to U.S. Supreme Court decisions (5KCNs). The overall accuracy of the ANN found was of >85% for 5KCNs; in training and validation series.

  • Open access
  • 23 Reads
Chemoinformatics Profiling of Ionic Liquids Cytotoxicity—From Machine Learning to Network-Like Similarity Graphs

Ionic liquids (ILs) possess a unique physicochemical profile providing a wide range of applications. However, their “greenness”, specifically their claimed relative non toxicity has been frequently questioned, hindering their REACH registration processes and so, their final application. In this work we introduce a reliable, predictive, simple and chemically interpretable classification and regression tree (CART) classifier enabling the prioritization of ILs with a favourable cytotoxicity profile. By inspecting the structure of the CART several moieties that can be regarded as “cytotoxicophores” were identified and used to establish a set of SAR trends specifically aimed to prioritise low cytotoxicity ILs. We also demonstrated the suitability of the joint use of the CART classifier and a group fusion similarity search as a virtual screening strategy for the automatic prioritisation of safe ILs disperse in a data set of ILs of moderate to very high cytotoxicity. Additionally, we decided to complement the quantitative results already obtained by applying the network-like similarity graphs (NSG) approach to the mining of relevant structure-cytotoxicity relationships (SCR) trends. Finally, the SCR information concurrently gathered by both, quantitative (CART classifier) and qualitative (NSG) approaches was used to design a focused combinatorial library enriched with potentially safe ILs.

  • Open access
  • 35 Reads
Complex networks of anti-HIV drugs activity vs. prevalence of AIDS in US Counties using symmetry information indices

Different aspects about the epidemiology, drugs, targets, chem-bioinformatics, and systems biology methods, related to AIDS/HIV have been reviewed. Next, we developed a new model to predict complex networks of the AIDS prevalence in U.S. counties taking into consideration the Gini coefficient (income inequality) and activity/structure data of anti-HIV drugs in preclinical assays. First, we trained different Artificial Neural Networks (ANNs) using as input Markov and Symmetry information indices of social networks and of molecular graphs, respectively. We obtained the data about AIDS prevalence and Gini coefficient from the AIDSVu database of the Rollins School of Public Health at Emory University and the data about anti-HIV compounds from ChEMBL database. To train/validate the model and predict the complex network we needed to analyze 43,249 data points including values of AIDS prevalence in 2310 US counties vs. ChEMBL results for 21,582 unique drugs, 9 viral or human protein targets, 4856 protocols, and 10 possible experimental measures. The best model found was a Linear Neural Network (LNN) with Accuracy, Specificity, Sensitivity, and AUROC above 0.72-0.73 in training and external validation series. The new linear equation was shown to be useful to generate complex network maps of drug activity vs. AIDS/HIV epidemiology in U.S. at county level.

  • Open access
  • 37 Reads
Kinetic study of activated carbon synthesis from Marabou Wood

In the last years the demand of activated carbons for environmental remediation and medical applications has been growing. This situation has stimulated the study of new precursors for the synthesis of these adsorbents. This work shows the kinetic parameters of activation process of Marabou Wood (Leptoptilus Crumeniferus) using a simple mathematical model. These parameters were compared with ones corresponding to other tropical biomasses studied under similar conditions. To conduct the study, a thermo-gravimetric analysis was carried out in steam water. The study was carried on from room temperature to 1000°C with a heating rate of 10°C/min, additionally; the crystallinity was determined by X-rays diffraction analysis. The characterization of the activated carbon was carried out through parameters that provide an indirect measure of the mechanical resistance. Interesting correlations for the analyzed thermal conversion processes were also obtained.

  • Open access
  • 24 Reads
Computational study of mycobacterial promoters with low sequence homology.

This communication shows  a classification model for prediction of mycobacterial promoter sequences (mps), which constitute a very low sequence homology problem. The model developed (mps = –4.664·0ξM + 0.991·1ξM – 2.432) was intended to predict whether a naturally occurring sequence is an mps or not on the basis of the calculated kξM value for the corresponding RNA secondary structure. The model predicted 115/135 mps (85.2%) and 100% of control sequences (cs). The detailed results have been published in detail in: Bioorg Med Chem Lett. 2006 Feb;16(3):547-53, the present is a short communications.

  • Open access
  • 21 Reads
Pairwise Ortholog Detection in Related Yeast Species by Using Big Data Supervised Classifications

Orthology detection still requires more effective scaling algorithms. Combinations of alignment, synteny, evolutionary distances and protein interactions have been used in different unsupervised algorithms to improve effectiveness while many available databases are concerned with the scaling problem. In this paper, a set of gene pair features based on similarity measures, such as alignment scores, sequence length, gene membership to conserved regions and physicochemical profiles are combined in a supervised Pairwise Ortholog Detection (POD) approach to improve effectiveness considering low ortholog ratios in relation to all possible pairwise comparisons between two genomes. In this POD scenario, big data supervised classifiers managing imbalance between ortholog and non-ortholog pair classes allow for an effective scaling solution built from two genomes and extended to other genome pairs.

The supervised approach for POD was compared with Reciprocal Best Hits (RBH), Reciprocal Smallest Distance (RSD) and a Comprehensive, Automated Project for the Identification of Orthologs from Complete Genome Data (OMA) algorithms by using (i) Saccharomyces cerevisiae - Kluyveromcyes lactis, (ii) Saccharomyces cerevisiae - Candida glabrata and (iii) Saccharomyces cerevisiae - Schizosaccharomyces pombe yeast genome pairs as benchmark datasets. Four datasets derived from each genome pair comparison with different alignment settings were used. Because of the large amount of instances (gene pairs) and the data imbalance, the building and testing of the supervised model was only possible by using big data supervised classifiers managing imbalance. Evaluation metrics taking low ortholog ratios into account were applied. From the effectiveness perspective, MapReduce Random Oversampling combined with Spark Support Vector Machines outperformed RBH, RSD and OMA, probably, because of the consideration of gene pair features beyond alignment similarities combined with the advances in big data supervised classification.

  • Open access
  • 18 Reads
Fatty Acids Distribution Networks in Ruminal Membrane by Computational and Experimental Studies

The present communication introduces a new classification model for fatty acids (FA) distribution networks in ruminal microbe membrane based on experimental and computational studies. In the experimental part, long chain fatty acids and volatile fatty acids in ruminal microbe membrane or liquid phase were investigated by supplementation of different ratios of Omega-6 / Omega-3 and in the processes of base- / acid- methylation. In the computational part, Perturbation Theory (PT) and Linear Free-Energy Relationships (LFER), combined with corresponding Box-Jenkins (ΔVkj) and PT Operators (ΔΔVkj) were applied into the calculation of physicochemical parameters (Vk) of fatty acids. The best PT-LFER model found to predict the effects of perturbations over the FA distribution network with Sensitivity, Specificity, and Accuracy > 80% for 407,655 cases. In final, PT-LFER model based on LDA was used to reconstruct the complex networks of perturbations in the FA distribution and compared with random Erdős–Rényi network models. The detail results have been published in Mol. BioSyst., 2015, Aug., the present is a short communications.

  • Open access
  • 20 Reads
In Silico Design of New Drugs for Myeloid Leukemia Treatment

In this work we use in silico tools like de novo drug design, molecular docking and absorption, distribution, metabolism and excretion (ADME) studies in order to develop new inhibitors for tyrosine-kinase protein (including its mutate forms) involved in myeloid leukemia disease. This disease is the first cancer directly associated with a genetic abnormality and is associated with hematopoietic stem cells that are manifested primarily with expansion myelopoiesis. Starting from a family of fragment and seeds from known reference drugs, a set of more than 6k molecules were generated. This first set was filtered using the Tanimoto similarity coefficient as criterion. The second set of more dissimilar molecules were then used in the docking and ADME studies. As a result, we obtain a group of molecule that inhibit the tyrosine-kinase family and have ADME properties better than the reference drugs used in the treatment of myeloid leukemia.

  • Open access
  • 17 Reads
Interdependence of Influenza HA and NA and possibilities of new reassortments

The influenza virion is characterized by two surface proteins, hemagglutinin (HA) and neuraminidase (NA). The changes in their surface antigenic sites have given rise to several subtypes – H1 to H16 for the hemagglutinin and N1 to N9 for the neuraminidase, and each influenza strain is identified with these subtypes such as the H5N1, H7N9, etc. Of the 16 x 9 combinations possible, only certain combinations are observed to proliferate in the wild, such as the H1N1, H3N2, H5N1, etc. This interdependence of the HA and NA on certain subtypes have been noticed, and experimentally demonstrated, but the underlying cause or its systematics have been unknown.

We have hypothesized that the base distribution characteristics of the HA and NA constitute a coupling between them. We estimate the coupling strength by measuring the distance in graph radii between the two genes in a graphical representation scheme. We found that this distance was characteristic of each subtype combination and forced combinations with a different HA or NA subtype led to widely different values, which by our hypothesis, and the experimental findings of Zhang et al, implied unstable combinations.

This hypothesis implies that given a stable subtype of pathogenic influenza, we can estimate using the coupling constants which other subtype combinations could emerge through reassortment. Thus in the case of the H5N2 strain which had an epidemic form in North America in 2015, we have calculated the consequences of altering the NA component. We found that only H5N4 and H5N9 combinations could match the coupling strength of the H5N2, thus implying that the next epidemics could arise from these combinations rather than other subtypes of H5. This allows for more focused monitoring of emerging flu strains for epidemic potential.

  • Open access
  • 25 Reads
QSPR-perturbation models for the prediction of B-epitopes from immune epitope database: an interesting route for predicting “in silico” new optimal peptide sequences and/or boundary conditions for vaccine development.

In the present study, three different physicochemical molecular properties for peptides were calculated using the program MARCH-INSIDE: atomic polarizability, partition coefficient, and polarity. These measures were used as input parameters of a Linear Discriminant Analysis (LDA) in order to develop three different quantitative structure-property relationship (QSPR)-perturbation models for the prediction of B-epitopes reported in the immune epitope database (IEDB) given perturbations in peptide sequence, in vivo process, experimental techniques, and source or host organisms. The accuracy, sensitivity and specificity of the models were >90% for both training and cross-validation series. The statistical parameters of the models were compared to the results achieved with the electronegativity QSPR-perturbation model previously reported. The results indicate that this type of approach may constitute an interesting route for predicting “in silico” new optimal peptide sequences and/or boundary conditions for vaccine development.