Publications

Book Chapters

2023

Blockchain-based knowledge graph for high-impact scientific collaboration networks

2023, Blockchain Technology for Secure Social Media Computing

DOI

In this book chapter, we introduce a framework leveraging the intelligence of the crowd to improve the quality, credibility, inclusiveness, long-term impact and adoption of research, particularly in the academic space. This integrated platform revolves around a central knowledge graph (KG) which interacts through artificial intelligence (AI) algorithms with the community. In combination with Internet of Things (IoT) technology and blockchain, a highly productive environment including liquid governance and arbitration is created to fairly acknowledge and attractively incentivize contributions of valuable intellectual property (IP) to this knowledge base. In the proposed platform, various stakeholders customize their terms of agreement to be followed while validating the transactions on blockchain, known as smart contracts. Through the interaction of smart contracts and stakeholders, the agreement based on objective (scientific) criteria will gradually emerge from the simulated interaction and, if applicable, its experimental/empirical verification. © The Institution of Engineering and Technology 2023.

2021

Privacy and security technologies for smart city development

2021, Studies in Systems, Decision and Control

DOI

The ever-increasing rate of urban population and latest technological advances including the IoT, sensors, big data, cloud computing and data analytics has replaced the standard methods of service delivery to the citizens. The IoT devices collect real-time and integrated data by monitoring an individual’s daily activities with the aim of providing efficient services including but not restricted to smart transportation, waste management, personalized healthcare and recommendations. As personal and sensitive information is being collected by these devices, security and privacy challenges are crucial paradigms for concern. While safety and privacy have always been significant study areas, there is a need for a broader perspective to protect personal data with evolving technological challenges. This chapter introduces the security and privacy issues faced by the existing infrastructure. Some case studies are discussed with the measures undertaken for data privacy and security. The chapter concludes with open research challenges grounded on security and privacy. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021.

Open challenges in smart cities: Privacy and security

2021, Studies in Systems, Decision and Control

DOI

Construction of smart cities is no longer a future endeavor. Even though the implementation of smart city comes with enormous conveniences, the realistic implementation is challenged in different aspects. Two of the major aspects along with the design, maintenance and implementation costs are privacy and security. The frameworks introduced for smart city impose many challenges regarding privacy and security of the citizens. Open networks, smart phones, computers etc. are used for the communication in the smart city, making the sensitive data vulnerable to attacks. It is also vital to deal with the privacy issues. Thus, maintaining security and ensuring the privacy in the smart city is necessary and turning out as an open challenge. The present paper proposes Cloud Data Security Model (CDSM) for the better security of data using the cloud storage mechanism. The CSDM proposes four different categories of cloud accounts with special permissions to access the data. Moreover, with the data access record, the owner is completely aware of who is accessing the data. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021.

Journal Articles

2024

A novel ML-driven test case selection approach for enhancing the performance of grammatical evolution

2024, Frontiers in Computer Science

DOI

Computational cost in metaheuristics such as Evolutionary Algorithm (EAs) is often a major concern, particularly with their ability to scale. In data-based training, traditional EAs typically use a significant portion, if not all, of the dataset for model training and fitness evaluation in each generation. This makes EA suffer from high computational costs incurred during the fitness evaluation of the population, particularly when working with large datasets. To mitigate this issue, we propose a Machine Learning (ML)-driven Distance-based Selection (DBS) algorithm that reduces the fitness evaluation time by optimizing test cases. We test our algorithm by applying it to 24 benchmark problems from Symbolic Regression (SR) and digital circuit domains and then using Grammatical Evolution (GE) to train models using the reduced dataset. We use GE to test DBS on SR and produce a system flexible enough to test it on digital circuit problems further. The quality of the solutions is tested and compared against state-of-the-art and conventional training methods to measure the coverage of training data selected using DBS, i.e., how well the subset matches the statistical properties of the entire dataset. Moreover, the effect of optimized training data on run time and the effective size of the evolved solutions is analyzed. Experimental and statistical evaluations of the results show our method empowered GE to yield superior or comparable solutions to the baseline (using the full datasets) with smaller sizes and demonstrates computational efficiency in terms of speed. Copyright © 2024 Gupt, Kshirsagar, Dias, Sullivan and Ryan.

2023

Dynamic Grammar Pruning for Program Size Reduction in Symbolic Regression

2023, SN Computer Science

DOI

Grammar is a key input in grammar-based genetic programming. Grammar design not only influences performance, but also program size. However, grammar design and the choice of productions often require expert input as no automatic approach exists. This research work discusses our approach to automatically reduce a bloated grammar. By utilizing a simple Production Ranking mechanism, we identify productions which are less useful and dynamically prune those to channel evolutionary search towards better (smaller) solutions. Our objective in this work was program size reduction without compromising generalization performance. We tested our approach on 13 standard symbolic regression datasets with Grammatical Evolution. Using a grammar embodying a well-defined function set as a baseline, we compare effective genome length and test performance with our approach. Dynamic grammar pruning achieved significantly better genome lengths for all datasets, while significantly improving generalization performance on three datasets, although it worsened in five datasets. When we utilized linear scaling during the production ranking stages (the first 20 generations) the results dramatically improved. Not only were the programs smaller in all datasets, but generalization scores were also significantly better than the baseline in 6 out of 13 datasets, and comparable in the rest. When the baseline was also linearly scaled as well, the program size was still smaller with the Production Ranking approach, while generalization scores dropped in only three datasets without any significant compromise in the rest. © 2023, The Author(s).

Grammatical Evolution-Driven Algorithm for Efficient and Automatic Hyperparameter Optimisation of Neural Networks

2023, Algorithms

DOI

Neural networks have revolutionised the way we approach problem solving across multiple domains; however, their effective design and efficient use of computational resources is still a challenging task. One of the most important factors influencing this process is model hyperparameters which vary significantly with models and datasets. Recently, there has been an increased focus on automatically tuning these hyperparameters to reduce complexity and to optimise resource utilisation. From traditional human-intuitive tuning methods to random search, grid search, Bayesian optimisation, and evolutionary algorithms, significant advancements have been made in this direction that promise improved performance while using fewer resources. In this article, we propose HyperGE, a two-stage model for automatically tuning hyperparameters driven by grammatical evolution (GE), a bioinspired population-based machine learning algorithm. GE provides an advantage in that it allows users to define their own grammar for generating solutions, making it ideal for defining search spaces across datasets and models. We test HyperGE to fine-tune VGG-19 and ResNet-50 pre-trained networks using three benchmark datasets. We demonstrate that the search space is significantly reduced by a factor of ~90% in Stage 2 with fewer number of trials. HyperGE could become an invaluable tool within the deep learning community, allowing practitioners greater freedom when exploring complex problem domains for hyperparameter fine-tuning. © 2023 by the authors.

SENSIBLE: SEquestered aNd SynergIstic BLockchain Ecosystem

2023, Engineering Reports

DOI

Health care interoperability unfolds the way for personalized health care services at a reduced cost. Furthermore, a decentralized system holds the promise to prevent compromises such as cyber-attacks due to data breaches. Hence, there is a need for a framework that seamlessly integrates and shares data across the system stakeholders. We propose SEquestered aNd SynergIstic BLockchain Ecosystem (SENSIBLE), a blockchain-powered, knowledge-driven data-sharing framework that gives patients complete control of their medical history and can extract rich information hidden in it using knowledge graphs (KGs). By incorporating both blockchain and KGs, we can provide a platform for secure data sharing among stakeholders by maintaining data privacy and integrity through data authentication and robust data integration. We present a Proof-of-Concept of the SENSIBLE network with Ethereum to share dynamic knowledge across stakeholders. Dynamic knowledge generation on the blockchain provides a two-fold advantage of cooperation and communication amongst the stakeholders in the health care ecosystem. This leads to operational ease through sharing relevant portions of complex information while also ensuring the isolation of sensitive medical data. © 2022 The Authors. Engineering Reports published by John Wiley & Sons Ltd.

2022

Design of a cryptographically secure pseudo random number generator with grammatical evolution

2022, Scientific Reports

DOI

This work investigates the potential for using Grammatical Evolution (GE) to generate an initial seed for the construction of a pseudo-random number generator (PRNG) and cryptographically secure (CS) PRNG. We demonstrate the suitability of GE as an entropy source and show that the initial seeds exhibit an average entropy value of 7.940560934 for 8-bit entropy, which is close to the ideal value of 8. We then construct two random number generators, GE-PRNG and GE-CSPRNG, both of which employ these initial seeds. We use Monte Carlo simulations to establish the efficacy of the GE-PRNG using an experimental setup designed to estimate the value for pi, in which 100,000,000 random numbers were generated by our system. This returned the value of pi of 3.146564000, which is precise up to six decimal digits for the actual value of pi. We propose a new approach called control_flow_incrementor to generate cryptographically secure random numbers. The random numbers generated with CSPRNG meet the prescribed National Institute of Standards and Technology SP800-22 and the Diehard statistical test requirements. We also present a computational performance analysis of GE-CSPRNG demonstrating its potential to be used in industrial applications. © 2022, The Author(s).

Insights Into Incorporating Trustworthiness and Ethics in AI Systems With Explainable AI

2022, International Journal of Natural Computing Research (IJNCR)

DOI

Over the past seven decades since the advent of artificial intelligence (AI) technology, researchers have demonstrated and deployed systems incorporating AI in various domains. The absence of model explainability in critical systems such as medical AI and credit risk assessment among others has led to neglect of key ethical and professional principles which can cause considerable harm. With explainability methods, developers can check their models beyond mere performance and identify errors. This leads to increased efficiency in time and reduces development costs. The article summarizes that steering the traditional AI systems toward responsible AI engineering can address concerns raised in the deployment of AI systems and mitigate them by incorporating explainable AI methods. Finally, the article concludes with the societal benefits of the futuristic AI systems and the market shares for revenue generation possible through the deployment of trustworthy and ethical AI systems.

2021

Convergence of Blockchain, Autonomous Agents, and Knowledge Graph to Share Electronic Health Records

2021, Frontiers in Blockchain

DOI

In this article, we discuss a data sharing and knowledge integration framework through autonomous agents with blockchain for implementing Electronic Health Records (EHR). This will enable us to augment existing blockchain-based EHR Systems. We discuss how major concerns in the health industry, i.e., trust, security and scalability, can be addressed by transitioning from existing models to convergence of the three technologies – blockchain, agent-based modeling, and knowledge graph in a decentralized ecosystem. Each autonomous agent is responsible for instantiating key processes, such as user authentication and authorization, smart contracts, and knowledge graph generation through data integration among the participating stakeholders in the network. We discuss a layered approach for the design of the proposed system leading to an enhanced, safer clinical decision-making system. This can pave the way toward more informed and engaged patients and citizens by delivering personalized healthcare. Copyright © 2021 Yao, Kshirsagar, Vaidya, Ducrée and Ryan.

2020

From trash to cash: How blockchain and multi-sensor-driven artificial intelligence can transform circular economy of plastic waste?

2020, Administrative Sciences

DOI

Virgin polymers based on petrochemical feedstock are mainly preferred by most plastic goods manufacturers instead of recycled plastic feedstock. Major reason for this is the lack of reliable information about the quality, suitability, and availability of recycled plastics, which is partly due to lack of proper segregation techniques. In this paper, we present our ongoing efforts to segregate plastics based on its types and improve the reliability of information about recycled plastics using the first-of-its-kind blockchain smart contracts powered by multi-sensor data-fusion algorithms using artificial intelligence. We have demonstrated how different data-fusion modes can be employed to retrieve various physico-chemical parameters of plastic waste for accurate segregation. We have discussed how these smart tools help in efficiently segregating commingled plastics and can be reliably used in the circular economy of plastic. Using these tools, segregators, recyclers, and manufacturers can reliably share data, plan the supply chain, execute purchase orders, and hence, finally increase the use of recycled plastic feedstock. © 2020 by the authors. Licensee MDPI, Basel, Switzerland.

2019

Sentence extraction for machine comprehension

2019, International Journal of Recent Technology and Engineering

DOI

Machine comprehension is a broad research area from Natural Language Processing domain, which deals with making a computerised system understand the given natural language text. Question answering system is one such variant used to find the correct ‘answer’ for a ‘query’ using the supplied ‘context’. Using a sentence instead of the whole context paragraph to determine the ‘answer’ is quite useful in terms of computation as well as accuracy. Sentence selection can, therefore, be considered as a first step to get the answer. This work devises a method for sentence selection that uses cosine similarity and common word count between each sentence of context and question. This removes the extensive training overhead associated with other available approaches, while still giving comparable results. The SQuAD dataset is used for accuracy based performance comparison. ©BEIESP.

Metagenomic classification using centrifuge

2019, International Journal of Recent Technology and Engineering

DOI

To assess the quality of food is a major challenge the food industry faces today. It is of utmost importance to test it for contaminates and non-edible material that may be present. To overcome these challenges metagenomic classification is majorly useful. Several researches involve various classification techniques and their studies. Difficulties in metagenomic classification include increasing number of genomes thereby requirement of computational methods to have high speed as well as high accuracy so as to compare DNA sequences to genomes. Centrifuge is a classification tool for quantification of species present in a sample so as to monitor the quality of the same. Given a food sample Centrifuge effectively classifies the species present in it enabling a timely and accurate analysis. © BEIESP.

Conference Papers

2025

Mitigating Algorithmic Bias in Prostate Cancer Risk Stratification with Responsible Artificial Intelligence and Machine Learning

2025, International Conference on Agents and Artificial Intelligence

DOI

Prostate cancer (PCa) is the second most prevalent cancer among men worldwide, the majority affecting those over the age of 65. The Gleason Score (GS) remains the gold standard for diagnosing clinically significant prostate cancer (csPCa); however, traditional biopsy can lead to patient discomfort. Algorithmic bias in medical diagnostic models remains a critical challenge, impacting model reliability and generalizability across diverse patient populations. This study explores the potential of Machine Learning (ML) models—Logistic Regression (LR) and multiple DL models—as non-invasive alternatives for predicting the GS using Prostate Imaging Cancer AI challenge dataset. To the best of our knowledge, this is the first attempt to use two modalities with this dataset for risk stratification. We developed a LR model, excluding biopsy-derived features like GS, to predict clinically significant prostate cancer, alongside an image triage approach with convolutional neural networks to reduce biases in the ML workflow. Preliminary results from LR and ResNet50, showed test accuracies of 69.79% and 60%, respectively. These findings demonstrate the potential for explainable, trustworthy, and responsible risk stratification enhancing the robustness and generalizability of the prostate cancer risk stratification model.

2024

Enhancing Portfolio Performance: A Random Forest Approach to Volatility Prediction and Optimization

2024, International Conference on Agents and Artificial Intelligence

DOI

Machine learning has diverse applications in various domains, including disease diagnosis in healthcare, user behavior analysis, and algorithmic trading. However, machine learning’s use in portfolio volatility predictions and optimization has only been recently explored and requires further investigation to prove valuable in real-world settings. We thus propose an effective method that accomplishes both these tasks and is targeted at people who are new to the realm of finance. This paper explores (a) a novel approach of using supervised machine learning with the Random Forest algorithm to predict portfolio volatility value and categorization and (b) a flexible method taking into account users’ restrictions on stock allocations to build an optimized and customized portfolio. Our framework also allows a diversified number of assets to be included in the portfolio. We train our model using historical asset prices collected over 8 years for six mutual funds and one cryptocurrency. We validate our results by comparing the volatility predictions against recent asset prices obtained from Yahoo Finance. The research underlines the importance of harnessing the power of machine learning to improve portfolio performance. © 2024 by SCITEPRESS - Science and Technology Publications, Lda.

2023

Adaptive Case Selection for Symbolic Regression in Grammatical Evolution

2023, International Joint Conference on Computational Intelligence

DOI

The analysis of time efficiency and solution size has recently gained huge interest among researchers of Grammatical Evolution (GE). The voluminous data have led to slower learning of GE in finding innovative solutions to complex problems. Few works incorporate machine learning techniques to extract samples from big datasets. Most of the work in the field focuses on optimizing the GE hyperparameters. This leads to the motivation of our work, Adaptive Case Selection (ACS), a diversity-preserving test case selection method that adaptively selects test cases during the evolutionary process of GE. We used six symbolic regression synthetic datasets with diverse features and samples in the preliminary experimentation and trained the models using GE. Statistical Validation of results demonstrates ACS enhancing the efficiency of the evolutionary process. ACS achieved higher accuracy on all six problems when compared to conventional ‘train/test split.’ It outperforms four out of six problems against the recently proposed Distance-Based Selection (DBS) method while competitive on the remaining two. ACS accelerated the evolutionary process by a factor of 14X and 11X against both methods, respectively, and resulted in simpler solutions. These findings suggest ACS can potentially speed up the evolutionary process of GE when solving complex problems. © 2023 by SCITEPRESS – Science and Technology Publications, Lda.

A Convolutional Neural Network Based Patch Classifier Using Mammograms

2023, International Conference on Agents and Artificial Intelligence

DOI

Breast Cancer is the most prevalent cancer among females worldwide. Early detection is key to good prognosis and mammography is the most widely-used technique, particularly in screening programs. However, mammography is a highly-skilled and often time-consuming task. Deep learning methods can facilitate the detection process and assist clinicians in disease diagnosis. There has been much research showing Deep Neural Networks’ successful use in medical imaging to predict early and accurate diagnosis. This paper proposes a patch-based Convolutional Neural Network (CNN) classification approach to classify patches (small sections) obtained from mammogram images into either benign or malignant cases. A novel patch extraction approach method, which we call Overlapping Patch Extraction, is developed and compared with the two different techniques, Non-Overlapping Patch Extraction, and a Region-Based-Extraction. Experimentation is conducted using images from the Curated Breast Imaging Subset of Digital Database for Screening Mammography. Five deep learning models, three configurations of EfficientNet-V2 (B0, B2, and L), ResNet-101, and MobileNet- V3L, are trained on the patches extracted using the discussed methods. Preliminary results indicate that the proposed patch extraction approach, Overlapping, produces a more robust patch dataset. Promising results are obtained using the Overlapping patch extraction technique trained on the EfficientNet-V2L model achieving an AUC of 0.90. © 2023 by SCITEPRESS – Science and Technology Publications, Lda.

2022

PreDive: Preserving Diversity in Test Cases for Evolving Digital Circuits using Grammatical Evolution

2022, GECCO 2022 Companion - Proceedings of the 2022 Genetic and Evolutionary Computation Conference

DOI

The ever-present challenge in the domain of digital devices is how to test their behavior efficiently. We tackle the issue in two ways. We switch to an automated circuit design using Grammatical Evolution (GE). Additionally, we provide two diversity-based methodologies to improve testing efficiency. The first approach extracts a minimal number of test cases from subsets formed through clustering. Moreover, the way we perform clustering can easily be used for other domains as it is problem-agnostic. The other uses complete test set and introduces a novel fitness function hitPlex that incorporates a test case diversity measure to speed up the evolutionary process. Experimental and statistical evaluations on six benchmark circuits establish that the automatically selected test cases result in good coverage and enable the system to evolve a highly accurate digital circuit. Evolutionary runs using hitPlex indicate promising improvements, with up to 16% improvement in convergence speed and up to 30% in success rate for complex circuits when compared to the system without the diversity extension. © 2022 Owner/Author.

DECART: Planning for Decarbonising Transport Sector with Predictive Analytics - An Irish Case Study

2022, International Conference on Smart Cities and Green ICT Systems, SMARTGREENS - Proceedings

DOI

This article explores assessing the impact of the decarbonisation of the transport sector using an evidence-based approach incorporating data analysis and advanced machine learning (ML) modelling. We investigate the radical behavioural and societal changes needed for the decarbonisation of the transport sector in Ireland. We perform a study through our system DECArbonisation in Road Transport (DECART), a suite of statistical and time series ML models for facilitating policy making, monitoring and advising governments, companies and organisations in the transport sector. Based on data analysis and through scenario-modelling approaches, we present alternatives to policy and decision makers to achieve goals in mitigation of carbon emissions in road transport. The models depict how changes in mobility patterns in road transport affect CO2 emissions. Through insights obtained from the models, we infer that renewable energy in Ireland has the potential for meeting the growing electricity needs of electric vehicles. Experimentation is conducted on real-world datasets such as traffic, motor registrations, and data from renewable sources such as wind farms, for building efficient ML models. The models are validated in terms of accuracy, based on their potential to capture hidden insights from real-world events and domain knowledge. Copyright © 2022 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved.

Rethinking Traffic Management with Congestion Pricing and Vehicular Routing for Sustainable and Clean Transport

2022, International Conference on Agents and Artificial Intelligence

DOI

Rapid growth in vehicular congestion increases the challenges of traffic management concerning pollution and infrastructure. Efficient traffic governance can have a significant impact on a country’s economy. To alleviate these challenges, we propose an intelligent integrated traffic management system that manages congestion through cost pricing models to achieve smooth traffic flow. We propose a novel rerouting algorithm and ensemble architecture for vehicle detection and classification, tested on live traffic captured in several Indian cities. The ensemble architectures are designed on a combination of existing pre-trained models. Choice of the ensembles is based on accuracy, model interpretability, and energy efficiency. We show that the second-best ensemble produced operates with significantly less energy and better explainability than our best performer and is still within 3% accuracy of the best performer. Based on predefined road priorities, these ensemble models provide traffic and individual vehicle counts, further fed to our proposed rerouting algorithm as input. The rerouting algorithm then recommends alternative routes and estimated journey time to the user. The paper also presents the results obtained by testing the models on real-time traffic videos from Aurangabad (India) on a GPU/CPU cluster consisting of machines incorporating different GPU hardware. © 2022 by SCITEPRESS - Science and Technology Publications, Lda. All rights reserved.

HyperEstimator: Evolving Computationally Efficient CNN Models with Grammatical Evolution

2022, ICSBT International Conference on Smart Business Technologies

DOI

Deep learning (DL) networks have the dual benefits due to over parameterization and regularization rendering them more accurate than conventional Machine Learning (ML) models. However, they consume massive amounts of resources in training and thus are computationally expensive. A single experimental run consumes a lot of computational resources, in such a way that it could cost millions of dollars thereby dramatically leading to massive project costs. Some of the factors for vast expenses for DL models can be attributed to the computational costs incurred during training, massive storage requirements, along with specialized hardware such as Graphical Processing Unit (GPUs). This research seeks to address some of the challenges mentioned above. Our approach, HyperEstimator, estimates the optimal values of hyperparameters for a given Convolutional Neural Networks (CNN) model and dataset using a suite of Machine Learning algorithms. Our approach consists of three stages: (i) obtaining candidate values for hyperparameters with Grammatical Evolution; (ii) prediction of optimal values of hyperparameters with supervised ML techniques; (iii) training CNN model for object detection. As a case study, the CNN models are validated by using a real-time video dataset representing road traffic captured in some Indian cities. The results are also compared against CIFAR10 and CIFAR100 benchmark datasets. Copyright © 2022 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved.

Automated grammar-based feature selection in symbolic regression

2022, GECCO 2022 - Proceedings of the 2022 Genetic and Evolutionary Computation Conference

DOI

With the growing popularity of machine learning (ML), regression problems in many domains are becoming increasingly high-dimensional. Identifying relevant features from a high-dimensional dataset still remains a significant challenge for building highly accurate machine learning models. Evolutionary feature selection has been used for high-dimensional symbolic regression using Genetic Programming (GP). While grammar based GP, especially Grammatical Evolution (GE), has been extensively used for symbolic regression, no systematic grammar-based feature selection approach exists. This work presents a grammar-based feature selection method, Production Ranking based Feature Selection (PRFS), and reports on the results of its application in symbolic regression. The main contribution of our work is to demonstrate that the proposed method can not only consistently select the most relevant features, but also significantly improves the generalization performance of GE when compared with several state-of-the-art ML-based feature selection methods. Experimental results on benchmark symbolic regression problems show that the generalization performance of GE using PRFS was significantly better than that of a state-of-the-art Random Forest based feature selection in three out of four problems, while in fourth problem the performance was the same. © 2022 Owner/Author.

On the effect of embedding hierarchy within multi-objective optimization for evolving symbolic regression models

2022, GECCO 2022 Companion - Proceedings of the 2022 Genetic and Evolutionary Computation Conference

DOI

Symbolic Regression is sometimes treated as a multi-objective optimization problem where two objectives (Accuracy and Complexity) are optimized simultaneously. In this paper, we propose a novel approach, Hierarchical Multi-objective Symbolic Regression (HMS), where we investigate the effect of imposing a hierarchy on multiple objectives in Symbolic Regression. HMS works in two levels. In the first level, an initial random population is evolved using a single objective (Accuracy), then, when a simple trigger occurs (the current best fitness is five times better than best fitness of the initial, random population) half of the population is promoted to the next level where another objective (complexity) is incorporated. This new, smaller, population subsequently evolves using a multi-objective fitness function. Various complexity measures are tested and as such are explicitly defined as one of the objectives in addition to performance (accuracy). The validation of HMS is performed on four benchmark Symbolic Regression problems with varying difficulty. The evolved Symbolic Regression models are either competitive with or better than models produced with standard approaches in terms of performance where performance is the accuracy measured as Root Mean Square Error. The solutions are better in terms of size, effectively scaling down the computational cost. © 2022 Owner/Author.

2021

Automatic test case generation for vulnerability analysis of galois field arithmetic circuits

2021, 2021 IEEE 5th International Conference on Cryptography, Security and Privacy, CSP 2021

DOI

The research work proposes a framework for checking the correctness of Galois field arithmetic operations in digital circuits. The authors propose to automatically generate the test cases from the user input, avoiding reliance upon predesigned test cases, comprising Galois field-width and respective choice of irreducible polynomial. We do this through the use of polynomial arithmetic to verify the circuits. To the best of author's knowledge, though extensive work has been carried out in optimising the performance of arithmetic operations in Galois field, there exist no testbench to evaluate the efficacy of hardware circuits incorporating this concept. By automating the process of generating test cases, the work can be scaled to test circuits of arbitrarily large field widths, thus providing a flexible architecture that guarantees correctness of the underlying design under test. We present simulation results for Galois field polynomials of width GF(22)), GF(24) and GF(28). This work can be applied to test and prevent intentional tampering of data bit stream and safeguarding it against malicious activities, especially in applications such as cryptography that heavily relies on Galois field arithmetic. © 2021 IEEE.

Multi-objective classification and feature selection of covid-19 proteins sequences using NSGA-II and MAP-Elites

2021, ICAART 2021 - Proceedings of the 13th International Conference on Agents and Artificial Intelligence

DOI

The advent of the Covid-19 pandemic has resulted in a global crisis making the health systems vulnerable, challenging the research community to find novel approaches to facilitate early detection of infections. This open-up a window of opportunity to exploit machine learning and artificial intelligence techniques to address some of the issues related to this disease. In this work, we address the classification of ten SARS-CoV-2 protein sequences related to Covid-19 using k-mer frequency as features and considering two objectives; classification performance and feature selection. The first set of experiments considered the objectives one at the time, four techniques were used for the feature selection and twelve well known machine learning methods, where three are neural network based for the classification. The second set of experiments considered a multiobjective approach where we tested a well known multi-objective approach Non-dominated Sorting Genetic Algorithm II (NSGA-II), and the Multi-dimensional Archive of Phenotypic Elites (MAP-Elites), which considers quality+diversity containers to guide the search through elite solutions. The experimental results shows that ResNet and PCA is the best combination using single objectives. Whereas, for the mulit-classification, NSGA-II outperforms ME with two out of three classifiers, while ME gets competitive results bringing more diverse set of solutions. © 2021 by SCITEPRESS - Science and Technology Publications, Lda.

Towards Automatic Grammatical Evolution for Real-world Symbolic Regression

2021, ICETE International Conference on E-Business and Telecommunication Networks (International Joint Conference on Computational Intelligence)

DOI

AutoGE (Automatic Grammatical Evolution) is a tool designed to aid users of GE for the automatic estimation of Grammatical Evolution (GE) parameters, a key one being the grammar. The tool comprises of a rich suite of algorithms to assist in fine tuning a BNF (Backus-Naur Form) grammar to make it adaptable across a wide range of problems. It primarily facilitates the identification of better grammar structures and the choice of function sets to enhance existing fitness scores at a lower computational overhead. This research work discusses and reports experimental results for our Production Rule Pruning algorithm from AutoGE which employs a simple frequency-based approach for eliminating less useful productions. It captures the relationship between production rules and function sets involved in the problem domain to identify better grammar. The experimental study incorporates an extended function set and common grammar structures for grammar definition. Preliminary results based on ten popular real-world regression datasets demonstrate that the proposed algorithm not only identifies suitable grammar structures, but also prunes the grammar which results in shorter genome length for every problem, thus optimizing memory usage. Despite utilizing a fraction of budget in pruning, AutoGE was able to significantly enhance test scores for 3 problems. Copyright © 2021 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved.

AutoGE: A tool for estimation of grammatical evolution models

2021, ICAART 2021 - Proceedings of the 13th International Conference on Agents and Artificial Intelligence

DOI

AutoGE (Automatic Grammatical Evolution), a new tool for the estimation of Grammatical Evolution (GE) parameters, is designed to aid users of GE. The tool comprises a rich suite of algorithms to assist in fine tuning BNF grammar to make it adaptable across a wide range of problems. It primarily facilitates the identification of optimal grammar structures, the choice of function sets to achieve improved or existing fitness at a lower computational overhead over the existing GE setups. This research work discusses and reports initial results with one of the key algorithms in AutoGE, Production Rule Pruning, which employs a simple frequency-based approach for identifying less worthy productions. It captures the relationship between production rules and function sets involved in the problem domain to identify optimal grammar structures. Preliminary studies on a set of fourteen standard Genetic Programming benchmark problems in the symbolic regression domain show that the algorithm removes less useful terminals and production rules resulting in individuals with shorter genome lengths. The results depict that the proposed algorithm identifies the optimal grammar structure for the symbolic regression problem domain to be arity-based grammar. It also establishes that the proposed algorithm results in enhanced fitness for some of the benchmark problems. © 2021 by SCITEPRESS - Science and Technology Publications, Lda.

Insights into the Advancements of Artificial Intelligence and Machine Learning, the Present State of Art, and Future Prospects: Seven Decades of Digital Revolution

2021, Smart Innovation, Systems and Technologies

DOI

The desire of human intelligence to surpass its potential has triggered the emergence of artificial intelligence and machine learning. Over the last seven decades, these terms have gained much prominence in the digital arena due to its wide adoption of techniques for designing affluent industry-enabled solutions. In this comprehensive survey on artificial intelligence, the authors provide insights from the evolution of machine learning and artificial intelligence to the present state of art and how the technology in future can be exploited to yield solutions to some of the challenging global problems. The discussion centers around successful deployment of diverse use cases for the present state of affairs. The rising interest among researchers and practitioners led to the unfolding of AI into many popular subfields as we know today. Through the course of this research article, the authors provide brief highlights about techniques for supervised as well as unsupervised learning. AI has paved the way to accomplish cutting-edge research in complex competitive domains ranging from autonomous driving, climate change, cyber-physical security systems, to healthcare diagnostics. The study concludes by depicting the growing share in market revenues from artificial intelligence-powered products and the forecasted billions of dollars worth of market shares ahead in the coming decade. © 2021, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

Pyramid-Z: Evolving Hierarchical Specialists in Genetic Algorithms

2021, ICETE International Conference on E-Business and Telecommunication Networks (International Joint Conference on Computational Intelligence)

DOI

Pyramid is a hierarchical approach to Evolutionary Computation that decomposes problems by first tackling simpler versions of them before scaling up to increasingly more difficult versions with smaller populations. Previous work showed that Pyramid was mostly as good or better than a standard GA approach, but that it did so with a fraction of individuals processed. Pyramid requires two key parameters to manage the problem complexity; (i) a threshold a as the performance bar, and (ii) b as the container with the maximum number of individuals to survive to the next level down. Pyramid-Z addressed the shortcomings of Pyramid by automating the choice of a (to assure that the top individuals are highly significantly better from the original population at the current level) and makes b less aggressive (to maintain a moderately sized population at the final level). In cases where evolution starts to stagnate at the final level, the population enters into a different form of evolution, driven by a form of hyper-mutation that runs until either a satisfactory fitness has been found or the total evaluation budget has been exhausted. The experimental results show that Pyramid-Z consistently outperforms the previous version and the baseline too. Copyright © 2021 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved.

GREECOPE: Green Computing with Piezoelectric Effect

2021, International Conference on Smart Cities and Green ICT Systems, SMARTGREENS - Proceedings

DOI

The growing interest in the search and use of alternative resources for renewable energy can lead the future towards substantially decreasing carbon footprint and reduce the effects of global warming. The proposed research explores the possibility of harnessing piezoelectric energy from the environment of moving vehicles on road. Although the technology is still immature, it has the advantage of having zero carbon footprints thus making it ideal to investigate the potential for green energy generation. The main objective is to develop regression models that can estimate energy generated from vehicular traffic. Energy is generated when force is applied to piezoelectric transducers which depend on significant factors such as the number of piezoelectric transducers and their arrangement, load applied and frequency. We design Support Vector Machine (SVM) and Generalised Linear Model (GLM) for predicting energy. The best features for training the model were selected by incorporating feature selection techniques such as Pearson’s correlation coefficient and Mutual Information Statistics. The experimental setup makes use of simulated data which takes into account vehicle count of different vehicles with and without load. The accuracy achieved from SVM and GLM are 99.6% and 99.7% respectively. The energy savings achieved by making use of generated piezoelectric energy is discussed with a sample scenario of Motorway50 of Dublin, the Irish Capital city. Through this work, we propose to investigate deeper into the feasibility towards cost-effectiveness by utilizing energy which is wasted by human and vehicular locomotion. Copyright © 2021 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved.

Automatic Test Case Generation for Prime Field Elliptic Curve Cryptographic Circuits

2021, Proceeding - 2021 IEEE 17th International Colloquium on Signal Processing and Its Applications, CSPA 2021

DOI

Elliptic curve is a major area of research due to its application in elliptic curve cryptography. Due to their small key sizes, they offer the twofold advantage of reduced storage and transmission requirements. This also results in faster execution times. The authors propose an architecture to automatically generate test cases, for verification of elliptic curve operational circuits, based on user-defined prime field and the parameters used in the circuit to be tested. The ECC test case generations are based on the Galois field arithmetic operations which were the subject of previous work by the authors. One of the strengths of elliptic curve mathematics is its simplicity, which involves just three points (P, Q, and R), which pass through a line on the curve. The test cases generate points for a user-defined prime field which sequentially selects the input vector points (P and/or Q), to calculate the resultant output vector (R) easily. The testbench proposed here targets field programmable gate array (FPGAs) platforms and experimental results for ECC test case generation on different prime fields are presented, while ModelSim is used to validate the correctness of the ECC operations. © 2021 IEEE.

Pyramid-Z: Evolving Hierarchical Specialists in Genetic Algorithms

2021, International Joint Conference on Computational Intelligence

DOI

Pyramid is a hierarchical approach to Evolutionary Computation that decomposes problems by first tackling simpler versions of them before scaling up to increasingly more difficult versions with smaller populations. Previous work showed that Pyramid was mostly as good or better than a standard GA approach, but that it did so with a fraction of individuals processed. Pyramid requires two key parameters to manage the problem complexity; (i) a threshold α as the performance bar, and (ii) β as the container with the maximum number of individuals to survive to the next level down. Pyramid-Z addressed the shortcomings of Pyramid by automating the choice of α (to assure that the top individuals are highly significantly better from the original population at the current level) and makes β less aggressive (to maintain a moderately sized population at the final level). In cases where evolution starts to stagnate at the final level, the population enters into a different form of evolution, driven by a form of hyper-mutation that runs until either a satisfactory fitness has been found or the total evaluation budget has been exhausted. The experimental results show that Pyramid-Z consistently outperforms the previous version and the baseline too. © 2023 by SCITEPRESS – Science and Technology Publications, Lda.

Towards Automatic Grammatical Evolution for Real-world Symbolic Regression

2021, International Joint Conference on Computational Intelligence

DOI

AutoGE (Automatic Grammatical Evolution) is a tool designed to aid users of GE for the automatic estimation of Grammatical Evolution (GE) parameters, a key one being the grammar. The tool comprises of a rich suite of algorithms to assist in fine tuning a BNF (Backus-Naur Form) grammar to make it adaptable across a wide range of problems. It primarily facilitates the identification of better grammar structures and the choice of function sets to enhance existing fitness scores at a lower computational overhead. This research work discusses and reports experimental results for our Production Rule Pruning algorithm from AutoGE which employs a simple frequency-based approach for eliminating less useful productions. It captures the relationship between production rules and function sets involved in the problem domain to identify better grammar. The experimental study incorporates an extended function set and common grammar structures for grammar definition. Preliminary results based on ten popular real-world regression datasets demonstrate that the proposed algorithm not only identifies suitable grammar structures, but also prunes the grammar which results in shorter genome length for every problem, thus optimizing memory usage. Despite utilizing a fraction of budget in pruning, AutoGE was able to significantly enhance test scores for 3 problems. © 2023 by SCITEPRESS – Science and Technology Publications, Lda.

Hierarchical clustering driven test case selection in digital circuits

2021, Proceedings of the 16th International Conference on Software Technologies, ICSOFT 2021

DOI

The quality assurance of circuits is of major importance as the complexity of circuits is rising with their capabilities. Thus a high degree of testing is required to guarantee proper operation. If, on the other hand, too much time is spent in testing then this prolongs development time. The work presented in this paper proposes a methodology to select a minimal set of test cases for validating digital circuits with respect to their functional specification. We do this by employing hierarchical clustering algorithms to group test cases using a hamming distance similarity measure. The test cases are selected from the clusters, by our proposed approach of distance-based selection. Results are tested on the two circuits viz. Multiplier and Galois Field multiplier that exhibit similar behaviour but differ in the number of test cases and their implementation. It is shown that on small fraction values, distance-based selection can outperform traditional random-based selection by preserving diversity among the chosen test cases. Copyright © 2021 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved

GREE-COCO: Green artificial intelligence powered cost pricing models for congestion control

2021, ICAART 2021 - Proceedings of the 13th International Conference on Agents and Artificial Intelligence

DOI

The objective of the proposed research is to design a system called Green Artificial Intelligence Powered Cost Pricing Models for Congestion Control (GREE-COCO) for road vehicles that address the issue of congestion control through the concept of cost pricing. The motivation is to facilitate smooth traffic flow among densely congested roads by incorporating static and dynamic cost pricing models. The other objective behind the study is to reduce pollution and fuel consumption and encourage people towards positive usage of the public transport system (e.g., bus, train, metro, and tram). The system will be implemented by charging the vehicles driven on a particular congested road during a specific time. The pricing will differ according to the location, type of vehicle, and vehicle count. The cost pricing model incorporates an incentive approach for rewarding the usage of electric/non-fuel vehicles. The system will be tested with analytics gathered from cameras installed for testing purposes in some of the Indian and Irish cities. One of the challenges that will be addressed is to develop sustainable and energy-efficient Artificial Intelligence (AI) models that use less power consumption which results in low carbon emission. The GREE-COCO model consists of three modules: vehicle detection and classification, license plate recognition, and cost pricing model. The AI models for vehicle detection and classification are implemented with You Only Look Once (YOLO) v3, Faster-Region based Convolutional Neural Network (F-RCNN), and Mask-Region based Convolutional Neural Network (Mask RCNN). The selection of the best model depends upon their performance concerning accuracy and energy efficiency. The dynamic cost pricing model is tested with both the Support Vector Machine (SVM) classifier and the Generalised Linear Regression Model (GLM). The experiments are carried out on a custom-made video dataset of 103 videos of different time duration. The initial results obtained from the experimental study indicate that YOLOv3 is best suited for the system as it has the highest accuracy and is more energy-efficient. © 2021 by SCITEPRESS - Science and Technology Publications, Lda.

2020

Proof of Authenticity-Based Electronic Medical Records Storage on Blockchain

2020, Smart Innovation, Systems and Technologies

DOI

Electronic health records driving over the hype of digitalization are currently booming in many hospitals. Despite advancements, plethora of challenges such as data interconnectivity, interoperability, and data sharing arises due to hospitals with their own hospital management information system form clusters of data. These can be solved by effectively employing blockchain platform. The authors in this work are proposing a novel consensus algorithm titled Proof of Authenticity over the distributed platform for all medical stakeholders. Unlike the previous approaches, wherein researchers were the miners, this work illustrates a methodology to implement blockchain for health care, where the hospitals and clinics are assumed the roles of both miners and validators. The peer-to-peer network is leveraged with a designed smart contract that follows the proof of authenticity mechanism. The medical stakeholders will access the medical data under security protocols and patient’s consent in a tamper-proof network. The proposed work aims for more patient centric and transparent health care. © 2020, Springer Nature Singapore Pte Ltd.

Mutichain Enabled EHR Management System and Predictive Analytics

2020, Smart Innovation, Systems and Technologies

DOI

One of the challenges in biomedical research and clinical practice is that we need to consolidate tremendous efforts in order to use all kinds of medical data for improving work processes, to increase capacity while lessening costs and enhancing efficiencies. Very few medical centers in India have digitized their patient records. Because of less interoperability among themselves, they end up having scattered and incomplete data. Health data is proprietary and being a personal asset of the patient, its distribution or use should be accomplished only with the patient’s consent and for a specific duration. This research proposes multichain as a secure, decentralized network for storing Electronic Health Records. The architecture provides users with a holistic, transparent view of their medical history by disintermediation of trust while insuring data integrity among medical facilities. This will open up new horizons of vital trends and insights for research, innovation, and development through robust analysis. The platform focuses on an interactive dashboard containing year, month, and season wise statistics of various diseases which are used to notify the users and the medical authorities on a timely basis. Prediction of epidemics using machine learning techniques will facilitate users by providing personalized care and the medical institutions for managing inventory and procuring medicines. Vital insights like patient to doctor ratio, infant mortality rates, and prior knowledge of the forthcoming epidemics will help government institutions to analyze and plan infrastructural requirements and services to be provided. © 2020, Springer Nature Singapore Pte Ltd.

GeTS: Grammatical evolution based optimization of smoothing parameters in univariate time series forecasting

2020, ICAART 2020 - Proceedings of the 12th International Conference on Agents and Artificial Intelligence

DOI

Time series forecasting is a technique that predicts future values using time as one of the dimensions. The learning process is strongly controlled by fine-tuning of various hyperparameters which is often resource extensive and requires domain knowledge. This research work focuses on automatically evolving suitable hyperparameters of time series for level, trend and seasonality components using Grammatical Evolution. The proposed Grammatical Evolution Time Series framework can accept datasets from various domains and select the appropriate parameter values based on the nature of dataset. The forecasted results are compared with a traditional grid search algorithm on the basis of error metric, efficiency and scalability. © 2020 by SCITEPRESS - Science and Technology Publications, Lda. All rights reserved

GEMO: Grammatical Evolution Memory Optimization System

2020, International Joint Conference on Computational Intelligence

DOI

In Grammatical Evolution (GE) individuals occupy more space than required, that is, the Actual Length of the individuals is longer than their Effective Length. This has major implications for scaling GE to complex problems that demand larger populations and complex individuals. We show how these two lengths vary for different sizes of population, demonstrating that Effective Length is relatively independent of population size, but that the Actual Length is proportional to it. We introduce Grammatical Evolution Memory Optimization (GEMO), a two-stage evolutionary system that uses a multi-objective approach to identify the optimal, or at least, near-optimal, genome length for the problem being examined. It uses a single run with a multi-objective fitness function defined to minimize the error for the problem being tackled along with maximizing the ratio of Effective to Actual Genome Length leading to better utilization of memory and hence, computational speedup. Then, in Stage 2, standard GE runs are performed restricting the genome length to the length obtained in Stage 1. We demonstrate this technique on different problem domains and show that in all cases, GEMO produces individuals with the same fitness as standard GE but significantly improves memory usage and reduces computation time. © 2020 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved.

2019

An in-depth survey of techniques employed in construction of emotional lexicon

2019, Smart Innovation, Systems and Technologies

DOI

Emotion is a state of mind affected by many external parameters one of which is text either read or spoken by others or self. Recognition of emotion from facial expression, sound intensity, or text is becoming an interesting research area. Extracting emotions from text is quite unfocused but important research problem from natural language processing domain. It requires the construction of emotional lexicon in respective natural language for classification of text/document into emotional classes. In this paper, an overview of the state-of-the-art techniques used to construct emotional lexicon for different languages is given. These methods are in their initial stage of research as much of the work is conducted for optimizing the results and hence open to wide field of innovative contributions. The author concludes with a proposal for developing language independent emotional lexicon. Main challenges in implementing this are discussed and promising applications in various fields are also elaborated. © 2019, Springer Nature Singapore Pte Ltd.

A framework for segregating solid waste by employing the technique of image annotation

2019, 2019 2nd International Conference on Advanced Computational and Communication Paradigms, ICACCP 2019

DOI

The way to proficient waste management is to guarantee appropriate segregation of waste to ensure its proper reuse. The objective of this paper is to identify types of wastes generated in India, the nature of waste coming out from different cities, the current disposal method being employed there, amount of waste that gets dumped at landfill. These statistics are useful for identifying efficient methods to segregate waste to enhance the efficiency of reusing and recycling process. The paper may justify the reason behind expanding number of landfills in India. Valuable insights can be obtained after classifying waste into biodegradable/non-biodegradable classes, and its suitability for disposal or not. They can help in choosing proper disposal techniques for different categories of waste. © 2019 IEEE.

2018

Constructing Knowledge Graph by Extracting Correlations from Wikipedia Corpus for Optimizing Web Information Retrieval