publications
Peer-reviewed journal and conference publications.
2024
- eScience“TaPS: A Performance Evaluation Suite for Task-based Execution Frameworks”Pauloski, J. Gregory, Hayot-Sasson, Valere, Gonthier, Maxime and 5 more authorsIn 2024 IEEE International Conference on e-Science 2024
Task-based execution frameworks, such as in parallel programming libraries, computational workflow systems, and function-as-a-service platforms, enable the composition of distinct tasks into a single, unified application designed to achieve a computational goal. Task-based execution frameworks abstract the parallel execution of an application’s tasks on arbitrary hardware. Research into these task executors has accelerated as computational sciences increasingly need to take advantage of parallel compute and/or heterogeneous hardware. However, the lack of evaluation standards makes it challenging to compare and contrast novel systems against existing implementations. Here, we introduce TaPS, the Task Performance Suite, to support continued research in parallel task executor frameworks. TaPS provides (1) a unified, modular interface for writing and evaluating applications using arbitrary execution frameworks and data management systems and (2) an initial set of synthetic and real-world science applications available within TaPS. We discuss how the design of TaPS supports the reliable evaluation of frameworks and demonstrate TaPS through a survey of benchmarks using the provided reference applications.
- eScience“An Empirical Investigation of Container Building Strategies and Warm Times to Reduce Cold Starts in Scientific Computing Serverless Functions”Bauer, André, Gonthier, Maxime, Pan, Haochen and 9 more authorsIn 2024 IEEE International Conference on e-Science 2024
Serverless computing has revolutionized application development and deployment by abstracting infrastructure management, allowing developers to focus on writing code. To do so, serverless platforms dynamically create execution environments, often using containers. The cost to create and deploy these environments is known as “cold start” latency, and this cost can be particularly detrimental to scientific computing workloads characterized by sporadic and dynamic demands. We investigate methods to mitigate cold start issues in scientific computing applications by pre-installing Python packages in container images. Using data from Globus Compute and Binder, we empirically analyze cold start behavior and evaluate four strategies for building containers, including fully pre-built environments and dynamic, on-demand installations. Our results show that pre-installing all packages reduces initial cold start time but requires significant storage. Conversely, dynamic installation offers lower storage requirements but incurs repetitive delays. Additionally, we implemented a simulator and assessed the impact of different warm times, finding that moderate warm times significantly reduce cold starts without the excessive overhead of maintaining always-hot states.
- Preprint“Causal Discovery over High-Dimensional Structured Hypothesis Spaces with Causal Graph Partitioning”Shah, Ashka, DePavia, Adela, Hudson, Nathaniel and 2 more authorsarXiv preprint arXiv:2406.06348 2024
The aim in many sciences is to understand the mechanisms that underlie the observed distribution of variables, starting from a set of initial hypotheses. Causal discovery allows us to infer mechanisms as sets of cause and effect relationships in a generalized way – without necessarily tailoring to a specific domain. Causal discovery algorithms search over a structured hypothesis space, defined by the set of directed acyclic graphs, to find the graph that best explains the data. For high-dimensional problems, however, this search becomes intractable and scalable algorithms for causal discovery are needed to bridge the gap. In this paper, we define a novel causal graph partition that allows for divide-and-conquer causal discovery with theoretical guarantees. We leverage the idea of a superstructure – a set of learned or existing candidate hypotheses – to partition the search space. We prove under certain assumptions that learning with a causal graph partition always yields the Markov Equivalence Class of the true causal graph. We show our algorithm achieves comparable accuracy and a faster time to solution for biologically-tuned synthetic networks and networks up to 10^4 variables. This makes our method applicable to gene regulatory network inference and other domains with high-dimensional structured hypothesis spaces.
- JDIQ“Thinking in Categories: A Survey on Assessing the Quality for Time Series Synthesis”Stenger, Michael, Bauer, André, Prantl, Thomas and 5 more authorsJournal of Data and Information Quality May 2024
Time series data are widely used and provide a wealth of information for countless applications. However, some applications are faced with a limited amount of data, or the data cannot be used due to confidentiality concerns. To overcome these obstacles, time series can be generated synthetically. For example, electrocardiograms can be synthesized to make them available for building models to predict conditions such as cardiac arrhythmia without leaking patient information. Although many different approaches to time series synthesis have been proposed, evaluating the quality of synthetic time series data poses unique challenges and remains an open problem, as there is a lack of a clear definition of what constitutes a “good” synthesis. To this end, we present a comprehensive literature survey to identify different aspects of synthesis quality and their relationships. Based on this, we propose a definition of synthesis quality and a systematic evaluation procedure for assessing it. With this work, we aim to provide a common language and criteria for evaluating synthetic time series data. Our goal is to promote more rigorous and reproducible research in time series synthesis by enabling researchers and practitioners to generate high-quality synthetic time series data.
- Preprint“Deep Learning for Molecular Orbitals”King, Daniel, Grzenda, Daniel, Zhu, Ray and 3 more authorsMay 2024
The advancement of deep learning in chemistry has resulted in state-of-the-art models that incorporate an increasing number of concepts from standard quantum chemistry, such as orbitals and Hamiltonians. With an eye towards the future development of these deep learning approaches, we present here what we believe to be the first work focused on assigning labels to orbitals, namely energies and characterizations, given the real-space descriptions of these orbitals from standard electronic structure theories such as Hartree-Fock. In addition to providing a foundation for future development, we expect these models to have immediate impact in automatizing and interpreting the results of advanced electronic structure approaches for chemical reactivity and spectroscopy.
- Sensor Letters“RuralAI in Tomato Farming: Integrated Sensor System, Distributed Computing and Hierarchical Federated Learning for Crop Health Monitoring”Devaraj, Harish, Sohail, Shaleeza, Li, Boyang and 7 more authorsIEEE Sensors Letters May 2024
Precision horticulture is evolving due to scalable sensor deployment and machine learning integration. These advancements boost the operational efficiency of individual farms, balancing the benefits of analytics with autonomy requirements. However, given concerns that affect wide geographic regions (e.g., climate change), there is a need to apply models that span farms. Federated Learning (FL) has emerged as a potential solution. FL enables decentralized machine learning (ML) across different farms without sharing private data. Traditional FL assumes simple 2-tier network topologies and thus falls short of operating on more complex networks found in real-world agricultural scenarios. Networks vary across crops and farms, and encompass various sensor data modes, extending across jurisdictions. New hierarchical FL (HFL) approaches are needed for more efficient and context-sensitive model sharing, accommodating regulations across multiple jurisdictions. We present the RuralAI architecture deployment for tomato crop monitoring, featuring sensor field units for soil, crop, and weather data collection. HFL with personalization is used to offer localized and adaptive insights. Model management, aggregation, and transfers are facilitated via a flexible approach, enabling seamless communication between local devices, edge nodes, and the cloud.
- FGCS“QoS-aware edge AI placement and scheduling with multiple implementations in FaaS-based edge computing”Hudson, Nathaniel, Khamfroush, Hana, Baughman, Matt and 3 more authorsFuture Generation Computer Systems May 2024
Resource constraints on the computing continuum require that we make smart decisions for serving AI-based services at the network edge. AI-based services typically have multiple implementations (e.g., image classification implementations include SqueezeNet, DenseNet, and others) with varying trade-offs (e.g., latency and accuracy). The question then is how should AI-based services be placed across Function-as-a-Service (FaaS) based edge computing systems in order to maximize total Quality-of-Service (QoS). To address this question, we propose a problem that jointly aims to solve (i) edge AI service placement and (ii) request scheduling. These are done across two time-scales (one for placement and one for scheduling). We first cast the problem as an integer linear program. We then decompose the problem into separate placement and scheduling subproblems and prove that both are NP-hard. We then propose a novel placement algorithm that places services while considering device-to-device communication across edge clouds to offload requests to one another. Our results show that the proposed placement algorithm is able to outperform a state-of-the-art placement algorithm for AI-based services, and other baseline heuristics, with regard to maximizing total QoS. Additionally, we present a federated learning-based framework, FLIES, to predict the future incoming service requests and their QoS requirements. Our results also show that our FLIES algorithm is able to outperform a standard decentralized learning baseline for predicting incoming requests and show comparable predictive performance when compared to centralized training.
- BDCAT“Trillion Parameter AI Serving Infrastructure for Scientific Discovery: A Survey and Vision”Hudson, Nathaniel, Pauloski, J. Gregory, Baughman, Matt and 13 more authorsIn Proceedings of the IEEE/ACM International Conference on Big Data Computing, Applications and Technologies May 2024
Deep learning methods are transforming research, enabling new techniques, and ultimately leading to new discoveries. As the demand for more capable AI models continues to grow, we are now entering an era of Trillion Parameter Models (TPM), or models with more than a trillion parameters—such as Huawei’s PanGu-Σ. We describe a vision for the ecosystem of TPM users and providers that caters to the specific needs of the scientific community. We then outline the significant technical challenges and open problems in system design for serving TPMs to enable scientific research and discovery. Specifically, we describe the requirements of a comprehensive software stack and interfaces to support the diverse and flexible requirements of researchers.
2023
- SC Workshop“Tournament-Based Pretraining to Accelerate Federated Learning”Baughman, Matt, Hudson, Nathaniel, Chard, Ryan and 3 more authorsIn Proceedings of the SC ’23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis May 2023
Advances in hardware, proliferation of compute at the edge, and data creation at unprecedented scales have made federated learning (FL) necessary for the next leap forward in pervasive machine learning. For privacy and network reasons, large volumes of data remain stranded on endpoints located in geographically austere (or at least austere network-wise) locations. However, challenges exist to the effective use of these data. To solve the system and functional level challenges, we present an three novel variants of a serverless federated learning framework. We also present tournament-based pretraining, which we demonstrate significantly improves model performance in some experiments. Overall, these extensions to FL and our novel training method enable greater focus on science rather than ML development.
- IMM“Measurement and Applications: Exploring the Challenges and Opportunities of Hierarchical Federated Learning in Sensor Applications”Ooi, Melanie Po-Leen, Sohail, Shaleeza, Huang, Victoria Guiying and 9 more authorsIEEE Instrumentation & Measurement Magazine May 2023
Sensor applications have become ubiquitous in modern society as the digital age continues to advance. AI-based techniques (e.g., machine learning) are effective at extracting actionable information from large amounts of data. An example would be an automated water irrigation system that uses AI-based techniques on soil quality data to decide how to best distribute water. However, these AI-based techniques are costly in terms of hardware resources, and Internet-of-Things (IoT) sensors are resource-constrained with respect to processing power, energy, and storage capacity. These limitations can compromise the security, performance, and reliability of sensor-driven applications. To address these concerns, cloud computing services can be used by sensor applications for data storage and processing. Unfortunately, cloud-based sensor applications that require real-time processing, such as medical applications (e.g., fall detection and stroke prediction), are vulnerable to issues such as network latency due to the sparse and unreliable networks between the sensor nodes and the cloud server [1]. As users approach the edge of the communications network, latency issues become more severe and frequent. A promising alternative is edge computing, which provides cloud-like capabilities at the edge of the network by pushing storage and processing capabilities from centralized nodes to edge devices that are closer to where the data are gathered, resulting in reduced network delays [2], [3].
- Preprint“Attention Lens: A Tool for Mechanistically Interpreting the Attention Head Information Retrieval Mechanism”Sakarvadia, Mansi, Khan, Arham, Ajith, Aswathy and 5 more authorsMay 2023
Transformer-based Large Language Models (LLMs) are the state-of-the-art for nat- ural language tasks. Recent work has attempted to decode, by reverse engineering the role of linear layers, the internal mechanisms by which LLMs arrive at their final predictions for text completion tasks. Yet little is known about the specific role of attention heads in producing the final token prediction. We propose Attention Lens, a tool that enables researchers to translate the outputs of attention heads into vocabulary tokens via learned attention-head-specific transformations called lenses. Preliminary findings from our trained lenses indicate that attention heads play highly specialized roles in language models. The code for Attention Lens is available at github.com/msakarvadia/AttentionLens.
- BlackBoxNLP“Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models”Sakarvadia, Mansi, Ajith, Aswathy, Khan, Arham and 5 more authorsMay 2023
Answering multi-hop reasoning questions requires retrieving and synthesizing information from diverse sources. Large Language Models (LLMs) struggle to perform such reasoning consistently. Here we propose an approach to pinpoint and rectify multi-hop reasoning failures through targeted memory injections on LLM attention heads. First, we analyze the per-layer activations of GPT-2 models in response to single and multi-hop prompts. We then propose a mechanism that allows users to inject pertinent prompt-specific information, which we refer to as "memories," at critical LLM locations during inference. By thus enabling the LLM to incorporate additional relevant information during inference, we enhance the quality of multi-hop prompt completions. We show empirically that a simple, efficient, and targeted memory injection into a key attention layer can often increase the probability of the desired next token in multi-hop tasks, by up to 424%.
- WF-IoT“Adversarial Predictions of Data Distributions Across Federated Internet-of-Things Devices”Rajani, Samir, Dematties, Dario, Hudson, Nathaniel and 4 more authorsIn 2023 IEEE World Forum on Internet of Things (WF-IoT) Oct 2023
Federated learning (FL) is increasingly becoming the default approach for training machine learning models across decentralized Internet-of-Things (IoT) devices. A key advantage of FL is that no raw data are communicated across the network, providing an immediate layer of privacy. Despite this, recent works have demonstrated that data reconstruction can be done with the locally trained model updates which are communicated across the network. However, many of these works have limitations with regard to how the gradients are computed in backpropagation. In this work, we demonstrate that the model weights shared in FL can expose revealing information about the local data distributions of IoT devices. This leakage could expose sensitive information to malicious actors in a distributed system. We further discuss results which show that injecting noise into model weights is ineffective at preventing data leakage without seriously harming the global model accuracy.
- Supercomputing
- TECS“Deadline-Aware Task Offloading for Vehicular Edge Computing Networks Using Traffic Lights Data”Oza, Pratham, Hudson, Nathaniel, Chantem, Thidapat and 1 more authorACM Transactions on Embededded Computing Systems Apr 2023
As vehicles become increasingly automated, novel vehicular applications emerge to enhance the safety and security of the vehicles and improve user experience. This brings ever-increasing data and resource requirements for timely computation on the vehicle’s on-board computing systems. To alleviate these demands, prior work propose deploying vehicular edge computing (VEC) resources on the road-side units (RSUs) in the traffic infrastructure to which the vehicles can communicate and offload compute intensive tasks. Due to limited communication range of these RSUs, the communication link between the vehicles and the RSUs and therefore the response times of the offloaded applications are significantly impacted by vehicle’s mobility through road traffic. Existing task offloading strategies do not consider the influence of traffic lights on vehicular mobility while offloading workloads on the RSUs, and thereby cause deadline misses and quality-of-service (QoS) reduction for the offloaded tasks. In this paper, we present a novel task model that captures time and location-specific requirements for vehicular applications. We then present a deadline-based strategy that incorporates traffic light data to opportunistically offload tasks. Our approach allows up to (33%) more tasks to be offloaded onto the RSUs, compared to existing work, without causing any deadline misses and thereby maximizing the resource utilization on the RSUs.
- ICPE“Searching for the Ground Truth: Assessing the Similarity of Benchmarking Runs”Bauer, André, Straesser, Martin, Leznik, Mark and 6 more authorsIn 2023 ACM/SPEC International Conference on Performance Engineering Data Challenge Track Apr 2023
Stable and repeatable measurements are essential for comparing the performance of different systems or applications, and benchmarks are used to ensure accuracy and replication. However, if the corresponding measurements are not stable and repeatable, wrong conclusions can be drawn. To facilitate the task of determining whether the measurements are similar, we used a data set of 586 micro-benchmarks to (i) analyze the data set itself, (ii) examine an approach from related work, and (iii) propose and evaluate a heuristic. To evaluate the different approaches, we perform a peer review to assess the dissimilarity of the benchmark runs. Our results show that this task is challenging even for humans and that our heuristic exhibits a sensitivity of 92%.
- PerCom“Balancing federated learning trade-offs for heterogeneous environments”Baughman, Matt, Hudson, Nathaniel, Foster, Ian and 1 more authorIn 2023 IEEE International Conference on Pervasive Computing and Communications (PerCom) Work in Progress Apr 2023
Federated Learning (FL) is an enabling technology for supporting distributed machine learning across several devices on decentralized data. A critical challenge when FL in practice is the system resource heterogeneity of worker devices that train the ML model locally. FL workflows can be run across diverse computing devices, from sensors to High Performance Computing (HPC) clusters; however, these resource disparities may result in some devices being too burdened by the task of training and thus struggle to perform robust training when compared to more high-power devices (or clusters). Techniques can be applied to reduce the cost of training on low-power devices, such as reducing the number of epochs to perform during training. However, such techniques may also negatively harm the performance of the locally-trained model, introducing a resource-model performance trade-off. In this work, we perform robust experimentation with the aim of balancing this resource-model performance trade-off in FL. Our results provide intuition for how training hyper-parameters can be tuned to improve this trade-off in FL.
2022
- Cloud Continuum“Hierarchical and Decentralised Federated Learning”Rana, Omer, Spyridopoulos, Theodoros, Hudson, Nathaniel and 4 more authorsIn 2022 Cloud Computing Apr 2022
Federated learning has shown enormous promise as a way of training ML models in distributed environments while reducing communication costs and protecting data privacy. However, the rise of complex cyber-physical systems, such as the Internet-of-Things, presents new challenges that are not met with traditional FL methods. Hierarchical Federated Learning extends the traditional FL process to enable more efficient model aggregation based on application needs or characteristics of the deployment environment (e.g., resource capabilities and/or network connectivity). It illustrates the benefits of balancing processing across the cloud-edge continuum. Hierarchical Federated Learning is likely to be a key enabler for a wide range of applications, such as smart farming and smart energy management, as it can improve performance and reduce costs, whilst also enabling FL workflows to be deployed in environments that are not well-suited to traditional FL. Model aggregation algorithms, software frameworks, and infrastructures will need to be designed and implemented to make such solutions accessible to researchers and engineers across a growing set of domains. H-FL also introduces a number of new challenges. For instance, there are implicit infrastructural challenges. There is also a trade-off between having generalised models and personalised models. If there exist geographical patterns for data (e.g., soil conditions in a smart farm likely are related to the geography of the region itself), then it is crucial that models used locally can consider their own locality in addition to a globally-learned model. H-FL will be crucial to future FL solutions as it can aggregate and distribute models at multiple levels to optimally serve the trade-off between locality dependence and global anomaly robustness.
- eScience“FLoX: Federated learning with FaaS at the edge”Kotsehub, Nikita, Baughman, Matt, Chard, Ryan and 5 more authorsIn 2022 IEEE International Conference on e-Science Dec 2022
Federated learning (FL) is a technique for distributed machine learning that enables the use of siloed and distributed data. With FL, individual machine learning models are trained separately and then only model parameters (e.g., weights in a neural network) are shared and aggregated to create a global model, allowing data to remain in its original environment. While many applications can benefit from FL, existing frameworks are incomplete, cumbersome, and environment-dependent. To address these issues, we present FLoX, an FL framework built on the funcX federated serverless computing platform. FLoX decouples FL model training/inference from infrastructure management and thus enables users to easily deploy FL models on one or more remote computers with a single line of Python code. We evaluate FLoX using three benchmark datasets deployed on ten heterogeneous and distributed compute endpoints. We show that FLoX incurs minimal overhead, especially with respect to the large communication overheads between endpoints for data transfer. We show how balancing the number of samples and epochs with respect to the capacities of participating endpoints can significantly reduce training time with minimal reduction in accuracy. Finally, we show that global models consistently outperform any single model on average by 8%.
- “Smart Edge-Enabled Traffic Light Control: Improving Reward-Communication Trade-offs with Federated Reinforcement Learning”Hudson, Nathaniel, Oza, Pratham, Khamfroush, Hana and 1 more authorIn 2022 IEEE International Conference on Smart Computing (SMARTCOMP) Jul 2022
Traffic congestion is a costly phenomenon of every-day life. Reinforcement Learning (RL) is a promising solution due to its applicability to solving complex decision-making problems in highly dynamic environments. To train smart traffic lights using RL, large amounts of data is required. Recent RL-based approaches consider training to occur on some nearby server or a remote cloud server. However, this requires that traffic lights all communicate their raw data to some central location. For large road systems, communication cost can be impractical, particularly if traffic lights collect heavy data (e.g., video, LIDAR). As such, this work pushes training to the traffic lights directly to reduce communication cost. However, completely independent learning can reduce the performance of trained models. As such, this work considers the recent advent of Federated Reinforcement Learning (FedRL) for edge-enabled traffic lights so they can learn from each other’s experience by periodically aggregating locally-learned policy network parameters rather than share raw data, hence keeping communication costs low. To do this, we propose the SEAL framework which uses an intersection-agnostic representation to support FedRL across traffic lights controlling heterogeneous intersection types. We then evaluate our FedRL approach against Centralized and Decentralized RL strategies. We compare the reward-communication trade-offs of these strategies. Our results show that FedRL is able to reduce the communication costs associated with Centralized training by 36.24%; while only seeing a 2.11% decrease in average reward (i.e., decreased traffic congestion).
- CCNC“Communication-Loss Trade-Off in Federated Learning: A Distributed Client Selection Algorithm”Hosseinzadeh, Minoo, Hudson, Nathaniel, Heshmati, Sam and 1 more authorIn 2022 IEEE 19th Annual Consumer Communications & Networking Conference (CCNC) May 2022
Mass data generation occurring in the Internet-of-Things (IoT) requires processing to extract meaningful information. Deep learning is commonly used to perform such processing. However, due to the sensitive nature of these data, it is important to consider data privacy. As such, federated learning (FL) has been proposed to address this issue. FL pushes training to the client devices and tasks a central server with aggregating collected model weights to update a global model. However, the transmission of these model weights can be costly, gradually. The trade-off between communicating model weights for aggregation and the loss provided by the global model remains an open problem. In this work, we cast this trade-off problem of client selection in FL as an optimization problem. We then design a Distributed Client Selection (DCS) algorithm that allows client devices to decide to participate in aggregation in hopes of minimizing overall communication cost — while maintaining low loss. We evaluate the performance of our proposed client selection algorithm against standard FL and a state-of-the-art client selection algorithm, called Power-of-Choice (PoC), using CIFAR-10, FMNIST, and MNIST datasets. Our experimental results confirm that our DCS algorithm is able to closely match the loss provided by the standard FL and PoC, while on average reducing the overall communication cost by nearly 32.67% and 44.71% in comparison to standard FL and PoC, respectively.
2021
- ICCCN“QoS-Aware Placement of Deep Learning Services on the Edge with Multiple Service Implementations”Hudson, Nathaniel, Khamfroush, Hana, and Lucani, Daniel E.In 2021 IEEE International Conference on Computer Communications and Networks (ICCCN) Big Data and Machine Learning for Networking (BDMLN) Workshop May 2021
Mobile edge computing pushes computationally-intensive services closer to the user to provide reduced delay due to physical proximity. This has led many to consider deploying deep learning models on the edge – commonly known as edge intelligence (EI). EI services can have many model implementations that provide different QoS. For instance, one model can perform inference faster than another (thus reducing latency) while achieving less accuracy when evaluated. In this paper, we study joint service placement and model scheduling of EI services with the goal to maximize Quality-of-Servcice (QoS) for end users where EI services have multiple implementations to serve user requests, each with varying costs and QoS benefits. We cast the problem as an integer linear program and prove that it is NP-hard. We then prove the objective is equivalent to maximizing a monotone increasing, submodular set function and thus can be solved greedily while maintaining a (1 – 1/e)-approximation guarantee. We then propose two greedy algorithms: one that theoretically guarantees this approximation and another that empirically matches its performance with greater efficiency. Finally, we thoroughly evaluate the proposed algorithm for making placement and scheduling decisions in both synthetic and real-world scenarios against the optimal solution and some baselines. In the real-world case, we consider real machine learning models using the ImageNet 2012 data-set for requests. Our numerical experiments empirically show that our more efficient greedy algorithm is able to approximate the optimal solution with a 0.904 approximation on average, while the next closest baseline achieves a 0.607 approximation on average.
- ICCCN“A Framework for Edge Intelligent Smart Distribution Grids via Federated Learning”Hudson, Nathaniel, Hossain, Md Jakir, Hosseinzadeh, Minoo and 3 more authorsIn 2021 IEEE International Conference on Computer Communications and Networks (ICCCN) May 2021
Recent advances in distributed data processing and machine learning provide new opportunities to enable critical, time-sensitive functionalities of smart distribution grids in a secure and reliable fashion. Combining the recent advents of edge computing (EC) and edge intelligence (EI) with existing advanced metering infrastructure (AMI) has the potential to reduce overall communication cost, preserve user privacy, and provide improved situational awareness. In this paper, we provide an overview for how EC and EI can supplement applications relevant to AMI systems. Additionally, using such systems in tandem can enable distributed deep learning frameworks (e.g., federated learning) to empower distributed data processing and intelligent decision making for AMI. Finally, to demonstrate the efficacy of this considered architecture, we approach the non-intrusive load monitoring (NILM) problem using federated learning to train a deep recurrent neural network architecture in a 2-tier and 3-tier manner. In this approach, smart homes locally train a neural network using their metering data and only share the learned model parameters with AMI components for aggregation. Our results show this can reduce communication cost associated with distributed learning, as well as provide an immediate layer of privacy, due to no raw data being communicated to AMI components. Further, we show that FL is able to closely match the model loss provided by standard centralized deep learning where raw data is communicated for centralized training.
- DySPAN“Joint Compression and Offloading Decisions for Deep Learning Services in 3-Tier Edge Systems”Hosseinzadeh, Minoo, Hudson, Nathaniel, Zhao, Xiaobo and 2 more authorsIn 2021 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN) Jan 2021
Task offloading in edge computing infrastructure remains a challenge for dynamic and complex environments, such as Industrial Internet-of-Things. The hardware resource constraints of edge servers must be explicitly considered to ensure that system resources are not overloaded. Many works have studied task offloading while focusing primarily on ensuring system resilience. However, in the face of deep learning-based services, model performance with respect to loss/accuracy must also be considered. Deep learning services with different implementations may provide varying amounts of loss/accuracy while also being more complex to run inference on. That said, communication latency can be reduced to improve overall Quality-of-Service by employing compression techniques. However, such techniques can also have the side-effect of reducing the loss/accuracy provided by deep learning-based service. As such, this work studies a joint optimization problem for task offloading decisions in 3-tier edge computing platforms where decisions regarding task offloading are made in tandem with compression decisions. The objective is to optimally offload requests with compression such that the trade-off between latency-accuracy is not greatly jeopardized. We cast this problem as a mixed integer nonlinear program. Due to its nonlinear nature, we then decompose it into separate subproblems for offloading and compression. An efficient algorithm is proposed to solve the problem. Empirically, we show that our algorithm attains roughly a 0.958-approximation of the optimal solution provided by a block coordinate descent method for solving the two sub-problems back-to-back.
2020
- TNSE“Behavioral Information Diffusion for Opinion Maximization in Online Social Networks”Hudson, Nathaniel, and Khamfroush, HanaIEEE Transactions on Network Science and Engineering (TNSE) Oct 2020
Online social networks provide a platform to diffuse information and influence people’s opinion. Conventional models for information diffusion do not take into account the specifics of each users’ personality, behavior, and their opinion. This work adopts the “Big Five” model from the social sciences to ascribe each user node with a personality. We propose a behavioral independent cascade (BIC) model that considers the personalities and opinions of user nodes when computing propagation probabilities for diffusion. We use this model to study the opinion maximization (OM) problem and prove it is NP-hard under our BIC model. Under the BIC model, we show that the objective function of the proposed OM problem is not submodular. We then propose an algorithm to solve the OM problem in linear-time based on a state-of-the-art influence maximization (IM) algorithm. We run extensive simulations under four cases where initial opinion is distributed in polarized/non-polarized and community/non-community cases. We find that when communities are polarized, activating a large number of nodes is ineffective towards maximizing opinion. Further, we find that our proposed algorithm outperforms state-of-the-art IM algorithms in terms of maximizing opinion in uniform opinion distribution-despite activating fewer nodes to be spreaders.
- GC“Improving the Accuracy-Latency Trade-off of Edge-Cloud Computation Offloading for Deep Learning Services”Zhao, Xiaobo, Hosseinzadeh, Minoo, Hudson, Nathaniel and 2 more authorsIn 2020 IEEE Globecom Workshops Dec 2020
Offloading tasks to the edge or the Cloud has the potential to improve accuracy of classification and detection tasks as more powerful hardware and machine learning models can be used. The downside is the added delay introduced for sending the data to the Edge/Cloud. In delay-sensitive applications, it is usually necessary to strike a balance between accuracy and latency. However, the state of the art typically considers offloading all-or-nothing decisions, e.g., process locally or send all available data to the Edge (Cloud). Our goal is to expand the options in the accuracy-latency trade-off by allowing the source to send a fraction of the total data for processing. We evaluate the performance of image classifiers when faced with images that have been purposely reduced in quality in order to reduce traffic costs. Using three common models (SqueezeNet, GoogleNet, ResNet) and two data sets (Caltech101, ImageNet) we show that the Gompertz function provides a good approximation to determine the accuracy of a model given the fraction of the data of the image that is actually conveyed to the model. We formulate the offloading decision process using this new flexibility and show that a better overall accuracy-latency tradeoff is attained: 58% traffic reduction, 25% latency reduction, as well as 12% accuracy improvement.
- ICNC“A Proximity-Based Generative Model for Online Social Network Topologies”Hufbauer, Emory, Hudson, Nathaniel, and Khamfroush, HanaIn 2020 International Conference on Computing, Networking and Communications (ICNC) Feb 2020
Online social networks (OSN) are an increasingly powerful force for information diffusion and opinion sharing in society. Thus, understanding and modeling their structure and behavior is critical. Researchers need vast databases of self-contained, appropriately-sized OSN topologies in order to test and train new algorithms and models to solve problems related to these platforms. In this paper, we present a flexible, robust, and novel model for generating synthetic networks which closely resemble real OSN network systems (e.g., Facebook and Twitter) that include community structures. We also present an automated parameter tuner which can match the model’s output to a given OSN topology. The model can then be used as a data factory to generate testbeds of synthetic topologies which closely resemble the given sample. We compare our model, tuned to match two large real-world OSN network samples, with the Barabási-Albert model and the Lancichinetti-Fortunato-Radicchi benchmark used as baselines. We find that output of our proposed generative model more closely matches the target topologies, than either model, on a variety of important metrics — including clustering coefficient, modularity, assortativity, and average path length. Our model also organically generates robust, realistic communities, with non-trivial inter- and intra-community structure.
- SMARTCOMP“Smart Advertisement for Maximal Clicks in Online Social Networks Without User Data”Hudson, Nathaniel, Khamfroush, Hana, Harrison, Brent and 1 more authorIn 2020 IEEE International Conference on Smart Computing (SMARTCOMP) Sep 2020
Smart cities are a growing paradigm in the design of systems that interact with one another for informed and efficient decision making, empowered by data and technology, of resources in a city. The diffusion of information to citizens in a smart city will rely on social trends and smart advertisement. Online social networks (OSNs) are prominent and increasingly important platforms to spread information, observe social trends, and advertise new products. To maximize the benefits of such platforms in sharing information, many groups invest in finding ways to maximize the expected number of clicks as a proxy of these platform’s performance. As such, the study of click-through rate (CTR) prediction of advertisements, in environments like online social media, is of much interest. Prior works build machine learning (ML) using user-specific data to classify whether a user will click on an advertisement or not. For our work, we consider a large set of Facebook advertisement data (with no user data) and categorize targeted interests into thematic groups we call conceptual nodes. ML models are trained using the advertisement data to perform CTR prediction with conceptual node combinations. We then cast the problem of finding the optimal combination of conceptual nodes as an optimization problem. Given a certain budget k, we are interested in finding the optimal combination of conceptual nodes that maximize the CTR. We discuss the hardness and possible NP-hardness of the optimization problem. Then, we propose a greedy algorithm and a genetic algorithm to find near-optimal combinations of conceptual nodes in polynomial time, with the genetic algorithm nearly matching the optimal solution. We observe that simple ML models can exhibit the high Pearson correlation coefficients w.r.t. click predictions and real click values. Additionally, we find that the conceptual nodes of “politics”, “celebrity”, and “organization” are notably more influential than other considered conceptual nodes.
2019
- ASN“Influence spread in two-layer interdependent networks: designed single-layer or random two-layer initial spreaders?”Khamfroush, Hana, Hudson, Nathaniel, Iloo, Samuel and 1 more authorSpringer Applied Network Science Dec 2019
Influence spread in multi-layer interdependent networks (M-IDN) has been studied in the last few years; however, prior works mostly focused on the spread that is initiated in a single layer of an M-IDN. In real world scenarios, influence spread can happen concurrently among many or all components making up the topology of an M-IDN. This paper investigates the effectiveness of different influence spread strategies in M-IDNs by providing a comprehensive analysis of the time evolution of influence propagation given different initial spreader strategies. For this study we consider a two-layer interdependent network and a general probabilistic threshold influence spread model to evaluate the evolution of influence spread over time. For a given coupling scenario, we tested multiple interdependent topologies, composed of layers A and B, against four cases of initial spreader selection: (1) random initial spreaders in A, (2) random initial spreaders in both A and B, (3) targeted initial spreaders using degree centrality in A, and (4) targeted initial spreaders using degree centrality in both A and B. Our results indicate that the effectiveness of influence spread highly depends on network topologies, the way they are coupled, and our knowledge of the network structure — thus an initial spread starting in only A can be as effective as initial spread starting in both A and B concurrently. Similarly, random initial spread in multiple layers of an interdependent system can be more severe than a comparable initial spread in a single layer. Our results can be easily extended to different types of event propagation in multi-layer interdependent networks such as information/misinformation propagation in online social networks, disease propagation in offline social networks, and failure/attack propagation in cyber-physical systems.
- ICNC“On the Effectiveness of Standard Centrality Metrics for Interdependent Networks”Hudson, Nathaniel, Turner, Matthew, Nkansah, Asare and 1 more authorIn 2020 IEEE International Conference on Computing, Networking, and Communications (ICNC) Feb 2019
This paper investigates the effectiveness of standard centrality metrics for interdependent networks (IDN) in identifying important nodes in preventing catastrophic failure propagation. To show the need for designing specialized centrality metrics for IDNs, we compare the performance of these metrics in an IDN under two different scenarios: i) the nodes with highest centrality of networks composing an IDN are selected separately and ii) the nodes with highest centrality of the entire IDN represented as one single network are calculated. To investigate the resiliency of an IDN, a threshold-based failure propagation model is used to simulate the evolution of failure propagation over time. The nodes with highest centrality are chosen and are assumed to be resistant w.r.t failure. Extensive simulation is conducted to compare the usefulness of standard metrics to stop or slow down the failure propagation in an IDN. Finally a new metric of centrality tailored for interdependent networks is proposed and evaluated. Also, useful guidelines on designing new metrics are presented.