Exponential growth of information in the Cyberspace alongside rapid advancements in its related technologies has created a new mode of competition between societies to gain information domination in this critical and invaluable space. It has thus become quite critical to all stakeholders to play a leading and dominant role in the generation of information and monitoring of voluminous information uploaded to this space. Dominance in monitoring of large amount of information in cyberspace requires real-time monitoring using new techniques and approaches instead of traditional techniques. Concerned with the latter case, we limit our focus in this paper on Blogs as an important part of the Cyberspace and propose a novel notification system for quick reporting of changes made to Blogs. This is achieved by restricting the search for changes to high volumes of Blogs only to changes to the abstracts of Blogs derived from Blogs. We show that this system works favourably compared to systems that require cooperation and synchronization between information providers.
Cloud computing environments have introduced a new model of computing by shifting the location of computing infrastructure to the Internet network to reduce the cost associated with the management of hardware and software resources. The Cloud model uses virtualization technology to effectively consolidate virtual machines (VMs) into physical machines (PMs) to improve the utilization of PMs. Studies however have shown that the average utilization of PMs in many Cloud data centers is still lower than expected. The Cloud model is expected to improve the existing level of utilization by employing new approaches of consolidation mechanisms. In this paper we propose a new approach for dynamic consolidation of VMs in order to maximize the utilization of PMs. This is achieved by a dynamic programing algorithm that selects the best VMs for migration from an overloaded PM, considering the migration overhead of a VM. Evaluation results demonstrate that our algorithms achieve good performance.
In this article, the authors investigate the development of Multiplexed Information and Computing Service (Multics) and Plan 9, and illustrate how these approaches have influenced today’s computing systems. “Computing service as an electricity service”‘ is the core mission of the distributed computing systems introduced by the Multics project in 1965. In developing the new generation of distributed computing systems, researchers have faced numerous obstacles, though many have already been addressed and dispelled by pioneer systems. Security, scalability, access transparency, resource-sharing, and dynamic reconfiguration are examples originally introduced and considered by both Multics and Plan 9. Moreover, many novel approaches have employed the basic ideas of these systems. However, there are further innovations that could be helpful in designing prospective distributed computing systems. As a result, studying previous systems’ objectives and current statuses can facilitate new solutions and help point to possible failures.
Several methods have already been developed for weight assignment in multi-criteria decision making (MCDM). We use geometric algebra to propose a new method named geometric algebra projection (LAP) for weight assignment in MCDM. Our method presents a geometric view of MCDM problem and uses matrix computation for simplifying the computations. Distance square error is defined as a new criterion for evaluating accuracy of answers. This criterion permits geometric interpretation for accuracy of results. LAP method generates best answer by minimizing the distance square error.
Load balancing is one of the main challenges of every structured peer-to-peer (P2P) system that uses distributed hash tables to map and distribute data items (objects) onto the nodes of the system. In a typical P2P system with N nodes, the use of random hash functions for distributing keys among peer nodes can lead to O(log N) imbalance. Most existing load balancing algorithms for structured P2P systems are not adaptable to objects’ variant loads in different system conditions, assume uniform distribution of objects in the system, and often ignore node heterogeneity. In this paper we propose a load balancing algorithm that considers the above issues by applying node movement and replication mechanisms while load balancing. Given the high overhead of replication, we postpone this mechanism as much as possible, but we use it when necessary. Simulation results show that our algorithm is able to balance the load within 85% of the optimal value.
Resource overloading causes one of the main challenges in computing environments. In this case, a new resource should be discovered to transfer the extra load. However, this results in drastic performance degradation. Thus, it is of high importance to discover the appropriate resource at first. So far, several resource discovery mechanisms have been introduced to overcome this challenge, a majority of which neglect the fact that this important decision should be made in cooperation with other units existing in a computing environment. One of the units is load balancing. In this paper, we propose a model for communication between resource discovery and load balancing units in a computing environment. Based on the model, resource discovery and load balancing decisions are made cooperatively considering the behavior of running processes and resources capacities. These considerations make decisions more precise. In addition, the model presents the loosest type of coupling between resource discovery and load balancing units, i.e., message coupling. This feature provides a better scalability in size for the model. Comparative results show that the proposed model increases scalability in size by 7 to 15 %, cuts message transmission rate by 15 % and improves hit rate by 51 %.
Not very long ago, organizations used to identify their customers by means of one-factor authentication mechanisms. In today’s world, however, these mechanisms cannot overcome the new security threats at least when it comes to high risk situations. Hence, identity providers have introduced varieties of two-factor authentication mechanisms. It may be argued that users may experience difficulties at time of authentication in systems that use two-factor authentication mechanisms for example because they may be forced to carry extra devices to be authenticated more accurately. This is however the tradeoff between ease-of-use and having a secure system that may be decided by the users and not the security providers. In this paper we present a new two-factor authentication mechanism that secures systems and at the same time is easier to use. We have used mnemonic features and the cache concept to achieve ease-of-use and security, respectively. Also, we have tested our method with almost 6500 users in real world using The Mechanical Turk Developer Sandbox
Service orientation is a promising paradigm that enables the engineering of large-scale distributed software systems using rigorous software development processes. The existing problem is that every service-oriented software development project often requires a customized development process that provides specific service-oriented software engineering tasks in support of requirements unique to that project. To resolve this problem and allow situational method engineering, we have defined a set of method fragments in support of the engineering of the project-specific service-oriented software development processes. We have derived the proposed method fragments from the recurring features of 11 prominent service-oriented software development methodologies using a systematic mining approach. We have added these new fragments to the repository of OPEN Process Framework to make them available to software engineers as reusable fragments using this well-known method repository.
Peer-to-peer (P2P) systems have been developed with the goal of providing support for transparent and efficient sharing of scalable distributed resources wherein size scalability is limited by the costs of all types of transparencies, especially data access transparency, which are due to the need for frequent data exchanges between peers and other related communication overheads. We present a model that formulates the relationship between scalability and data access transparency in P2P distributed systems to figure out how large these systems can be scaled up, given the overheads of establishing data access transparency. To validate our model and show how our model can be deployed in real life, we consider a real P2P distributed system as a case study and evaluate how CPU utilization, bandwidth, and data request frequency parameters of our model relate to the amount of effort required by the system management to establish data access transparency. We then calculate the strength of the coefficient of correlation of scalability and data access transparency in the system. The degree of strength of this coefficient allows the system designer to decide at design time whether to allow the use of the model in the management of system at runtime or not.
In this paper, we propose an efficient resource discovery framework allowing pure unstructured peer-to-peer systems to respond to requests at run time with a high success rate while preserving the local autonomy of member machines. There are five units in the proposed framework that respectively gather information about the status of resources, make decisions, detect the states of member machines, discover resources to respond to requests in normal and dynamic conditions, and balance the load of local machines. Efficient resource discovery is achieved by the deployment of a newly introduced mechanism that is placed on every machine allowing it to figure out its states before and after accepting other machines’ requests for its resources using a state model and deciding whether to accept or reject those requests. This state model accurately estimates the machine’s state based on the resources and processes of the machine before and after accepting the request. We have experimentally compared the proposed mechanism with random, learning-based, and state-based search mechanisms with regard to the number of missed requests, network bandwidth due to transferred messages, number of associated machines in a discovery operation, time required to process information in discovery operation, processing time in machines, and the number of faults per request. The results show significant improvement of some of these parameters, specially network bandwidth and the number of missed requests in a dynamic condition, under our framework.
High Performance Cluster Computing Systems (HPCSs) represent the best performance because their configuration is customized regarding the features of the problem to be solved at design time. Therefore, if the problem has static nature and features, the best customized configuration can be done. New generations of scientific and industrial problems usually have dynamic nature and behavior. A drawback of this dynamicity is that the customized HPCSs face challenges at runtime, and consequently show the worse performance. The reason for this might be due to the fact that dynamic problems are not adapted to configuration of the HPCS. Hence, requests of the dynamic problem are not in the direction of the HPCS configuration. The main proposed solutions for this challenge are dynamic load balancing or using reconfigurable platforms.
One of the benefits of virtualization technology is the provision of secure and isolated computing environments on a single physical machine. However, the use of virtual machines for this purpose often degrades the overall system performance that is due to emulation costs, for example, packet filtering on every virtual machine. To allow virtual machines to be favorably used as before for the provision of secure environments but with comparably less performance degradation, we propose a new architecture called Alamut in this paper for restructuring any typical network intrusion detection system (NIDS) to run in a Xen-based virtual execution environment. In the proposed architecture, primitive mechanisms for implementing the security concerns of typical NIDSs such as signature matching are placed at the kernel level of driver domain (dom0), whereas security policies and management modules are kept in user space of that domain. Separation of mechanisms from policies allows network packets to be verified at the kernel level first hand more efficiently without requiring costly context switches to push them to user space for validation. In addition, system administrators can easily define new policies at user level and determine on which virtual machines these policies should be enforced. A proof-of-concept implementation of Alamut has been prototyped on the Xen hypervisor using Bro open-source NIDS. Experimental results show approximately 3.5-fold increase in the overall system performance when our prototype is run compared with when Bro is run. Results also show 19% improvement in network throughput. The comparison of Alamut with Snort with the same set of signatures and attacks shows that our prototyped NIDS has lower processor utilization and has captured more packets in heavy network loads.
Wireless Sensor Actor Networks (WSANs) have contributed to the advancement of ubiquitous computing wherein time and energy considerations to perform the tasks of ubiquitous applications are critical. Therefore, real-timeliness and energy-awareness are amongst the grand challenges of WSANs. In this paper, we present a context-aware task distribution approach for assigning real-time tasks to cost-effective actors and for scheduling these tasks at actor level subject to minimizing the energy consumptions of actors and meeting the deadlines of tasks. The proposed approach comprises of three protocols, namely a Market-based Task Assignment Protocol (MaTAP), an Energy Calculation Protocol (ECaP), and a Context-aware Task Scheduling Protocol (CaTSP). We present the formal models of the proposed protocols using Timed Automata and prove their soundness to validate the correctness of the proposed approach.
We show that our proposed approach is more efficient in terms of both the total remaining energies of actors and the average tasks completion time compared to stochastic approach. We also show that our approach guarantees the deadlines of all tasks.
The use of virtualization technology (VT) has become widespread in modern datacenters and Clouds in recent years. In spite of their many advantages, such as provisioning of isolated execution environments and migration, current implementations of VT do not provide effective performance isolation between virtual machines (VMs) running on a physical machine (PM) due to workload interference of VMs. Generally, this interference is due to contention on physical resources that impacts performance in different workload configurations. To investigate the impacts of this interference, we formalize the concept of interference for a consolidated multi-tenant virtual environment. This formulation, represented as a mathematical model, can be used by schedulers to estimate the interference of a consolidated virtual environment in terms of the processing and networking workloads of running VMs, and the number of consolidated VMs. Based on the proposed model, we present a novel batch scheduler that reduces the interference of running tenant VMs by pausing VMs that have a higher impact on proliferation of the interference. The scheduler achieves this by selecting a set of VMs that produce the least interference using a 0–1 knapsack problem solver. The selected VMs are allowed to run and other VMs are paused. Users are not troubled by the pausing and resumption of VMs for a short time because the scheduler has been designed for the execution of batch type applications such as scientific applications. Evaluation results on the makespan of VMs executed under the control of our scheduler have shown nearly 33% improvement in the best case and 7% improvement in the worst case compared to the case in which all VMs are running concurrently. In addition, the results show that our scheduling algorithm outperforms serial and random scheduling of VMs as well.
In the recent years, Cloud Computing has been one of the top ten new technologies which provides various services such as software, platform and infrastructure for internet users. The Cloud Computing isa promising IT paradigm which enables the Internetevolution into a global market of collaborating services.In order to provide better services for cloud customers,cloud providers need services that are in cooperation with other services. Therefore, Cloud Computing semantic interoperability plays a key role in Cloud Computing services. In this paper, we address interoperability issues in Cloud Computing environments. After a description of Cloud Computing interoperability from different aspects and references,we describe two architectures of cloud service interoperability. Architecturally, we classify existing interoperability challenges and we describe them.Moreover, we use these aspects to discuss and compare several interoperability approaches
One of the important aspects of decision making and management in distributed systems is collecting accurate information about the available resources of the peers. The previously proposed approaches for collecting such information completely depend on the system’s architecture. In the server-oriented architecture, servers assume the main role of collecting comprehensive information from the peers and the system. Next, based on the information about the features of the basic activities and the system, an exact description of the peers’ status is produced. Accurate decisions are then made using this description. However, the amount of information gathered in this architecture is too large, and it requires massive processing. On the other hand, updating the information takes time, causing delays and undermining the validity of the information. In addition, due to the limitations imposed by the servers, such architecture is not scalable and dynamic enough. The peer-to-peer architecture was introduced to address these concerns. However, due to a lack of complete knowledge of the peers and the system, the decisions are made without a precise description of the peers’ status and are only based on the hardware data collected from the peers. Such an abstract and general image of the peers is not adequate for the purpose of decision making. In this paper, a 4-dimensional model is presented for the purpose of information collection and the exact description of the peer’s status, including the features of the peer, the basic activity, the time, and the specifications of the system. The proposed model is for a server-oriented architecture, but it also adapts to the peer-to-peer serverless architecture. Based on this model, a new approach is introduced for information collection and an exact description of the peers’ status in a peer-to-peer system based on the Latin square concept. We evaluate the model in the server-oriented and serverless situations. The workload is considered as the basic activity in our evaluation. Our evaluation demonstrates that in a server-oriented situation, increasing the size of the system has a direct relation with time. However, a serverless situation does not follow this behavior.
This paper presents a model for resolving two main issues of time in e-commerce. The first issue is the time value of e-commerce that represents the value of each moment of the commerce time from the perspective of buyers and sellers. Buyers and sellers can use this model to calculate the time value at each moment of time and accordingly decide whether it is profitable to buy or sell at that moment. The second issue is to allow buyers or sellers to increase their savings or decrease their costs by changing each of the factors governing the time value model of the concerned e-commerce. We present relevant model specifically for Amazon e-commerce to present a proof of concept of our proposed models.
This paper proposes two complementary virtual machine monitor (VMM) detection methods. These methods can be used to detect any VMM that is designed for ×86 architecture. The first method works by finding probable discrepancies in hardware privilege levels of the guest operating system’s kernel on which user applications run. The second method works by measuring the execution times of a set of benchmark programs and comparing them with the stored execution times of the same programmes previously ran on a trusted physical machine. Unlike other methods, our proportional execution time technique could not be easily thwarted by VMMs. In addition, using proportional execution times, there is no need for a trusted external source of time during detection. It is shown experimentally that the deployment of both methods together can detect the existence of four renowned VMMs, namely, Xen, VirtualBox, VMware, and Parallels, on both types of processors that support virtualisation technology (VT-enabled) or do not support it (VT-disabled).
Spelling errors in digital documents are often caused by operational and cognitive mistakes, or by the lack of full knowledge about the language of the written documents. Computer-assisted solutions can help to detect and suggest replacements. In this paper, we present a new string distance metric for the Persian language to rank respelling suggestions of a misspelled Persian word by considering the effects of keyboard layout on typographical spelling errors as well as the homomorphic and homophonic aspects of words for orthographical misspellings. We also consider the misspellings caused by disregarded diacritics. Since the proposed string distance metric is custom-designed for the Persian language, we present the spelling aspects of the Persian language such as homomorphs, homophones, and diacritics. We then present our statistical analysis of a set of large Persian corpora to identify the causes and the types of Persian spelling errors. We show that the proposed string distance metric has a higher mean average precision and a higher mean reciprocal rank in ranking respelling candidates of Persian misspellings in comparison with other metrics such as the Hamming, Levenshtein, Damerau–Levenshtein, Wagner–Fischer, and Jaro–Winkler metrics.
The combination of sensor and actor nodes in wireless sensor actor networks (WSANs) has created new challenges notably in coordination. In this paper, we survey, categorize, and bring into perspective existing researches on weak connectivity and its impacts on coordination ranging from a node failure to disability of actor nodes to communicate with other actors permanently. We present challenges in each category alongside existing provisions and approaches in the context of the proposed coordination-oriented connectivity categorization. Alongside explanation of general concepts for a communication generalist, we compare the proposed protocols using parameters related to weak connectivity and coordination. Powerful actors can help weaker sensors in many aspects such as routing and data forwarding and many sensors can help few actors in the regions that actors are sparsely deployed. Actors can carry, move and charge sensors while sensors can detect partitions of inter-actor network. Considering lessons learned from surveyed works, we show that actor and sensor nodes in a WSAN must cooperate to provide an integrated network when network connectivity is weak.
Power efficiency is one of the main challenges in large-scale distributed systems such as datacenters, Grids, and Clouds. One can study the scheduling of applications in such large-scale distributed systems by representing applications as a set of precedence-constrained tasks and modeling them by a Directed Acyclic Graph. In this paper we address the problem of scheduling a set of tasks with precedence constraints on a heterogeneous set of Computing Resources (CRs) with the dual objective of minimizing the overall makespan and reducing the aggregate power consumption of CRs. Most of the related works in this area use Dynamic Voltage and Frequency Scaling (DVFS) approach to achieve these objectives. However, DVFS requires special hardware support that may not be available on all processors in large-scale distributed systems. In contrast, we propose a novel two-phase solution called PASTA that does not require any special hardware support. In its first phase, it uses a novel algorithm to select a subset of available CRs for running an application that can balance between lower overall power consumption of CRs and shorter makespan of application task schedules. In its second phase, it uses a low-complexity power-aware algorithm that creates a schedule for running application tasks on the selected CRs. We show that the overall time complexity of PASTA is O(p.v2) wherep is the number of CRs and v is the number of tasks. By using simulative experiments on real-world task graphs, we show that the makespan of schedules produced by PASTA are approximately 20 % longer than the ones produced by the well-known HEFT algorithm. However, the schedules produced by PASTA consume nearly 60 % less energy than those produced by HEFT. Empirical experiments on a physical test-bed confirm the power efficiency of PASTA in comparison with HEFT too.
The deployment of sensors without enough coverage can result in unreliable outputs in wireless sensor networks (WSNs). Thus sensing coverage is one of the most important quality of service factors in WSNs. A useful metric for quantifying the coverage reliability is the coverage rate that is the area covered by sensor nodes in a region of interest. The network sink can be informed about locations of all nodes and calculate the coverage rate centrally. However, this approach creates huge load on the network nodes that had to send their location information to the sink. Thus, a distributed approach is required to calculate the coverage rate. This paper is among the very first to provide a localized approach to calculate the coverage rate. We provide two coverage rate calculation (CRC) protocols, namely distributed exact coverage rate calculation (DECRC) and distributed probabilistic coverage rate calculation (DPCRC). DECRC calculates the coverage rate precisely using the idealized disk graph model. Precise calculation of the coverage rate is a unique property of DECRC compared to similar works that have used the disk graph model. In contrast, DPCRC uses a more realistic model that is probabilistic coverage model to determine an approximate coverage rate. DPCRC is in fact an extended version of DECRC that uses a set of localized techniques to make it a low cost protocol. Simulation results show significant overall performance improvement of CRC protocols compared to related works.
With the ever-increasing advancement of mobile device technology and their pervasive usage, users expect to run their applications on mobile devices and get the same performance as if they used to run their applications on powerful non-mobile computers. There is a challenge though in that mobile devices deliver lower performance than traditional less-constrained and non-mobile computers because they are constrained by weight, size, and mobility in spite of all their advancements in recent years. One of the most common solutions that has ameliorated this performance disparity is cyber foraging, wherein nearby non-mobile computers called surrogates are utilized to run the whole or parts of applications on behalf of mobile devices. In this paper, we present a survey of cyber foraging as a solution to resolve the challenges of computing on resource-constrained mobile devices. We also explain the most notable cyber foraging systems and present a categorization of existing cyber foraging approaches considering their type of dynamicity, granularity, metrics used, surrogate types and scale, location of their decision maker unit, remoteness of execution, migration support, and their overheads.
In many applications of wireless sensor actor networks (WSANs) that often run in harsh environments, the reduction of completion times of tasks is highly desired. We present a new time-aware, energy-aware, and starvation-free algorithm called Scate for assigning tasks to actors while satisfying the scalability and distribution requirements of WSANs with semi-automated architecture. The proposed algorithm allows concurrent executions of any mix of small and large tasks and yet prevents probable starvation of tasks. To achieve this, it estimates the completion times of tasks on each available actor and then takes the remaining energies and the current workloads of these actors into account during task assignment to actors. The results of our experiments with a prototyped implementation of Scate show longer network lifetime, shorter makespan of resulting schedules, and more balanced loads on actors compared to when one of the three well-known task-scheduling algorithms, namely, the max-min, min-min, and opportunistic load balancing algorithms, is used.
In spite of the fact that Cloud Computing Environments (CCE) host many I/O intensive applications such as Web services, big data and virtual desktops, virtual machine monitors like Xen impose high overhead on CCEs’ delivered performance hosting such applications. Studies have shown that hypervisors such as Xen favor compute intensive workloads while their performance for I/O intensive tasks is far from satisfactory. In this paper we present a new mechanism called cCluster to mitigate I/O processing delay in CCEs. To this end, cCluster classifies running virtual machines into I/O and computation VMs, and based on this classification, it dynamically classifies exiting physical cores into I/O and computation cores too. It then schedules I/O virtual CPUs (vCPU) on I/O cores and computation vCPUs on computation cores. Empirical results demonstrate that cCluster remarkably reduces the I/O response time and thus improves the network throughput.
Deitel and Deitel, How to Program C++, 5th Edition, 2005; a university text book translated under Pearson’s legal license to Farsi and printed by Ghazal Publications Co. in July 2006. (Farsi)
Cloud computing systems have emerged as a type of distributed systems in which a multitude of interconnected machines are gathered and recruited over the internet to help solve a computation or data-intensive problem. There are large numbers of cases in which Cloud techniques solely are not able to solve the job due to the nature of the tasks. To overcome this problem recently a strong inclination has emerged towards enlisting the human intelligence and wisdom of crowds a. k. a. Crowdsourcing in combination with the machine automated techniques. In this paper the authors propose a model for integrating crowds of people in the Cloud environments to enrich Cloud computing environments to be able to provide hybrid human-machine services enabling it to solve a wider variety of problems which some of them are studied here. The authors nickname these rich types of services, Crowd-enhanced Cloud services. At the end, the modality and challenges of this convergence and its future trends are explored.
Cloud computing environments (CCEs) are expected to deliver their services with qualities in service level agreements (SLA). On the other hand, they typically employ virtualization technology to consolidate multiple workloads on the same physical machine, thereby enhancing the overall utilization of physical resources. Most existing virtualization technologies are however unaware of their delivered quality of services (QoS). For example, the Xen hypervisor merely focuses on fair sharing of processor resources. We believe that CCEs have got married with traditional virtualization technologies without many traits in common. To bridge the gap between these two technologies, we have designed and implemented Kani, a QoS-aware hypervisor-level scheduler. Kani dynamically monitors the quality of delivered services to quantify the deviation between desired and delivered levels of QoS. Using this information, Kani determines how to allocate processor resources among running VMs so as to meet the expected QoS. Our evaluations of Kani scheduler prototype in Xen show that Kani outperforms the default Xen scheduler namely the Credit scheduler. For example, Kani reduces the average response time to requests to an Apache web server by up to 93.6 %; improves its throughput by up to 97.9%; and mitigates the call setup time of an Asterisk media server by up to 96.6%.
Detection of stateful complex event patterns using parallel programming features is a challenging task because of statefulness of event detection operators. Parallelization of event detection tasks needs to be implemented in a way that keeps track of state changes by new arriving events.
In this paper, we describe our implementation for a customized complex event detection engine by using Open Multi-Processing (OpenMP), a shared memory programming model. In our system event detection is implemented using Deterministic Finite Automata (DFAs). We implemented a data stream aggregator that merges 4 given event streams into a sequence of C++ objects in a buffer used as source event stream for event detection in a next processing step. We describe implementation details and 3 architectural variations for stream aggregation and parallelized of event processing. We conducted performance experiments with each of the variations and report some of our experimental results. A comparison of our performance results shows that for event processing on single machine with multi cores and limited memory, using mutli-threads with shared buffer has better stream processing performance than an implementation with multi-processes and shared memory.
Considerable energy consumption of datacenters results in high service and maintenance costs and environmental pollutions. Therefore, reducing the energy of operating data centers received a lot of attention in recent years. In spite of the fact that modern multi-core architectures have presented both power management techniques, such as dynamic voltage and frequency scaling (DVFS), as well as percore power gating (PCPG) and CPU consolidation techniques for energy saving, the joint deployment of these two features has been less exercised. Obviously, by widespread use of chip multiprocessors (CMPs), power management with consideration of multi-core chip and core count management techniques can offer more efficient energy consumption in environments operating large datacenters. In this paper, we focus on dynamic power management in virtualized multi-core server systems which are used in cloud-based systems. We propose an algorithm which is effectively equipped by power management techniques to select an efficient number of cores and frequency level in CMPs within an acceptable level of performance. The paper also reports an extensive set of experimental results found on a realistic multi-core server system setup by RUBiS benchmark. Our algorithm demonstrates energy saving up to 67% compared to baseline. Our algorithm outperforms two existing consolidation algorithms in virtualized servers by 15% and 21%.
This paper presents a high performance technique for virtualization-unaware scheduling of compute-intensive synchronized (i.e., tightly-coupled) jobs in virtualized high performance computing systems. Online tightly-coupled jobs are assigned/reassigned to clustered virtual machines based on synchronization costs. Virtual machines are in turn assigned/reassigned to clustered physical machines based on CPU load. Our analytical study shows that it is possible to minimize the performance and scalability degradation of high performance computing applications such as ExaScale and PetaScale systems and applications that are recommended to use virtualization technology to achieve higher degree of performability, namely higher utilization, energy efficiency, portability, flexibility and configurability.
Sequence similarity, as a special case of data intensive applications, is one of the neediest applications for parallelization. Clustered commodity computers as a cost-effective platform for distributed and parallel processing, can be leveraged to parallelize sequence similarity. However, manually designing and developing parallel programs on commodity computers is a time-consuming, complex and error-prone process. In this paper, we present a sequence similarity parallelization technique using the Apache Storm as a stream processing framework with a data parallel programming model. Storm automatically parallelizes computations via a special user-defined topology that is represented as a directed acyclic graph. The proposed technique collects streams of data from a disk and sends them sequence by sequence to clustered computers for parallel processing. We also present a dispatching policy for balancing the cluster workload and managing the cluster heterogeneity to achieve more than 99 percent parallelism. An alignment-free method, known as n-gram modeling, is used to calculate similarities between the sequences. To show the cost-performance superiority of our method on clustered commodity computers over serial processing in powerful computers, we simply use UniProtKB/SwissProt dataset for evaluation of the performance of sequence similarity as an interesting large-scale Bioinformatics application.
Cloud computing users are faced with a wide variety of services to choose from. Consequently, a number of cloud service brokers (CSB) have been presented to help users in their service selection process. This paper reviews recent brokerage approaches that have been introduced and used for cloud service brokerage, and discusses their challenges accordingly. A set of attributes for a CSB to be considered effective is proposed. CSBs with different brokerage approaches are classified into two categories, namely single service and multiple services models, and then assessed, analyzed and compared with respect to the proposed attributes. Based on our studies, CSBs with multiple service brokerage capability that support more attributes of effective CSBs, would have wider application in cloud computing environments.
Extremely heterogeneous software stacks have encouraged the use of system virtualization technology for execution of composite high performance computing (HPC) applications to enable full utilization of extreme-scale HPC systems (ExaScale). Parts of composite applications, called loosely-coupled components, consist of a set of loosely-coupled CPU-intensive jobs. Jobs of loosely-coupled components run on a set of virtual machines (VMs), which in turn are distributed on physical machines. Co-location of VMs on physical machines, is the main source of interferences which cause uncertainty in jobs completion time. Motivated by this challenge, our main goal is to introduce an adaptive job scheduling method for VMs of loosely-coupled components in order to bound the negative impact of interferences. On the other hand, due to the abstraction of virtualization, job schedulers are unaware of the status of underlying physical machines. Introducing a scheme to dynamically reconfigure the job scheduler’s parameters to inform scheduler about the true status of the physical machines, is our second goal. This paper presents a combination of ASSIGN-ROUTE online job scheduling and a reconfiguration technique allowing a given loosely-coupled component to balance its resource usage load, and thus improve the scaled execution of its loosely-coupled jobs. We prove that reconfiguration covers the virtualization unawareness in a way that the whole technique balances the load, comparable to the optimal load balancing for online deterministic unrelated parallel machine makespan minimization scheduling. We also show that the results of our experiments, support the theoretical achievements specially in case of scaled execution.