Browsing by Author "Herkersdorf, Andreas"

1 - 20 of 50

Achieving scalability for job centric monitoring in a distributed infrastructure
(Gesellschaft für Informatik e.V., 2012) Hilbrich, Marcus; Müller-Pfefferkorn, Ralph; Mühl, Gero; Richling, Jan; Herkersdorf, Andreas
Job centric monitoring allows to observe jobs on remote computing resources. It may offer visualisation of recorded monitoring data and helps to find faulty or misbehaving jobs. If installations like grids or clouds are observed monitoring data of many thousands of jobs have to be handled. The challenge of job centric monitoring infrastructures is to store, search and access data collected in huge installations like grids or clouds. We take this challenge with a distributed layer based architecture which provides a uniform view to all monitoring data. The concept of this infrastructure called SLAte and an analysis of the scalability is provided in this paper.
Adaptive content distribution network for live and on-demand streaming
(Gesellschaft für Informatik e.V., 2012) Miyauchi, Yuta; Matsumoto, Noriko; Yoshida, Norihiko; Kamiya, Yuko; Shimokawa, Toshihiko; Mühl, Gero; Richling, Jan; Herkersdorf, Andreas
We have proposed an adaptive content distribution network (CDN), FCAN (Flash Crowds Alleviation Network), which changes its structure dynamically against a flash crowd, that is a rapid increase in server load caused by a sudden access concentration. FCAN in our preceding studies responds only to static content delivery. In this paper, we extend FCAN to alleviate flash crowds in video streaming. Through some experiments, we confirmed that FCAN for video streaming is effective to alleviate flash crowds.
An anonymous efficient private set intersection protocol for wireless sensor networks
(Gesellschaft für Informatik e.V., 2012) Moldovan, George; Ignat, Anda; Mühl, Gero; Richling, Jan; Herkersdorf, Andreas
We present an efficient protocol which, under certain assumptions, provides a suitable level of security and anonymity in the ideal cipher model when computing the intersection of two private data-sets containing lists of elements from a large domain. The assumptions are that each node is pre-loaded with a set of pseudonyms, signed by the network's trusted authority; that the cardinality of each data-set is globally known. Our protocol first establishes a secure, trusted connection between two partners, then uses lightweight, symmetrical key operations for encoding and privately comparing the elements of two sets. Given a cryptographically secure symmetric encryption scheme, our protocol is safe for both semi-honest and malicious adversaries. The primary target platform for this protocol are Wireless Sensor Networks (WSNs), specifically those used in Ambient Assisted Living (AAL) scenarios, which almost entirely consist of a heterogeneous mix of devices, providers and manufacturers.
An architecture for runtime evaluation of soc reliability
(Gesellschaft für Informatik e.V., 2006) Bernauer, Andreas; Bringmann, Oliver; Rosenstiel, Wolfgang; Bouajila, Abdelmajid; Stechele, Walter; Herkersdorf, Andreas; Hochberger, Christian; Liskowsky, Rüdiger
This paper presents an architecture to evaluate the reliability of a systemon-chip (SoC) during its runtime that also accounts for the system's redundancy. We propose to integrate an autonomic layer into the SoC to detect the chip's current condition and instruct appropriate countermeasures. In the autonomic layer, error counters are used to count the number of errors within a fixed time interval. The counters' values accumulate into a global register representing the system's reliability. The accumulation takes into account the series and parallel composition of the system.
ARCS 2012 Workshops
(Gesellschaft für Informatik e.V., 2012) Mühl, Gero; Richling, Jan; Herkersdorf, Andreas
ART networks as flexible means to implement dependability properties in autonomous systems
(Gesellschaft für Informatik e.V., 2012) Großpietsch, Karl-Erwin; Silayeva, Tanya A.; Mühl, Gero; Richling, Jan; Herkersdorf, Andreas
In this paper, the potential of adaptive resonance theory (ART) networks for dependability issues is considered. The basic properties of ART architectures are described, and some strategies are discussed to enable a balanced combination of performance and dependability requirements by these networks.
Cellular location determination - reliability and trustworthiness of GSM location data
(Gesellschaft für Informatik e.V., 2012) Zahoransky, Richard M.; Rechert, Klaus; Meier, Konrad; Wehrle, Dennis; Suchodoletz, Dirk von; Mühl, Gero; Richling, Jan; Herkersdorf, Andreas
While using mobile telephony networks, the serving network infrastructure is able to determine the mobile station's location. Until now, cellular telephony has been built on self-contained infrastructure, i.e. all network components have been certified and especially users have been unable to take over control over their mobile equipment's behavior. With the rising awareness on privacy issues, software-based mobile phone network stacks became available and thereby a new freedom degree for mobile subscribers is introduced. While slight modification to the mobile phones behavior will not impair with the general functionality of the network, cellular location determination becomes less reliable and trustworthy. We discuss user imposed measures to detect external location determination attempts and to obfuscate generated location information. With a dedicated testbed setup, the effects of location obfuscation were evaluated.
A comparison of parallel programming models of network processors
(Gesellschaft für Informatik e.V., 2004) Albrecht, Carsten; Hagenau, Rainer; Maehle, Erik; Döring, Andreas; Herkersdorf, Andreas; Brinkschulte, Uwe; Becker, Jürgen; Fey, Dietmar; Großpietsch, Karl-Erwin; Hochberger, Christian; Maehle, Erik; Runkler, Thomas A.
Today's network processor utilize parallel processing in order to cope with the traffic growth and wire-speed of current and future network technologies. In this paper, we study two important parallel programming models for network processors: run to completion and pipelining. In particular, the packet flow of a standard network application, IPv4 Forwarding, through two examined network processors, IBM PowerNP NP4GS3 and Intel IXP1200, is reviewed and characterized in respect to their programming models. Based on a benchmark for PC-cluster SANs, their application throughput and latency for Gigabit Ethernet is investigated and compared to a commercial, ASIC-based switch. It is shown that in this scenario network processors can compete with hard-wired solutions.
Concurrent phase classification for accelerating MPSoC simulation
(Gesellschaft für Informatik e.V., 2012) Tawk, Melhem; Ibrahim, Khaled Z.; Niar, Smail; Mühl, Gero; Richling, Jan; Herkersdorf, Andreas
To rapidly evaluate performances and power consumption in design space exploration of modern highly complex embedded systems, new simulation tools are needed. The checkpointing technique, which consists in saving system states in order to simulate in detail only a small part of the application, is among the most viable simulation approaches. In this paper, a new method for generating and storing checkpoints for accelerating MPSoC simulation is presented. Experimental results demonstrate that our technic can reduce simulation time and the memory size required to store these checkpoints on a secondary memory. In addition, the necessary time to load checkpoints on the host processor at runtime is optimized. These advantages speedup simulations and allow exploration of a large space of alternative designs in the DSE.
Correction of faulty signal transmission for resilient designs of signed-digit arithmetic
(Gesellschaft für Informatik e.V., 2012) Neuhäuser, David; Zehendner, Eberhard; Mühl, Gero; Richling, Jan; Herkersdorf, Andreas
When arithmetic components are parallelized, fault-prone interconnections can tamper results significantly. Advances in feature size shrinking lead to a steady increase of errors caused by faulty transmission. We suggest to employ resilient data encoding schemes to offset these negative effects. Focusing on parallel signed-digit based arithmetic, frequently used in high-speed systems, we found that a suitable data encoding can reduce error rates by about 25% when using 2-bit encoding and about 62% when using 3-bit encoding. Data encoding should be driven by symbol occurrence probabilities. We develop a methodology to obtain these probabilities, show example fault-tolerant encodings, and discuss the impact on communicating parallel arithmetic circuits in example error scenarios.
Efficient memory allocations on a many-core accelerator
(Gesellschaft für Informatik e.V., 2012) Koutras, Ioannis; Bartzas, Alexandros; Soudris, Dimitrios; Mühl, Gero; Richling, Jan; Herkersdorf, Andreas
Memory management is one of the key challenges in the design of embedded systems where memory is a scarce resource. The problem scales disproportionally as new embedded systems incorporate many-core architectures where the cores have to struggle accessing an even more limited amount of resources. In this paper we present a way of creating custom memory allocators for many-core accelerators. We evaluated our approach in the P2012 platform, a many-core accelerator from ST. It is shown that a custom memory allocator created by our framework could save on average 62% of the total cycles spent on memory resource management when compared with the platform's current memory allocator without increasing the allocator's overhead.
Enhanced reliability in tiled manycore architectures through transparent task relocation
(Gesellschaft für Informatik e.V., 2012) Rauchfuss, Holm; Wild, Thomas; Herkersdorf, Andreas; Mühl, Gero; Richling, Jan; Herkersdorf, Andreas
Manycore platforms with tens and even up to hundreds of processing cores per chip are becoming a commercial reality and are subject of intensified research. This concept paper describes work in progress on the applicability of HW supported communication and processing virtualization on regular structured, tiled manycore architectures for the benefit of improved fault tolerance against transient and permanent perturbations. Temporarily unused, naturally redundant tiles are dynamically occupied during run time via transparent task relocation. This means, the execution of a task can pro-actively and transparently for the application be switched by distributed system management and virtualization services from a tile, which is considered unreliable, to a more reliable tile. In order to support different requirements regarding safety, timing integrity and minimized overhead for the relocation services, several established strategies can be enacted by the system management. The migration protocol for signaling during run configuration and actual relocation allows migration with minimal downtime and no communication loss. The actual migration is triggered by a configurable threshold on critical system parameters on a per task basis.
Evaluating run-time resource management policies for multi-core embedded platforms with the EMME evaluation framework
(Gesellschaft für Informatik e.V., 2012) Mariani, Giovanni; Palermo, Gianluca; Zaccaria, Vittorio; Silvano, Cristina; Mühl, Gero; Richling, Jan; Herkersdorf, Andreas
Toady's embedded computing electronic products are based on multi-core platforms and they are capable to concurrently execute different applications. For these products it is of paramount importance that a Run-time Resource Management (RRM) system integrated in the Operating System (OS) arbiters about resource allocation to the active applications. The RRM should take decisions at run-time to maximize platform performance and minimizing non-functional costs such as power consumption or memory requirements. However, embedded system design covers a broad range of applications and the customer requirements are very different depending on the target device. In general there is not an unique RRM that best fits in all possible embedded scenarios. This paper presents the EMME Evaluation Framework, an open source tool that provides a methodology and the accompanying infrastructure to quickly explore the effects of different RRM systems for a target use case scenario. The tool aims at the analysis of different figures of merit of the system being designed such as the applications' response time, system throughput and power consumption. Different RRM modules are released with the framework. These modules implement different RRM policies that define how to allocate computing resources to the active applications while fitting in a power budget that is assumed assigned by other layers of the OS.
Exploiting bit-level parallelism in GPGPUs: a case study on KeeLoq exhaustive key search attack
(Gesellschaft für Informatik e.V., 2012) Agosta, Giovanni; Barenghi, Alessandro; Pelosi, Gerardo; Mühl, Gero; Richling, Jan; Herkersdorf, Andreas
Graphic Processing Units (GPU) are increasingly popular in the field of high-performance computing for their ability to provide computational power for massively parallel problems at a reduced cost. However, the programming model exposed by the GPGPU software development tools is often insufficient to achieve full performance, and a major rethinking of algorithmic choices is needed. In this paper, we showcase such an effect on a case study drawn from the cryptography application domain. The pervasive use of cryptographic primitives in modern embedded systems is a growing trend. Small, efficient cryptosystems have been effectively employed to design and implement keyless password-based access control systems in various wireless authentication applications. The security margin provided by these lightweight ciphers should be accurately examined in light of the speed and area constraints imposed by the target environment. We present a re-design of the ASIC-oriented KEELOQ implementation to perform efficient exhaustive key search attacks while fitting tightly the parallel programming model exposed by modern GPUs. Indeed, the bitslicing technique allows the intrinsic parallelism offered by word-oriented SIMD computations to be effectively exploited. Through proper adaptation of the algorithm implementation to a platform radically different from the one it was designed for, we achieved a × 40 speedup in the computation time with respect to a single-core CPU bruteforce attack, employing only consumer grade hardware. The outstanding speedup obtainable points to a significant weakening of the cipher security margin, since it proves that anyone with off-the-shelf hardware is able to circumvent the security measures in place.
Fail*: towards a versatile fault-injection experiment framework
(Gesellschaft für Informatik e.V., 2012) Schirmeier, Horst; Hoffmann, Martin; Kapitza, Rüdiger; Lohmann, Daniel; Spinczyk, Olaf; Mühl, Gero; Richling, Jan; Herkersdorf, Andreas
Many years of research on dependable, fault-tolerant software systems yielded many tool implementations for vulnerability analysis and experimental validation of resilience measures. We identify two disjoint classes of fault-injection (FI) experiment tools in the field, and argue that both are plagued by inherent deficiencies, such as insufficient target state access, little or no means to switch to another target system, and non-reusable experiment code. In this article, we present a novel design approach for a FI infrastructure that aims at combining the strengths of both classes. Our FAIL* experiment framework provides carefully-chosen abstractions simplifying both the implementation of different simulator/hardware target backends and the reuse of experiment code, while retaining the ability for deep target-state access for specialized FI experiments. An exemplary report on first experiences with a prototype implementation based on existing X86 and ARM simulators demonstrates the tool's versatility.
Fast scenario-based design space exploration using feature selection
(Gesellschaft für Informatik e.V., 2012) Stralen, Peter van; Pimentel, Andy; Mühl, Gero; Richling, Jan; Herkersdorf, Andreas
This paper presents a novel approach to efficiently perform early system level design space exploration (DSE) of MultiProcessor System-on-Chip (MPSoC) based embedded systems. By modeling dynamic multi-application workloads using application scenarios, optimal designs can be quickly identified using a combination of a scenario-based DSE and a feature selection algorithm. The feature selection algorithm identifies a representative subset of scenarios, which is used to predict the fitness of the MPSoC design instances in the genetic algorithm of the scenario-based DSE. Results show that our scenario-based DSE provides a tradeoff between the speed and accuracy of the early DSE.
Fault tolerant and adaptive path planning in crowded environments for mobile robots based on hazard estimation via health signals
(Gesellschaft für Informatik e.V., 2012) Maas, Raphael; Maehle, Erik; Mühl, Gero; Richling, Jan; Herkersdorf, Andreas
Mobile robots are complex systems that tend to become even more complex. Since these systems show often graceful degradation in case of hardware faults the applied control strategies should be taken into account. The overall fitness of a robot can be condensed into health signals. Depending on this signal the control of the robot can be adjusted to counteract reduced sensing or acting capabilities, by following the design principles of organic computing. This work introduces techniques to estimate possible hazards that are introduced by the presence of obstacles in the close proximity of a damaged robot and might threaten its task.
A Fault-Tolerant Processor Architecture
(Gesellschaft für Informatik e.V., 2010) Bouajila, Abdelmajid; Sommer, Thomas; Zeppenfeld, Johannes; Stechele, Walter; Herkersdorf, Andreas
Ffault localization in NoCs by timed heartbeats
(Gesellschaft für Informatik e.V., 2012) Fechner, Bernhard; Garbade, Arne; Weis, Sebastian; Ungerer, Theo; Mühl, Gero; Richling, Jan; Herkersdorf, Andreas
Future computing systems will contain more and more cores on a single die. Permanent faults occur not only during manufacturing but may also arise at runtime. To detect these faults, a group of cores is monitored by a single unit, receiving heartbeats from all cores. In this paper, we present a simple method to localize permanent faults in a 2D mesh-based NoC by using heartbeats and by measuring the time from source (core) to destination (monitoring unit). We introduce a heartbeat network along with the normal application message network to guarantee a deterministic heartbeat timing and no interferences with application messages. If the time for a heartbeat exceeds a given interval, it can be concluded that the heartbeat is missing or delayed, e.g. because of a faulty core, link or router. As this is not sufficient to localize a fault, we introduce the concept of Timed Heartbeats, which uses different routing directions in contrary to the intended routing to introduce a fixed, additional delay for rerouted heartbeats. The delay helps to localize the fault without any additional bandwidth consumption.
Flexible scheduling and thread allocation for synchronous parallel tasks
(Gesellschaft für Informatik e.V., 2012) Kessler, Christoph W.; Hansson, Erik; Mühl, Gero; Richling, Jan; Herkersdorf, Andreas
We describe a task model and dynamic scheduling and resource allocation mechanism for synchronous parallel tasks to be executed on SPMD-programmed synchronous shared-memory MIMD parallel architectures with uniform, unit-time memory access and strict memory consistency, also known in the literature as PRAMs (Parallel Random Access Machines). Our task model provides a two-tier programming model for PRAMs that flexibly combines SPMD and fork-join parallelism within the same application. It offers flexibility by dynamic scheduling and late resource binding while preserving the PRAM execution properties within each task, the only limitation being that the maximum number of threads that can be assigned to a task is limited to what the underlying architecture provides. In particular, our approach opens for automatic performance tuning at run-time by controlling the thread allocation for tasks based on run-time predictions. By a prototype implementation of a synchronous parallel task API in the SPMD- based PRAM language Fork and experimental evaluation with example programs on the SBPRAM simulator, we show that a realization of the task model on a SPMD- programmable PRAM machine is feasible with moderate runtime overhead per task.