hosted by
publicationslist.org
    

David A Kearney


david.kearney@unisa.edu.au

Journal articles

2010
Van Nguyen, David Kearney, Gianpaolo Gioiosa (2010)  An extensible, maintainable and elegant approach to hardware source code generation in Reconfig-P   JOURNAL OF LOGIC AND ALGEBRAIC PROGRAMMING 79: 6. 383-396 AUG 2010  
Abstract: Reconfig-P is a prototype hardware implementation for membrane computing applications. Currently there are two alternative designs for the implementation of P systems in Reconfig-P: the rule-oriented design and the region-oriented design. Driven by the goal of high performance, the rule-oriented design treats reaction rules as the primary computational entities and represents regions only implicitly. In contrast, the region-oriented design represents regions, rather than reaction rules, as the primary computational entities, and thereby more directly reflects the intuitive conceptual understanding of a P system and promotes the extensibility of Reconfig-P. To improve its practical usefulness and versatility, not only should Reconfig-P include both of the hardware designs, it should be maintainable and extensible so that it can easily and effectively incorporate new implementation strategies and implementations of additional types of P systems. To accomplish a seamless integration of the rule-oriented and region-oriented designs and other alternative implementation strategies in Reconfig-P, and to make Reconfig-P amenable to future integration of additional implementation strategies, we have developed a new version of P Builder, our intelligent hardware source code generator, in accordance with a novel design pattern called Content-Form-Strategy. In this paper, we describe the Content-Form-Strategy pattern and the implementation of the new version of P Builder. (C) 2010 Elsevier Inc. All rights reserved.
Notes: Times Cited: 1
Van Nguyen, David Kearney, Gianpaolo Gioiosa (2010)  A Region-Oriented Hardware Implementation for Membrane Computing Applications   MEMBRANE COMPUTING 5957: 385-409 2010  
Abstract: We have recently developed a prototype hardware implementation of membrane computing based on reconfigurable computing technology called Reconfig-P. The existing hardware design treats reaction rules as the primary computational entities and represents regions only implicitly. In this paper, we describe and evaluate an alternative hardware design that more directly reflects the intuitive conceptual understanding of a P system and therefore promotes the extensibility of Reconfig-P. A key feature of the design is the fact that regions, rather than reaction rules, are the primary computational entities. More specifically, in the design, regions are represented as loosely coupled processing units which communicate objects by message passing. Experimental results show that for many P systems the region-oriented and rule-oriented designs exhibit similar performance and hardware resource consumption.
Notes: Times Cited: 0
2009
Van Nguyen, David Kearney, Gianpaolo Gioiosa (2009)  An Algorithm for Non-deterministic Object Distribution in P Systems and Its Implementation in Hardware   MEMBRANE COMPUTING 5391: 325-354 2009  
Abstract: We have recently developed a prototype hardware implementation of membrane computing using reconfigurable computing technology. This prototype, called Reconfig-P, exhibits a good balance of performance, flexibility and scalability. However, it does not yet implement non-deterministic object distribution. One of our goals is to incorporate non-deterministic object distribution into Reconfig-P without compromising too significantly its performance, flexibility or scalability. In this paper, we (a) propose an algorithm for non-deterministic object distribution in P systems, and (b) describe and evaluate a prototype hardware implementation of this algorithm based on reconfigurable computing technology. The results of our evaluation of the prototype implementation show that our proposed algorithm can be efficiently implemented using reconfigurable computing technology. Therefore there is strong evidence that it is feasible to incorporate non-deterministic object distribution into Reconfig-P as desired.
Notes: Times Cited: 4
2008
Van Nguyen, David Kearney, Gianpaolo Gioiosa (2008)  AN IMPLEMENTATION OF MEMBRANE COMPUTING USING RECONFIGURABLE HARDWARE   COMPUTING AND INFORMATICS 27: 551-569 2008  
Abstract: Because of their inherent, large-scale parallelism, membrane computing models can be fully exploited only through the use of a parallel computing platform. We have fully implemented such a computing platform based on reconfigurable hardware that is intended to support the efficient execution of membrane computing models. This computing platform is the first of its type to implement parallelism at both the system and region levels. In this paper, we describe how our computing platform implements the core features of membrane computing models in hardware, and present a theoretical performance analysis of the algorithm it executes in hardware. The performance analysis suggests that the computing platform can significantly outperform sequential implementations of membrane computing as well as Petreska and Teuscher's hardware implementation, the only other complete hardware implementation of membrane computing in existence.
Notes: Times Cited: 3
Vinay Sriram, David Kearney (2008)  Multiple parallel FPGA implementations of a Kolmogorov phase screen generator   JOURNAL OF REAL-TIME IMAGE PROCESSING 3: 3. 195-200 SEP 2008  
Abstract: Modelling the effects of wavefront distortions over a finite aperture is an essential component in the simulation of adaptive optics configurations, prediction of performance of laser designators and atmospheric imaging simulations like generation of infrared (IR) scenes in the presence of atmospheric turbulence. In all of these applications many thousands of phase screens need to be generated. The computation time required for a large iterations of algorithms that model this effect is important an issue and for this reason there have been many previous attempts to improve the computation speed such algorithms. In this paper, the computation performance of the best previous algorithm that models this phenomenon is substantially improved using high performance reconfigurable computing through acceleration of the key computationally intensive steps of the algorithm on a field programmable gate array (FPGA). Our best hardware implementation can provide a speedup of more than 60 times the original algorithm.
Notes: Times Cited: 0
2007
Vinay Sriram, David Kearney (2007)  Implementing a phase screen generator in hardware   EIGHTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS 141-145 2007  
Abstract: The computation time required for the modelling of wavefront distortions over finite aperture has always been an important issue for applications like prediction of performance of laser designators and simulation of infrared scenes in the presence of atmospheric turbulence. In this paper, we show that the computation performance of the best previous algorithm that models this phenomenon can be substantially improved using high performance reconfigurable computing through acceleration of the key computationally intensive steps of the algorithm on a field programmable gate array (FPGA). Our best hardware implementation provides a overall speedup of more than 8 times the original algorithm.
Notes: Times Cited: 0
Vinay Sriram, David Kearney (2007)  A FPGA implementation of variable kernel convolution   EIGHTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS 105-109 2007  
Abstract: Convolution is a basic signal and image processing application. In image processing, kernel coefficients of convolution commonly remain constant across the entire image. A less common situation is where the kernel coefficients change in value for each pixel in the image. We call this variable kernel convolution. In this paper we present what we believe are the first three FPGA implementations of variable kernel convolution. The first uses sequential streaming, the second uses pipelining and the third solution uses what we call convolve and gather and its hardware implementation has the highest area time rating (6.7 x better than streaming and 3.4 x better than the pipelining solution). Both pipelining and convolve and gather have the same throughput (which is 25 x that of streaming), but convolve and gather has 71 % smaller area footprint than the pipeline.
Notes: Times Cited: 0
Vinay Sriram, David Kearney (2007)  High throughput multi-port MT19937 uniform random number generator   EIGHTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS 157-158 2007  
Abstract: There have been many previous attempts to accelerate MT19937 using FPGAs but we believe that we can substantially improve the previous implementations to develop a higher throughput and more area time efficient design. In this paper we first present a single port design and then present an enhanced 624 port hardware implementations of the MT19937 algorithm that has a through put of 119.6 x 10(9) 32 bit random numbers per second, which is more than 17 times that of the previously best published uniform random number generator. Furthermore it has the lowest area time metric of all the currently published FPGA based pseudo uniform random number generators.
Notes: Times Cited: 1
Van Nguyen, David Kearney, Gianpaolo Gioiosa (2007)  Balancing performance, flexibility, and scalability in a parallel computing platform for membrane computing applications   MEMBRANE COMPUTING 4860: 385-413 2007  
Abstract: It is an open question whether it is feasible to develop a parallel computing platform for membrane computing applications that significantly outperforms equivalent sequential computing platforms while still achieving acceptable flexibility and scalability. To move closer to an answer to this question, we have investigated a novel approach to the development of a parallel computing platform for membrane computing applications that has the potential to deliver a good balance between performance, flexibility and scalability. This approach involves the use of reconfigurable hardware and an intelligent software component that is able to configure the hardware to suit the specific properties of the membrane computing model to be executed. We have already developed a prototype computing platform called Reconfig-P based on the approach. Reconfig-P is the first computing platform of its type to implement parallelism at both the system and region levels. In this paper, we describe the functionality of the intelligent software component responsible for hardware configuration in Reconfig-P, and perform an empirical analysis of the performance, flexibility and scalability of Reconfig-P. The empirical results suggest that the implementation approach on which Reconfig-P is based is a viable means of attaining a good balance between performance, flexibility and scalability.
Notes: Times Cited: 4
Vinay Sriram, David Kearney (2007)  A high throughput area time efficient pseudo uniform random number generator based on the TT800 algorithm   2007 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS, PROCEEDINGS, VOLS 1 AND 2 529-532 2007  
Abstract: Many computer simulations require large quantities of uncorrelated random numbers to be generated quickly. Examples include all forms of Monte Carlo simulation, generating phase screens to simulate the effects of atmospheric turbulence and the simulation of electrical noise in sensors. A flexible way to generate random numbers of arbitrary distribution is to modify the distribution of a source of uniform random numbers. Thus it is of interest to have a fast uniform random number generator implemented in reconfigurable hardware. In this paper we present multiple hardware implementations of the TT800 algorithm. The best implementation achieved a throughput of 4.6 x 10(9) uniform random numbers per second using 24 parallel generators by making use of 253 Xilinx Virtex XC2VP70 slices. It has an area time rating of 0.05 x 10(-6) Xilinx slices x seconds per 32 bit random number. It has the lowest area time metric and only half the area requirement than the previously best published multi-port, single seed generator with at least a 2(800) period.
Notes: Times Cited: 2
Vinay Sriram, David Kearney (2007)  An ultra fast Kolmogorov phase screen generator suitable for parallel implementation   OPTICS EXPRESS 15: 21. 13709-13714 OCT 17 2007  
Abstract: Modelling phase fluctuations due to Kolmogorov turbulence is important in many areas of applied optics such as simulating adaptive optics configurations, prediction of the performance of laser designators and simulation of infrared (IR) scenes in the presence of atmospheric turbulence. The computational performance of algorithms implementing this model is an important issue because in many situations a large number of phase screens is required. For example, in IR scene simulation a different phase screen is required for each pixel in the scene, and in other situations there exists a need for many thousands of phase screens to be calculated to obtain a statistical average. Whilst there have been previous attempts to increase the computational speed of these algorithms, the computation time required for a large number of phase screens still remains an issue. In this paper, we apply linear and statistical properties to improve the performance of the previous best published algorithm by 60 times when implemented on a sequential processor in software. Because the new algorithm is now trivially parallelizable, a further 20 times speedup can easily be achieved through a parallel software or hardware implementation. (c) 2007 Optical Society of America.
Notes: Times Cited: 3
2006
Grant Wigley, David Kearney (2006)  Performance evaluations of ReconfigME   2006 IEEE International Conference on Field Programmable Technology, Proceedings 309-312 2006  
Abstract: With the development of reconfigurable computers containing FPGAs with in excess of 6 million system-gates, it is now feasible to consider the possibility of sharing the FPGA between multiple concurrently executing applications. This could potentially increase the resource usage of the expensive FPGA logic and decrease response times so users will not have to wait for the FPGA to be completely available. However the system environment software required to support this, may actually result in application performance much less than would be considered acceptable to many FPGA users. This paper involves using a prototype to evaluate the performance of such an operating system, ReconfigME.
Notes: Times Cited: 0
David Kearney, John Hopf (2006)  Hardware join Java : A unified Hardware/Software language for dynamic partial runtime reconfigurable computing applications   2006 IEEE International Conference on Field Programmable Technology, Proceedings 277-280 2006  
Abstract: Reconfigurable computing is maturing rapidly as FPGAs combining hard core processors and high density logic block arrays become widely available at low cost Application developers have been developing algorithms that cross the hardware software divide for some years but will in addition want to express the dynamic reconfiguration of FPGAs made available via an operating system for reconfigurable computing. Whilst there are many behavioural languages available or expressing reconfigurable computing applications very few of them are comprehensive enough to address simultaneously these two requirements. In this paper we present an experimental language based on Java which aims to achieve the twin goals of a transparent hardware software interface and an integrated expression of dynamic reconfiguration. Hardware Join Java (HJJ) uses a common threading abstraction and synchronization based on the Join calculus to unify the semantics and interface between hardware and software. The language extends the dynamic class instantiation mechanism of Java (supported by the services of an operating system for reconfigurable computing) to express user initiated dynamic reconfiguration of the FPGA. In this paper we present basic syntax and semantics of HJJ and give our initial experience with the prototype compiler.
Notes: Times Cited: 0
2005
Mark Jasiunas, David Kearney, Richard Bowyer (2005)  Connectivity, resource integration, and high performance reconfigurable computing for autonomous UAVs   2005 IEEE Aerospace Conference, Vols 1-4 3020-3027 2005  
Abstract: In an investigation into the capabilities of small autonomous formations of unmanned aerial vehicles (UAVs), we identified connectivity, processing power, and lack of resource integration as three major limiting factors of current technology. In an endeavor to address these issues, we propose a new novel hardware and software environment consisting of a traditional Von Neumann processor coupled with a field programmable gate array (FPGA) for high performance processing, along with support libraries to better manage the resources of a formation. The supporting software libraries have the primary functions of allowing any networked resource (such as processors and UAV sensors) to be accessed from any location in the UAV formation, and also provide support that allows algorithms implemented simultaneously on the reconfigurable and traditional processors to migrate between UAVs for better connectivity to resources or to balance processing loads. In this paper we present the issues we faced in the design of these systems, along with our preliminary results indicating the advantages and shortcomings of the system. We also describe in detail the construction of the prototype systems used to determine the correct software settings for the mobile algorithms.
Notes: Times Cited: 0
Powered by PublicationsList.org.