David A Kearney - Publications List

Journal articles

2010

DOI

Van Nguyen, David Kearney, Gianpaolo Gioiosa (2010) An extensible, maintainable and elegant approach to hardware source code generation in Reconfig-P JOURNAL OF LOGIC AND ALGEBRAIC PROGRAMMING 79: 6. 383-396 AUG 2010

Abstract: Reconfig-P is a prototype hardware implementation for membrane computing applications. Currently there are two alternative designs for the implementation of P systems in Reconfig-P: the rule-oriented design and the region-oriented design. Driven by the goal of high performance, the rule-oriented design treats reaction rules as the primary computational entities and represents regions only implicitly. In contrast, the region-oriented design represents regions, rather than reaction rules, as the primary computational entities, and thereby more directly reflects the intuitive conceptual understanding of a P system and promotes the extensibility of Reconfig-P. To improve its practical usefulness and versatility, not only should Reconfig-P include both of the hardware designs, it should be maintainable and extensible so that it can easily and effectively incorporate new implementation strategies and implementations of additional types of P systems. To accomplish a seamless integration of the rule-oriented and region-oriented designs and other alternative implementation strategies in Reconfig-P, and to make Reconfig-P amenable to future integration of additional implementation strategies, we have developed a new version of P Builder, our intelligent hardware source code generator, in accordance with a novel design pattern called Content-Form-Strategy. In this paper, we describe the Content-Form-Strategy pattern and the implementation of the new version of P Builder. (C) 2010 Elsevier Inc. All rights reserved.

Notes: Times Cited: 1

Van Nguyen, David Kearney, Gianpaolo Gioiosa (2010) A Region-Oriented Hardware Implementation for Membrane Computing Applications MEMBRANE COMPUTING 5957: 385-409 2010

Abstract: We have recently developed a prototype hardware implementation of membrane computing based on reconfigurable computing technology called Reconfig-P. The existing hardware design treats reaction rules as the primary computational entities and represents regions only implicitly. In this paper, we describe and evaluate an alternative hardware design that more directly reflects the intuitive conceptual understanding of a P system and therefore promotes the extensibility of Reconfig-P. A key feature of the design is the fact that regions, rather than reaction rules, are the primary computational entities. More specifically, in the design, regions are represented as loosely coupled processing units which communicate objects by message passing. Experimental results show that for many P systems the region-oriented and rule-oriented designs exhibit similar performance and hardware resource consumption.

Notes: Times Cited: 0

DOI

David Kearney, Van Nguyen, Gianpaolo Gioiosa (2010) A Special Issue on Bio-Inspired Computing (BIC-TA 2008) JOURNAL OF COMPUTATIONAL AND THEORETICAL NANOSCIENCE 7: 5. 803-805 MAY 2010

Abstract:

Notes: Times Cited: 0

2009

Van Nguyen, David Kearney, Gianpaolo Gioiosa (2009) An Algorithm for Non-deterministic Object Distribution in P Systems and Its Implementation in Hardware MEMBRANE COMPUTING 5391: 325-354 2009

Abstract: We have recently developed a prototype hardware implementation of membrane computing using reconfigurable computing technology. This prototype, called Reconfig-P, exhibits a good balance of performance, flexibility and scalability. However, it does not yet implement non-deterministic object distribution. One of our goals is to incorporate non-deterministic object distribution into Reconfig-P without compromising too significantly its performance, flexibility or scalability. In this paper, we (a) propose an algorithm for non-deterministic object distribution in P systems, and (b) describe and evaluate a prototype hardware implementation of this algorithm based on reconfigurable computing technology. The results of our evaluation of the prototype implementation show that our proposed algorithm can be efficiently implemented using reconfigurable computing technology. Therefore there is strong evidence that it is feasible to incorporate non-deterministic object distribution into Reconfig-P as desired.

Notes: Times Cited: 4

2008

Van Nguyen, David Kearney, Gianpaolo Gioiosa (2008) AN IMPLEMENTATION OF MEMBRANE COMPUTING USING RECONFIGURABLE HARDWARE COMPUTING AND INFORMATICS 27: 551-569 2008

Abstract: Because of their inherent, large-scale parallelism, membrane computing models can be fully exploited only through the use of a parallel computing platform. We have fully implemented such a computing platform based on reconfigurable hardware that is intended to support the efficient execution of membrane computing models. This computing platform is the first of its type to implement parallelism at both the system and region levels. In this paper, we describe how our computing platform implements the core features of membrane computing models in hardware, and present a theoretical performance analysis of the algorithm it executes in hardware. The performance analysis suggests that the computing platform can significantly outperform sequential implementations of membrane computing as well as Petreska and Teuscher's hardware implementation, the only other complete hardware implementation of membrane computing in existence.

Notes: Times Cited: 3

DOI

Vinay Sriram, David Kearney (2008) Multiple parallel FPGA implementations of a Kolmogorov phase screen generator JOURNAL OF REAL-TIME IMAGE PROCESSING 3: 3. 195-200 SEP 2008

Abstract: Modelling the effects of wavefront distortions over a finite aperture is an essential component in the simulation of adaptive optics configurations, prediction of performance of laser designators and atmospheric imaging simulations like generation of infrared (IR) scenes in the presence of atmospheric turbulence. In all of these applications many thousands of phase screens need to be generated. The computation time required for a large iterations of algorithms that model this effect is important an issue and for this reason there have been many previous attempts to improve the computation speed such algorithms. In this paper, the computation performance of the best previous algorithm that models this phenomenon is substantially improved using high performance reconfigurable computing through acceleration of the key computationally intensive steps of the algorithm on a field programmable gate array (FPGA). Our best hardware implementation can provide a speedup of more than 60 times the original algorithm.

Notes: Times Cited: 0

2007

DOI

Vinay Sriram, David Kearney (2007) Implementing a phase screen generator in hardware EIGHTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS 141-145 2007

Abstract: The computation time required for the modelling of wavefront distortions over finite aperture has always been an important issue for applications like prediction of performance of laser designators and simulation of infrared scenes in the presence of atmospheric turbulence. In this paper, we show that the computation performance of the best previous algorithm that models this phenomenon can be substantially improved using high performance reconfigurable computing through acceleration of the key computationally intensive steps of the algorithm on a field programmable gate array (FPGA). Our best hardware implementation provides a overall speedup of more than 8 times the original algorithm.

Notes: Times Cited: 0

DOI

Vinay Sriram, David Kearney (2007) A FPGA implementation of variable kernel convolution EIGHTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS 105-109 2007

Abstract: Convolution is a basic signal and image processing application. In image processing, kernel coefficients of convolution commonly remain constant across the entire image. A less common situation is where the kernel coefficients change in value for each pixel in the image. We call this variable kernel convolution. In this paper we present what we believe are the first three FPGA implementations of variable kernel convolution. The first uses sequential streaming, the second uses pipelining and the third solution uses what we call convolve and gather and its hardware implementation has the highest area time rating (6.7 x better than streaming and 3.4 x better than the pipelining solution). Both pipelining and convolve and gather have the same throughput (which is 25 x that of streaming), but convolve and gather has 71 % smaller area footprint than the pipeline.

Notes: Times Cited: 0

DOI

Vinay Sriram, David Kearney (2007) High throughput multi-port MT19937 uniform random number generator EIGHTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS 157-158 2007

Abstract: There have been many previous attempts to accelerate MT19937 using FPGAs but we believe that we can substantially improve the previous implementations to develop a higher throughput and more area time efficient design. In this paper we first present a single port design and then present an enhanced 624 port hardware implementations of the MT19937 algorithm that has a through put of 119.6 x 10(9) 32 bit random numbers per second, which is more than 17 times that of the previously best published uniform random number generator. Furthermore it has the lowest area time metric of all the currently published FPGA based pseudo uniform random number generators.

Notes: Times Cited: 1

Van Nguyen, David Kearney, Gianpaolo Gioiosa (2007) Balancing performance, flexibility, and scalability in a parallel computing platform for membrane computing applications MEMBRANE COMPUTING 4860: 385-413 2007

Abstract: It is an open question whether it is feasible to develop a parallel computing platform for membrane computing applications that significantly outperforms equivalent sequential computing platforms while still achieving acceptable flexibility and scalability. To move closer to an answer to this question, we have investigated a novel approach to the development of a parallel computing platform for membrane computing applications that has the potential to deliver a good balance between performance, flexibility and scalability. This approach involves the use of reconfigurable hardware and an intelligent software component that is able to configure the hardware to suit the specific properties of the membrane computing model to be executed. We have already developed a prototype computing platform called Reconfig-P based on the approach. Reconfig-P is the first computing platform of its type to implement parallelism at both the system and region levels. In this paper, we describe the functionality of the intelligent software component responsible for hardware configuration in Reconfig-P, and perform an empirical analysis of the performance, flexibility and scalability of Reconfig-P. The empirical results suggest that the implementation approach on which Reconfig-P is based is a viable means of attaining a good balance between performance, flexibility and scalability.

Notes: Times Cited: 4

DOI

Vinay Sriram, David Kearney (2007) A high throughput area time efficient pseudo uniform random number generator based on the TT800 algorithm 2007 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS, PROCEEDINGS, VOLS 1 AND 2 529-532 2007

Abstract: Many computer simulations require large quantities of uncorrelated random numbers to be generated quickly. Examples include all forms of Monte Carlo simulation, generating phase screens to simulate the effects of atmospheric turbulence and the simulation of electrical noise in sensors. A flexible way to generate random numbers of arbitrary distribution is to modify the distribution of a source of uniform random numbers. Thus it is of interest to have a fast uniform random number generator implemented in reconfigurable hardware. In this paper we present multiple hardware implementations of the TT800 algorithm. The best implementation achieved a throughput of 4.6 x 10(9) uniform random numbers per second using 24 parallel generators by making use of 253 Xilinx Virtex XC2VP70 slices. It has an area time rating of 0.05 x 10(-6) Xilinx slices x seconds per 32 bit random number. It has the lowest area time metric and only half the area requirement than the previously best published multi-port, single seed generator with at least a 2(800) period.

Notes: Times Cited: 2

DOI

Vinay Sriram, David Kearney (2007) An ultra fast Kolmogorov phase screen generator suitable for parallel implementation OPTICS EXPRESS 15: 21. 13709-13714 OCT 17 2007

Abstract: Modelling phase fluctuations due to Kolmogorov turbulence is important in many areas of applied optics such as simulating adaptive optics configurations, prediction of the performance of laser designators and simulation of infrared (IR) scenes in the presence of atmospheric turbulence. The computational performance of algorithms implementing this model is an important issue because in many situations a large number of phase screens is required. For example, in IR scene simulation a different phase screen is required for each pixel in the scene, and in other situations there exists a need for many thousands of phase screens to be calculated to obtain a statistical average. Whilst there have been previous attempts to increase the computational speed of these algorithms, the computation time required for a large number of phase screens still remains an issue. In this paper, we apply linear and statistical properties to improve the performance of the previous best published algorithm by 60 times when implemented on a sequential processor in software. Because the new algorithm is now trivially parallelizable, a further 20 times speedup can easily be achieved through a parallel software or hardware implementation. (c) 2007 Optical Society of America.

Notes: Times Cited: 3

2006

DOI

Grant Wigley, David Kearney (2006) Performance evaluations of ReconfigME 2006 IEEE International Conference on Field Programmable Technology, Proceedings 309-312 2006

Abstract: With the development of reconfigurable computers containing FPGAs with in excess of 6 million system-gates, it is now feasible to consider the possibility of sharing the FPGA between multiple concurrently executing applications. This could potentially increase the resource usage of the expensive FPGA logic and decrease response times so users will not have to wait for the FPGA to be completely available. However the system environment software required to support this, may actually result in application performance much less than would be considered acceptable to many FPGA users. This paper involves using a prototype to evaluate the performance of such an operating system, ReconfigME.

Notes: Times Cited: 0

DOI

David Kearney, John Hopf (2006) Hardware join Java : A unified Hardware/Software language for dynamic partial runtime reconfigurable computing applications 2006 IEEE International Conference on Field Programmable Technology, Proceedings 277-280 2006

Abstract: Reconfigurable computing is maturing rapidly as FPGAs combining hard core processors and high density logic block arrays become widely available at low cost Application developers have been developing algorithms that cross the hardware software divide for some years but will in addition want to express the dynamic reconfiguration of FPGAs made available via an operating system for reconfigurable computing. Whilst there are many behavioural languages available or expressing reconfigurable computing applications very few of them are comprehensive enough to address simultaneously these two requirements. In this paper we present an experimental language based on Java which aims to achieve the twin goals of a transparent hardware software interface and an integrated expression of dynamic reconfiguration. Hardware Join Java (HJJ) uses a common threading abstraction and synchronization based on the Join calculus to unify the semantics and interface between hardware and software. The language extends the dynamic class instantiation mechanism of Java (supported by the services of an operating system for reconfigurable computing) to express user initiated dynamic reconfiguration of the FPGA. In this paper we present basic syntax and semantics of HJJ and give our initial experience with the prototype compiler.

Notes: Times Cited: 0

2005

Mark Jasiunas, David Kearney, Richard Bowyer (2005) Connectivity, resource integration, and high performance reconfigurable computing for autonomous UAVs 2005 IEEE Aerospace Conference, Vols 1-4 3020-3027 2005

Abstract: In an investigation into the capabilities of small autonomous formations of unmanned aerial vehicles (UAVs), we identified connectivity, processing power, and lack of resource integration as three major limiting factors of current technology. In an endeavor to address these issues, we propose a new novel hardware and software environment consisting of a traditional Von Neumann processor coupled with a field programmable gate array (FPGA) for high performance processing, along with support libraries to better manage the resources of a formation. The supporting software libraries have the primary functions of allowing any networked resource (such as processors and UAV sensors) to be accessed from any location in the UAV formation, and also provide support that allows algorithms implemented simultaneously on the reconfigurable and traditional processors to migrate between UAVs for better connectivity to resources or to balance processing loads. In this paper we present the issues we faced in the design of these systems, along with our preliminary results indicating the advantages and shortcomings of the system. We also describe in detail the construction of the prototype systems used to determine the correct software settings for the mobile algorithms.

Notes: Times Cited: 0


Journal articles

2010	DOI Van Nguyen, David Kearney, Gianpaolo Gioiosa (2010) An extensible, maintainable and elegant approach to hardware source code generation in Reconfig-P JOURNAL OF LOGIC AND ALGEBRAIC PROGRAMMING 79: 6. 383-396 AUG 2010 Abstract: Reconfig-P is a prototype hardware implementation for membrane computing applications. Currently there are two alternative designs for the implementation of P systems in Reconfig-P: the rule-oriented design and the region-oriented design. Driven by the goal of high performance, the rule-oriented design treats reaction rules as the primary computational entities and represents regions only implicitly. In contrast, the region-oriented design represents regions, rather than reaction rules, as the primary computational entities, and thereby more directly reflects the intuitive conceptual understanding of a P system and promotes the extensibility of Reconfig-P. To improve its practical usefulness and versatility, not only should Reconfig-P include both of the hardware designs, it should be maintainable and extensible so that it can easily and effectively incorporate new implementation strategies and implementations of additional types of P systems. To accomplish a seamless integration of the rule-oriented and region-oriented designs and other alternative implementation strategies in Reconfig-P, and to make Reconfig-P amenable to future integration of additional implementation strategies, we have developed a new version of P Builder, our intelligent hardware source code generator, in accordance with a novel design pattern called Content-Form-Strategy. In this paper, we describe the Content-Form-Strategy pattern and the implementation of the new version of P Builder. (C) 2010 Elsevier Inc. All rights reserved. Notes: Times Cited: 1
	Van Nguyen, David Kearney, Gianpaolo Gioiosa (2010) A Region-Oriented Hardware Implementation for Membrane Computing Applications MEMBRANE COMPUTING 5957: 385-409 2010 Abstract: We have recently developed a prototype hardware implementation of membrane computing based on reconfigurable computing technology called Reconfig-P. The existing hardware design treats reaction rules as the primary computational entities and represents regions only implicitly. In this paper, we describe and evaluate an alternative hardware design that more directly reflects the intuitive conceptual understanding of a P system and therefore promotes the extensibility of Reconfig-P. A key feature of the design is the fact that regions, rather than reaction rules, are the primary computational entities. More specifically, in the design, regions are represented as loosely coupled processing units which communicate objects by message passing. Experimental results show that for many P systems the region-oriented and rule-oriented designs exhibit similar performance and hardware resource consumption. Notes: Times Cited: 0
	DOI David Kearney, Van Nguyen, Gianpaolo Gioiosa (2010) A Special Issue on Bio-Inspired Computing (BIC-TA 2008) JOURNAL OF COMPUTATIONAL AND THEORETICAL NANOSCIENCE 7: 5. 803-805 MAY 2010 Abstract: Notes: Times Cited: 0
2009	Van Nguyen, David Kearney, Gianpaolo Gioiosa (2009) An Algorithm for Non-deterministic Object Distribution in P Systems and Its Implementation in Hardware MEMBRANE COMPUTING 5391: 325-354 2009 Abstract: We have recently developed a prototype hardware implementation of membrane computing using reconfigurable computing technology. This prototype, called Reconfig-P, exhibits a good balance of performance, flexibility and scalability. However, it does not yet implement non-deterministic object distribution. One of our goals is to incorporate non-deterministic object distribution into Reconfig-P without compromising too significantly its performance, flexibility or scalability. In this paper, we (a) propose an algorithm for non-deterministic object distribution in P systems, and (b) describe and evaluate a prototype hardware implementation of this algorithm based on reconfigurable computing technology. The results of our evaluation of the prototype implementation show that our proposed algorithm can be efficiently implemented using reconfigurable computing technology. Therefore there is strong evidence that it is feasible to incorporate non-deterministic object distribution into Reconfig-P as desired. Notes: Times Cited: 4
2008	Van Nguyen, David Kearney, Gianpaolo Gioiosa (2008) AN IMPLEMENTATION OF MEMBRANE COMPUTING USING RECONFIGURABLE HARDWARE COMPUTING AND INFORMATICS 27: 551-569 2008 Abstract: Because of their inherent, large-scale parallelism, membrane computing models can be fully exploited only through the use of a parallel computing platform. We have fully implemented such a computing platform based on reconfigurable hardware that is intended to support the efficient execution of membrane computing models. This computing platform is the first of its type to implement parallelism at both the system and region levels. In this paper, we describe how our computing platform implements the core features of membrane computing models in hardware, and present a theoretical performance analysis of the algorithm it executes in hardware. The performance analysis suggests that the computing platform can significantly outperform sequential implementations of membrane computing as well as Petreska and Teuscher's hardware implementation, the only other complete hardware implementation of membrane computing in existence. Notes: Times Cited: 3

	DOI Vinay Sriram, David Kearney (2008) Multiple parallel FPGA implementations of a Kolmogorov phase screen generator JOURNAL OF REAL-TIME IMAGE PROCESSING 3: 3. 195-200 SEP 2008 Abstract: Modelling the effects of wavefront distortions over a finite aperture is an essential component in the simulation of adaptive optics configurations, prediction of performance of laser designators and atmospheric imaging simulations like generation of infrared (IR) scenes in the presence of atmospheric turbulence. In all of these applications many thousands of phase screens need to be generated. The computation time required for a large iterations of algorithms that model this effect is important an issue and for this reason there have been many previous attempts to improve the computation speed such algorithms. In this paper, the computation performance of the best previous algorithm that models this phenomenon is substantially improved using high performance reconfigurable computing through acceleration of the key computationally intensive steps of the algorithm on a field programmable gate array (FPGA). Our best hardware implementation can provide a speedup of more than 60 times the original algorithm. Notes: Times Cited: 0
2007	DOI Vinay Sriram, David Kearney (2007) Implementing a phase screen generator in hardware EIGHTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS 141-145 2007 Abstract: The computation time required for the modelling of wavefront distortions over finite aperture has always been an important issue for applications like prediction of performance of laser designators and simulation of infrared scenes in the presence of atmospheric turbulence. In this paper, we show that the computation performance of the best previous algorithm that models this phenomenon can be substantially improved using high performance reconfigurable computing through acceleration of the key computationally intensive steps of the algorithm on a field programmable gate array (FPGA). Our best hardware implementation provides a overall speedup of more than 8 times the original algorithm. Notes: Times Cited: 0
	DOI Vinay Sriram, David Kearney (2007) A FPGA implementation of variable kernel convolution EIGHTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS 105-109 2007 Abstract: Convolution is a basic signal and image processing application. In image processing, kernel coefficients of convolution commonly remain constant across the entire image. A less common situation is where the kernel coefficients change in value for each pixel in the image. We call this variable kernel convolution. In this paper we present what we believe are the first three FPGA implementations of variable kernel convolution. The first uses sequential streaming, the second uses pipelining and the third solution uses what we call convolve and gather and its hardware implementation has the highest area time rating (6.7 x better than streaming and 3.4 x better than the pipelining solution). Both pipelining and convolve and gather have the same throughput (which is 25 x that of streaming), but convolve and gather has 71 % smaller area footprint than the pipeline. Notes: Times Cited: 0
	DOI Vinay Sriram, David Kearney (2007) High throughput multi-port MT19937 uniform random number generator EIGHTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS 157-158 2007 Abstract: There have been many previous attempts to accelerate MT19937 using FPGAs but we believe that we can substantially improve the previous implementations to develop a higher throughput and more area time efficient design. In this paper we first present a single port design and then present an enhanced 624 port hardware implementations of the MT19937 algorithm that has a through put of 119.6 x 10(9) 32 bit random numbers per second, which is more than 17 times that of the previously best published uniform random number generator. Furthermore it has the lowest area time metric of all the currently published FPGA based pseudo uniform random number generators. Notes: Times Cited: 1
	Van Nguyen, David Kearney, Gianpaolo Gioiosa (2007) Balancing performance, flexibility, and scalability in a parallel computing platform for membrane computing applications MEMBRANE COMPUTING 4860: 385-413 2007 Abstract: It is an open question whether it is feasible to develop a parallel computing platform for membrane computing applications that significantly outperforms equivalent sequential computing platforms while still achieving acceptable flexibility and scalability. To move closer to an answer to this question, we have investigated a novel approach to the development of a parallel computing platform for membrane computing applications that has the potential to deliver a good balance between performance, flexibility and scalability. This approach involves the use of reconfigurable hardware and an intelligent software component that is able to configure the hardware to suit the specific properties of the membrane computing model to be executed. We have already developed a prototype computing platform called Reconfig-P based on the approach. Reconfig-P is the first computing platform of its type to implement parallelism at both the system and region levels. In this paper, we describe the functionality of the intelligent software component responsible for hardware configuration in Reconfig-P, and perform an empirical analysis of the performance, flexibility and scalability of Reconfig-P. The empirical results suggest that the implementation approach on which Reconfig-P is based is a viable means of attaining a good balance between performance, flexibility and scalability. Notes: Times Cited: 4
	DOI Vinay Sriram, David Kearney (2007) A high throughput area time efficient pseudo uniform random number generator based on the TT800 algorithm 2007 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS, PROCEEDINGS, VOLS 1 AND 2 529-532 2007 Abstract: Many computer simulations require large quantities of uncorrelated random numbers to be generated quickly. Examples include all forms of Monte Carlo simulation, generating phase screens to simulate the effects of atmospheric turbulence and the simulation of electrical noise in sensors. A flexible way to generate random numbers of arbitrary distribution is to modify the distribution of a source of uniform random numbers. Thus it is of interest to have a fast uniform random number generator implemented in reconfigurable hardware. In this paper we present multiple hardware implementations of the TT800 algorithm. The best implementation achieved a throughput of 4.6 x 10(9) uniform random numbers per second using 24 parallel generators by making use of 253 Xilinx Virtex XC2VP70 slices. It has an area time rating of 0.05 x 10(-6) Xilinx slices x seconds per 32 bit random number. It has the lowest area time metric and only half the area requirement than the previously best published multi-port, single seed generator with at least a 2(800) period. Notes: Times Cited: 2
	DOI Vinay Sriram, David Kearney (2007) An ultra fast Kolmogorov phase screen generator suitable for parallel implementation OPTICS EXPRESS 15: 21. 13709-13714 OCT 17 2007 Abstract: Modelling phase fluctuations due to Kolmogorov turbulence is important in many areas of applied optics such as simulating adaptive optics configurations, prediction of the performance of laser designators and simulation of infrared (IR) scenes in the presence of atmospheric turbulence. The computational performance of algorithms implementing this model is an important issue because in many situations a large number of phase screens is required. For example, in IR scene simulation a different phase screen is required for each pixel in the scene, and in other situations there exists a need for many thousands of phase screens to be calculated to obtain a statistical average. Whilst there have been previous attempts to increase the computational speed of these algorithms, the computation time required for a large number of phase screens still remains an issue. In this paper, we apply linear and statistical properties to improve the performance of the previous best published algorithm by 60 times when implemented on a sequential processor in software. Because the new algorithm is now trivially parallelizable, a further 20 times speedup can easily be achieved through a parallel software or hardware implementation. (c) 2007 Optical Society of America. Notes: Times Cited: 3
2006	DOI Grant Wigley, David Kearney (2006) Performance evaluations of ReconfigME 2006 IEEE International Conference on Field Programmable Technology, Proceedings 309-312 2006 Abstract: With the development of reconfigurable computers containing FPGAs with in excess of 6 million system-gates, it is now feasible to consider the possibility of sharing the FPGA between multiple concurrently executing applications. This could potentially increase the resource usage of the expensive FPGA logic and decrease response times so users will not have to wait for the FPGA to be completely available. However the system environment software required to support this, may actually result in application performance much less than would be considered acceptable to many FPGA users. This paper involves using a prototype to evaluate the performance of such an operating system, ReconfigME. Notes: Times Cited: 0
	Vinay Sriram, David Kearney (2006) High speed high fidelity infrared scene simulation using reconfigurable computing 2006 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS, PROCEEDINGS 955-956 2006 Abstract: Notes: Times Cited: 0
	DOI David Kearney, John Hopf (2006) Hardware join Java : A unified Hardware/Software language for dynamic partial runtime reconfigurable computing applications 2006 IEEE International Conference on Field Programmable Technology, Proceedings 277-280 2006 Abstract: Reconfigurable computing is maturing rapidly as FPGAs combining hard core processors and high density logic block arrays become widely available at low cost Application developers have been developing algorithms that cross the hardware software divide for some years but will in addition want to express the dynamic reconfiguration of FPGAs made available via an operating system for reconfigurable computing. Whilst there are many behavioural languages available or expressing reconfigurable computing applications very few of them are comprehensive enough to address simultaneously these two requirements. In this paper we present an experimental language based on Java which aims to achieve the twin goals of a transparent hardware software interface and an integrated expression of dynamic reconfiguration. Hardware Join Java (HJJ) uses a common threading abstraction and synchronization based on the Join calculus to unify the semantics and interface between hardware and software. The language extends the dynamic class instantiation mechanism of Java (supported by the services of an operating system for reconfigurable computing) to express user initiated dynamic reconfiguration of the FPGA. In this paper we present basic syntax and semantics of HJJ and give our initial experience with the prototype compiler. Notes: Times Cited: 0
2005	Mark Jasiunas, David Kearney, Richard Bowyer (2005) Connectivity, resource integration, and high performance reconfigurable computing for autonomous UAVs 2005 IEEE Aerospace Conference, Vols 1-4 3020-3027 2005 Abstract: In an investigation into the capabilities of small autonomous formations of unmanned aerial vehicles (UAVs), we identified connectivity, processing power, and lack of resource integration as three major limiting factors of current technology. In an endeavor to address these issues, we propose a new novel hardware and software environment consisting of a traditional Von Neumann processor coupled with a field programmable gate array (FPGA) for high performance processing, along with support libraries to better manage the resources of a formation. The supporting software libraries have the primary functions of allowing any networked resource (such as processors and UAV sensors) to be accessed from any location in the UAV formation, and also provide support that allows algorithms implemented simultaneously on the reconfigurable and traditional processors to migrate between UAVs for better connectivity to resources or to balance processing loads. In this paper we present the issues we faced in the design of these systems, along with our preliminary results indicating the advantages and shortcomings of the system. We also describe in detail the construction of the prototype systems used to determine the correct software settings for the mobile algorithms. Notes: Times Cited: 0