Self-organizing systems are characterized by an inherently local behavior, as their configuration is almost exclusively determined by the union of the states of each of the units composing the system. Moreover, all state changes are mutually independent and governed by the same laws. In this work we study the parallel implementation of a specific subset of this broader family, namely that of growing self-organizing networks, in relation to parallel computing hardware devices based on Graphic Processing Units (GPUs), which are increasingly gaining popularity due to their favourable cost/performance ratio. In order to do so, we first define a new version of the standard, sequential algorithm, where the intrinsic parallelism of the execution is made more explicit and then we perform comparative experiments with the standard algorithm, together with an optimized variant of the latter, where an hash index is used for speed. Our experiments demonstrates that the parallel version outperforms both variants of the sequential algorithm but also reveals a few interesting differences in the overall behavior of the system, that might be relevant for further investigations. Copyright © 2012 SciTePress.
GPU-based parallel implementation of a growing self-organizing network
PARIGI, GIACOMO;PIASTRA, MARCO
2012-01-01
Abstract
Self-organizing systems are characterized by an inherently local behavior, as their configuration is almost exclusively determined by the union of the states of each of the units composing the system. Moreover, all state changes are mutually independent and governed by the same laws. In this work we study the parallel implementation of a specific subset of this broader family, namely that of growing self-organizing networks, in relation to parallel computing hardware devices based on Graphic Processing Units (GPUs), which are increasingly gaining popularity due to their favourable cost/performance ratio. In order to do so, we first define a new version of the standard, sequential algorithm, where the intrinsic parallelism of the execution is made more explicit and then we perform comparative experiments with the standard algorithm, together with an optimized variant of the latter, where an hash index is used for speed. Our experiments demonstrates that the parallel version outperforms both variants of the sequential algorithm but also reveals a few interesting differences in the overall behavior of the system, that might be relevant for further investigations. Copyright © 2012 SciTePress.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.