%0 Conference Paper
%T Towards Privacy Requirements for Collaborative Development of AI Applications
%C Karlskrona
%U http://urn.kb.se/resolve?urn=urn:nbn:se:bth-16446
%X The use of data is essential for the capabilities of Data- driven Artificial intelligence (AI), Deep Learning and Big Data analysis techniques. The use of data, however, raises intrinsically the co ...
%G eng
%B 14th Swedish National Computer Networking Workshop (SNCNW), 2018
%A Ahmadi Mehri, Vida
%A Ilie, Dragos
%A Tutschku, Kurt
%D 2018
%0 Conference Paper
%T Privacy and trust in cloud-based marketplaces for AI and data resources
%I Springer New York LLC
%P 223-225
%U http://urn.kb.se/resolve?urn=urn:nbn:se:bth-14841
%X DiVA portal is a finding tool for research publications and student theses written at the following 47 universities and research institutions.
%G eng
%A Mehri, Vida A.
%A Tutschku, Kurt
%D 2017
%0 Conference Paper
%T Pricing of Data Products in Data Marketplaces
%S Lecture Notes in Business Information Processing
%I Springer, Cham
%P 49-66
%W http://www.diva-portal.se/smash/get/diva2:1163530/FULLTEXT01.pdf
%@ 978-3-319-69190-9 978-3-319-69191-6
%U https://link.springer.com/chapter/10.1007/978-3-319-69191-6_4
%X Mobile computing and the Internet of Things promises massive amounts of data for big data analytic and machine learning. A data sharing economy is needed to make that data available for companies that wish to develop smart systems and services. While digital markets for trading data are emerging, there is no consolidated understanding of how to price data products and thus offer data vendors incentives for sharing data. This paper uses a combined keyword search and snowballing approach to systematically review the literature on the pricing of data products that are to be offered on marketplaces. The results give insights into the maturity and character of data pricing. They enable practitioners to select a pricing approach suitable for their situation and researchers to extend and mature data pricing as a topic.
%G en
%B Software Business
%A Fricker, Samuel A.
%A Maksimov, Yuliyan V.
%D 2017/06/12
%0 Journal Article
%T Three Factors Influencing Minima in SGD
%W http://arxiv.org/abs/1711.04623
%U http://arxiv.org/abs/1711.04623
%X We investigate the dynamical and convergent properties of stochastic gradient descent (SGD) applied to Deep Neural Networks (DNNs). Characterizing the relation between learning rate, batch size and the properties of the final minima, such as width or generalization, remains an open question. In order to tackle this problem we investigate the previously proposed approximation of SGD by a stochastic differential equation (SDE). We theoretically argue that three factors - learning rate, batch size and gradient covariance - influence the minima found by SGD. In particular we find that the ratio of learning rate to batch size is a key determinant of SGD dynamics and of the width of the final minima, and that higher values of the ratio lead to wider minima and often better generalization. We confirm these findings experimentally. Further, we include experiments which show that learning rate schedules can be replaced with batch size schedules and that the ratio of learning rate to batch size is an important factor influencing the memorization process.
%J arXiv:1711.04623 [cs, stat]
%A Jastrzębski, Stanisław
%A Kenton, Zachary
%A Arpit, Devansh
%A Ballas, Nicolas
%A Fischer, Asja
%A Bengio, Yoshua
%A Storkey, Amos
%D 2017-11-13
%K Computer Science - Artificial Intelligence
Computer Science - Computer Vision and Pattern Recognition
Computer Science - Machine Learning
Statistics - Machine Learning
%0 Journal Article
%T Low-memory GEMM-based convolution algorithms for deep neural networks
%W http://arxiv.org/abs/1709.03395
%U http://arxiv.org/abs/1709.03395
%X Deep neural networks (DNNs) require very large amounts of computation both for training and for inference when deployed in the field. A common approach to implementing DNNs is to recast the most computationally expensive operations as general matrix multiplication (GEMM). However, as we demonstrate in this paper, there are a great many different ways to express DNN convolution operations using GEMM. Although different approaches all perform the same number of operations, the size of temporary data structures differs significantly. Convolution of an input matrix with dimensions $C \times H \times W$, requires $O(K^2CHW)$ additional space using the classical im2col approach. More recently memory-efficient approaches requiring just $O(KCHW)$ auxiliary space have been proposed. We present two novel GEMM-based algorithms that require just $O(MHW)$ and $O(KW)$ additional space respectively, where $M$ is the number of channels in the result of the convolution. These algorithms dramatically reduce the space overhead of DNN convolution, making it much more suitable for memory-limited embedded systems. Experimental evaluation shows that our low-memory algorithms are just as fast as the best patch-building approaches despite requiring just a fraction of the amount of additional memory. Our low-memory algorithms have excellent data locality which gives them a further edge over patch-building algorithms when multiple cores are used. As a result, our low memory algorithms often outperform the best patch-building algorithms using multiple threads.
%J arXiv:1709.03395 [cs]
%A Anderson, Andrew
%A Vasudevan, Aravind
%A Keane, Cormac
%A Gregg, David
%D 2017-09-08
%K Computer Science - Computer Vision and Pattern Recognition
%0 Conference Paper
%T Optimal DNN primitive selection with partitioned boolean quadratic programming
%C Vienna, Austria
%I ACM Press
%P 340-351
%W http://arxiv.org/abs/1710.01079
%@ 978-1-4503-5617-6
%U http://dl.acm.org/citation.cfm?doid=3168805
%X Deep Neural Networks (DNNs) require very large amounts of computation both for training and for inference when deployed in the field. Many different algorithms have been proposed to implement the most computationally expensive layers of DNNs. Further, each of these algorithms has a large number of variants, which offer different trade-offs of parallelism, data locality, memory footprint, and execution time. In addition, specific algorithms operate much more efficiently on specialized data layouts and formats. We state the problem of optimal primitive selection in the presence of data format transformations, and show that it is NP-hard by demonstrating an embedding in the Partitioned Boolean Quadratic Assignment problem (PBQP). We propose an analytic solution via a PBQP solver, and evaluate our approach experimentally by optimizing several popular DNNs using a library of more than 70 DNN primitives, on an embedded platform and a general purpose platform. We show experimentally that significant gains are possible versus the state of the art vendor libraries by using a principled analytic solution to the problem of layout selection in the presence of data format transformations.
%G en
%B Proceedings of the 2018 International Symposium on Code Generation and Optimization - CGO 2018
%A Anderson, Andrew
%A Gregg, David
%D 2018
%0 Conference Paper
%T Flexible Privacy and High Trust in the Next Generation Internet : The Use Case of a Cloud-based Marketplace for AI
%I Halmstad university
%U http://urn.kb.se/resolve?urn=urn:nbn:se:bth-14963
%X Cloudified architectures facilitate resource ac-cess and sharing which is independent from physical lo-cations. They permit high availability of resources at lowoperational costs. These advantages, ...
%G eng
%B DIVA
%A Mehri, Vida A.
%A Tutschku, Kurt
%D 2017
%0 Conference Paper
%T Parallel Multi Channel convolution using General Matrix Multiplication
%I IEEE
%P 19-24
%W http://arxiv.org/abs/1704.04428
%@ 978-1-5090-4825-0
%U http://ieeexplore.ieee.org/document/7995254/
%A Vasudevan, Aravind
%A Anderson, Andrew
%A Gregg, David
%D 7/2017
%0 Conference Paper
%T Accelerating Deep Neural Networks on Low Power Heterogeneous Architectures
%W https://www.semanticscholar.org/paper/Accelerating-Deep-Neural-Networks-on-Low-Power-Loukadakis-Cano/1b4ffd5eee8f49b3bfa059b4f54d6960247d2aa8
%X Deep learning applications are able to recognise images and speech with great accuracy, and their use is now everywhere in our daily lives. However, developing deep learning architectures such as deep neural networks in embedded systems is a challenging task because of the demanding computational resources and power consumption. Hence, sophisticated algorithms and methods that exploit the hardware of the embedded systems need to be investigated. This paper is our first step towards examining methods and optimisations for deep neural networks that can leverage the hardware architecture of low power embedded devices. In particular, in this work we accelerate the inference time of the VGG-16 neural network on the ODROID-XU4 board. More specifically, a serial version of VGG-16 is parallelised for both the CPU and GPU present on the board using OpenMP and OpenCL. We also investigate several optimisation techniques that exploit the specific hardware architecture of the ODROID board and can accelerate the inference further. One of these optimisations uses the CLBlast library specifically tuned for the ARM Mali-T628 GPU present on the board. Overall, we improve the inference time of the initial serial version of the code by 2.8X using OpenMP, and by 9.4X using the most optimised version of OpenCL.
%B 11th International Workshop on Programmability and Architectures for Heterogeneous Multicores (MULTIPROG-2018)
%A Loukadakis, Manolis
%A Cano, Jose
%A O'Boyle, Michael
%D 1 2018
%K Deep Neural Networks
Heterogeneous architectures
Low power embedded systems
%0 Journal Article
%T Moonshine: Distilling with Cheap Convolutions
%W http://arxiv.org/abs/1711.02613
%U http://arxiv.org/abs/1711.02613
%X Model distillation compresses a trained machine learning model, such as a neural network, into a smaller alternative such that it could be easily deployed in a resource limited setting. Unfortunately, this requires engineering two architectures: a student architecture smaller than the first teacher architecture but trained to emulate it. In this paper, we present a distillation strategy that produces a student architecture that is a simple transformation of the teacher architecture. Recent model distillation methods allow us to preserve most of the performance of the trained model after replacing convolutional blocks with a cheap alternative. In addition, distillation by attention transfer provides student network performance that is better than training that student architecture directly on data.
%J arXiv:1711.02613 [cs, stat]
%A Crowley, Elliot J.
%A Gray, Gavin
%A Storkey, Amos
%D 2017-11-07
%K Computer Science - Computer Vision and Pattern Recognition
Computer Science - Learning
Statistics - Machine Learning
%0 Conference Paper
%T BONSEYES: Platform for Open Development of Systems of Artificial Intelligence: Invited Paper
%S CF'17
%C New York, NY, USA
%I ACM
%P 299–304
%@ 978-1-4503-4487-6
%U http://doi.acm.org/10.1145/3075564.3076259
%X The Bonseyes EU H2020 collaborative project aims to develop a platform consisting of a Data Marketplace, a Deep Learning Toolbox, and Developer Reference Platforms for organizations wanting to adopt Artificial Intelligence. The project will be focused on using artificial intelligence in low power Internet of Things (IoT) devices ("edge computing"), embedded computing systems, and data center servers ("cloud computing"). It will bring about orders of magnitude improvements in efficiency, performance, reliability, security, and productivity in the design and programming of systems of artificial intelligence that incorporate Smart Cyber-Physical Systems (CPS). In addition, it will solve a causality problem for organizations who lack access to Data and Models. Its open software architecture will facilitate adoption of the whole concept on a wider scale. To evaluate the effectiveness, technical feasibility, and to quantify the real-world improvements in efficiency, security, performance, effort and cost of adding AI to products and services using the Bonseyes platform, four complementary demonstrators will be built. Bonseyes platform capabilities are aimed at being aligned with the European FI-PPP activities and take advantage of its flagship project FIWARE. This paper provides a description of the project motivation, goals and preliminary work.
%B Proceedings of the Computing Frontiers Conference
%A Llewellynn, Tim
%A Fernández-Carrobles, M. Milagro
%A Deniz, Oscar
%A Fricker, Samuel
%A Storkey, Amos
%A Pazos, Nuria
%A Velikic, Gordana
%A Leufgen, Kirsten
%A Dahyot, Rozenn
%A Koller, Sebastian
%A Goumas, Georgios
%A Leitner, Peter
%A Dasika, Ganesh
%A Wang, Lei
%A Tutschku, Kurt
%D 2017
%K Data marketplace
Deep Learning
Internet of things
Smart Cyber-Physical Systems
%0 Conference Paper
%T Performance Analysis and Optimization of Sparse Matrix-Vector Multiplication on Modern Multi- and Many-Core Processors
%I IEEE
%P 292-301
%W https://arxiv.org/abs/1711.05487
%@ 978-1-5386-1042-8
%U http://ieeexplore.ieee.org/document/8025303/
%A Elafrou, Athena
%A Goumas, Georgios
%A Koziris, Nectarios
%D 8/2017