A. Umut, A. Acar, A. Charguéraud, M. Gua-o, F. Rainey et al., Heartbeat Scheduling: Provable E ciency for Nested Parallelism, PLDI, pp.769-782, 2018.

H. Amir, W. Ashouri, J. Killian, G. Cavazos, C. Palermo et al., A Survey on Compiler Autotuning Using Machine Learning. Comput. Surv, vol.51, 2018.

C. Augonnet, R. Samuel-ibault, P. Namyst, and . Wacrenier, StarPU: A Uni ed Platform for Task Scheduling on Heterogeneous Multicore Architectures, Concurr. Comput. : Pract. Exper, vol.23, pp.187-198, 2011.

D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter et al., NAS Parallel Benchmarks&Mdash;Summary and Preliminary Results. In Supercomputing, pp.158-165, 1991.

R. Barik, N. Farooqui, B. T. Lewis, C. Hu, and T. Shpeisman, A Black-box Approach to Energy-aware Scheduling on Integrated CPU-GPU Systems, CGO, pp.70-81, 2016.

E. Barre, C. F. Bolz-tereick, R. Killick, S. Mount, and L. Tra, Virtual Machine Warmup Blows Hot and Cold, Proc. ACM Program. Lang, vol.1, p.27, 2017.

T. Bessa, G. Gull, P. , M. Frank, J. Nacif et al.,

, JetsonLEAP: A framework to measure power on a heterogeneous system-on-a-chip device, Science of Computer Programming, vol.33, pp.1-37, 2017.

C. E. Bonferroni, Teoria statistica delle classi e calcolo delle probabilità, 1936.

S. Boyd and L. Vandenberghe, Seven Concurrency Models in Seven Weeks, 2004.

H. Cai, Q. Cao, F. Sheng, M. Zhang, C. Qi et al., Montgol er: Latencyaware power management system for heterogeneous servers, pp.1-8, 2016.

T. Cao, M. Stephen, T. Blackburn, K. S. Gao, and . Mckinley, Yin and Yang of Power and Performance for Asymmetric Hardware and Managed So ware, ISCA. IEEE, pp.225-236, 2012.

M. and A. Cauchy, Méthode Générale pour la résolution des systèmes d'Équations simultanées, Comptes Rendus Hebd. Séances Acad. Sci, vol.25, pp.536-538, 1847.

A. Cohen, F. Finkelstein, A. Mendelson, R. Ronen, and D. Rudoy, On Estimating Optimal Performance of CPU Dynamic ermal Management, IEEE Computer Architecture Le ers, vol.2, issue.1, pp.6-6, 2003.

J. Cong and B. Yuan, Energy-e cient Scheduling on Heterogeneous Multi-core Architectures, pp.345-350, 2012.

K. D. Cooper, A. Grosul, T. J. Harvey, S. Reeves, D. Subramanian et al., ACME: Adaptive Compilation Made E cient, LCTES, pp.69-77, 2005.

J. Silva, F. Magno, M. Pereira, A. Frank, and . Gamatié, A CompilerCentric Infra-Structure for Whole-Board Energy Measurement on Heterogeneous Android Systems, pp.1-8, 2018.
URL : https://hal.archives-ouvertes.fr/lirmm-01912850

F. David, G. O. , J. Lawall, and G. Muller, Continuously Measuring Critical Section Pressure with the Free-lunch Pro ler, SIGPLAN Not, vol.49, pp.291-307, 2014.

L. Francisco-de and A. Semlyen, A simple representation of dynamic hysteresis losses in power transformers, IEEE Transactions on Power Delivery, vol.10, pp.315-321, 1995.

C. Delimitrou and C. Kozyrakis, asar: Resource-e cient and QoS-aware Cluster Management, ASPLOS, pp.127-144, 2014.

M. Di-y, T. Architecture, J. Montrym, and C. M. Wi, NVIDIA's Tegra K1 system-on-chip, HCS. IEEE, pp.1-26, 2014.

A. F. Donaldson, P. Keir, and A. Lokhmotov, Compile-Time and Run-Time Issues in an Auto-Parallelisation System for the Cell BE Processor, Euro-Par Workshops, pp.163-173, 2008.

B. Donyanavard, T. Mück, S. Sarma, and N. Du, SPARTA: Runtime Task Allocation for Energy E cient Heterogeneous Many-cores, CODES, vol.27, pp.1-27, 2016.

. Olive-jean-dunn, Estimation of the Means for Dependent Variables, Annals of Mathematical Statistics, vol.29, pp.1095-1111, 1958.

R. A. Fisher, 1918. e Correlation Between Relatives on the Supposition of Mendelian Inheritance, Philosophical Transactions, vol.52, pp.399-433, 1918.

A. Garcia-garcia, J. C. Saez, and M. Prieto, Contention-Aware Fair Scheduling for Asymmetric Single-ISA Multicore Systems, IEEE Trans. Computers, vol.67, pp.1703-1719, 2018.

M. Garland, B. David, and . Kirk, Understanding throughput-oriented architectures, Commun. ACM, vol.53, issue.3, p.32, 2010.

. Ribeiro,

F. Gaspar, L. Taniça, P. Tomás, A. Ilic, and L. Sousa, A Framework for Application-Guided Task Management on Heterogeneous Embedded Systems, ACM Trans. Archit. Code Optim, vol.12, p.25, 2015.

P. Greenhalgh, Big.LITTLE processing with ARM cortex-A15 & cortex-A7, 2011.

U. Gupta, A. Chetan, G. Patil, P. Bhat, U. Y. Mishra et al., DyPO: Dynamic Pareto-Optimal Con guration Selection for Heterogeneous MpSoCs, Trans. Embed. Comput. Syst, vol.16, p.5, 2017.

M. Hähnel and H. Härtig, Heterogeneity by the Numbers: A Study of the ODROID XU+E Big. LITTLE Platform, HotPower. USENIX Association, pp.3-3, 2014.

A. Jain, M. A. Laurenzano, L. Tang, and J. Mars, Continuous shape shi ing: Enabling loop co-optimization via near-free dynamic code rewriting, pp.1-12, 2016.

B. Je, 2013. big.LITTLE Technology moves towards fully heterogeneous Global Task Scheduling

A. José, M. Joao, O. Aater-suleman, Y. N. Mutlu, and . Pa, Bo leneck Identi cation and Scheduling in Multithreaded Applications, ASPLOS, pp.223-234, 2012.

A. Jundt, A. Cauble-chantrenne, A. Tiwari, J. Peraza, M. A. Laurenzano et al., Compute Bo lenecks on the New 64-bit ARM, E2SC, vol.6, pp.1-6, 2015.

M. Kambadur and M. A. Kim, An experimental survey of energy management across the stack, OOPSLA, pp.329-344, 2014.

M. Kim, S. K. Seo, and S. W. Chung, Looking into heterogeneity: when simple is faster, 2014.

J. Krishna and R. Nasre, Optimizing Graph Algorithms in Asymmetric Multicore Processors, Trans. on CAD of Integrated Circuits and Systems, vol.37, pp.2673-2684, 2018.

R. Kumar, D. M. Tullsen, N. P. Jouppi, and P. Ranganathan, Heterogeneous Chip Multiprocessors, Computer, vol.38, pp.32-38, 2005.

C. Luk, S. Hong, and H. Kim, Qilin: Exploiting Parallelism on Heterogeneous Multiprocessors with Adaptive Mapping, pp.45-55, 2009.

A. Lukefahr, S. Padmanabha, R. Das, F. M. Sleiman, R. G. Dreslinski et al., Exploring Fine-Grained Heterogeneity with Composite Cores, Transactions on Computers, vol.65, pp.535-547, 2016.

G. Mendonça, B. Guimarães, P. Alves, M. Pereira, G. Araújo et al., DawnCC: Automatic Annotation for Data Parallelism and O oading, Transactions on Architecture and Code Optimization, vol.14, p.25, 2017.

X. Meng, J. Bradley, B. Yavuz, E. Sparks, S. Venkataraman et al., Mllib: Machine learning in apache spark, Journal of Machine Learning Research, vol.17, pp.1235-1241, 2016.

N. Mishra, C. Imes, J. D. La, and H. Ho-mann, CALOREE: Learning Control for Predictable Latency and Low Energy, ASPLOS, pp.184-198, 2018.

S. , A Survey of Techniques for Architecting and Managing Asymmetric Multicore Processors, Comput. Surv, vol.48, 2016.

, A Survey of CPU-GPU Heterogeneous Computing Techniques, Comput. Surv, vol.47, p.35, 2015.

J. Nickolls and W. J. Dally, 2010. e GPU Computing Era, IEEE Micro, vol.30, pp.56-69, 2010.

P. Nie and Z. Duan, E cient and Scalable Scheduling for Performance Heterogeneous Multicore Systems, J. Parallel Distrib. Comput, vol.72, pp.353-361, 2012.

R. Nishtala, P. M. Carpenter, V. Petrucci, and X. Martorell, Hipster: Hybrid Task Manager for Latency-Critical Cloud Workloads, pp.409-420, 2017.

A. Orgerie, M. Dias-de-assunção, and L. Lefevre, A Survey on Techniques for Improving the Energy E ciency of Large-scale Distributed Systems, ACM Comput. Surv, vol.46, p.31, 2014.

J. Park, S. Park, and W. Baek, RPPC: A Holistic Runtime System for Maximizing Performance Under Power Capping, CCGRID. IEEE, pp.41-50, 2018.

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Irion et al., Scikit-learn: Machine Learning in Python, Journal of Machine Learning Research, vol.12, pp.2825-2830, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00650905

V. Petrucci, O. Loques, D. Mossé, R. Melhem, N. Abou-gazala et al., Energy-E cient read Assignment Optimization for Heterogeneous Multicore Systems, ACM Trans. Embed. Comput. Syst, vol.14, p.26, 2015.

. Pacm-progr and . Lang, , vol.1, 2017.

, Scheduling in Het. Archs. via Multivariate Linear Regression on Function Inputs, vol.1, p.33

G. Piccoli, H. N. Santos, R. E. Rodrigues, and C. Pousa,

. Pereira, Compiler Support for Selective Page Migration in NUMA Architectures, PACT, pp.369-380, 2014.

G. Pinto, F. Castor, and Y. D. Liu, Understanding Energy Behaviors of read Management Constructs, OOPSLA, pp.345-360, 2014.

E. A. Jean-c-pique-e, W. Mclaughlin, . Ren, K. Binu, and . Mukherjee, Generalization of a model of hysteresis for dynamical systems, e Journal of the Acoustical Society of America, vol.111, pp.2671-2674, 2002.

G. Poesia, F. Breno-campos-ferreira-guimarães, F. Ferracioli, and . Pereira, Static placement of computation on heterogeneous devices, PACMPL, vol.1, p.28, 2017.

A. Prokopec, A. Rosà, D. Leopoldseder, G. Duboscq, P. T?ma et al., Renaissance: Benchmarking Suite for Parallel Applications on the JVM. In PLDI, pp.31-47, 2019.

K. K. Rangan, . Gu-yeon, D. Wei, and . Brooks, read Motion: Fine-grained Power Management for Multi-core Systems, ISCA, pp.302-313, 2009.

C. J. Rossbach, Y. Yu, J. Currey, J. Martin, and D. F. Erly, Dandelion: A Compiler and Runtime for Heterogeneous Systems, SOSP, pp.49-68, 2013.

S. Seabold and J. Perktold, Statsmodels: Econometric and statistical modeling with python, SciPy.org, vol.57, p.61, 2010.

G. Semeraro, G. Magklis, R. Balasubramonian, D. H. Albonesi, S. Dwarkadas et al., Energy-E cient Processor Design Using Multiple Clock Domains with Dynamic Voltage and Frequency Scaling, HPCA. IEEE, p.29, 2002.

D. Shelepov, J. Carlos-saez-alcaide, S. Je-ery, A. Fedorova, N. Perez et al., HASS: A Scheduler for Heterogeneous Multicore Systems, SIGOPS Oper. Syst. Rev, vol.43, pp.66-75, 2009.

J. Shun, G. E. Blelloch, J. T. Fineman, P. B. Gibbons, and A. Kyrola, Price eory Based Power Management for Heterogeneous Multi-cores, annirmalai Somu Muthukaruppan, Anuj Pathania, and Tulika Mitra, pp.161-176, 2012.

T. Sorensen, H. Evrard, and A. F. Donaldson, GPU Schedulers: How Fair Is Fair Enough, CONCUR. Schloss Dagstuhl, Leibniz-Zentrum fuer Informatik, vol.23, p.17, 2018.

J. Krishna-viswakaran, S. Sreelatha, R. Balachandran, and . Nasre, CHOAMP: Cost Based Hardware Optimization for Asymmetric Multicore Processors, Trans. Multi-Scale Computing Systems, vol.4, pp.163-176, 2018.

L. Tang, J. Mars, W. Wang, T. Dey, and M. Lou, ReQoS: Reactive Static/Dynamic Compilation for QoS in Warehouse Scale Computers, ASPLOS, pp.89-100, 2013.

. Twi-er, Open-Source Twi er Finagle Repository at GitHub, 2019.

S. Tzilis, P. Trancoso, and I. Sourdis, Energy-E cient Runtime Management of Heterogeneous Multicores using Online Projection, TACO, vol.15, p.26, 2019.

R. Vallée-rai, P. Co, E. Gagnon, L. Hendren, P. Lam et al., Soot -a Java Bytecode Optimization Framework, CASCON. IBM Press, p.13, 1999.

K. Van-craeynest, A. Jaleel, L. Eeckhout, P. Narvaez, and J. Emer, Scheduling Heterogeneous Multi-cores rough Performance Impact Estimation (PIE). In ISCA, pp.213-224, 2012.

K. Van-craeynest, A. Jaleel, L. Eeckhout, P. Narvaez, and J. Emer, Scheduling Heterogeneous Multi-cores rough Performance Impact Estimation (PIE). In ISCA, IEEE Computer Society, pp.213-224, 2012.

Z. Wang, F. P. Michael, and . O'boyle, Machine Learning in Compiler Optimization, Proc. IEEE, vol.106, pp.1879-1901, 2018.

F. Wilhelmst and E. , Open-Source Java Jenetics Repository at GitHub. h ps://github.com/jenetics/jenetics, 2019.

A. Yazdanbakhsh, J. Park, H. Sharma, P. Lot--kamran, and H. Esmaeilzadeh, Neural acceleration for GPU throughput processors, pp.482-493, 2015.

H. Zhang and H. Ho-mann, Maximizing Performance Under a Power Cap: A Comparison of Hardware, So ware, and Hybrid Techniques, ASPLOS, pp.545-559, 2016.